AlwaysON Secondary database going to “Not Synchronizing/ Suspect” State!

In this blog post I will share an issue we had with a database which is configured with AlwaysON. Before proceeding any further, the environment which we’ve got is:

Each node has Windows Server 2008R2(With all the service packs and hot fixes recommended for AlwaysON)
Running on top of VMware VShpere 5.1
SQL Server 2012(SP1) Enterprise Edition
RAM: 10 GB (8 GB assigned to SQL Server).
2 VCPU’s.
Availability Mode- Synchronous Commit

Issue: Daily around 5 AM, the secondary database is going to “Not Synchronizing/Suspect” state and until we fix this the T-Log on primary grows and all that normal jazz once the AlwaysON databases get out of Sync…(See below)

So, what’s happening?
The App team is performing data load daily around 4.30 AM. Okay…So what’s bad about that? They are loading ~30 Million Records daily, in a single transaction. Oops!!!…
From SQL Server error logs, we see the below message:
Message
AlwaysOn Availability Groups data movement for database ‘Test_DB’ has been suspended for the following reason: “system” (Source ID 2; Source string: ‘SUSPEND_FROM_REDO‘). To resume data movement on the database, you will need to resume the database manually. For information about how to resume an availability database, see SQL Server Books Online.
This message is always accommodated with another message(Shown below):
Message
The instance of the SQL Server Database Engine cannot obtain a LOCK resource at this time. Rerun your statement when there are fewer active users. Ask the database administrator to check the lock and memory configuration for this instance, or to check for long-running transactions.

Ummm…This doesn’t looks good. If you are thinking, what Locks have to do with AlwaysON secondaries, let me tell you this. With Database Mirroring and AlwaysON Rollbacks/Redo thread will also take a lock on the secondary side to avoid any other transaction to interrupt REDO process, thus guaranteeing consistency. If for some reason SQL Server is not able to acquire locks for redo thread it won’t synchronize the database starting that point. (It’s by design).

In our case what’s happening was SQL Server was running out of memory and was not able to acquire any further locks(Remember, each lock structure in SQL Server will need certain amount of memory).Basically, it says “Since I wasn’t able to acquire a lock during the REDO, I don’t know what else happened at that time and I can’t guarantee the database to be consistent. So…am not going to synchronize from this point and I will suspend the data movement and also take the database to Suspect state”).

From AlwaysON standpoint, Suspending Synchronization when the REDO thread encounters any error is by design and is done on purpose by SQL Server.

To avoid this, all they(App team) have to do is optimize their load process to better manage lock acquisition.(We are not being granted any more memory on these boxes unfortunately).

Bottom Line: Avoid huge transactions on tiny SQL Servers. Try to split the transactions into multiple chunks especially when dealing with millions/billions of rows. That helps in general many ways, not just in this particular scenario.

Have a safe and happy long weekend guys!

AlwaysON Secondary database going to “Not Synchronizing/ Suspect” State!

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112