The database level health detection failover option introduced on this article
In addition to the existing checks, the new implementation has the following additional checks.
Error |
Cause |
Documentation |
605 |
Page or allocation corruption. |
|
823 |
Checkpoint failures. |
|
829 |
Disk corruption. |
|
832 |
Hardware or memory corruption. |
|
1101 |
No disk space available in a filegroup. |
|
1105 |
No disk space available in a filegroup. |
|
5102 |
Missing filegroup ID requests. |
|
5180 |
Wrong file ID requests. |
|
5515 |
|
|
5534 |
Log corruption due to FILESTREAM operation log record. |
|
5535 |
FILESTREAM data container corruption. |
|
9004 |
Log Corruption |
If we enable this feature, to make sure AG can failover successfully, we need to change the default failover policy.
The default “max restarts in the specified Period =1 in 1 hour
The default “max failure in the specified Period” =1 in 6 hours
Based on this settings, if the 823 error reported but this error could not be repaired from the secondary replica:
Recommend setting : “max failure in the specified Period” >=“max restarts in the specified Period+1 at least. Then all restart attempt finished but the issue still is detected, next time failover will trigger.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.