AlwaysOn availability groups introduce the new flexible failover policy for SQL Server instance health monitoring for the AlwaysOn availability group resource.
Legacy clustered SQL Server utilized a LooksAlive that performed a lightweight check of the SQL Server process health. The legacy IsAlive connected to SQL Server and executed a simple query.
AlwaysOn flexible failover policy offers a more comprehensive health monitoring model that is configurable. When creating or modifying an availability group, the failure_condition_level property can be set or adjusted. This property supports values of one to five, with one performing the most lightweight checks up to five which includes more comprehensive internal SQL Server health monitoring.
For more information on availability group flexible failover policy settings, see 'Flexible Failover Policy for Automatic Failover of an Availability Group (SQL Server)'
The following discussion gives greater detail on the implementation of Windows cluster LooksAlive and IsAlive by SQL Server 2012 AlwaysOn failover cluster instance (SQLFCI) and availability groups.
Once an availability group is created, the host process of the SQL resource DLL sets up health monitoring with SQL Server and begins periodic LooksAlive and IsAlive operations to satisfy health monitoring.
The resource DLL establishes an ODBC connection and begins receiving the results of sp_server_diagnostics at a pace of 1/3 the availability group's HEALTH_CHECK_TIMEOUT setting.
The Resource DLL begins health monitoring. In SQL Server 2012, LooksAlive and IsAlive perform identical checks to monitor SQL Server instance health.
Under normal operating conditions, LooksAlive executes every second, checking the health of SQL Server based on the availability group resouce's failure_condition_level
The following describes the algorithm used by LooksAlive and IsAlive to detect SQL Server instance health.
1 The health monitoring cannot be disabled. Therefore, health check will always perform the checks associated with failure_condition_level=1 and at minimum, does the following:
Check the SQL Server process is running by performing a query service state operation.
Check the health of the lease mechanism.
2 If failure_condition_level is 2 or greater, in addition to the checks above:
Check that the last result set of sp_server_diagnostics was received within the time period defined by the availability group's HEALTH_CHECK_TIMEOUT.
3 If failure_condition_level is 3 or greater, , in addition to the checks above:
Check the system component results returned from the last sp_server_diagnostics result set for error condition.
4 If failure_condition_level is 4 or greater, , in addition to the checks above:
Check the resource component results returned from the last sp_server_diagnostics result set for error condition.
5 If failure_condition_level is 5, in addition to the checks in above:
Check the query_processing component results returned from the last sp_server_diagnostics result set for error condition.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.