Monitoring AlwaysOn Health with PowerShell - Part 2: Advanced Cmdlet Usage
Published Mar 23 2019 12:38 PM 1,098 Views
Microsoft
First published on MSDN on Feb 13, 2012

Part 2 – Advanced Usage of AlwaysOn Health Cmdlets


In this part, I’ll cover some advanced features of the AlwaysOn health cmdlets.  Please see Part 1 for a basic overview of these cmdlets. We’ll look at how we can drill down into unhealthy objects to discover individual policy failures, and how to include custom policies in health evaluation.  We’ll also talk about some performance considerations.  Please note that all PowerShell commands demonstrated below should be executed within the SQLPS shell.


Recall the AlwaysOn health cmdlets are:



  • Test-SqlAvailabilityGroup

  • Test-SqlAvailabilityReplica

  • Test-SqlDatabaseReplicaState


AlwaysOn Health Model


Read this blog for an in depth look at the AlwaysOn health model.  To briefly summarize: to compute the health of an availability group, availability replica, or availability database, we execute a collection of Policy-Based Management (PBM) policies that are installed along with SQL Server 2012.  These policies test various aspects of your high availability configuration, and based on the results we compute an overall health state of the object in question.  Some policies are categorized as more critical than others, so that one policy failure might result in a warning and another in a critical error.  Users can add their own policies to this collection to extend the health model.


Viewing Policy Evaluation Details


Using our AlwaysOn health cmdlets, you can obtain policy evaluation details in two ways:



  1. Every health result returned by a Test-Sql* cmdlet has a field, PolicyEvaluationDetails.  This field stores an array of policy evaluation details, one entry for each policy executed.

  2. Every Test-Sql* cmdlet has a –ShowPolicyDetails parameter.  If you specify this parameter, the cmdlet will enumerate the policy evaluation details instead of returning a single result.


Here are some examples.  Last time we were investigating the availability group ‘VLEApplication’ which had a critical error.


PS > cd SQLSERVER:\SQL\WSNAVELY1-SH21Q\DEFAULT\AvailabilityGroups
PS > Test-SqlAvailabilityGroup .\VLEApplication

HealthState            Name
-----------            ----
Error                  VLEApplication


We know that the availability group is in a critical state, but we can’t immediately see the policy failures responsible for this.  We’ll first demonstrate method (1) from above for getting this information, namely accessing the PolicyEvaluationDetails field on the object returned by the Test-SqlAvailabilityGroup cmdlet.


PS > $result = Test-SqlAvailabilityGroup .\VLEApplication
PS > $result.PolicyEvaluationDetails | ft Result,Name -AutoSize

Result  Name
------  ----
True    AlwaysOnAgOnlineStateHealthPolicy
True    AlwaysOnAgWSFClusterHealthPolicy
False   AlwaysOnAgAutomaticFailoverHealthPolicy
True    AlwaysOnAgReplicasConnectionHealthPolicy
False   AlwaysOnAgReplicasDataSynchronizationHealthPolicy
True    AlwaysOnAgReplicasRoleHealthPolicy
False   AlwaysOnAgSynchronousReplicasDataSynchronizationHealthPolicy


Note I apply some simple formatting to the output (ft, alias for Format-Table). We see that three policies failed (returned a result of False):



  • AlwaysOnAgAutomaticFailoverHealthPolicy

  • AlwaysOnAgReplicasDataSynchronizationHealthPolicy

  • AlwaysOnAgSynchronousReplicasDataSynchronizationHealthPolicy


At this point, I can connect to the server instance with SQL Server Management Studio and determine the conditions monitored by these policies, and take appropriate action.


Now I’ll demonstrate method (2) from above, using the –ShowPolicyDetails parameter.


PS > Test-SqlAvailabilityGroup .\VLEApplication -ShowPolicyDetails | ft Result,Name -AutoSize

Result  Name
------  ----
True    AlwaysOnAgOnlineStateHealthPolicy
True    AlwaysOnAgWSFClusterHealthPolicy
False   AlwaysOnAgAutomaticFailoverHealthPolicy
True    AlwaysOnAgReplicasConnectionHealthPolicy
False   AlwaysOnAgReplicasDataSynchronizationHealthPolicy
True    AlwaysOnAgReplicasRoleHealthPolicy
False   AlwaysOnAgSynchronousReplicasDataSynchronizationHealthPolicy


Again I apply some simple formatting to the result.  We see that the output of method (1) and (2) is identical.


Including User Policies


You are able to extend the default AlwaysOn health model with your own policies.  I won’t go into details here about how to create these custom policies, but please refer to this post for more information.  By default, the AlwaysOn health cmdlets won’t evaluate user policies; however you can force them to do so by specifying the AllowUserPolicies parameter, supported by all three cmdlets.  This is a simple switch parameter:


Test-SqlAvailabilityGroup .\VLEApplication -ShowPolicyDetails -AllowUserPolicies


Performance Considerations


The AlwaysOn health cmdlet follow this general pattern: accept an object from the pipeline, refresh its properties, then execute a collection of PBM policies against the object.  The objects in this case are part of the SQL Server Management Object (SMO) framework.  Test-SqlAvailabilityGroup accepts objects of type AvailabilityGroup , Test-SqlAvailabilityReplica objects of type AvailabilityReplica , and Test-SqlDatabaseReplicaState objects of type DatabaseReplicaState .  If you are monitoring many objects – say, you have an availability group with 100 databases – this process will be very chatty, since each object refresh will result in a query to the server.


In this scenario, you can suppress the refresh behavior of these cmdlets: simply specify the –NoRefresh parameter.  In this case you will be responsible for ensuring that the SMO objects passed to the cmdlets have recent data.  Let’s consider the scenario mentioned: our availability group “LargeAvailabilityGroup” has 100 databases across two replicas, for a total of 200 databases, and you want to evaluate the health of all these databases.  First let’s use the naïve approach.


PS > $serverObj = New-Object Microsoft.SqlServer.Management.Smo.Server("WSNAVELY1-SH21Q")
PS > $ag = $serverObj.AvailabilityGroups["LargeAvailabilityGroup"]
PS > Measure-Command { $ag.DatabaseReplicaStates | Test-SqlDatabaseReplicaState }
Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 23
Milliseconds      : 430


This took about 20 seconds to execute, rather a long time.  Now we’ll demonstrate an optimized approach.  First, we’ll want to turn off the auto-refresh behavior of the cmdlet using the –NoRefresh parameter as mentioned.  Next, we need to figure out an efficient way to load the data we need from the server.  Fortunately, SMO provides a mechanism for efficiently loading data for collections of objects, the “SetDefaultInitFields” method on the SMO Server object.  For a detailed explanation of how this mechanism works, please refer to these blog posts by Michiel Wories:



Here’s a complete example with our cmdlet:


PS > $ServerObj = New-Object Microsoft.SqlServer.Management.Smo.Server("WSNAVELY1-SH21Q")
PS > $ServerObj.SetDefaultInitFields([Microsoft.SqlServer.Management.Smo.AvailabilityGroup], $true)
PS > $ServerObj.SetDefaultInitFields([Microsoft.SqlServer.Management.Smo.AvailabilityReplica], $true)
PS > $ServerObj.SetDefaultInitFields([Microsoft.SqlServer.Management.Smo.DatabaseReplicaState], $true)
PS > Measure-Command { $ag.DatabaseReplicaStates | Test-SqlDatabaseReplicaState -NoRefresh }
Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 7
Milliseconds      : 317


This time the evaluation took about 7 seconds, much faster.  The primary difference here is the number of queries issued.  The first example issues a query for each database, while the second example issues a single query that fetches all the required data.


We've now covered most of the advanced usage scenarios for these cmdlets.  In Part 3 we will combine what we've learned into a simple script.

Version history
Last update:
‎Mar 23 2019 12:38 PM
Updated by: