First published on MSDN on Jan 24, 2013
In this blog I will discuss how Failover Clustering communicates with cluster resources, along with how clustering detects and recovers when something goes wrong. For the sake of simplicity I will use a Virtual Machine as an example throughout this blog, but the logic is generic and applies to all workloads.
When a Virtual Machine is clustered, there is a cluster “Virtual Machine” resource created which controls that VM. The “Virtual Machine” resource and its associated resource DLL communicates with the VMMS service and tells the VM when to start, when to stop, and it also does health checks to ensure the VM is ok.
Resources all run in a component of the Failover Clustering feature called the Resource Hosting Subsystem (RHS). These VM actions from the user map to entry point calls that RHS makes to resources, such as Online, Offline, IsAlive, and LooksAlive. You can find the full list of resource DLL entry-point functions here .
The most interesting in most cases where resources go unresponsive and you see clustering need to recover is with the LooksAlive and IsAlive which is a health check to the resource.
Health check calls to the resource continue constantly while resources are online. If a resource returns a failure for the lightweight LooksAlive health check, RHS will then immediately do a more comprehensive health check and call IsAlive to see if the resource is really healthy. A resource is considered failed as the result of an IsAlive failure.
Think of it like this… Every 60 seconds RHS calls IsAlive and basically is asking the resource “Are you ok?”. And the resource then responds to RHS “Yes, I am doing fine.” This periodic health check goes on and on… Until, there can be a case where something happens to the resource and it doesn’t respond. Think of it like a dropped call on your cell phone, how long are you willing to sit there going “Hello? Hello? Hello?”… before you give up and call the person back? Basically resetting the connection…
Failover Clustering has this same concept. RHS will sit there waiting for the resource to respond to an IsAlive call, and eventually it will give up and need to take recovery action. By default RHS will wait for 5 minutes for the resource to respond to an entry point call to it. This is configurable with the resource DeadlockTimeout common property.
To modify the DeadlockTimeout property of an individual resource, you can use the following PowerShell cmdlet command:
(Get-ClusterResource “Resource Name”).DeadlockTimeout = 300000
Or if you want to modify the DeadlockTimeout for all resources of that type you can modify it at the resource type level with the following syntax (this example will be for all virtual machine resources):
(Get-ClusterResourceType “Virtual Machine”).DeadlockTimeout = 300000
Resources are expected to respond to an IsAlive or LooksAlive within a few hundred milliseconds, so waiting 5 minutes for a resource to respond is a really long time. Something pretty bad happened if a resource which normally responds in milliseconds, suddenly takes longer than 5 minutes. So it is generally recommended to stay with the default values.
If the resource doesn’t respond in 5 minutes, RHS decides that there must be something wrong with the resource and that it should take recovery action to get it back up and running. Remember that the resource has gone silent; RHS has no idea what is wrong with it. The only way to recover and get the resource back up and running is that the RHS process is terminated, then RHS restarts, which will then restart the resource, and everything is back up and running. You may also see the associated entries in the System event log:
Event ID 1230
Cluster resource ‘ Resource Name ’ (resource type ‘ Resource Type Name ’, DLL ‘ DLL Name ’) did not respond to a request in a timely fashion. Cluster health detection will attempt to automatically recover by terminating the Resource Hosting Subsystem (RHS) process running this resource.
Event ID 1146
The cluster Resource Hosting Subsystem (RHS) stopped unexpectedly. An attempt will be made to restart it. This is usually associated with recovery of a crashed or deadlocked resource.
The next layer of protection is that when clustering issues a request to terminate the RHS process, it will wait four times the DeadlockTimeout value (which equates to 20 minutes by default) for the RHS process to terminate. If RHS does not terminate in 20 minutes, clustering will deem that the server has some serious health issues and will bugcheck the server to force failover and recovery. The bugcheck code will be Stop 0x0000009E ( Parameter1 , Parameter2 , 0x0000000000000005 , Parameter4 ). Note: that Parameter3 will always be a value of 0x5 if it is the result of an RHS process failing to terminate.
This is the way clustering is designed to work… it is monitoring the health of the system, it detects something is wrong, and recovers. This is a good thing!
Impact of RHS Recovery
The Resource Hosting Subsystem (RHS) is the process which hosts resources, and for any given node if there are multiple resources currently online and being hosted by a node they may share a common RHS process. For example, if you had 5 clustered VMs running on the same node, all the resources associated with those VMs would all be running in the same RHS process.
There are some side effects from terminating the RHS process when a resource goes unresponsive. If there are multiple resources hosted on that node, they may be hosted in the same RHS process. That means when RHS terminates and restarts to recover an individual resource, all resources being hosted in that specific RHS process are also restarted. With Windows Server 2008 R2 if you have 5 VMs running on a node, all 5 VMs are going to get restarted.
If a resource becomes unresponsive and causes an RHS crash, the cluster service will deem that specific resource to be suspect and that it needs be isolated. Think of it as, one strike and you are out! The cluster service will automatically set the resource common property SeparateMonitor to mark that resource to run in its own dedicated RHS process, so that in the event that the resource becomes unresponsive again; it will not affect others. This setting is also configurable, you can either manually enable a resource to run in its own RHS process or you can disable a resource from running in its own RHS process as the result of having had an issue in the past which is now addressed.
To modify the SeparateMonitor property of an individual resource, you can use the following PowerShell cmdlet command:
(Get-ClusterResource “Resource Name”).SeparateMonitor = 0
The impact of running resources in their own dedicated RHS process is that each RHS process consumes a little more system resources. If you open Task Manager you will see a series of “Failover Cluster Resource Host Subsystem” processes running, each of which consuming a few MB of RAM.
In general clustering will self-manage misbehaving resources. Resources will be given a chance to play nicely with everyone else, and if they don’t they will be automatically isolated to minimize impact. So it is generally recommended to stay with the default values.
There are some feature enhancements in Windows Server 2012 to mitigate the impact of non-responsive resource recovery.
Resource Re-attach : When a resource goes unresponsive the RHS process will recycle just as before, but any healthy resources in a running state will have their resources re-attach to the new RHS process without having to be restarted. This means that impact from recovery is reduced, just 1 VM gets restarted and the other 4 are not impacted.
Additionally resources can also be marked with the SeparateMonitor property to run in their own dedicated RHS process in Windows Server 2012, as they could in previous releases.
Everything we have discussed in this blog to this point has describing the expected behavior of how Failover Clustering recovers when something goes wrong with a resource and it becomes unresponsive. Now the most important question… What do you do about it?
Troubleshooting Steps:
Advanced Troubleshooting:
NOTE: DebugBreakOnDeadlock will only create a dump if the RHS process itself deadlocks, not a resource. When a resource deadlocks, RHS will attempt to terminate it. As part of the termination, it should create a WER Report with a small heap and process dump. Once those are completed, RHS will terminate the resource. If it is successful in terminating, then RHS itself will not deadlock. Since RHS itself does not deadlock, no dump is created. So this may be something not needed.
The key take-away is that RHS recovery is expected behavior for a resource that has become unresponsive. To address the root cause issue you need to dig in to which resource is failing and then by understanding what it was attempting to do, you can identify why it didn’t respond.
For additional information on troubleshooting resources that result in RHS recovery, see the blogs below. Microsoft support is also available to assist in advanced debugging to help you identify root cause.
Resource Hosting Subsystem (RHS) In Windows Server 2008 Failover Clusters
http://blogs.technet.com/b/askcore/archive/2009/11/23/resource-hosting-subsystem-rhs-in-windows-ser...
Thanks!
Elden Christensen
Principal PM Manager
Clustering & High-Availability
Microsoft
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.