Customizing Managed Availability
Published Aug 13 2013 09:40 AM 62.3K Views

Exchange Server 2013 introduces a new feature called Managed Availability, which is a built-in monitoring system with self-recovery capabilities.

If you’re not familiar with Managed Availability, it’s a good idea to read these posts:

As described in the above posts, Managed Availability performs continuous probing to detect possible problems with Exchange components or their dependencies, and it performs recovery actions to make sure the end user experience is not impacted due to a problem with any of these components.

However, there may be scenarios where the out-of-box settings may not be suitable for your environment. This blog post guides you on how to examine the default settings and modify them to suit your environment.

Managed Availability Components

Let’s start by finding out which health sets are on an Exchange server:

Get-HealthReport -Identity Exch2

This produces the output similar to the following:

image

 

Next, use Get-MonitoringItemIdentity to list out the probes, monitors, and responders related to a health set. For example, the following command lists the probes, monitors, and responders included in the FrontendTransport health set:

Get-MonitoringItemIdentity -Identity FrontendTransport -Server exch1 | ft name,itemtype –AutoSize

This produces output similar to the following:

image

You might notice multiple probes with same name for some components. That’s because Managed Availability creates a probe for each resource. In following example, you can see that OutlookRpcSelfTestProbe is created multiple times (one for each mailbox database present on the server).

image

 

Use Get-MonitoringItemIdentity to list the monitoring Item Identities along with the resource for which they are created:

Get-MonitoringItemIdentity -Identity Outlook.Protocol -Server exch1 | ft name,itemtype,targetresource –AutoSize

image

 

Customize Managed Availability

Managed Availability components (probes, monitors and responders) can be customized by creating an override.

There are two types of override: local override and global override. As their names imply, a local override is available only on the server where it is created, and a global override is used to deploy an override across multiple servers.

Either override can be created for a specific duration or for a specific version of servers.

Local Overrides

Local overrides are managed with the *-ServerMonitoringOverride set of cmdlets. Local overrides are stored under following registry path:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ExchangeServer\v15\ActiveMonitoring\Overrides\

The Microsoft Exchange Health Management service reads this registry path every 10 minutes and loads configuration changes. Alternatively, you can restart the service to make the change effective immediately.

You would usually create a local override to:

  • Customize a managed availability component that is server-specific and not available globally; or
  • Customize a managed availability component on a specific server.

Global Overrides

Global overrides are managed with the *-GlobalMonitoringOverride set of cmdlets. Global overrides are stored in the following container in Active Directory:

CN=Overrides,CN=Monitoring Settings,CN=FM,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=Contoso,DC=com

Get Configuration Details

The configuration details of most of probes, monitors, and responders are stored in the respective crimson channel event log for each monitoring item identity, you would examine these first before deciding to change.

In this example, we will explore properties of a probe named “OnPremisesInboundProxy”, which is part of the FrontendTransport Health Set. The following script lists detail of the OnPremisesInboundProxy probe:

(Get-WinEvent -LogName Microsoft-Exchange-ActiveMonitoring/ProbeDefinition | % {[XML]$_.toXml()}).event.userData.eventXml | ?{$_.Name -like "OnPremisesInboundP
roxy"}

You can also use Event Viewer to get the details of probe definition. The configuration details most of the Probes are stores in the ProbeDefinition channel:

  1. Open Event Viewer, and then expand Applications and Services Logs\Microsoft\Exchange\ActiveMonitoring\ProbeDefinition.
  2. Click on Find, and then enter OnPremisesInboundProxy.
  3. The General tab does not show much detail, so click on the Details tab, it has the configuration details specific to this probe. Alternatively, you can copy the event details as text and paste it into Notepad or your favorite editor to see the details.

Override Scenarios

Let’s look at couple real-life scenarios and apply our learning so far to customize managed availability to our liking, starting with local overrides.

Creating a Local Override

In this example, an administrator has customized one of the Inbound Receive connectors by removing the binding of loopback IP address. Later, they discover that the FrontEndTransport health set is unhealthy. On further digging, they determine that the OnPremisesInboundProxy probe is failing.

To figure out why the probe is failing, you can first list the configuration details of OnPremisesInboundProxy probe.

(Get-WinEvent -LogName Microsoft-Exchange-ActiveMonitoring/ProbeDefinition | % {[XML]$_.toXml()}).event.userData.eventXml | ?{$_.Name -like "OnPremisesInboundProxy"}

Name : OnPremisesInboundProxy

WorkItemVersion : [null]

ServiceName : FrontendTransport

DeploymentId : 0

ExecutionLocation : [null]

CreatedTime : 2013-08-06T12:54:29.7571195Z

Enabled : 1

TargetPartition : [null]

TargetGroup : [null]

TargetResource : [null]

TargetExtension : [null]

TargetVersion : [null]

RecurrenceIntervalSeconds : 300

TimeoutSeconds : 60

StartTime : 2013-08-06T12:54:36.7571195Z

UpdateTime : 2013-08-06T12:48:27.1418660Z

MaxRetryAttempts : 1

ExtensionAttributes : <ExtensionAttributes><WorkContext><SmtpServer>127.0.0.1</SmtpServer><Port>25</Port><HeloDomain>InboundProxyProbe</HeloDomain><MailFrom Username="inboundproxy@contoso.com"/><MailTo Select="All" Username="HealthMailboxdd618748368a4935b278e884fb41fd8a@FM.com"/><Data AddAttributions="false">X-Exchange-Probe-Drop-Message:FrontEnd-CAT-250 Subject:Inbound proxy probe</Data><ExpectedConnectionLostPoint>None</ExpectedConnectionLostPoint></WorkContext></ExtensionAttributes>

The ExtentionAttributes property above shows that the probe is using 127.0.0.1 for connecting to port 25. As that is the loopback address, the administrator needs to change the SMTP server in ExtentionAttributes property to enable the probe to succeed.

You use following command to create a local override, and change the SMTP server to the hostname instead of loopback IP address.

Add-ServerMonitoringOverride -Server ServerName -Identity FrontEndTransport\OnPremisesInboundProxy -ItemType Probe -PropertyName ExtensionAttributes -PropertyValue '<ExtensionAttributes><WorkContext><SmtpServer>Exch1.contoso.com</SmtpServer><Port>25</Port><HeloDomain>InboundProxyProbe</HeloDomain><MailFrom Username="inboundproxy@contoso.com" /><MailTo Select="All" Username="HealthMailboxdd618748368a4935b278e884fb41fd8a@FM.com" /><Data AddAttributions="false">X-Exchange-Probe-Drop-Message:FrontEnd-CAT-250Subject:Inbound proxy probe</Data><ExpectedConnectionLostPoint>None</ExpectedConnectionLostPoint></WorkContext></ExtensionAttributes>' -Duration 45.00:00:00

The probe will be created on the specified server in following registry path:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ExchangeServer\v15\ActiveMonitoring\Overrides\Probe

You can use following command to verify if the probe is effective:

(Get-WinEvent -LogName Microsoft-Exchange-ActiveMonitoring/ProbeDefinition | % {[XML]$_.toXml()}).event.userData.eventXml | ?{$_.Name -like "OnPremisesInboundProxy"}

Creating a Global Override

In this example, the organization has an EWS application that is keeping the EWS app pools busy for complex queries. The administrator discovers that the EWS App pool is recycled during long running queries, and that the EWSProxyTestProbe probe is failing.

To find out the details of EWSProxyTestProbe, run the following:

(Get-WinEvent -LogName Microsoft-Exchange-ActiveMonitoring/ProbeDefinition | % {[XML]$_.toXml()}).event.userData.eventXml | ?{$_.Name -like "EWSProxyTestProbe"}

Next, change the timeout interval for EWSProxyTestProbe to 25 seconds on all servers running Exchange Server 2013 RTM CU2:

  1. Use following command to get version information for Exchange 2013 RTM CU2 servers:

    Get-ExchangeServer | ft name,admindisplayversion

  2. Use following command to create a new global override:

    Add-GlobalMonitoringOverride -Identity "EWS.Proxy\EWSProxyTestProbe" -ItemType Probe -PropertyName TimeoutSeconds -PropertyValue 25 –ApplyVersion “15.0.712.24”

Override Durations

Either of above mentioned overrides can be created for a specific Duration or for a specific Version of Exchange servers. The override created with Duration parameter will be effective only for the period mentioned. And maximum duration that can be specified is 60 days. For example, an override created with Duration 45.00:00:00 will be effective for 45 days since the time of creation.

The Version specific override will be effective as long as the Exchange server version matches the value specified. For example, an override created for Exchange 2013 CU1, with version “15.0.620.29” will be effective until the Exchange server version changes. The override will be ineffective if the Exchange server is upgraded to different version of Cumulative Update or Service Pack.

Hence, if you need an override in effect for longer period, create the override using ApplyVersion parameter.

Removing an Override

Finally, this last example shows how to remove the local override that was created for the OnPremisesInbounxProxy probe.

Remove-ServerMonitoringOverride -Server ServerName -Identity FrontEndTransport\OnPremisesInboundProxy -ItemType Probe -PropertyName ExtensionAttributes

 

Conclusion

Managed Availability performs gradual recovery actions to automatically recover from failure scenarios. The overrides help you customize the configuration of the Managed Availability components to suite to your environment. The steps mentioned in the document can be used to customize Monitors & Probes as required.

Special thanks to Abram Jackson, Scott Schnoll, Ben Winzenz, and Nino Bilic for reviewing this post. Smile

Bhalchandra Atre

6 Comments
Not applicable

Does it really have to be this difficult to monitor health?  Why the needs to be so PowerShell dependent?  We know PS is a great tool but it's almost as though we're headed back to the days of 1975 and everything is available only via command line.  I understand the granularity is certainly very worthwhile to those who have the time to drill this far.  However most of us do not.  A lack of monitoring and reporting on Exchange has been an issue since the product was first shipped. Providing better high-level analysis tools in ADDITION to the above would be much more warmly received.  Combined with the changes in Win Server 2012/EMC to EAC and a removal of such tools as ExBPA, it's made the job more difficult than it needs to be.  Thankfully there's people like Steve Goodman around making our lives easier, filling in the gaps on health reporting.

Not applicable

Any chance you could provide an example of how you would override the default 100GB free-space setting for database volumes?  I.e. how would you determine what the property name is for the below message.  Note this would only be used on a development or test environment.

Database drive ***** is below critical threshold on space for last 00:15:00.

Threshold: 100GB

Error: The performance counter '****Free Megabytes' sustained a value of '3,960.00', for the '15' minute(s) interval starting at '22/08/2013 8:42:00 p.m.'. Additional information: None. Trigger Name:DatabaseDriveSpaceTrigger. Instance:*****

Not applicable

Could you speak a little further on action taken, e.g., the loopback on 25 port... in case it fails I'm assuming it's mailing All, but what are the other options?

Thanks

Not applicable

SAT #, Thank you for the comment. I agree the GUI management option would make life little more easier. But,

the best part for now is we have built in monitoring and recovery system.

Matt # The drive space threshold of 100 GB is hard coded, there isn't an option to override the free space threshold.

Jack # The other options you see are used for composing the test email

Not applicable

How can i override search functionality, e.g: whenever outlook or owa search occurs i want "catch" and customize it ?

Another Q: any way to override the search service and ask him to NOT to index specific items

Not applicable

Eran, I guess your queries are w.r.t customizing "Search", not related to managed availability.

Version history
Last update:
‎Jul 01 2019 04:14 PM
Updated by: