It's not uncommon to want to monitor a perfmon counter like Logical Disk\% Free Space where there are multiple instances and you only want to be alerted if one instance goes below the threshold. Authors will typically try to do this through the UI building a monitor such as:
You set the threshold, set the critical state and healthy state and saved out the new monitor to an MP (hopefully not the Default Management Pack).
The end result of this monitor likely didn't look quite as you expected it to, when there are more than one instance associated with a selected perfmon counter it ends up looking like this:
The problem here is in how the counters are processed. Each Logical Disk instance is processed one after the other. If you have 4 logical disks, 2 of which are healthy and 2 of which are unhealthy, you end up with the healthy disks overwriting the state of the unhealthy disks as the monitor processes each instance in turn. If the last disk processed is unhealthy, the state will stay unhealthy. If the last disk processed is healthy the state will stay healthy and through each state transition a notification may have been fired.
One solution to this is to discover each Logical Disk instance and then target your unit monitor at those instances using the discovery data to filter which instance of Logical Disk you wish to check.
With R2 there's a new way to do this, it requires building out a new Monitor Type but it eliminates the need to discover the individual instances of a perfmon counter. So you now have the choice, write a discovery or create a monitor type.
The key to building a monitor type that can handle multi-instance counters are two new tags that can be used within a <ConditionDetection> element. The new elements are <EmptySet>[Passthrough|Block]</EmptySet> and <SetEvaluation>[All|Any]</SetEvaluation>. These elements allow you to control how the condition detection will work when it receives input that contains multiple data items.
<EmptySet> specifies what to do with empty data sets, the default (and old behavior) is 'Passthrough' which means the empty set is passed on to the next module in the chain. 'Block' keeps the empty set from going any further.
<SetEvaluation> specifies how to determine is the condition is met when multi data items are passed. 'All' tells the condition detection to only pass if all data items that pass the expression criteria. 'Any', the default behavior, tells the condition detection to pass along any data items that pass the expression criteria.
Currently these tags are not used in any standard MonitorType definitions, so to make use of them you have to define a new monitor type. The following XML defines a simple threshold style monitor that will allow me to determine if a performance counter is below the configured threshold.
<MonitorTypes>
<UnitMonitorType ID="CustomSimpleThreshold.ErrorOnAnyBelowThreshold" Accessibility="Public">
<MonitorTypeStates>
<MonitorTypeState ID="AboveThreshold" NoDetection="false" />
<MonitorTypeState ID="BelowThreshold" NoDetection="false" />
</MonitorTypeStates>
<Configuration>
<xsd:element name="ComputerName" type="xsd:string" />
<xsd:element name="CounterName" type="xsd:string" />
<xsd:element name="ObjectName" type="xsd:string" />
<xsd:element name="InstanceName" type="xsd:string" />
<xsd:element name="AllInstances" type="xsd:boolean" />
<xsd:element name="Frequency" type="xsd:unsignedInt" />
<xsd:element name="Threshold" type="xsd:double" />
</Configuration>
<OverrideableParameters>
<OverrideableParameter ID="Frequency" Selector="$Config/Frequency$" ParameterType="int" />
<OverrideableParameter ID="Threshold" Selector="$Config/Threshold$" ParameterType="double" />
</OverrideableParameters>
<MonitorImplementation>
<MemberModules>
<DataSource ID="DS_PerfData" TypeID="Performance!System.Performance.DataProvider">
<ComputerName>$Config/ComputerName$</ComputerName>
<CounterName>$Config/CounterName$</CounterName>
<ObjectName>$Config/ObjectName$</ObjectName>
<InstanceName>$Config/InstanceName$</InstanceName>
<AllInstances>$Config/AllInstances$</AllInstances>
<Frequency>$Config/Frequency$</Frequency>
</DataSource>
<ConditionDetection ID="AboveThresholdDetection" TypeID="System!System.LogicalSet.ExpressionFilter">
<Expression>
<SimpleExpression>
<ValueExpression>
<XPathQuery Type="Double">Value</XPathQuery>
</ValueExpression>
<Operator>Greater</Operator>
<ValueExpression>
<Value Type="Double">$Config/Threshold$</Value>
</ValueExpression>
</SimpleExpression>
</Expression>
<EmptySet>Passthrough</EmptySet>
<SetEvaluation>All</SetEvaluation>
</ConditionDetection>
<ConditionDetection ID="BelowThresholdDetection" TypeID="System!System.LogicalSet.ExpressionFilter">
<Expression>
<SimpleExpression>
<ValueExpression>
<XPathQuery Type="Double">Value</XPathQuery>
</ValueExpression>
<Operator>LessEqual</Operator>
<ValueExpression>
<Value Type="Double">$Config/Threshold$</Value>
</ValueExpression>
</SimpleExpression>
</Expression>
<EmptySet>Block</EmptySet>
<SetEvaluation>Any</SetEvaluation>
</ConditionDetection>
</MemberModules>
<RegularDetections>
<RegularDetection MonitorTypeStateID="BelowThreshold">
<Node ID="BelowThresholdDetection">
<Node ID="DS_PerfData" />
</Node>
</RegularDetection>
<RegularDetection MonitorTypeStateID="AboveThreshold">
<Node ID="AboveThresholdDetection">
<Node ID="DS_PerfData" />
</Node>
</RegularDetection>
</RegularDetections>
</MonitorImplementation>
</UnitMonitorType>
</MonitorTypes>
The AboveThresholdDetection checks to see if all passed in data items are above the threshold, when this holds true the state of the monitor will be set to AboveThreshold. The BelowThresholdDetection checks to see if any of the passed in data items are below the threshold and if this holds true the state of the monitor will be set to BelowThreshold. Note that these two conditions are mutually exclusive, you can't have both be true.
If we create a unit monitor using the new monitoring type, such as the following example for Logical Disk/% Free Space:
<UnitMonitor ID="MultiInstancePerfCounterUnitMonitors.CustomFreeDiskSpace" Accessibility="Public" Enabled="false" Target="Windows!Microsoft.Windows.OperatingSystem" ParentMonitorID="Health!System.Health.PerformanceState" Remotable="true" Priority="Normal" TypeID="CustomSimpleThreshold.ErrorOnAnyBelowThreshold" ConfirmDelivery="true">
<Category>PerformanceHealth</Category>
<AlertSettings AlertMessage="MultiInstancePerfCounterUnitMonitors.CustomFreeDiskSpace_AlertMessageResourceID">
<AlertOnState>Error</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>Error</AlertSeverity>
</AlertSettings>
<OperationalStates>
<OperationalState ID="AboveThreshold" MonitorTypeStateID="AboveThreshold" HealthState="Success" />
<OperationalState ID="BelowThreshold" MonitorTypeStateID="BelowThreshold" HealthState="Error" />
</OperationalStates>
<Configuration>
<ComputerName>$Target/Host/Property[Type="Windows!Microsoft.Windows.Computer"]/NetworkName$</ComputerName>
<CounterName><![CDATA[% Free Space]]></CounterName>
<ObjectName>LogicalDisk</ObjectName>
<InstanceName />
<AllInstances>true</AllInstances>
<Frequency>300</Frequency>
<Threshold>30</Threshold>
</Configuration>
</UnitMonitor>
We get a monitor that will alert if one or more logical disk's free space is below 30% and the monitor will only cause a single state change for the monitored system.
Hopefully you find this helpful as you build out custom monitoring in your environment.
Mike Guthrie
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.