By Priscilla 'Nini' Ikhena, Program Manager - Microsoft OMS Log Analytic team
Are you interested in easily troubleshooting issues in your Exchange environment? A little while ago, we made available near-real time performance data collection in OMS , and now you are better able to track metrics (which is important for managing Exchange) in your environment in addition to logs.
I recently set up a small Exchange environment in Azure and have been tracking different metrics in OMS using search queries based on the Exchange Management Pack monitors, and now you can do the same! Before you begin searching, be sure to add the necessary event logs and enable the performance counters you intend on collecting data for.
Some of the queries I’ve been tracking:
Average percentage of time that the processor is executing application or operating system processes (should be less than 75% on average)
Counter Name
:
Process\% Processor Time
Query
:
Type=Perf ObjectName="Process" CounterName="% Processor Time" | measure avg(Average) by Computer | where AggregatedValue > 75
Note
: If the aggregated average value goes above 75, one or more logs will get generated, which will then increase the ‘0’ log count on my dashboard, thus highlighting the tile!
Recovery Action Failed
Log Name
:
Microsoft-Exchange-ManagedAvailability/RecoveryActionLogs
Query
:
Type=Event Source=Microsoft-Exchange-ManagedAvailability EventLog: "Microsoft-Exchange-ManagedAvailability/RecoveryActionLogs"
MS Exchange Frontend Transport Service has not been running for a period of time
Log Name
:
Application
Query
:
Type=Event EventLog: Application Source: "MSExchangeFrontEndTransport"
The connection between the Client Access server and the Mailbox server failed
Log Name:
Application
Q
uery
:
Type=Event EventLog=Application Source=ActiveSync EventID=1022
The Availability service could not successfully send a proxy Web request to another instance of the Exchange Availability service that is running in a different Active Directory site or forest.
Log Name:
Application
Type=Event EventLog=Application EventID=4002 Source=MSExchange Autodiscover
A setting in the Web.config file was not valid and has been reset to the default value
Log Name:
Application
Query
:
Type=Event EventLog=Application EventID=1033 Source=ActiveSync
Percentage of the free usable space on my disk drive
Counter Name
:
LogicalDisk\% Free Space
Query
:
Type=Perf ObjectName=LogicalDisk CounterName="% Free Space”
Low Disk Space - Free Disk Space is less than 10%
Counter Name
:
LogicalDisk\% Free Space
Query
:
Type=Perf ObjectName: "LogicalDisk" "% Free Space" | measure avg(Average) by Computer | where AggregatedValue < 10
Average number of bytes transferred to or from the disk during write or read operations
Counter Name
:
LogicalDisk\Avg. Disk Bytes/Transfer
Query
:
Type=Perf ObjectName=LogicalDisk CounterName="Avg. Disk Bytes/Transfer"
Amount of Virtual memory in use
Counter Name
:
Memory\% Committed Bytes In Use
Query
:
Type=Perf ObjectName=Memory CounterName="% Committed Bytes In Use" | measure avg(Average) by Computer
Amount of physical memory available for running processes
Counter Name
:
Memory\Available MBytes
Query
:
Type=Perf ObjectName=Memory CounterName="Available MBytes"
Percentage of elapsed time processor spends in User Mode
Counter Name
:
Processor\% User Time
Query
:
Type=Perf ObjectName=Processor CounterName="% User Time"
Rate at which bytes are sent and received over each adapter
Counter Name
:
Network Interface\Bytes Total/sec
Query
:
Type=Perf ObjectName="Network Interface" CounterName="Bytes Total/Sec"
LDAP Search Time is beyond the warning threshold
Counter Name
:
MSExchange ADAccess Domain Controllers\LDAP Search Time
Query
:
Type=Perf ObjectName="MSExchange ADAccess Domain Controllers" CounterName="LDAP Search Time" |measure avg(Average) by Computer | where AggregatedValue > 50
LDAP Read Time is beyond the warning threshold
Counter Name
:
MSExchange ADAccess Domain Controllers\LDAP Read Time
Query:
Type=Perf ObjectName="MSExchange ADAccess Domain Controllers" CounterName="LDAP Read Time" |measure avg(Average) by Computer | where AggregatedValue > 50
Exchange ActiveSync could not access a mailbox on a Mailbox server because the Mailbox server is offline.
Log Name
:
Application
Query
:
Type=Event EventLog=Application EventID=1023 Source=ActiveSync
Length of output packet queue in packet
Counter Name
:
Network Interface\Output Queue Length
Query
:
Type=Perf ObjectName="Network Interface" CounterName="Output Queue Length"
Memory leak occurs
Counter Name
:
Process\Private Bytes
Query
:
Type=Perf ObjectName=Process CounterName="Private Bytes"
Client RPC Average Latencies are very high
Counter Name
:
MSExchange RpcClientAccess \RPC Averaged Latency
Query
:
Type=Perf ObjectName= "MSExchange RpcClientAccess" CounterName="RPC Averaged Latency"|measure avg(Average) by Computer | where AggregatedValue > 250
Getting data on Message Tracking Report
Counter Name
:
MSExchange Message Tracking\Get-MessageTrackingReport Task Executed
Query
:
Type=Perf ObjectName="MSExchange Message Tracking" CounterName="Get-MessageTrackingReport Task Executed"
Counter Name
:
MSExchange Message Tracking\Get-MessageTrackingReport Task Executed/Sec
Query
:
Type=Perf ObjectName="MSExchange Message Tracking" CounterName="Get-MessageTrackingReport Task Executed/Sec"
The Exchange Transport service is rejecting message submissions due to memory consumption higher than the configured threshold
Log Name:
Application
Query
:
Type=Event EventLog: Application EventID=15007 Source=MSExchangeTransport
Outlook Web Access was unable to read or update some of its configuration settings
Log Name
:
Application
Query
:
Type=Event EventLog: Application EventID=64 Source="MSExchange OWA"
Exchange Direct Push has detected that the configuration value for the minimum heartbeat interval is set to a value that is too low
Log Name
:
Application
Query
:
Type=Event EventLog: Application EventID=1011 Source=ActiveSync
Unable to add an email address because it is invalid
Log Name
:
Application
Query
:
Type=Event EventLog: Application EventID=1 Source=InternetProxy
The database engine lost one page of corrupted data
Log Name
:
Application
Query
:
Type=Event EventLog: Application EventID=500 Source=ESE
MSExchangeMailSubmission, There is no available Hub Transport server in the local site
Log Name
:
Application
Query
:
Type=Event EventLog: Application EventID=1008 Source=MSExchangeMailSubmission
Outlook Web Access is not available for one of the mailboxes in a mailbox database
Log Name
:
Application
Query
:
Type=Event EventLog: Application EventID=57 Source="MSExchange OWA"
I then went on to save these search results and added them to my dashboard:
Note: I’ve set Thresholds on these tiles so a tile gets highlighted whenever it has an unusual number of logs. For example, if the ‘ Exchange ActiveSync could not access a mailbox on a Mailbox server’ tile above reads greater than 0, this would mean there was an instance where this happened in the Exchange environment which may lead messages not getting delivered.
You can do this by selecting “Customize” at the bottom of the Dashboard page, then selecting the tile you’re interested in and then “Edit”:
Additionally, a quick and easy way to add these search queries to your Saved Searches after adding the right events and counters, would be to simply copy and paste the PowerShell script below into Windows PowerShell ISE. Be sure to download the Armclient command line tool prior to running this code. More information on Armclient available here – ArmClient – A command line tool for Azure API :
--
--
For your convenience, here’s a list of counters and event logs I added to my workspace before searching:
Performance Counters | Event Logs |
Process\% Processor Time |
Application |
MSExchange Database\I/O Database Reads Average Latency |
Microsoft-Exchange-ManagedAvailability/RecoveredActionResults |
MSExchange Database\I/O Database Writes Average Latency |
|
LogicalDisk\% Free Space |
|
LogicalDisk\Avg. Disk Bytes/Transfer |
|
Memory\% Committed Bytes In Use |
|
Memory\Available MBytes |
|
Processor\% User Time |
|
Network Interface\Bytes Total/sec |
|
MSExchange ADAccess Domain Controllers\LDAP Search Time |
|
MSExchange ADAccess Domain Controllers\LDAP Read Time |
|
Network Interface\Output Queue Length |
|
Process\Private Bytes |
|
MSExchange RpcClientAccess \RPC Averaged Latency |
|
MSExchange Message Tracking\Get-MessageTrackingReport Task Executed |
|
MSExchangeMessageTracking\Get-MessageTrackingReport Task Executed/Sec |
What’s Next?
Moving forward, our thoughts are around building an OMS Exchange Solution that will you help you better manage your Exchange environment by providing assessment, suggestions and monitoring across all portfolios.
I hope this has been helpful! Enjoy searching and please post feedback or questions on UserVoice or leave a comment below!
- Nini
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.