Exchange 2010 features a new resource protection mechanism - user throttling. This feature is designed to limit the amount of resources a single user or application can take up on a CAS to prevent poorly written applications from causing denial of service (DoS) to the rest of the users. You can read about throttling in Understanding Client Throttling Policies. If any of the terminology in this post sounds unfamiliar, please refer to this documentation.
While Exchange 2010 RTM shipped with user throttling "off" by default (most limits were set to infinite), after more testing in Exchange 2010 SP1, we've come up with a tighter set of limits for the throttling policies, and have thus turned user throttling on by default.
We have also changed what happens when users exceed their budget in some cases. In Exchange 2010 RTM version, Exchange rejected any Exchange Web Service (EWS), Exchange ActiveSync (EAS) and Outlook Web App (OWA) requests made by users who exceeded their budget. We've improved on this idea in SP1 in the EWS and ActiveSync protocols, by instead delaying the call just enough for the budget to recharge back into the positive and then execute the request. This means that end users will generally see fewer errors from the ActiveSync client or EWS application. In some rare conditions, such as if the caller is exceeding max number of connections or subscriptions in EWS, we'll still reject the request.
The longest a single request can be delayed is a minute, but this would be an extreme case and one that would signify that something is out of place either on the server, or with the caller. Typically, users and applications will not encounter throttling (except maybe if the user is doing a sync of the whole mailbox). However, some resource-heavy applications may start to get throttled in SP1. If throttling does kick in, the delays will be short enough that users won't notice any effect. However, we've provided ways to gain an insight into what is the user's experience is like due to throttling.
There are two main ways to monitor throttling - by monitoring perf counters and by looking at IIS logs. First, SP1 offers the following useful perf counters (instance is per CAS process) to monitor throttling under the MSExchange Throttlingcategory on a CAS:
- Max Delay Per Minute - this value represents the longest amount of time in msec that anyone was delayed due to throttling in the past minute.
- Max Effective Time In * - this set of counters say that if the throttling policy was set to the counter values, then all requests that have been encountered in the past minute would all go through unthrottled.
- Users Delayed X Milliseconds - the number of users who saw delays greater than "X" (see Delay Time Threshold) milliseconds in the past minute.
- Users X Times OverBudget - the number of users whose requests were rejected more than "X" times in the past minute (see OverBudgetThreshold).
- OverBudgetThreshold - the "X" value for the "Users X Times OverBudget" counter.
- Delay Time Threshold - the "X" value for the "Users Delayed X Milliseconds" counter.
- Total Unique Budgets - number of unique budgets (ie callers/users) seen in the past minute
- Unique Budgets OverBudget - number of unique budgets that went over budget in the past minute
The general rule is that if the "Unique Budgets OverBudget" counter is graphing a line that's close to the "Total Unique Budgets" line, then most of the users in your system are getting throttled. You can further refine that by checking how many users are seeing rejections vs how many are getting delayed by viewing the appropriate "Users X times ..." counter. Finally, you can see if and how much users are delayed by viewing the "Max Delay Per Minute" counter. Also, all of these counters are saved off to SCOM once every minute.
If you do determine that many of your users are getting throttled, you may further try to understand why by digging into IIS logs. As of SP1, only ActiveSync, OWA and EWS log throttling info to IIS. By searching IIS for users or the string "overbudget", you can view which requests they have been making and which have been going over budget. You can refer to Budget Snapshots in the IIS Logs for a breakdown of the different parts of the budget.
If you do determine that your users or applications are throttled too much by your standards and their scenarios are in fact legitimate, then you can tweak the throttling settings to reflect your environment's use by:
- Turning throttling off
- Running your regular traffic through Exchange
- Watching what the "Max Effective Time In *" counters report over the course of a few days
- Setting the throttling policies to that value. To do this, call Get-ThrottlingPolicy ?| { $_.IsDefault} | Set-ThrottlingPolicy <new param values>
Alternatively, if it is an EWS application using a service account that becomes throttled, and you determine that it is not resource intensive to the Exchange server, you should create a new, custom throttling policy for it. To do this:
- Call New-ThrottlingPolicy and set the proper parameters (refer to Exchange documentation at the top of the document for explanation of the parameters)
- Call Get-Mailbox <mailbox of service account that the app is using) | Set-Mailbox -ThrottlingPolicy:<your policy that you just created>
The changes will be picked up within 15 minutes, or immediately after you recycle the EWS app pool in IIS. Please note that custom policies are meant as one-off solutions when a few applications or users are getting throttled and the load they are putting on the system is actually legitimate. You shouldn't update everyone's link to a custom policy - if you need to change throttling settings for the majority of your users, edit the default policy. For more information on throttling please refer to the official documentation linked at the top of this article.
You Had Me at EHLO.