Store Driver Fault Isolation Improvements in Exchange 2010 SP1
Published Apr 11 2011 09:30 AM 81.7K Views

Background

The Exchange Store Driver is a core transport component which lives both on the Mailbox server role (as the mail submission service) and the Hub server role. It is responsible for:

  • Retrieving messages from the mailbox server that have been submitted by end-users and submitting those to the Hub transport role for categorization and routing.
  • Delivering messages to the appropriate mailbox server based on the location of the recipients mailbox.
  • Extensibility platform for both mail submission & delivery. Store Driver currently hosts a number of agents that extend the functionality of Exchange. Examples include such agents as Inbox Rules, Conversations, meeting forward notifications, etc.

Exchange 2010 is currently being utilized in Live@EDU, as well as the upcoming Office 365. As you can probably imagine, the Exchange servers that run in those datacenters are loaded and pushed harder than almost any other Exchange server imaginable. Prior to SP1, there were several problems that were encountered with mail delivery to the Exchange mailbox store. In particular, there was a need to make sure that a handful of recipients did not starve the rest of the mail delivery system.

While many of you may not have noticed this problem, Microsoft has seen many of these types of cases over the years; often isolated to a single event like an inadvertent public folder replication storm.

This was despite the message throttling that was already available. Transport roles have also had functionality to avoid resource starvation known as Back Pressure, but this was not designed to protect the system from messages that were already in the Local Delivery queue.

Changes in SP1

In order to further protect both the Mailbox servers and Hub servers from resource starvation, new thread limits were introduced in SP1:

KeyDescriptionScenarioError in Connectivity Log:
<add key=”RecipientThreadLimit” value=”1”/> Limit beyond which no more threads can be allocated to the recipient for delivery.

Note: If this is increased, you should increase MaxMailboxDeliveryPerMdbConnections as well, so that slow or hung deliveries to a single recipient will not block delivery for the entire MDB.

Flood of messages to a single Mailbox or a performance problem associated with a single mailbox, has minimal impact on delivery to the rest of the Mailboxes in the database. Throttled delivery for recipient <recipient> due to concurrency limit <limit>
<add key=”MaxMailboxDeliveryPerMdbConnections” value=”2”/> The maximum number of concurrent connections to a single “healthy” Mailbox Database.

Database health is determined by the Health Monitor API and recorded in the connectivity logs as a value between -1, 0-100. 100 being healthy.

Connections hang to a single problematic database have minimal impact on delivery of other queued messages Throttled Delivery due to server limit for <server FQDN> with threshold

Note: These keys are not present in the EdgeTransport.exe.config file by default.

Is it possible to have too much protection?

Unfortunately, there are two scenarios after applying SP1 where we are seeing customers with messages backing up in the queue. The temporary error message is:

432 4.3.2 STOREDRV.Deliver; recipient thread limit exceeded

As you can probably guess, the two scenarios are:

  • Journaling
  • Public Folders

In both cases, the deliveries are occurring to a single recipient (or very small number of recipients). This is likely to occur during heavy mail flow. The screen shot below was taken from a lab server while reproducing the issue:

Screenshot: Messages backed up in mailbox delivery queue due to recipient thread limits
Figure 1: Messages backed up in mailbox delivery queue due to recipient thread limits being exceeded (click here for larger screenshot)

You can see a historical history of 4.3.2 events in connectivity logs on your Hub Transport servers (in the \Program Files\Microsoft\Exchange Server\V14\TransportRoles\Logs\Connectivity\CONNECTLOGxxxxxxxx-x.LOG), like:

#Software: Microsoft Exchange Server
#Version: 14.0.0.0
#Log-type: Transport Connectivity Log
#Date: 2011-01-12T00:00:00.775Z
#Fields: date-time,session,source,Destination,direction,description
 
2011-01-12T00:00:00.775Z,08CD7F1200CBDBD0,MapiSubmission,5f24e416-c380-41b5-bfe0-37b6f1091f49,>,"Failed; HResult: 1140850693; DiagnosticInfo: Stage:LoadItem, SmtpResponse:432-4.3.2 STOREDRV; mailbox server is too busy"
 
2011-01-12T00:00:00.775Z,08CD7F1200CBDBD0,MapiSubmission,5f24e416-c380-41b5-bfe0-37b6f1091f49,-,RegularSubmissions: 0 ShadowSubmissions: 0 Bytes: 0 Recipients: 0 Failures: 1 ReachedLimit: False Idle: False
 
2011-01-12T00:00:05.792Z,08CD7F1200CBDBD1,MapiSubmission,5f24e416-c380-41b5-bfe0-37b6f1091f49,+,Win2k8R2Ex14.dom2k8r2ex14.lab

In some cases, simply leaving the servers alone should cause the queues to slowly drain. In other cases, it may be necessary to take further action because the problem has persisted for a while or because it isn’t part of a one-time event like a Public Folder replication storm.

Alright, I understand. Now how do I fix my situation?

Like every other throttling & performance related feature that has ever been in Exchange, the solution isn’t exactly straight forward. For starters, we’ve seen some level of success simply incrementing both values up one, as follows:

<add key="RecipientThreadLimit" value="2" />
<add key="MaxMailboxDeliveryPerMdbConnections" value="3" />

In fact, there has been en ough success with this that we’re considering changing the defaults in a future rollup or service pack. Of course, when we do, the new defaults will only apply to those who haven’t already modified the values. Although it has not yet been released, KB 2491972 will discuss the change when the time comes.

So if +1 is better, then why not +2 or more?

Well, the problem is that all Exchange servers will ultimately be bound by some hardware resource. You don’t want to introduce thrashing or resource starvation due to a public folder or journaling server event that now impacts all users as well. In addition, there are other limits that control the maximum number of threads that can service these types of deliveries. In short, we are not recommending going above 4 for MaxMailboxDeliveryPerMdbConnections, and RecipientThreadLimit should always be at least one fewer.

In an extreme case, if you are in a very busy environment with dedicated journaling mailbox servers, it may be worth considering having dedicated hub transport servers to go along with that. Of course, they can be virtualized or multi-role boxes, but in order to be isolated, the whole bunch will need to be in a dedicated site. Then, you would be able to increase these limits without risking delivery to other mailboxes.

Of course, this is an extreme example. For the rest of us, it may be as simple as trying the +1 approach while carefully monitoring the hub & mailbox servers.

Scott Landry, Steve Schiemann, Frank Brown and Jason Nelson

7 Comments
Not applicable

Good to know!  As I am about to bring on users to Exchange 2010 that require journaling, this will be very helpful should we encounter any issues.  Thanks!

Not applicable

Even in a modestly sized exchange deployment < 1 million messages a day, setting MaxMailboxDeliveryPerMdbConnections to 4 and RecipientThreadLimit to 3 is going to be vastly inadequate for anyone doing transport level journaling that would have been fine pre-SP1.

The designers of these new safeguards didn't really take into account journaling or email archiving needs outside of Exchange 2010's limited new 2nd tier dumpster approach. What is really needed is a way to set high transport limits for privileged classes of message delivery while maintaining reasonable thresholds for other "normal" users.

Not applicable

Could you please explain whether the mail submission service of MB server moves a message from Outbox to the queue of HT server or just sends a notification to HT and then HT's Store Driver extracts the message? Different sources provide different answers  :(   Ex2007 documentation was less ambiguous at this point

Not applicable

@Mikhail: The Mail Submission service notifies Hub Transport servers in the same Active Directory site. The Store driver on the Hub Transport server retrieves the message from the Mailbox server and places it in the Submission queue.

More details in

Understanding Message Routing >

Receiving Messages for Routing > Retrieving Messages from a Mailbox Server

Note, you can always provide us feedback about our documentation from the topic page on TechNet by clicking (the stars next to)

Rate and provide feedback located in the top right corner of the page.

Not applicable

How relevant ! Great post !

I've just blogged about that same issue, well in another related situation..

ilantz.wordpress.com/.../432-4-3-2-storedrv-and-store-driver-throttling

Could please show all the options the edgetransport.exe.config , like the "hidden" one I've managed to find ?

<add key="MailboxDeliveryThrottlingEnabled" value="False" />

Thanks again !

Copper Contributor

Hi guys, Good evening, can you help me with a question I have, I am not that experienced in exchange, but I have a topic with this NDR 432 4.3.2 STOREDRV.Deliver; recipient thread limit exceeded, but my boss does not authorize me to use this solution to add those two lines to the edge transport, which because he saw in another forum that it is not functional. The link where he indicates that it is not useful to do that solution is: https://social.technet.microsoft.com/Forums/exchange/en-US/022acffe-d80d-4fd6-8c01-2bc5f020ec91/432-... recipient-thread-limit-exceeded-in-the-queue-of-mails-exchange-2010? forum = exchange2010 Our environment is different from that of the boy in question, we have two servers and the roles are divided in Mailbox and CAS / HUB, Exchange 2010 SP1 Rollup 32. We do not have a DAG like him. And another question I have is that those two lines would be added, only to the cas / hub server or I also add them to the mailbox server, since both servers have the same file in the same path: E: \ Program Files \ Microsoft \ Exchange Server \ V14 \ Bin \ EdgeTransport.exe.config I do not know if it is too much to ask that you support me in indicating how they reproduced the error that appears in figure 1 of this blog. To apply the solution in my laboratory and show that it works. Awaiting your kind reply and thanks for your attention. Best regards.

Copper Contributor

Rohan Suresh Dhanwate 

Version history
Last update:
‎Jul 01 2019 03:58 PM
Updated by: