The Exchange Store Driver is a core transport component which lives both on the Mailbox server role (as the mail submission service) and the Hub server role. It is responsible for:
Exchange 2010 is currently being utilized in Live@EDU, as well as the upcoming Office 365. As you can probably imagine, the Exchange servers that run in those datacenters are loaded and pushed harder than almost any other Exchange server imaginable. Prior to SP1, there were several problems that were encountered with mail delivery to the Exchange mailbox store. In particular, there was a need to make sure that a handful of recipients did not starve the rest of the mail delivery system.
While many of you may not have noticed this problem, Microsoft has seen many of these types of cases over the years; often isolated to a single event like an inadvertent public folder replication storm.
This was despite the message throttling that was already available. Transport roles have also had functionality to avoid resource starvation known as Back Pressure, but this was not designed to protect the system from messages that were already in the Local Delivery queue.
In order to further protect both the Mailbox servers and Hub servers from resource starvation, new thread limits were introduced in SP1:
Note: If this is increased, you should increase MaxMailboxDeliveryPerMdbConnections as well, so that slow or hung deliveries to a single recipient will not block delivery for the entire MDB.
Database health is determined by the Health Monitor API and recorded in the connectivity logs as a value between -1, 0-100. 100 being healthy.
Note: These keys are not present in the EdgeTransport.exe.config file by default.
Unfortunately, there are two scenarios after applying SP1 where we are seeing customers with messages backing up in the queue. The temporary error message is:
432 4.3.2 STOREDRV.Deliver; recipient thread limit exceeded
As you can probably guess, the two scenarios are:
In both cases, the deliveries are occurring to a single recipient (or very small number of recipients). This is likely to occur during heavy mail flow. The screen shot below was taken from a lab server while reproducing the issue:
Figure 1: Messages backed up in mailbox delivery queue due to recipient thread limits being exceeded (click here for larger screenshot)
You can see a historical history of 4.3.2 events in connectivity logs on your Hub Transport servers (in the \Program Files\Microsoft\Exchange Server\V14\TransportRoles\Logs\Connectivity\CONNECTLOGxxxxxxxx-x.LOG), like:
#Software: Microsoft Exchange Server#Version: 184.108.40.206#Log-type: Transport Connectivity Log#Date: 2011-01-12T00:00:00.775Z#Fields: date-time,session,source,Destination,direction,description 2011-01-12T00:00:00.775Z,08CD7F1200CBDBD0,MapiSubmission,5f24e416-c380-41b5-bfe0-37b6f1091f49,>,"Failed; HResult: 1140850693; DiagnosticInfo: Stage:LoadItem, SmtpResponse:432-4.3.2 STOREDRV; mailbox server is too busy" 2011-01-12T00:00:00.775Z,08CD7F1200CBDBD0,MapiSubmission,5f24e416-c380-41b5-bfe0-37b6f1091f49,-,RegularSubmissions: 0 ShadowSubmissions: 0 Bytes: 0 Recipients: 0 Failures: 1 ReachedLimit: False Idle: False 2011-01-12T00:00:05.792Z,08CD7F1200CBDBD1,MapiSubmission,5f24e416-c380-41b5-bfe0-37b6f1091f49,+,Win2k8R2Ex14.dom2k8r2ex14.lab
In some cases, simply leaving the servers alone should cause the queues to slowly drain. In other cases, it may be necessary to take further action because the problem has persisted for a while or because it isn’t part of a one-time event like a Public Folder replication storm.
Like every other throttling & performance related feature that has ever been in Exchange, the solution isn’t exactly straight forward. For starters, we’ve seen some level of success simply incrementing both values up one, as follows:
<add key="RecipientThreadLimit" value="2" /><add key="MaxMailboxDeliveryPerMdbConnections" value="3" />
In fact, there has been enough success with this that we’re considering changing the defaults in a future rollup or service pack. Of course, when we do, the new defaults will only apply to those who haven’t already modified the values. Although it has not yet been released, KB 2491972 will discuss the change when the time comes.
So if +1 is better, then why not +2 or more?
Well, the problem is that all Exchange servers will ultimately be bound by some hardware resource. You don’t want to introduce thrashing or resource starvation due to a public folder or journaling server event that now impacts all users as well. In addition, there are other limits that control the maximum number of threads that can service these types of deliveries. In short, we are not recommending going above 4 for MaxMailboxDeliveryPerMdbConnections, and RecipientThreadLimit should always be at least one fewer.
In an extreme case, if you are in a very busy environment with dedicated journaling mailbox servers, it may be worth considering having dedicated hub transport servers to go along with that. Of course, they can be virtualized or multi-role boxes, but in order to be isolated, the whole bunch will need to be in a dedicated site. Then, you would be able to increase these limits without risking delivery to other mailboxes.
Of course, this is an extreme example. For the rest of us, it may be as simple as trying the +1 approach while carefully monitoring the hub & mailbox servers.
Scott Landry, Steve Schiemann, Frank Brown and Jason Nelson