Over the past 6 months, our BPOS support team has been trying to get to the bottom of issues some customers were experiencing with Outlook Anywhere where Outlook (2007 & 2010) reports that it is connected but when a user tries to send email the message sits in the Outbox and does not get sent unless you restart Outlook. The issue was finally determined to be related to “keep alives” between the client and server and timeouts on network devices between the end-user and the Exchange CAS.
While you may not be using BPOS, this issue may be seen in ANY Exchange environment (including Office365) where Outlook Anywhere is used and so I want to make you aware of it and our recommendation to resolve it.
By default, Outlook Anywhere opens two default connections to the Exchange CAS called RPC_InData and RPC_OutData. the Outlook Anywhere client to server used a default timeout of 12 minutes (720 seconds) of inactivity and the server to the client timeout is 15 minutes (900 seconds).
These default Keep-Alive intervals are NOT aggressive enough for some of today’s home networking devices and/or aggressive network devices on the Internet. Some of those devices are dropping TCP connections after as little as 5 minutes (300 seconds) of inactivity. When one or both of the two default connections are dropped, the connection to the Exchange server is essentially broken and not useable.
To address this issue, we are recommending setting a registry key on the Exchange CAS to change the default Keep-Alive from 15 minutes (900 seconds) to 2 minutes (120 seconds):
MinimumConnectionTimeout DWORD 0x00000078 (120)
When present, this setting specifies the minimum connection timeout used by the client and RPC Proxy, in seconds. The actual timeout used is the lower of this value and the IIS idle connection timeout. If zero, or the key is not present, the IIS idle connection timeout is used. Used only in RPC over HTTP v2. When changes are made to this value on the RPC Proxy, IIS must be restarted for the change to take effect. See http://msdn.microsoft.com/en-us/library/windows/desktop/aa373592(v=vs.85).aspx for reference.
The Outlook client honors this new default during the connection to the server so both the Outlook client and the Server now send a Keep-Alive packet after 2 minutes of inactivity, effectively maintaining both TCP connections needed.
This change has almost negligible impact on the Exchange server, as it simply sets the Keep-Alive interval. By setting the timeout to 2 minutes, it is below the 5 minute timeout that a device between the user and Exchange server may be using and therefore allows the connections to "stay alive".
Does this issue affect just OA or the other services (ie. EAS, OWA)?
The core issue where Outlook thinks its connected but isn't because of the timeout only affects OA. EAS, OWA, etc use different mechanisms to handle it.
Is there any way to test this with a client side setting without have to make the global change on the server?
We have encountered the very same issue with a Hosted Exchange service at Telus.com. What a pain :(
While monitoring internal office network for TCP/IP timeouts, we obserbe that while outlook is connected to exchange server (using RPC over https), outlook opens and maintains five or so individual TCP/IP sessions on port 443 (https) and these connections do not time out whenever issue is encountered!.
Reading your blog message, you suggest when problem is encountered, it is because one or more of these TCP/IP sessions have timed out, but in reality they are up with consistent up-time. Furthermore, whenever problem is encountered, Outlook reports (in lower left corner) that it is "connected" to exchange server. I take it the connection is only half open, it can receive, but it cannot send. while outlook is in that state, when a new email is composed and we press on TO: to perform a GLA look-up, outlook pauses and reports connection to exchange server is unavailable.
very much appreciate if you can offer further clarificaiton.
It's not a new topic and I always handled it with setting the KeepAliveTime to 270 seconds (which is a bit lower than the 300 second default setting in HLBs and most of *Nix-based network devices). I've been doing that since my very first Exchange 2010 implementation, several hundred of thousands of mailboxes later I still apply the same configuration with the same success. Never had to set MinimumConnectionTimeout.
Configuring the MinimumConnectionTimeout should basically have the same effect as the TCP KeepAliveTime, but I'm wondering which one works better, or if they are complementary. For sure, the TCP KeepAlliveTime applies to everything, while MinimumConnectionTimeout sounds to apply only to MSRPC-based connections.
Value name: KeepAliveTime
Value Type: REG_DWORD-Time in milliseconds
Valid Range: 1-0xFFFFFFFF
Default: 7,200,000 (two hours)
This value controls how frequently TCP tries to verify that an idle connection is still intact by sending a keep-alive packet. If the remote computer is still reachable, it acknowledges the keep-alive packet. Keep-alive packets are not sent by default. You can use a program to configure this value on a connection. The recommended value setting is 300,000 (5 minutes).
Great article, but i think somthing might be missing.
We needed to restart the server as iisreset didnt work.
This might be more related to the tcp stack than first sight.
I've seen this registry key in a Microsoft Exchange 2010 SP1 tip & tricks presentation.
The recommended value is 120 sec.
"The actual timeout used is the lower of this value and the IIS idle connection timeout" : unless i'm wrong, IIS default connection timeout is 120 sec.
So it's not clear for me. Is there a need to add this registry key because it permits to handle some timeout cases or it is useless?