This tip comes to you from Frank Plawetzki, who is a Senior PFE based in Germany.  He focuses on Exchange and besides other issues deals with performance and monitoring issues.


Symptoms

Recently I was troubleshooting an issue where the thread that connects to the global catalog (GC) in the Outlook connections status windows was staying on “connecting”:

Outlook hung on connecting to Global Catalog

During troubleshooting we realized that the Outlook clients even could not create new profiles in online mode although normal intended operation for the clients was cached-mode. This was an Outlook Anywhere connection for newly connected clients and therefore the clients even did not have an offline address book available on the clients.

Needless to say, that the working experience for the affected clients was very bad: as soon as they started any operation in Outlook which requires access to the address book (like choosing other recipients for a reply E-mail or checking the member of attribute for a user) or even when they simply tried to open the address book in Outlook, the clients started to become completely unresponsive and hung.

As I mentioned, this was an Outlook Anywhere scenario, so mainly the Outlook client connects to the mailbox server via the proxy server on port 6001. Right when you start Outlook, there may not be a GC connection right away. Right after the user performs an operation which requires a connection to the GC, like checking group membership of an address book entry, a connection thread to the GC is established by the client. This connection also uses the proxy server and then connects to port 6004 to DSProxy on the mailbox server which proxies the connection onwards to the GCs when everything runs well.

Troubleshooting

As for troubleshooting our first steps was to check the IIS logs on the proxy server, which was an Exchange 2007 CAS server in this case. The IIS logs showed lots of 401.1 0 entries but because of the hardware load-balancer in front of the proxy server there was not much to gain from the lIS logs in regards of IP addresses being used.

So we decided to use RPCPING to test the connection from the client all the way to the mailbox server. The RPCPING syntax and usage are nicely described in this KB article.

In our case we were using this syntax to query the DSProxy port 6004 on the mailbox server:

rpcping -t ncacn_http -o RpcProxy=mail.fourthcoffee.com -P "bastianb,fourthcoffee,*" -H 1 -F 3 -a connect -u 9 -v 3 -s ExchangeCAS.fourthcoffee.com -I " bastianb,fourthcoffee,*" -e 6004

Since those were Windows 7 clients, RPCPING ships inbox and was available right away for troubleshooting on the clients.

First issue identified

Several attempts of the RPCPING either timed out or came back with status 1722 which translates to RPC_S_SERVER_UNAVAILABLE. At this point I decided to check if the mailbox server was even listening on port 6004. It could have been that e.g. some third party application was blocking this port on the mailbox server.

A NETSTAT –ANO revealed that the mailbox server was indeed not listening on port 6004. DSProxy is part of the Microsoft Exchange System Attendant Service (SA), also known as MAD.EXE.

The Exchange 2007 mailbox server was part of a Cluster Continuous Replication (CCR) cluster. So the SA lives as a resource in the cluster manager but luckily does not have dependencies to other resources, so we could restart the service with minimal user impact.

Since SA is a cluster resource, the procedure for restarting the SA is opening cluster administrator (CLUADMIN) and doing a “take offline”, “bring online” cycle with the System Attendant resource.

Right after that, the mailbox server started to listen on port 6004 again.

Second issue identified

But, this did not resolve the issue for all affected used. Some users still had the issue of the “connecting” GC thread. At this point, I started to wonder whether or not the hardware-load balancer might be configured correctly.

In Outlook Anywhere, RPC is split into two connections, RPC_IN_DATA and RPC_OUT_DATA which you will see in the IIS logs. In order for Outlook Anywhere to work correctly across a hardware load balancer, stickiness for the Outlook Anywhere traffic must be configured on the hardware load balancer, otherwise your clients will be affected by a split RPC_IN_DATA and RPC_OUT_DATA connection issue. This issue is described in the blog post ‘How does Outlook Anywhere work (and not work)?’

Resolution

In case you are affected by this issue, the RPC_IN_DATA and RPC_OUT_DATA connections will end up at different endpoints on different GCs and therefore a successful GC connection from the client will not be possible.

In order to check whether or not RPC split connections were biting us, getting the hardware load balancer out of the picture e.g. via using local host files etc. in this case was not an option. Therefore I suggested to pinpoint the mailbox server to only one GC. We accomplished this by using the following cmdlet.

Set-ExchangeServer –StaticGlobalCatalogs <single GC> -StaticDomainControllers <single GC> -StaticConfigDomainControler <single GC>

Right after that we moved one of the affected mailboxes to this mailbox server which was now bound to a single GC and waited for the AD replication to finish. And voila, everything started to work correctly as expected! The Outlook clients were now able to connect to the GC thread very quickly, open the address book and could work fluently again now.


Original content from Frank Plawetzki; posted by MSPFE editor Arvind Shyamsundar