I’m sure this has been covered by many other sources on the internet, but I thought I’d put down my thoughts on the matter as many people still don’t understand why the correct load balancing configuration is important.
I’ve been involved in a number of Exchange Server 2010 deployments during my last couple of months and most of the deployments were upgrades on hosted platforms from Exchange Server 2007 to Exchange Server 2010.
What I noticed in these Exchange Server 2007 deployments were that load on Client Access Servers (CAS) were somewhat skew. And this makes sense, because the load balancing was configured for Source IP persistence.
What does this mean in a hosting environment?
Well, firstly all clients are connecting to the messaging platform over the Internet behind a NATed IP.
You could potentially have a tenant with a 1000 users behind a single IP. The hosting environment won’t have any visibility to the internal IP’s and thus only see the source IP being the external interface on the tenants firewall. If source IP persistence is configured on the load balancer it will basically send all traffic for that source IP to one CAS server (give or take a few connections).
Something like this:
This concept is also the same for corporate enterprises running their own on-premise Exchange Server 2010 solution. The reason I’m saying it effects corporate deployments as well is that most mobile phones connect to the internet via NATed IP’s behind the carrier firewall. So mobile phone ActiveSync connections from a specific carrier will be sent to one CAS box.
So how do we fix this? Configure the correct persistence.
First we check that the load balancer is on the Exchange qualification program for load balancers. The main reason for this is that we’ll know if it was tested and reviewed by Microsoft and the partner for the type of load balancing we want to do. It’s also a very good resource to find deployment guides on the specific load balancer.
When I deployed the Exchange Server 2010 solution we incorporated cookie based persistence on the load balancers for the customer. We did not configure SSL Offloading. To keep things simple we configured an SSL Bridge whereby the load balancer will decrypt the packets, read the cookies then re-encrypt the packets before sending it to the CAS boxes.
Implementing cookie based persistence can be tricky. but it can also be very easy, it really depends on the person responsible for the load balancer, which usually falls into the networking or security team. Personally, I put in a lot of effort to understand how the specific vendors’ load balancer works. I find that this makes discussions with the network engineer easier. If the engineer understands the concepts on the Exchange side and the impact then it makes life very easy to implement the correct solution.
What protocols require persistence?
I’ve detailed the recommendations on the specific services below that will help you determine the correct persistence method for optimal load balancing.
Hopefully, this helps some administrators/implementer's understand the concept better. As I mentioned earlier, I personally do a lot of research during my planning and deployment phases to help ease configuration on firewalls, load balancers and such.
Some references that you will find very valuable:
Until next time…..happy load balancing :-)
With OWA and EAS having different persistence recommendations, does this mean each service must use it's own VIP? (as the both use 443), so two virtual servers each with the recommended persistence type?
Would prefer all Exchange services under a single VIP, but would be the best choice for this?
I've updated the EAS recommendation to make it a bit more clear - it's also based on cookie persistence using the Authorization header, so you can certainly use a single VIP.
I prefer using a single virtual server for all my https traffic and depending on the loadbalancer I also prefer to configure advanced health monitors on the member pool to check that each service is responding with the correct http response strings. You can associate multiple health monitors to a pool (depending on the loadbalancer) and configure it to stop sending traffic if 1 monitor fails or 2 or all...etc.
Hope that answers your question :-)
How about if you are using TMG and load balancing external clients to a server farm of CAS servers so no HLB. From what we see there is only cookie auth and source IP based load balancing methods available. Would cookie auth cause issues for clients and under what circumstances?
Although TMG can load balance http traffic and insert a cookie (load balancer generated cookie) to provide affinity. AFAIK the client needs to be able to receive arbitrary cookies so that these cookies are included in all future requests that the client make to the server. EAS and OA do not support this so you’ll be stuck with source ip.
Excellent article Thank You! Any experience as to the persistence requirements around an ADFS or ADFS proxy farm using WID or SQL?
Sorry for the late reply, I didn't get the mail notification for the comment.
In an ADFS scenario you don't really need the session to persist to a specific server as the ADFS cookie is saved on the machine during the secure token assignment. When the browser closes the cookie is removed and also during the signout process. You could essentially still use a SSL Bridge between the loadbalancer and the ADFS farm and setup https health monitors to the ADFS federation service URL.
For external traffic to the ADFS Proxy Farm you would also configure a SSL Bridge and use the loadbalancer generated cookie to avoid a skew on a certain ADFS proxy server - or you could even use round robin loadbalancing on the loadbalancer.
The recommendation on TechNet for ActiveSync is somewhat ambiguous - what would your recommendation for load balancing ActiveSync where client certificate authentication is being used?
Seems that SSL Session ID is the only possible option here as you can't terminate SSL on the load balancer (as you then can't authenticate to the CAS with the client cert) and it does not seem possible to 'pass-through' the client certificate if using End to End SSL on the load balancer.
Any words of wisdom?
I'm not sure what load balancing solution you are using, but F5 LTM has a feature called SSL Proxy - you can still utilize Cookie persistence using Authorization Header, but you can configure SSL Proxy to allow the client to auth directly against the CAS servers with the supplied certificate.
With the Proxy SSL feature it enables direct client-server authentication by creating a SSL tunnel between the client and server. It will then forward the SSL handshake messages from the client to the server and vice versa.
Check it out here: support.f5.com/.../15.html
Thanks for the quick feedback. We are using a Cisco ACE, and so far as I can tell it does have an SSL proxy capability, but all the configuration points to it needing to terminate the SSL connection on the Cisco and then re-establish a new SSL session from the Cisco (as the client) to the CAS (as the server), and this is where it breaks down.
I guess this is a potential anomaly with the Cisco product - the only thing we can find is that it seems that if you want the ACE to send the user cert then you have to load each individual user cert onto the ACE which is completely impractical. There doesn't seem to be any configuration option that says "pass through the client authentication cert"
We are using F5 load balancer to connect to Exchange 2010 CAS. I've created multiple virtual servers i.e.3 to be precise for handling MAPI, Public Folder and Address Book on ports 135, 59531, and 59532 respectively with source based affinity for all three VIPs. Once in a while we can't pull up the address book when users click on the address book icon in their outlook. Have you noticed that behavior and what would you recommend to check?
First thing I would do is to check each of my CAS in the pool by bypassing the F5 to determine if it's a LB issue or a specific server issue. In your situation clients are connecting to different servers in the pool depending on the connection count, so today you might connect to server 1 and tomorrow to server 2. So check and ensure that OAB is working correctly for all your OAB vdirs on each of you CAS.
Does Cisco Catalyst 6504 switch load balance have the ability to achieve session affinity?
Can Exchange 2010 use IIS ARR?
How to configure IIS ARR in Exchange 2010?
Is it same with that in Exchange 2013?