Hey folks, I’m here to tell you that I too now have my own story of getting burned by an anomaly of using claims authentication that I wish would have been clearer to me. This is such a fundamental aspect of deploying it that I want to make sure I call it out front and center here so that the same thing doesn’t happen to you.
Very simply stated, if you’re using claims authentication, you MUST use affinity in your load balancing solution. TechNet does describe this, but only as a very brief side note, and not in an appropriately convincing fashion. The article is at http://technet.microsoft.com/en-us/library/cc288475.aspx and says this:
Note: If you use SAML token-based authentication with AD FS on a SharePoint Foundation 2010 farm that has multiple Web servers in a load-balanced configuration, there might be an effect on the performance and functionality of client Web-page views. When AD FS provides the authentication token to the client, that token is submitted to SharePoint Foundation 2010 for each permission-restricted page element. If the load-balanced solution is not using affinity, each secured element is authenticated to more than one SharePoint Foundation 2010 server, which might result in rejection of the token. After the token is rejected, SharePoint Foundation 2010 redirects the client to authenticate again back to the AD FS server. After this occurs, an AD FS server might reject multiple requests that are made in a short time period. This behavior is by design, to protect against a denial of service attack. If performance is adversely affected or pages do not load completely, consider setting network load balancing to single affinity. This isolates the requests for SAML tokens to a single Web server.
I’ll take the hit for not noticing this and not taking it more seriously, but I’m blogging about this now so hopefully you won’t have to. I’ve italicized the words in the note that clearly do not give this justice (nor should it be a note for that matter – it should be in big bold letters). If you don’t use affinity you will see some of these kinds of issues occur:
In short, there should be no confusion or waffling on this issue going forward – for SharePoint 2010, if you are going to use claims authentication, USE AFFINITY WITH YOUR LOAD BALANCER!
UPDATE 6/22/2012 - My friend Mark P. correctly points out that this affinity is required for FBA too, as well as SAML claims. Make sure you are on top of this for both!
as i understand, the cookie issued by one wfe in a NLB is not valid on the other WFEs ? if we set affinity to single and then we must bring one wfe down for maintenance, everybody on that wfe will have to relogin and that's not a desired effect.
is there a workaround ? something related to certificates used ?
Hi Luis, your assessment is correct, and no, there isn't a work-around at this time.
Hi Steve and for the reply,
do you think that replacing SPTokenCache may be the way ? SPTokenCache has a internal SPSecurityTokenCache that keeps the data in memory, and we could provide a database to persist the data but, there are some internal static functions that are called from other classes/methods so i'm not sure if that's a good way to try to workaround.
No, I can't personally imagine trying to replace the SPTokenCache. It's hard to believe that it would be easy or supported. Certainly a lot more work and lot more unknowns than just enabling affinity on your load balancer.
Steve, thank you very much for keeping this blog, it is a constant source of information for solution architects like me. We have just recently encountered this issue and tried to resolve it by using a custom set of CookieTransforms added to the OnServiceConfigurationCreated of global.asax, thinking that the cookie is encrypted using machine's local DPAPI, and this is why the second node would reject it, but it didn't help. In the process we discovered that it looks like the only transform that SPTokenCache is applying to the cookie is DeflateCookieTransform. Without the RsaEncryptionCookieTransform, is the FedAuth cookie secure altogether?
another question - when you configure ISA for example, you can set either Cookie based or Source-IP based load balance mechanism. Most cases, you need Cookie based. For something executed on the server, you would want Source-IP though in order to make sure the request stays on the same server. How can this be achieved?
Continuing on the above - an example would be calling a web service from WFE1. In this case, if you do not handle the cookies during the calls, you would want the request to stay on WFE1, not go to WFE2 as it will return a 403 in this case.
Just wanted to say, thanks for the insight you provide in this post. It is much appreciated.
I'm really glad I ran into your post. I have an interesting situation at my current client and I'm trying to figure out if this scenario is even feasible. They want their session to essentially never timeout...ever. The only exception to this is if the user clears their browser cache and the FedAuth cookie is removed.
So, my first thought was I would set the STS token lifetime to something like 5 years and use persistent cookies. This seems to work fine....until we deploy to a load balanced environment. My theory is that the sticky session in the load balancer has a session timeout of 20 minutes. So, in effect, after 20 minutes of inactivity, the user could hit a different WFE regardless of the FedAuth cookie that was generated by the STS at the original point of authentication. When this happens, they will need to reauthenticate since the token generated from another WFE will not authenticate properly. Is there an effective way around this? I don't think setting the Sticky Session timeout to something huge is secure or even possible.
Hi we are running into a similar problem - situation with Cisco ACE loadbalanser, we are using Azure, ACS and Claims Auth to login to a SharePoint 2010 site behind https. A problem we have while adding the second Authoring server is what you describe above, we get redrirected to the Authentication page. ACE however support SSL ID stickness, but it can not terminate it... what is your opinion about that?
i luv you steve
*about to fait*
call me steve ba by please...anybody know where he is i been looking for him and if you do chat with me on this ecsict computer website
Sharing a tip about how to know if your sticky session config is really sticky.
You will need to do this on each of your SharePoint Web Front End servers.
1) Open IIS Manager and highlight the SharePoint "site" (WebApplication) you are working with.
2) Go to the HTTP Response Headers and then click Add to add a new HTTP Response header that IIS sends back on all responses.
3) In the Name: field, put in the name of the HTTP header you want to use. Call it something like X-SPServer. It should start with X- is all. In the Value: field, put in the name of your SharePoint server, or a unique string that will help you identify this particular Server.
NOTE: This will cause an IIS Reset to take place as this entry gets put into the web.config file of the SharePoint webapp.
Then when troubleshooting you just need to look at the HTTP response headers using developer tools (F12), firebux, HTTPFox, fiddler etc. and it will tell you exactly what SharePoint server you are interacting with and if it switches to another server, you can easily see that.
Just to make your readers aware. This scenario (same as ours) requires sticky sessions but be even more cautious if you are planning on deploying SharePoint 2010 to Azure VMs as Azure DOES NOT support sticky sessions and as a result you can only get away with using 1 WFE when utilizing claims/fba auth!
Great article, Steve thanks for sharing. I have configured my sticky session to last 60 min. and my fedAuthCookie for 8 hrs. but some users are still experiencing page not found error from time to time.
PING is my identity provider. any thoughts ?
I'm having some issues in a hardware NLB environment. My question is if the requirement of sticky session apply only on Sharepoint WFE or also in the ADFS WFE?
My metric at this time are:
- Least Connections
- Client IP Afinnity