SharePoint and SID History not playing well together

SharePoint and SID History not playing well together

  • Comments 12
  • Likes

Hi,

I struck a problem at a custom and the impact, while it seemed minor on the surface, was actually a big deal for their migration project. In fact, the large team they had assembled to migrate users from one forest to a new forest had stopped while this issue was investigated.

It relates to SID History and the way Windows queries for and caches Name-to-SID and SID-to-Name lookups from AD. This cache was causing SharePoint to think that a user who wanted to logon was actually a user from the wrong domain, and would create that person a new identity for that person within SharePoint for them.

The scenario is actually very close to this one:

http://blogs.technet.com/b/rgullick/archive/2010/05/15/sharepoint-people-picker.aspx

But the workaround that we found would resolve the problem while they were migrating was pretty cool, so I thought I’d save it for all eternity here as a blog.

It boils down to this:

The LsaCache stores the previously looked-up domain user names and their SIDs. By asking a DC which has users that have both the new SID and the migrated SID on them at the same time, the DC always links the migrated SID to the new user name, not the old user name. If we can artificially fill the LsaCache with mappings for OLD USERNAME = OLD SID in our servers, then we can act as though no resources have migrated yet.

Here’s the scenario where users were migrated with SID History from child1.domainA.com to domainB.com

image 

  1. CHILD1\bob logs onto a workstation in CHILD1 and opens the SPS site in DOMAINB (intranet.domainB.com)
  2. SPS asks IIS, which asks Windows for a local DC to resolve a remote SID: S-1-5-21-[SID_for_CHILD1]-1010
  3. The local DC finds the SID assigned to the migrated user in the global catalog
  4. The local DC returns the account name of the migrated user, DOMAIN2\bob
  5. The SPS server adds the result to its LsaCache as a mapping for this SID to the DOMAIN2 account

So we can see from the picture above that the LsaCache (the table in the bottom right of the drawing) has a mapping for NEW USERNAME = OLD SID but we want OLD USERNAME = OLD SID

So, let’s warm up the LsaCache so it looks the way we’d like it to:

image

  1. SPS constantly runs a script to query for the name CHILD1\bob
  2. The local DC queries its Global Catalog and does NOT have a record for this username
  3. The local DC must do its own LSA query to a DC in the domain CHILD1 for this name
  4. The remote DC in CHILD1 finds the user and replies with the SID: S-1-5-21-[SID_for_CHILD1]-1010
  5. The CHILD1 DC returns this to the DOMAINB DC (the DOMAINB DC caches this result in its own LsaCache)
  6. The local DC returns this result to the SPS server
  7. The SPS server adds this entry to its LsaCache

Ah ha! Now our cache looks the way we’d like it, where OLD USERNAME = OLD SID. This way when a query for OLD SID is made, the result from cache will return OLD USERNAME.

image 

  1. CHILD1\bob logs onto a workstation in CHILD1 and opens the SPS site in DOMIANB (intranet.domainB.com)
  2. SPS does NOT ask the local DC for the remote SID, it uses its LsaCache
  3. The LsaCache on SPS replies back with the username which relates to the SID: S-1-5-21-[SID_for_CHILD1]-1010 is CHILD1\bob

The important step here is the red X where there IS NO STEP. What I mean is that the SharePoint server never talked to the DC to get the OLD SID lookup to return a result, meaning that we relied totally on the warmed up cache on the SPS alone.

This relies on the LsaCache on the SPS server ALWAYS having the entry for the SID from the CHILD1 domain matching the CHILD1 username, and never matching the DOMAINB username. The only way to ensure this is:

  1. Constantly query from the SPS server for the name CHILD1\username for every user in DOMAINB which has been migrated from CHILD1 and has its SIDHistory migrated with it. Use a tool which invokes LookupAccountName() to locate the SID for the username: CHILD1\username. LookupAccountName is explained here: http://msdn.microsoft.com/en-us/library/aa379159(v=vs.85). I had access to a private tool which would do these queries for us. I suspect that PsGetSid from Sysinternals would be able to help out here too, but we never tried it.
  2. The LsaCache on SPS must be large enough to sure that the entries which are queried are never overwritten by entries from DOMAINB. Set the reg value HKLM\System\CurrentControlSet\Contol\Lsa\LsaLookupCacheMaxSize = (DWORD) = 0x2000 (8192 decimal). If this value does not exist the system uses a default cache size of 128 entries, which is overwritten too quickly on the busy SPS servers. 8192 entries on a pair of load balanced servers should be able to hold all SIDs for all users accessing the SPS site in the 2 forests (if your forest has more users, you’ll need to increase this.
  3. This is a workaround. The real fix is to have the users who are migrated from CHILD1.domain.com to domainB.com with SIDHistory should use their migrated accounts immediately. After the migration, their CHILD1 accounts should be disabled/deleted and SIDHistory should be removed from the DOMAINB accounts. This is an operationally very difficult action to do as it does not allow for an easy testing path or roll-back path.

To view the actions as they are performed by LSA Lookups, add these 2 DWORDs to the registry under HKLM\System\CurrentControlSet\Control\Lsa\:

  • LspDbgTraceOptions = 0x1 (1 means “log to a file”, the file is C:\Windows\Debug\Lsp.log)
  • LspDbgInfoLevel = 0x88888888 (all 8‘s in hex means “log as verbose as possible”)

These keys are explained here:

http://technet.microsoft.com/en-us/library/ff428139(v=ws.10).aspx

So, all in all a little complicated, but the workaround to increase the value for LsaLookupCacheMaxSize and constantly running a script on the SPS server to query for the SID for usernames in CHILD1 (with a filter to target only users which had been migrated to domainB) worked well for the customer.

Comments
  • When you say the process that must be run to constantly query for the SID of CHILD1 users, what do you mean by "constantly"?  Just long enough to keep them in cache?  What if you have an environment where a lot of lookups are done?  I think I am having this same problem on one of our customer's SharePoint farms, during a long-timeframe domain migration.  There are disabled accounts in the new domain due to Exchange migration.  I currently have an open Premier support case for this.  Can you help?

  • @Brandon: Yes, the idea is to keep the cache flooded with the entries you want, so "constantly" is up to you to decide.

    If you point the people who are working on your Premier case to this blog and give them my name, I may be able to help them along if needed. I won't be able to help directly just now though.

  • How horrible.  Why are customers not asking Microsoft to fix this issue in SharePoint rather than applying dodgy hacks like this?

  • @Bruce - Would you like to contact me so you can be that customer and start that process up?

  • This is covered in the August 2012 CU.

    The Product Group has added a command line that changes the behavior of this scenario.  When you run this command:

    stsadm.exe -o setproperty -propertyname "HideInactiveProfiles" -propertyvalue "true"

    It will bypass the disabled accounts and query the active domain.  The underlying issue is still unsupported (duplicate SIDs) but SharePoint now has more flexibility to deal with these situations.

  • That is great news Brandon. Thanks for sharing it. I've let my customer know that the stsadm tool has been updated with this workaround.

  • Thanks so does that mean disregard teh workaround in the article and go for the stsadm.exe setting as described above?

  • @MarkM: Yes.

  • Do you know if this fix also made it into the 2007 product line in a post SP3 CU? We're having the same issue at work - the fix above worked great on our 2013 farm, but we also have a 2007 farm thats experiencing issues similar to what's described here and under similar circumstances (a prolonged 12 month AD migration project)

  • @Jack Fruh: I'm no SharePoint person, but according to comments above, this entire workaround has been superseded by the August 2012 Cumulative Update. The link to this for SharePoint 2007 is here: http://support.microsoft.com/kb/2687330

    You run this command: stsadm.exe -o setproperty -propertyname "HideInactiveProfiles" -propertyvalue "true"

  • Thanks Craig! We used the CU fix on a 2013 and 2010 farm, we also had a 2007 farm that unfortunately pre-dated August 2012. A shotgun update wasn't an option so I used the approach you outlined in the orignal article.
    I wrote about it here, complete with powershell scripts to query AD for the userlist: http://sharepointjack.com/2014/active-directory-migration-woes-part-2/

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment