Hi AskPFEPlat readers. Tom Moser here. A question I get on a pretty frequent basis from my larger, multi-forest enterprise customers is:
“Do I need to add subnets from Forest A to Forest B so that clients find the correct DC across the trust?”
“Do I need to add subnets from Forest A to Forest B so that clients find the correct DC across the trust?”
And here’s how I try to answer that question, usually with a lot of words, a little white boarding, and a lot of pointing. I thought, “this needs pictures…” so here you go.
If you’re in a hurry to get back to /r/sysadmin, the short answer is no. If you want to know why, keep reading. Then maybe cross post this for me there.
*** Point of Clarification ***
This post is about the a scenario where the subnets in the two forests do not overlap (i.e., client’s IP address from forest A is not covered by any subnet in forest B). This would typically occur in resource forest scenarios with separate networks. For example: federating via trust with Microsoft online services or a trust between a corporate forest and a perimeter forest. Everything you’re about to read below assumes that the client IP from Forest A is not covered by any subnet in Forest B.
In cases where the two forests have conflicting subnets (for example, 10.1.1.0/24 means site “Detroit” in Forest A, but means site “Siberia” in Forest B), there are additional considerations. We will cover these in a later post.
First, let’s talk about how your workstation, or any domain member, finds a domain controller at startup. To demo this, I configured port mirroring on my VMs in Hyper-V and intercepted the entire network conversation on another VM. For the purposes of demonstration, I’ve filtered the traffic to just DNS, LDAP, and Netlogon responses.
At startup, the first thing a domain member needs to do is authenticate. Almost. Before that, it needs to find a (hopefully local) domain controller. It does this by sending a DNS query to its primary DNS server. The query is simply looking for an LDAP server in the DNS domain of the workstation. My client queried DC01 (primary DNS) for _ldap._tcp.dc._msdcs.corp.milt0r.com (Figure 1).
Figure 1 - First DNS queries at start
The first frame shows the DNS query, and the second shows the response. In the response data, we get a list of all of the SRV records (Figure 2). Examining the frame details for the response, we can see all of the DCs with an LDAP SRV record registered in the global SRV list. This is the list of all DCs in my forest that are configured to globally register SRV records.
Figure 2 - DNS Response Frame Details
Next, the client picks one of those “ARecord” entries and queries the hostname. Here (Figure 3) it queries its DNS server for the IP address of dc04.corp.milt0r.com and receives a response of 10.2.1.11.
Figure 3 - DNS Query: Round 2
You can see that based only on that initial query for _ldap._tcp.dc._msdcs.corp.milt0r.com we’ve resolved an IP address for a DC that is (well, should be) hosting the LDAP service.
Netlogon now has what it needs to contact the DC. Using the IP address it resolved, the client sends a UDP ping in the form of a UDP LDAP query to the DC (Figure 4).
Figure 4 - UDP LDAP “Ping” Conversation
DC04 responds to the LDAP “ping” in the form of a Netlogon SAM Response. If no response is received, it tries another DC. The payload contains (Figure 5):
Figure 5 - More Frame Details
Check that out. ClientSiteName. The response from the UDP LDAP query tells my client which site it (the client) belongs to. Now you know. That value ends up getting written under Netlogon’s parameters key in the registry on the client machine(HKLM\System\CurrentControlSet\Services\Netlogon\Parameters\DynamicSiteName).
But, we still aren’t connected to a local DC. That DcSiteName property indicates that DC04 is in CORPDR. We want a local DC. If you wanted to go way off in to the weeds here, you could enable Netlogon debug logging and look for MAILSLOT entries in the log. There you’d see (Figure 6):
Figure 6 - Netlogon.log with debug level logging
It shows us right in the logs that DC04 isn’t a local DC and that it’s going to try to find a DC in a closer site. Immediately after that, we see another DNS query (Figure 7). This query is for another LDAP SRV record, but this time it looks a little different:
Figure 7 - Site specific DNS query
Instead of querying for any DC as the service did at start, the service now performs a site specific query using _ldap._tcp.CORPHQ._sites.dc._msdcs.corp.milt0r.com. DNS returns a response that contains the SRV records for all of the domain controllers in the CORPHQ site. The frame details for the DNS reply contains (Figure 8):
Figure 8 - Site specific DNS reply - Frame Details
Each ARecord entry contains info about an SRV record. Now we know that DC02 is hosting LDAP on 389. Next we would expect a DNS query for DC02 (Figure 9):
Figure 9 - DNS Query for DC A Record
Based on the response, Netlogon tries the UDP LDAP query again, this time to 10.1.1.11 (in the log above). And there you have it. From this point on, any process using DCLocator or DsGetDcName should use the site specific queries.
But “how does that help you cross-forest” you ask? Great question.
I have a forest trust configured between corp.milt0r.com and dmz.milt0r.com. I’ve also got a stub zone configured on my primary DNS server pointing to dmz.milt0r.com. From my client, Win8, I’ll open cmd.exe and run nltest to find a DC in the other forest (Figure 10).
Figure 10 – nltesting…
Examining the network trace shows some interesting information. When my machine started, it performed that generic query to the global SRV list. When I crossed my forest trust and needed to find a DC, it did this (Figure 11):
Figure 11 - Cross-trust DNS Query – Site specific
The first query to the trusting forest performed a DNS query looking for _ldap._tcp.CORPHQ._sites.dc._msdcs.dmz.milt0r.com. Weird. We don’t even know if that’s a valid site in dmz.milt0r.com… and according to that “Name Error” message in the response, it isn’t! Based on the information provided in the capture, we now know that the first DNS query for a DC in another forest will look for a DC in a site that matches the client’s site in its own forest. Since it didn’t find one, it falls back to the global list (Figure 12).
Figure 12 - Cross-trust DNS Query – Non-site specific
This one returns a response. From there, we witness a similar behavior to what we saw in the local forest. A DNS query to the hostname we want to use, then one of the UDP LDAP “pings” (Figure 13).
Figure 13 - A Record Query and LDAP Ping
And then the corresponding MAILSLOT entry in the Netlogon debug log (Figure 14).
Figure 14 - More netlogon logs.
By now, you’ve probably guessed it. You simply need to create a site in the trusting forest that has the same name as the site in the trusted forest. If I jump on to OLDDC01 in the DMZ forest and fire up Sites and Services, I can add a new site (Figure 15):
Figure 15 - It's a 2003 DC, hence OLDDC01.
The site doesn’t need to actually contain any domain controllers. You’ll just want to ensure that it’s connected to a site that you want to service the authentications and LDAP queries. The rest will happen via automatic site coverage. DCs linked to empty sites recognize that the other site has no DCs and register SRV records there to ensure that clients find a “close enough” domain controller. When that happens, you’ll see a message from Netlogon like the one below (Figure 16).
Figure 16 - Automatic Site Coverage event log message
And now when running that nltest command again, we observe:
Figure 17 - Cross-trust DNS Query - Site Specific and successful!
The first query we see (Figure 17) is, once again, site specific…except this time it returns a successful response. Next, it queries the A record and finally performs the LDAP ping. From there authentication occurs against the site-specific foreign DC. By matching the site name, we’re able to help predict and control which DCs I will go to across the forest trust.
My hope is that by this point you have an understanding of how DCLocator works for finding DCs in the domain of the workstation and that you understand how to get it to find specific DCs across a forest trust. Long story short, you don’t need to go out and register all of your subnets in the trusting forest, but you do need to have site names that match, as well as a topology that matches how you’d like the traffic to flow between the two forests. You won’t want to place a matching site name off of some site that contains DCs but is poorly connected and unreliable.
Please keep in mind that caveat I mentioned at the beginning. This methodology works great when the subnet in Forest A doesn’t exist in Forest B. There could, however, be some unintended results and issues if conflicting subnet definitions exist. Watch for another post on that very soon.
Big thanks to fellow PFE Matt Reynolds for his assistance with this post and doing some labbing and confirmation around these topics. His contribution provided the info for the follow up post. Look for that in the next few weeks.
Thanks for reading and post any questions in the comments below!
- Tom "I'm gonna need a 10-20 on that DC" Moser
Awesome post! Thanks for the great detail on the whole process. Just one minor correction... I believe the second sentence under "Anybody home?" should read "the client sends a UDP ping in the form of a UDP LDAP query to the DC."
Thanks, Steve! I've made the correction.
So what happens if there is not any matching site names in Forest B? Will the client do a global lookup for any DC so that it can find a DC to talk to? If this is the case, it seems that it would only add a few seconds at most to the boot process while the client waits for the non-existing site lookup to fail before it switches to a global lookup.
You are correct. If there are no matching site names in Forest B, it will fall back to the global list and could get any domain controller in the forest. For most customers, this is OK. When you get in to larger customer with hundreds of DCs and sites around the world, with varying levels of connectivity and latency, this becomes problematic and somewhat unpredictable. Throw firewalls in the mix and you've got a recipe for slow logon/authentication issues. It can also make troubleshooting more difficult by requiring that you investigate every domain controller in the event of issues instead of just a subset contained in a site.
Much of that can be solved by properly modifying DNSAvoidRegisterRecords and preventing the DCs in branch or poorly connected locations from registering in the global lists in the first place. Unfortunately that is rarely applied properly in my experience. A combination of the steps outlined in the post with DNSAvoidRegisterRecords applied would make for consistent, predictable access to directory services across forest boundaries.
What kind of network monitoring software you used?
I used Netmon 3.4 for all of the traces. I configured port mirroring from my Win8 client to another client machine to capture all traffic from boot.
I look forward to seeing your post with over lapping subnets. I have done some changes for the site names to match up but I am still not getting a local DC when performing cross forest queries on my domain name.
Thanks for this article. I learned some things!
Interesting and informative...Thank You.
This is a great detailed article and this all makes sense to me but I want to see that follow-up on special considerations for conflicting networks. I am curious how you would handle it since there are so many different ways to manipulate how the SRV records are created using site coverage and/or GPO settings.
How is this affected by a one-way forest trust?