In this blog post we’re going to go over a few techniques that are a bit old school but will come in handy for understanding how things work even if you ultimately use a great monitoring suite like MOM. Now, there are great articles here and here that describe good general ways to start checking your AD replication-and the information on those articles still applies. In this post we’re going to go a bit past and to the side of them though.
Before we go further we need to go over USN Highwater-marks and Up to Dateness vectors and how they are used. In my experience these are the two data points in tracking updates that are the most confusing in Active Directory replication.
Of course, USNs are Update Sequence Numbers and are an ever increasing counter of numbers assigned to updates-unique per domain controller. As updates are received from peer replicas, or as updates originate at that domain controller itself, the next USN in the series is used to signify that update. In other words USNs are local numbers on each DC. However, those local USNs are monitored by peer domain controllers who look at what the most recent and highest number USN was in order to help decide whether or not some of those updates are needed to be replicated in. If they are not needed then they can be discarded…which is what propagation dampening is.
A recent supportability article had excellent explanations of up-to-dateness vector and high water mark which I’m pasting below:
For each directory partition that a destination domain controller stores, USNs are used to track the latest originating update that a domain controller has received from each source replication partner, as well as the status of every other domain controller that stores a replica of the directory partition. When a domain controller is restored after a failure, it queries its replication partners for changes with USNs that are greater than the USN of the last change that the domain controller received from each partner before the time of the backup.
The following two replication process values contain USNs. Source and destination domain controllers use them to filter updates that the destination domain controller requires.
Let’s dig in with a scenario where you are the admin and you have noticed that there is a replication backlog at some AD sites. In this situation we have anecdotal complaints from our help desk that users created in New York but it is hour or even occasionally days before we see those users on DCs in the Los Angeles site. Although it’s sometimes wise to take help desk reports with a grain of salt this isn’t something you want to ignore.
We have three sites-Los Angeles, Kansas City and New York-and we have DCs in each site. For the question at hand we need to figure out whether there is, in fact, a replication back log and if so how big it is. Repadmin.exe, since it is the Swiss Army knife of AD replication tools, would be the first tool to use (repadmin /showrepl * /csv that is) however it is entirely possible to have a back log of updates between two replicas and not see constant or even intermittent errors from them if they are replicating-albeit replicating slowly.
Now let’s see why the USNHighwater-mark and Up-to-Dateness Vectors are important in tracking updates by using the command “repadmin /showutdvec <hostname> <distinguished name of naming context>”. To understand what is happening between the three DCs Server15 in LA, Server17 in KC, and Server12 in NY we will need to run the showutdvec command once on each server and then examine the results.
Ran on or against Server15:
LosAngeles\server15 @ USN 16531174 @ Time 2009-09-21 13:54:45
KansasCity\server17 @ USN 35282103 @ Time 2009-09-17 12:51:15
NewYork\server12 @ USN 1581572 @ Time 2009-09-21 13:54:39
Ran on or against Server17:
KansasCity\server17 @ USN 36483665 @ Time 2009-09-21 10:54:41
Ran on or against Server12:
KansasCity\server17 @ USN 35295102 @ Time 2009-09-18 07:03:08
Let’s take KC and NY and compare them:
KC LOCALLY:server17 @ USN 36483665
NEW YORK: server17 @ USN 35282103
Now subtract what NY knows of KC having versus what KC has as high water mark:
36483665 minus 35282103 = 1201562
So there is a difference of 1,201,562 between what the Kansas City server named Server17 has and what its peers think it has. This tells us that Server17 has received (from some other DC not listed above) or originated approximately 1.2 million updates and that the LA and New York servers have not processed those updates yet. This also tells us that the KC DC Server17 is receiving inbound updates from the other two sites just fine.
That suggests a replication backlog, since the up-to-dateness vector (that USN number above) for Server17 which the LA and NY servers have retained for tracking locally are lower than the USN Highwater-mark which actually is on the KC server itself.
Are all of these updates ones that the NY and LA actually need? Perhaps not-it simply depends on the nature of the updates. More than likely propagation dampening will occur as the replicas try to process the updates from KC. Propagation dampening is the routine which assesses whether a received updated is needed by the local domain controller or not. If the update is not then it is discarded. For those unneeded updates you would see an event like below following a similar event ID 1240 if you have your NTDS diagnostic logging for Replication events turned up:
9/20/2009 10:35:30 AM Replication 1239 Servername
Internal event: The attribute of the following object was not sent to the following directory service because its up-to-dateness vector indicates that the change is redundant.
<distinguishedname of object>
directory service GUID:
That leads us to the question of how to find out more about what those updates are.
To do that we can issue an LDAP query against KCs DC Server 17 for all of the objects that have a recent WhenChanged attribute. To do that we first get the USNHighwatermark for the given partition from our showvector command above and subtract a number from it in order to display the most recent updates against that DC. In our scenario that would be 36483665, and we will subtract 1000 in order to query for the most recent 1000 updates.
1. Open LDP.EXE.
2. From the Connection menu select Connect and then press OK in the Connect dialogue that appears.
3. From the Connection menu select Bind and then press OK in the Connect dialogue that appears.
4. Next, click on the Browse menu and select Search.
5. Enter the partition’s distinguished name in the BaseDN field (DC=<partname>,DC=com).
6. Paste the following in the filter field: (usnchanged>=36482665)
7. Select Subtree search.
8. Click on Options and change the size limit to 5000.
9. Still in Options add the following to the Attributes list (each entry separated by semicolon) to those already present: usnchanged;whenchanged
10. Then click Run.
And here is a sample of our result set:
>> Dn: CN=Test134417,OU=Accounting,DC=treyresearch,DC=com
4> objectClass: top; person; organizationalPerson; user;
1> cn: Test134417;
1> distinguishedName: CN=Test134417,OU=Accounting,DC=treyresearch,DC=com;
1> whenChanged: 09/13/2009 15:11:26 Central Standard Time;
1> uSNChanged: 36483650;
1> name: Test134417;
1> canonicalName: treyresearch.com/Accounting/Test134417;
>> Dn: CN=Test134418,OU=Accounting,DC=treyresearch,DC=com
1> cn: Test134418;
1> distinguishedName: CN=Test134418,OU=Accounting,DC=treyresearch,DC=com;
1> uSNChanged: 36483649;
1> name: Test134418;
1> canonicalName: treyresearch.com/Accounting/Test134418;
In this case, after a large sampling of all of the most recent updates to occur on the KC DC, we see that someone or something is creating users named Test<number> in the Accounting OU. Is it some provisioning software that the accounting department uses? A migration from another directory? What if the objects were of some other type, something unique enough to be immediately understood? These are all questions that you can apply to a concern like this once you have an idea about those updates you were looking for.