Clean that Active Directory forest of lingering objects

So, you want to clean up your forest of lingering objects before you set your forest to strict?

Good choice! This little database inconsistency can cause big business continuity issues. A change to strict replication consistency while lingering objects still exist in the forest can result in replication outages which themselves can cause big business continuity issues.

   

Alphabet soup in this blog:

TSL = tombstone lifetime

DC = domain controller

GC = global catalog server

W2K = Windows 2000 Server

W2K3 = Windows Server 2003

IFM = install from media

USN = update sequence number

GUID = globally unique identifier

FQDN = fully qualified domain name

WR = writable

RO = read only

DN = distinguished name

NC = naming context (aka partition)

NDNC = non-domain NC

RPC = remote procedure call

Nwr = # of writable DCs

Nro = # of read only DCs

   

   

What are lingering objects?

Lingering objects are objects that exist on one or more DCs that do not exist on other DCs hosting the same partition. They may be introduced in any partition except the schema. They are essentially object delete operations that do not successfully replicate to a DCs/GCs that host the partition of the deleted object. Eventually the tombstoned (deleted) object will be garbage collected which destroys all knowledge of the delete and purges the object from the database. They can be introduced through a few mechanisms:

  • Failing replication for more than the tombstone lifetime (TSL)
  • System state restores using a backup that is older than TSL
  • Dcpromos using IFM media that is older than TSL.

   

Do you have lingering objects in your forest?

If you answer any of the following questions with a YES, then lingering objects may exist in your forest.

Has any DC (or any one or more partitions on a DC) ever failed to receive inbound replication for more than the tombstone lifetime (TSL) configured on the forest? (60 days default for forests that started with W2K. 180 days default if the first DC in a forest is W2K3 SP1)

Has any DC been successfully restored using a backup that was older than TSL?

Has a DC ever been promoted with IFM method using IFM media that was older than TSL?

   

There are other types of database consistency problems beyond the above that will be treated as lingering objects by the OS quarantine logic when Strict Replication Consistency is enforced.

  • USN rollback: See https://support.microsoft.com/kb/875495
  • Abandoned deletes: This is a fairly unknown (and should be rare) phenomena where an object is deleted on a DC, replicates the tombstone to a RO neighbor, then dies, is force demoted, or is restored before successfully replicating the tombstone to a writable neighbor. Eventually after TSL, the GCs will garbage collect these objects, that remain alive on the DCs for the partition.

   

So how do you clean a forest of lingering objects?

There are a few methods available. This blog will cover using repadmin.exe /removelingeringobjects. The following steps assume all DCs are running W2K3. I Plan to write a future blog on other methods that can be used when W2K DCs are in the mix.

   

The command to clean out lingering objects looks like the following.

repadmin /removelingeringobjects <targetDCFQDN> <sourceDCguid> <partitionLDAPdn>

It specifies a target DC by DN, a source DC by GUID, and an NC to be cleaned. The target DC is cleaned using a reference DC for the comparison. The reference DC must always be writable for the partition being cleaned and the target DC may be WR or RO.

It can be run in advisory mode to have the DC report an event identifying each lingering object.

repadmin /removelingeringobjects <targetDCFQDN> <sourceDCguid> <partitionLDAPdn> /ADVISORY_MODE

   

This command must be run 2(Nwr-1) to clean the writable DCs for the NC. For NCs that have RO copies (all domain NCs), it must also be run (Nro) more times.

Configuration and NDNCs (2(N-1) * # of NCs). Domain NCs (2(Nwr-1)+(Nro)*NCs). N = # of DCs hosting the partition.

An example forest of 10 GCs, 5 domain NCs (2 DCs each), and 6 application partitions (forestdnszones hosted on all 10 DCs and domaindnszones in each domain hosted on each DC in their respective domains) will require 96 executions of repadmin.

Consider the following illustration that explains how the above methodology is the most efficient and thorough approach possible with repadmin /removelingeringobjects.

   

DC1,2,3,4 all host a writable copy of domain A. DC5,6,7,8,9,10 host a read only copy of domain A.

DC1 will be chosen as an initial target for this illustration. DC1 may be clean or dirty with respect to lingering objects.

1) Clean a target DC.

    • Repadmin /removelingeringobjects <DC1> <DC2guid> <domain A LDAP DN>
    • Repadmin /removelingeringobjects <DC1> <DC3guid> <domain A LDAP DN>
    • Repadmin /removelingeringobjects <DC1> <DC4guid> <domain A LDAP DN>

        DC1 is now clean as compared to DC2,3,4.

        DC1 now becomes the source to be used to clean DC2,3,4

        2) Clean remaining DCs using the target in 1) above as the source DC.

    • Repadmin /removelingeringobjects <DC2> <DC1guid> <domain A LDAP DN>
    • Repadmin /removelingeringobjects <DC3> <DC1guid> <domain A LDAP DN>
    • Repadmin /removelingeringobjects <DC4> <DC1guid> <domain A LDAP DN>

        DC2,3,4 are now clean with respect to DC1. This approach makes DC1,2,3,4 consistent with each other.

        At this point any writable DC for domain A can be used as a source to clean the DCs hosting a read only copy of domain A.

         DC1 will be chosen as the source DC for cleaning the DCs hosting read only copies of domain A.

         3) Clean all DCs hosting a read only copy of domain A.

    • Repadmin /removelingeringobjects <DC5> <DC1guid> <domain A LDAP DN>
    • Repadmin /removelingeringobjects <DC6> <DC1guid> <domain A LDAP DN>
    • Repadmin /removelingeringobjects <DC7> <DC1guid> <domain A LDAP DN>
    • Repadmin /removelingeringobjects <DC8> <DC1guid> <domain A LDAP DN>
    • Repadmin /removelingeringobjects <DC9> <DC1guid> <domain A LDAP DN>
    • Repadmin /removelingeringobjects <DC10> <DC1guid> <domain A LDAP DN>

At this point all DCs hosting a read only copy of domain A are consistent with each other and are consistent* with the writable DCs for domain A.

* The abandoned delete scenario is not addressed with the above method. There is no in the box method to discover, report on , and remove objects that are lingering in the writable as compared to the read only. Working with Microsoft PSS is currently necessary to leverage an internal tool to compare LDIFDE.exe dumps that will report on lingering objects in the writable partition.

   

So, how do you apply the above methodology to your forest?

Simple! Of course, you must have RPC connectivity between each source and target identified in the repadmin command.

Apply steps 1 & 2 for all non domain partitions. This means the configuration partition and all application partitions.

Apply steps 1 & 2 & 3 for all domain partitions.

*** Note ***  

There is a tool available that calls the same API (namely DsReplicaVerifyObjects https://msdn.microsoft.com/en-us/library/ms676035(VS.85).aspx ) used by repadmin /rlo and automates above process of cleaning all NCs in a forest using a single command line. repldiag.exe https://www.codeplex.com/ActiveDirectoryUtils/Release/ProjectReleases.aspx?ReleaseId=13664

 

What default logging of the process is provided during the exercise?

Every target DC will log details about the cleaning exercise such as a start event, an event for each lingering object purged, and a finish event summarizing the number of lingering objects removed.

The following is an example of the start of a clean cycle on a particular NC.

Event Type: Information
Event Source: NTDS Replication
Event Category: Replication
Event ID: 1937
Date:  11/8/2007
Time:  1:38:23 PM
User:  TAILSPINTOYS\Administrator
Computer: W2K3ENTR2-VM3
Description:
Active Directory has begun the removal of lingering objects on the local domain controller. All objects on this domain controller will have their existence verified on the following source domain controller.
 
Source domain controller:
150efcda-20b4-4f1f-9b48-705665bfc095._msdcs.tailspintoys.com 
 
Objects that have been deleted and garbage collected on the source domain controller yet still exist on this domain controller will be deleted. Subsequent event log entries will list all deleted objects.

Note:   This is worth repeating. "Objects that have been deleted and garbage collected on the source domain controller yet still exist on this domain controller will be deleted." 

If you run the same cleanup command multiple times, you may see the 1945 events referencing deleted objects that were cleaned because they happened to be garbage collected on the source DC used in the clean command.  This is of no concern as the objects will have been purged on the next run of the garbage collection process anyway.  This is more likely in larger more dynamic environments.

Next are the events specifying the objects deemed lingering that were deleted.  There will be one for every object deleted, so be sure the DS event log is sufficiently large enough to hold all these events for reporting as well as so other unrelated events are not lost to a full event log.

Event Type: Warning
Event Source: NTDS Replication
Event Category: Replication
Event ID: 1945
Date:  11/8/2007
Time:  1:38:52 PM
User:  TAILSPINTOYS\Administrator
Computer: W2K3ENTR2-VM3
Description:
Active Directory will remove the following lingering object on the local domain controller because it had been deleted and garbage collected on the source domain controller without being deleted on this domain controller. 
 
Object:
CN=retail1003,OU=retail,DC=tailspintoys,DC=com 
Object GUID:
5e83e965-f802-4d7a-8372-d35a43820515
Source domain controller:
150efcda-20b4-4f1f-9b48-705665bfc095._msdcs.tailspintoys.com

Finally, there is a summary event detailing the number of lingering objects deleted on the server.

Event Type: Information
Event Source: NTDS Replication
Event Category: Replication
Event ID: 1939
Date:  11/8/2007
Time:  1:38:52 PM
User:  TAILSPINTOYS\Administrator
Computer: W2K3ENTR2-VM3
Description:
Active Directory has completed the removal of lingering objects on the local domain controller. All objects on this domain controller have had their existence verified on the following source domain controller.
 
Source domain controller:
150efcda-20b4-4f1f-9b48-705665bfc095._msdcs.tailspintoys.com 
Number of objects deleted:
16 
 
Objects that were deleted and garbage collected on the source domain controller yet existed on the local domain controller were deleted from the local domain controller. Past event log entries list these deleted objects.

These postings are provided "AS IS" with no warranties, and confers no rights. The content of this site are personal opinions and do not represent the Microsoft corporation view in anyway. In addition, thoughts and opinions often change. Because a weblog is intended to provide a semi-permanent point-in-time snapshot, you should not consider out of date posts to reflect current thoughts and opinions.