There will be times when you have to make big changes in your Active Directory. Sometimes those big changes mean deleting a lot of objects. I’ve personally needed to match customer environments by creating tens of thousands of AD objects just to have the beginnings of a matching environment. For my test forests I can leave those objects around after I’m done and not have to worry about things.
But if I have a production forest I will probably want to delete unused objects. I’ll also want to reclaim that disk space and the possible performance from indexes that might be filled with these remaining object references.
A more pointed scenario would be someone who had a maverick provisioning software that created a massive number of unwanted new user objects. These objects would replicate throughout the domain they are in as well as into the global catalogs throughout the forest and would bloat the AD database.
Such a thing could increase a 50Mb Active Directory database to 50Gb one in pretty short order.
Whatever the chain of events that got you to this point you are now in the position of cleanup-deleting all of the unwanted objects. Let’s take a moment to do a quick and high level run though of how the object deletion process works in AD. When an object is deleted by you in Active Directory Users and Computers what really happens to that object is that all but a few attributes of that object are discarded, the object is moved to the Deleted Items container, and it receives a time stamp showing when it was marked for deletion.
This object is retained for a length of time. That time is known as the tombstone lifetime (TSL). At the end of that length of time that object will be removed by a thread that runs on each domain controller at startup and about every 12 hours afterward. That thread is called Garbage Collection. Picture it as a dumpster carrying trash truck that pulls up to each deleted object and quickly examines them to see if the deletion time on them is greater than the TSL or not. If they are then the object goes into the dumpster (figuratively speaking) and is finally deleted and removed from the database. This process is also explained here.
In order to do that very quickly-not having to wait the TSL of 180 or 60 days- you would have to do something we don’t recommend: alter your tombstone lifetime (TSL) to a shorter interval and then garbage collection will remove the objects more quickly next time it runs. We have a KB article that talks about the problems you can see altering this value and why it is generally a bad idea. For the sake of this article we’re going to assume you’re either a Cowboy Admin and have lowered TSL to a small length of time despite Microsoft recommendations, or you have the patience of a saint (Saint Admin? Feels like we should have a patron saint, doesn’t it?).
But once its complete you notice that-while the DIT has decreased a little bit-it hasn’t gotten close to the original size. What’s going on? Didn’t the garbage collection take out all the trash?
The reason that the DIT only decreased a small amount was the result of the dumpster being too small to fit all of that large set of deleted objects into it. There are simply too many deleted objects (which were deleted longer ago than the tombstone lifetime) to fit into the dumpster. Seriously.
When the garbage collection thread runs it takes a batch of 5000 objects that match the criteria of having been deleted greater than the tombstone lifetime ago. Once it has removed that batch from the database it will pause in order to let more important AD business take place. What this means in reality is that only that batch may be done during one garbage collection interval of 12 hours and then you would have to wait for the next collection to see the next 5000 get removed.
Is there a way to speed that process up if you need to? Yes, there is.
You can initiate garbage collection manually by using a published LDAP control. This doesn’t alter the what objects are collected, nor does it alter how may go into the dumpster. It simply says to do garbage collection right then rather than waiting until the next 12 hour interval has passed.
You can use LDP.EXE to do the garbage collection control. Here are the steps:
1. In Ldp.exe, when you click Browse on the Modify menu, leave the Distinguished name box empty.
2. In the Edit Entry Attribute box, type "DoGarbageCollection" (without the quotation marks),
3. In the Values box, type "1" (without the quotation marks).
4. Set the Operation value set to Add and click the Enter button, and then click Run.
It’s possible that the garbage collection you start using the above method could stop in favor of more important tasks like AD replication in the same way as the scheduled garbage collection does. If that happens you can simply repeat the garbage collection steps above until all of the objects are removed.
How can you tell if they are removed? We have a KB article which goes over how to view your deleted objects. Take note that you may need to alter the size limit variable if you have a large number of deleted objects.
What about all of that free space? Can we get it all back just by doing the garbage collection and removing all of the objects that qualify?
Online defrags may reclaim some of that space-an online defrag will occur as part of the garbage collection- but the best thing to do is reboot to Directory Services Restore Mode (DSRM) and run an offline defragmentation of the database.
Keep in mind that garbage collection is not replicated in any way. In other words, the routine you go through for garbage collection and database defragmentation needs to be performed on each domain controller individually. It would not necessarily be a problem to only force garbage collection on some domain controllers and not others but of course you may see performance differences between those that have had the trash taken out and those that haven’t yet.
For IT folks who are remote from their domain controllers there’s a nice little option for booting to DSRM without having to resort to using the F8 on the keyboard (which may be thousands of miles away). Just go to your Run command and type MSCONFIG and press enter. Here’s the option in Server 2003:
…and for Server 2008:
This scenario may not necessarily result from a sudden creation of a huge number objects but could be the result of a gradual database increase over years in production. The resolution portion will not be different in either case. Follow those same steps and just take out the trash.
It sounds like the same logic is used when it comes to the garbage collections thats used during a GC tear down as far as the pre-emptions and priorities. You had some good tidbits in there that I didn't know. Great article! Keep em coming! :)
Garbage Collection Scheduling Enhancements
The process for completing garbage collection has changed in Windows Server 2003 to improve storage conditions in the directory database. Garbage collection removes a maximum of 5,000 objects per pass to avoid indefinitely delaying other directory service tasks. However, the rate at which remaining tombstones are deleted when more than 5,000 tombstones have expired has increased from Windows 2000 Server to Windows Server 2003, as follows:
Windows 2000 Server: If collection stops because of the 5,000-object limit (rather than by running out of objects to collect), the next garbage collection pass is scheduled for half the normal garbage collection interval (by default, every 6 hours instead of 12 hours). Garbage collection continues running at this accelerated pace until all objects have been collected.
Windows Server 2003: Rather than waiting a set time to remove a subsequent set of 5,000 tombstones, a domain controller continues deleting tombstones according to CPU availability. If no other process is using the CPU, garbage collection proceeds. Removing tombstones in this way keeps the database size from increasing inordinately as a result of the inability of garbage collection to fully complete removal of all tombstones during a garbage collection interval.