I was recently asked to help out on an issue with a similar theme to other cases we have seen over the years. The topic has never been one that has generated a high number of calls to us but the calls we have received are not easy ones to get an initial handle on.
I was very surprised that I couldn’t find a single comprehensive “soup to nuts” technical article or KB about this topic. This blog post is here to do that from the Active Directory angle. So let’s talk about how printers are published to Active Directory, problems that may arise, and what to do about them.
First, it’s important to understand how printers are published to AD. Basically, when you setup a printer on a computer you are presented with the choice in the wizard to publish it to AD if the client is domain joined. Choosing that action tells the local computers (the print server, really) printer spooler service to communicate with Active Directory and to create a printqueue object there that represents the printer being published and it’s essential properties that will come in handy for users to find and use it.
The printer spooler service will attempt to communicate with AD when it starts or restarts in order to determine if the printers it hosts are still resident in AD of if they need to be re-published. For those that may be interested, it does this by querying AD via LDAP for those printers. The directory replies with the info on whether they are there or not and based on that reply the printer spooler service decides what to do.
This is all about making it easy for people on their workstations to be able to look for printers in the AD for them to print to, and to help those printers be ones that are actually still on the network to send your print job too.
Why is that an important thing to have the print server do? Because we provide a mechanism in AD called the printer pruner. It was noticed early on that printers could potentially become stale objects in AD very easily. Particularly if the there are no real limitations on allowing users elevated privileges on their computers-a common struggle in the IT world where the principle of least privilege is often construed to mean that the IT organization is the least privileged. But I digress.
The printer pruner service is actually a single thread running in the spooler service on domain controllers. What this does is checks intermittently to see if the printqueue objects are being used or not, and it does this by default three times over an eight hour period. This is behavior that can be altered by settings and those settings can be distributed to the DCs via group policy.
But what can happen if the print server notices that the printqueue objects in AD that represents it’s printers have been removed and then there are problems in communicating with the DCs to create them anew? You could run into a scenario where the print server creates those objects on several different domain controllers at approximately the same time.
If that happens you will likely end up with two objects (at least) for each actual printer-one which is the actual printqueue object for that printer and one that has a similar name and directory location but has a funky looking name which contains the moniker “CNF=<GUID>” in the middle of it.
When creating Active Directory our development team was aware that there will be times when extra logic and behavior will need to be in place to deal with problems like this. The link below has a top level overview of the different types of directory issues that can be seen. It should be noted that you can actually have a combination of these issues in some cases.
Troubleshooting Directory Data Problems
So we were talking about these printqueue CNF objects. CNFs, or conflict objects, are created when two updates for the same object are received by two different DCs at approximately the same time. These could potentially be created by an administrator manually creating an object of the same name in the same directory location at approximately the same time on two different domain controllers. These changes are then replicated around to those two domain controller’s peers until the updates reach convergence, at which point these two updates must be dealt with in some manner by the domain controller which is receiving them both at the same time. In other words the receiving domain controller must decide which update to keep and which to discard. There are a few different tweaks to this conflict scenario which can occur, such as the conflict being only one update for an object, but the concept is the same.
The logic behind how to decide which update to keep and which to discard is outlined in more detail in the article below as what we call a “sibling name conflict”. The discarded update for objects adds and similar conflicted updates is an object with a CNF in the name, along with the GUID of the originating object. It should be noted that the printer pruner will not necessarily discard printqueue CNFs though it may.
Multimaster Conflict Resolution Policy
As an example, here’s the distinguished name for one of those problem objects: CN=Third Floor Laserjet\0ACNF:f529dgt-7452-jd46-w3u8-c21d41fddf356,CN=LostAndFound,DC=barrel,DC=of,DC=monkeys,DC=fun.
So you’re probably asking yourself what was the “lingering object” part of this? It was in the blog post title, right?
In the recent case that spurred this blog post on the printqueue objects were CNFs which only resided in the Global Catalog partitions for that domain in some Global Catalog DCs. So they were in fact lingering CNF objects.
Lingering objects can occur from a sequence of events where one domain controller goes out of contact with its peers longer than the tombstonelifetime (the interval a deleted object is retained in Deleted Items before it is actually deleted also known as garbage collected) and during that interval an object(s) are deleted from the other domain controllers and that deletion update never reaches the out of contact domain controller. That’s a pretty complex sentence that just means that the object has been really deleted everywhere except that one out-of-contact DC, and not event the update saying “go delete that object” remains to tell that DC (once connected again) to do so.
Lingering objects can be ones in the domain writable partitions or in the Global Catalog non-writable partition, or a combination of different places in different partitions on different domain controllers. You can even have CNF objects which are lingering objects, which is in fact what our customer had seen.
The after the fact resolution steps for this type of scenario belies the complexity of the behavior underlying it. In most cases it is acceptable to simply delete these CNFs which are created after verifying that the “real” object is existing and the one being used, not the CNF. In the case where these objects reside on non-writables (in Global Catalog partitions) as lingering objects then rehosting the non-writable partition using repadmin.exe is an acceptable alternative. For cases where the objects are lingering on writable partitions the “repadmin /removelingeringobjects” switch is the way to go.
Unfortunately there is no way to tell precisely what the originating cause(s) for this type of issue unless you examine the print servers event logs for network problems or actually have network captures of the printer spooler network traffic as it happens. As far as the events go they may only give symptoms of the underlying difficulty, which is most commonly network oriented. In instances where there is no remaining data from the print server or elsewhere from when the CNFs were created we can only infer that there was network difficulty based on the objects we see left behind.
This has been a good example topic that I hope helps folks gain knowledge of AD replication in general as well as assist people who are seeing this specific problem in particular.