DNS Scavenging is a great answer to a problem that has been nagging everyone since RFC 2136 came out way back in 1997. Despite many clever methods of ensuring that clients and DHCP servers that perform dynamic updates clean up after themselves sometimes DNS can get messy. Remember that old test server that you built two years ago that caught fire before it could be used? Probably not. DNS still remembers it though. There are two big issues with DNS scavenging that seem to come up a lot:
"I'm hitting this 'scavenge now' button like a snare drum and nothing is happening. Why?"
"I woke up this morning, my DNS zones are nearly empty and Active Directory is sitting in a corner rocking back and forth crying. What happened?"
This post should help us figure out when the first issue will happen and completely avoid the second. We'll go through how scavenging is setup then I'll give you my best practices.
Scavenging will help you clean up old unused records in DNS. Since "clean up" really means "delete stuff" a good understanding of what you are doing and a healthy respect for "delete stuff" will keep you out of the hot grease. Because deletion is involved there are quite a few safety valves built into scavenging that take a long time to pop. When enabling scavenging patience is required. It will work just fine, but not today!
Note: For purposes of this discussion we are going to concentrate on the most common Windows DNS scenario: Windows Server 2003 DNS servers hosting AD integrated zones.
Scavenging is set in three places on a Windows Server:
It must be set in all three places or nothing happens.
To see the scavenging setting on a record hit View | Advanced in the DNS MMC then bring up properties on a record.
Scavenging gets set on a resource record in one of three ways. The first is by someone coming in here, checking the "Delete this record when it becomes stale" checkbox and hitting apply. When you hit apply the time of day will be rounded down to the nearest hour and applied as the timestamp on the record. Static records have a timestamp of 0 indicating do not scavenge.
The second is when a record gets created by a client machine registering using dynamic DNS. Windows clients will attempt to dynamically update DNS every 24 hours. All DDNS records get set to scavenge. When a record is first created by a client that has no existing record it is considered an "Update" and the timestamp is set. If the client has an existing host record and changes the IP of the host record this is also considered an "Update" and the timestamp is set. If the client has an existing host record with the same IP address then this is considered a "Refresh" and the timestamp may or may not get changed depending on zone settings. More on this later.
The third way to set scavenging on records is by using DNScmd.exe with the /ageallrecords switch. Let's pause here for a few moments to consider a few important words: All, Records, Delete, Stuff. If you actually run this command against a zone it will truly set scavenging and a timestamp on all records in the zone including static records that you never want to be scavenged. Because of the time it takes scavenging to do it's thing people find this command and get tempted to give it a try. Do not. It will delete stuff. Have patience instead.
Once a timestamp is set on a record it will replicate around to all servers that host the zone. There is one caveat to this. If scavenging is not enabled on the zone that hosts the record then it will never scavenge so the timestamp is essentially irrelevant. The timestamp may get updated on the server where the client dynamically registers but it will not replicate around to the other servers in the zone.
Before a server will even look at a record to see if it will be scavenged the zone must have scavenging enabled. To access the scavenging settings for a zone right click the zone, select properties then on the general tab hit the "Aging" button. This screen is universal for the zone. If you view it on any DNS server where this zone is replicated it will be the same.
When you first set scavenging on a zone the timestamp seen at the bottom (reload zone if you don't see it) will be set to the current time of day rounded down to the nearest hour plus the Refresh interval. This also gets reset any time the zone is loaded or any time dynamic updates get enabled on the zone.
The "zone can be scavenged after" timestamp is the first of your safety valves. It gives clients time to get their record timestamp updated before the big axe swings. Since new record timestamps are not replicated while zone scavenging is disabled this also gives replication time to get things in order.
The next safety valves are the Refresh and No-refresh intervals. Both of these must elapse before a record can be deleted.
The No-refresh interval is a period of time during which a resource record cannot be refreshed. Recall from earlier that a refresh is a dynamic update where we are not changing the host/IP of a resource record, just touching the timestamp. If a client changes the IP of a host record this is considered an "update" and is exempt from the No-refresh interval. The purpose of a No-refresh interval is simply to reduce replication traffic. A change to a record means a change that must be replicated.
After the (Record Timestamp) + (No-refresh interval) elapses we enter the Refresh interval. The refresh interval is the time when refreshes to the timestamp are allowed. This is the time when good things must happen. The client is allowed to come in and update it's timestamp. This timestamp will be replicated around and the No-refresh interval begins again. If for some reason the client fails to update it's record during the refresh interval it becomes eligible to be scavenged. Will it disappear immediately? Probably not but it is certainly possible.
Note: When setting Refresh and No-Refresh intervals be sure to allow enough time for clients to get several registration attempts during a Refresh interval. Failure to do so could allow a record to become eligible for scavenging simply from a failed refresh attempt.
One last thing before we leave the zone setting behind. If you right click on your server you will see the option to "Set Aging/Scavenging for All Zones...". Selecting this will take you to a screen similar to the one above. What does this do? This sets the default settings that will be used if a new zone is created by this server. Unless you check the subsequent box "Apply these settings to the existing Active Directory-integrated zones" it will not touch existing zones.
So you now have a resource resource record set to scavenge and a zone set to scavenge. All that is left is for somebody to come along, check all the timestamps and delete some stuff. This is done by any server that hosts the AD integrated zone.
Setting scavenging on the server is done by right clicking the server in the MMC, selecting properties, going to the advanced tab and checking the "Enable automatic scavenging of stale records" checkbox.
The Scavenging Period is how often this particular server will attempt to scavenge. When a server scavenges it will log a DNS event 2501 to indicate how many records were scavenged. An event 2502 will be logged if no records were scavenged. Only one server is required to scavenge since the zone data is replicated to all servers hosting the zone.
Tip: You can tell exactly when a server will attempt to scavenge by taking the timestamp on the most recent 2501/2502 event and adding the Scavenging period to it.
Although you can set every server hosting the zone to scavenge I recommend just having one. The logic for this is simple: If the one server fails to scavenge the world won't end. You'll have one place to look for the culprit and one set of logs to check. If on the other hand you have many servers set to scavenge you have many logs to check if scavenging fails. Worse yet, if things start disappearing unexpectedly you don't want to go hopping from server to server looking for 2501 events.
To facilitate strict control over which server is scavenging for a zone you can use DNSCmd.exe to specify exactly which servers may scavenge. For example the following command will make it so that only 192.168.1.1 and 192.168.1.2 DNS servers are allowed to scavenge on the contoso.com zone:
DNSCmd . /ZoneResetScavengeServers contoso.com 192.168.1.1 192.168.1.2
DNSCmd . /ZoneResetScavengeServers contoso.com 192.168.1.1 192.168.1.2
With the server now scavenging, zones enabled for scavenging, and resources records set what actually happens when the server does it's thing?
When the last 2501/2052 event + the server scavenging period comes around the server is going to make a scavenging attempt. You can also manually initiate an attempt by right clicking the server and selecting "Scavenge Stale Resource Records". Note that manually making an attempt in no way bypasses the safety valves. These are the final safety valves before we "delete stuff":
If all of the above checks are good then the zone is ready to be scavenged. At this point the scavenging server checks the timestamp on each individual resource record. If the current date/time is greater than the timestamp + No-refresh + Refresh then the record is deleted.
Here is how I set scavenging up on a preexisting zone. This procedure is designed for maximum safety. Using default settings this process can take as long as 4-5 weeks (2 weeks Sanity phase, 2-3 weeks for Enable phase)
Sift through your DNS records looking for any records older than the Refresh + No-Refresh interval. If you see any then something has gone wrong with the dynamic registration process and it must be corrected before proceeding. A thorough check at this point is the most important step in setup
Things to check if you find old records:
Do not proceed unless you can explain any outdated records. In the next phase they will be deleted.
The final step is to actually enable scavenging. Enable scavenging on the single server you used the /ZoneResetScavengServers command on.
Once enabled create a new test record and enable it for scavenging. Then map out the point in time when this record will disappear. Here is how:
Lets look at an example with the following assumptions:
Given these assumptions you can rub your temples for a bit and predict that the record will be deleted at approximately 6am on 1/10/2008.
Once scavenging is enabled you can check back periodically to look for the 2501 and 2502 events to see how things are going. You can also come back at the predicted date and time and see if your test record disappeared.
What's the process for checking the age of DNS records? I've exported the DNS data to a text file but it states "[AGE:3579465]" on all records, what does this mean and how do interpret this numbers?
Also I can't find a way of distinguishing between dynamic and static DNS entries, how can I confirm a static entry?
Thanks in advance.
The AGE is calculated by adding the age number (which is number of hours) to the date 1/1/1601..
So in C# you can calculate it like this..
DateTime rootTime = new DateTime(1601, 1, 1);
DateTime expires = rootTime.AddHours(age);
This is a great blog post and explains a lot of things, but I have a problem I am not able to resolve.
It is said that in "Sanity Check" phase, most important step is to check for old Time Stamps for records. I found a script and exported Time Stamp information from one of the DCs, only to find out that if I export from another DC, Time Stamp data is different.
As it happens, Time Stamp info is not replicated between DCs if the zone is not set to be Aging.
How can I check my records if I have a lot of DCs (DNS servers) in different locations/AD sites, and DDNS process works on all of these servers?
How can I be sure what will be deleted, and which server should I choose to perform scavenging?
Here is the script for exporting Time Stamp information from your DNS server:
On Error Resume Next
Const SERVER_NAME = "<DNSServer>"
Const DOMAIN_NAME = "<DomainNameToQuery>"
Const WBEM_RETURN_IMMEDIATELY = &h10
Const WBEM_FORWARD_ONLY = &h20
Set objWMIService = GetObject("winmgmts:\\" & SERVER_NAME & "\root\MicrosoftDNS")
Set colItems = objWMIService.ExecQuery("SELECT * FROM MicrosoftDNS_AType", "WQL", _
WBEM_RETURN_IMMEDIATELY + WBEM_FORWARD_ONLY)
For Each objItem In colItems
If InStr(1, objItem.DomainName, DOMAIN_NAME, VbTextCompare) > 0 Then
WScript.Echo "DnsServerName: " & objItem.DnsServerName
WScript.Echo "DomainName: " & objItem.DomainName
WScript.Echo "Name: " & objItem.OwnerName
WScript.Echo "IPAddress: " & objItem.IPAddress
If objItem.TimeStamp > 0 Then
WScript.Echo "Timestamp: " & DateAdd("h", objItem.TimeStamp, "1/1/1601 00:00:00 AM")
WScript.Echo "Timestamp: Not Set"
Question regarding Sanity Check (the most important step):
We have a lot of DCs which are DNS servers, and the Time Stamp data for records is not the same on all servers, since this information is not replicated before enabling Aging and Scavenging.
How can we make sure that no important Host record will be deleted, when we have inconsistent time stamps among different DNS servers?
I have the same question, I see inconsistent record time stamps across my DNS servers. Even after forcing replication across the Active Directory Integrated Zone.
If anybody has info on why, or how to fix, would appreciate it.
Would sure be nice to know WHICH records were removed and WHY in the default logging. Presumably we can get this info with debug logging, but who runs debug logging on their DNS servers 24/7?
Twice now I've had ACTIVE systems purged from DNS by scavenging. Yes, upon reboot the system will re-register its record, but the business process that access these systems via DNS 24/7 FAIL until the record is manually recreated or the system is rebooted. FLAWED DESIGN!!
really its a great article about DNS scavenging.
I need help.
We are trying to understand why our AD is growing compare to last year.
When we ran ADRAP tool it shows the DNS nodes (30010)
Tombstoned DNS nodes (200001) toal (230011)
this might be because of low scavening / aging value set
what will be best scavenging time set to have other than default settings
Server properties is set to Default 7 days
no-refresh interval set to 2 days
refresh set to 2 days
DHCP lease is 8 hrs.
can anyone please suggest me what will be the best settings for our environment (as DHCP lease is 8 hrs keeping in mind)
Thanks in advance
I'd recommend that you export DNS records to Excel and examine their timestamps during the sanity check phase. There are excellent instructions on how to do this here:
This will allow you to determine exactly which records will be deleted by the scavenging process.
so the big question here is how can you log the records that were scavenaged? I get a 2501 stating x records or nodes scavenged, is there a way to log all the records that were removed?
The article was very informative. I'm lucky I found a link to it. I have a question about the relationship between the lease time in DHCP and the scavenge period-no refresh interval-refresh interval. If I need to set my DHCP lease to one day what must I set my scavenging parameters at?
Has anyone experienced issues with Linux servers falling out of DNS? Using product called Likewise Domain join to place Linux server in AD domain. After period of time, server A records falling out of DNS forward lookup zones, but not all only some?
Great article! Any reason why this would delete some of my server A record? I set it up, and for some reason some of them got deleted. In ALL cases, the servers have static IP's with dynamic DNS registration. Thanks.
I have clients with DHCP reservations; leases expire in 30 days and scavenging is enabled for 7 days. So the clients try and renew every 15 days (1/2 of 30) but theyre scavenged every 7 so host records are disappearing. is it just a matter of extending the refresh, no refresh, and scavenging period to longer than 15 days?
So the Setup Phase involves turning off scavenging on all servers. Then configuring it in the zones. If you haven't actually enabled scavenging at the server level yet and it was never on to begin with, why wouldn't there be older dates still? Nothing is cleaning them up yet as scavenging needs to be set at all three of record, zone, and server for any clean up to happen.
I am missing something in the chain of thought here.