Welcome to TechNet Blogs Sign in | Join | Help

News

  • Welcome to the blog for the Microsoft CSS Enterprise Platforms Networking team.

    Disclaimer: All postings are provided "AS IS" with no warranties, and confer no rights. This weblog does not represent the thoughts, intentions, plans or strategies of Microsoft. Because a weblog is intended to provide a semi-permanent point-in-time snapshot, you should not consider out of date posts to reflect current thoughts and opinions.

    Blog Tools

    Blog Flux Directory

    Add to Google

    Locations of visitors to this page

Preparing the Network for NLB 2008

Windows Server 2008 is here, along with a new version of Network Load Balancing (NLB).  Just as in previous versions, NLB continues to provide an excellent option for scaling many kinds of applications and promoting higher availability.  And while the deployment and configuration of NLB is fairly straightforward, it’s important to ensure the network environment is ready for NLB. 

Unicast

If you choose to deploy NLB using unicast, all of the NLB adapters will share a Cluster MAC address, in addition to the Virtual IP (VIP) address.  The idea behind the shared MAC is that when a host communicates with the MAC address for the NLB Cluster, all of the NLB nodes will respond, making it impossible for the switch to associate the MAC address to a particular port.  This in turn will cause the switch to simply flood the frames destined to the Cluster MAC out all of its ports, ensuring that all of the NLB nodes receive the frames.  Problems may arise when using multi-layer switches or virtual network environments if the switch does associate the Cluster MAC or the Virtual IP to a specific port.  In this case, only one NLB node will receive traffic destined to the Virtual IP address of the Cluster, preventing the remaining NLB nodes from sharing the load.  One way to get around this issue is to employ a hub.  By connecting all the NLB nodes into a hub, and then connecting the hub to a port on the switch, all of the NLB nodes will receive the traffic destined to the Cluster.  Another solution is to configure port mirroring on the switch to ensure traffic sent to one of the NLB ports is replicated to all of them.

As mentioned earlier, unicast NLB relies on switch “flooding” behavior to function properly.  If you want to limit the flooded traffic on your network, you  can create a separate VLAN encompassing only the ports the NLB nodes are connected to.

Multicast

You can also opt to deploy NLB using multicast.  With multicast, each NLB node effectively has two MAC addresses: a physical MAC and a multicast MAC.  Switches typically do not associate ports with a multicast MAC address, so the traffic will be flooded out all ports.  The flooding of the multicast traffic may cause unintended network performance issues.  To resolve these issues, you can configure the switch with static mappings of the multicast MAC and the ports that the NLB nodes are connected to.

NLB Manager

One other point to keep in mind when deploying Windows Server 2008 Network Load Balancing is that the NLB Manager from Windows Server 2003 cannot be used to manage Windows Server 2008 NLB nodes.  You can manage the Windows Server 2008 nodes with the NLB Manager on a Windows Server 2008 server or with Windows Vista if you have the Remote Server Administration Tools (RSAT) installed.

For more information on deploying NLB, including upgrading from Windows Server 2003 NLB, check out the following article:

http://technet2.microsoft.com/windowsserver2008/en/library/d7c4efd2-3cf0-4b3d-9207-4746cab1f9aa1033.mspx?mfr=true

- Baruch Frost

New Networking-related KB articles for the week of April 26 - May 2

Here are the latest Networking-related Knowledge Base articles:

951764  How to enable the port scalability feature for RPC proxies and for applications in Windows Server 2008

950499  You may be unable to use the "netsh interface" context in some Server Core installations of Windows Server 2008

951598  On a computer that is running an Itanium-based version of Windows Server 2008, the Ftp.exe utility crashes when you run the "mput" command

947557  The WINS automatic scavenging process may not start as expected at the expiration of the configured interval on a Window Server 2008-based computer

951745  After you install a non-English-language Input Method Editor on a Windows Vista-based computer, you cannot enter any numeric character in the WEP box when you try to join a secure wireless network

951025  The Server service and the Workstation service do not start in Windows 2000, and you receive a "The specified file could not be found" error message

951656  UPnP devices may not be displayed in the "My Network Places" folder after you restart a Windows XP-based computer

- Mike Platts

Windows XP Service Pack 3 has released!

The latest Service Pack for Windows XP, SP3, is now available for download.  Of note in this release, Windows XP with Service Pack 3 will have the ability to be a NAP (Network Access Protection) client.  Also, Wi-Fi Protected Access 2 (WPA2) support is now included (previously available as a separate download for Windows XP SP2).

Windows XP SP3 Released to Web (RTW), now available on Windows Update and Microsoft Download Center

Service Pack 3 Resources for IT Professionals (Microsoft TechNet)

How to obtain the latest Windows XP service pack (Microsoft KnowledgeBase)

List of fixes that are included in Windows XP Service Pack 3 (Microsoft KnowledgeBase)

Thanks to Boyd Benson for his assistance with this post.

-Mike Platts

New Networking-related KB articles for the week of April 19-25

Here are the latest Networking-related KB articles:

948927  Error message when you use SmartCard-only authentication to log on to a Windows Vista-based client computer in a wireless network environment: "Cannot connect to <SSID>: Please contact network administrator"

950923  The SNMP Event Log Extension Agent does not initialize correctly on a computer that is running Windows Vista with Service Pack 1 or Windows Server 2008

949127  You cannot establish a wireless connection by using EAP authentication on a Windows XP-based client computer if the Service Set Identifier (SSID) includes a comma

- Mike Platts

New Troubleshooting section on Technet for Windows Server 2008

There is a recently added Troubleshooting section to the Windows Server 2008 Technical Library.  This new area contains troubleshooting information for many areas, including:

The Networking area contains quite a number of areas below it, including wired and wireless services, network Plug and Play, transports, NDIS, and more.

- Mike Platts

New Networking-related KB articles for the week of April 12-18

Here are the latest Networking-related KB articles:

948505  The gethostbyname function unexpectedly returns the IP addresses in numeric order on a Windows Vista-based computer or on a Windows Server 2008-based computer

951422  The WTSQuerySessionInformation function on a Windows Server 2008-based terminal server returns ambiguous IPv6 address data

948572  A handle leak occurs in a Server Message Block (SMB) session between two Windows Vista-based computers or between two Windows Server 2008-based computers

951037  Information about the TCP Chimney Offload feature in Windows Server 2008

942567  Description of the Windows Vista Feature Pack for Wireless

948180  Error message when you try to automatically connect to a wireless access point that uses shared-mode network authentication in Windows Vista: "Windows cannot connect to <access_point>"

- Mike Platts

Some batch files to take the pain out of capturing traces along with other relevant information

NetMon is a common tool which comes in handy in revealing behaviour of various problems. When Network engineers receive traces for review, some basic information is required before starting work on this:

  1. At What machine was the trace taken?
  2. Were simultaneous traces collected? – A lot of times, network issues are revealed only if we have traces from both ends. Consider a simple scenario where client machines are not able to open intranet websites. It’s possible that requests from the client machine get dropped by an intermediate device and never reach the server. Such issues are revealed only if we have traces run simultaneously on the client machine and the server.
  3. What are the IPs of the machines involved?

Well, these are exactly the pain areas that I am trying to address using DOS scripting and “nmcap” command line utility of NetMon 3.

Let’s start with “simultaneous traces”.

@echo off

if "%1"=="" goto Usage

REM Following line is wrapped

start /min cmd.exe /c nmcap /network * /capture /file c:\trace\%computername%.cap /stopwhen /frame "ipv4.DestinationAddress==4.3.2.1"

start /min cmd.exe /c psexec \\%1 "nmcap" /network * /capture /file c:\%1.cap /stopwhen /frame "ipv4.DestinationAddress==4.3.2.1"

echo Press any key to stop the tracing

pause

psexec \\%1 "ping" -n 1 4.3.2.1

ping -n 1 4.3.2.1

ipconfig/all > c:\trace\%computername%.txt

psexec \\%1 "ipconfig" /all > \\%computername%\C$\trace\%1.txt

copy \\%1\C$\%1.cap c:\trace\%1.cap

del \\%1\C$\%1.cap

goto :EOF

:Usage

echo Usage:

echo %0 "remote machine host name"

echo In this case, it is assumed that the directory c:\trace exists on the executing machine and files are stored only in this location.

The above script uses a sysinternals tool called psexec to run commands on a remote machine. See http://technet.microsoft.com/en-us/sysinternals/bb897553.aspx for more information. You can actually use psexec to run the command on several machines at the same time.

The above script starts tracing on the computer it is executed on and one specified in the argument simultaneously (ok, nearly simultaneously!) and waits for a keystroke (any) to stop it. Thus, one can simply run the script, reproduce the problem, go back to the original command prompt and hit a key! Traces from both machines and also ipconfig information are dumped as computername.cap & .txt in the folder c:\trace on the executing machine.

Note

1. nmcap cannot create the folder specified. So one has to ensure that c:\trace exists or replace that with an existing folder name.

2. This script relies on file copy to get all files in one location – so please do not expect to capture file copy issue with this one! Change the script to save capture files on local machines.

3. psexec & nmcap require administrative rights on the machines – thus the logged on user should have admin rights on both machines.

To make things simple, it is best to create a folder c:\trace (or whatever else you like ensuring that you modify the script accordingly) and dump the required utilities (like psexec.exe) including the script in that folder and then execute the script from that folder.

Is ipconfig information not enough? Want to gather MPS reports just then so that the error you just reproduced is captured?

Use MPSRPT_NETWORK /Q in place of ipconfig/all > c:\trace\%computername%.txt. My experience with MPS tells me to let it dump the cab file %COMPUTERNAME%_MPSReports.CAB in its default location %systemroot%\MPSReports\Network\Bin\Reports\Cab. Of course the MPS RPT file name & storage path will differ with speciality.

Thus:

@echo off

if "%1"=="" goto Usage

REM Following line is wrapped

start cmd.exe /c nmcap /network * /capture /file c:\trace\%computername%.cap /stopwhen /frame "ipv4.DestinationAddress==4.3.2.1"

start cmd.exe /c psexec \\%1 "nmcap" /network * /capture /file c:\%1.cap /stopwhen /frame "ipv4.DestinationAddress==4.3.2.1"

echo Press any key to stop the tracing

pause

psexec \\%1 "ping" -n 1 4.3.2.1

ping -n 1 4.3.2.1

MPSRPT_NETWORK /Q

psexec \\%1 " MPSRPT_NETWORK” /Q

copy \\%1\C$\%1.cap c:\trace\%1.cap

copy \\%1\%systemroot%\MPSReports\Network\Bin\Reports\Cab\*.CAB c:\trace\*.cab

copy %systemroot%\MPSReports\Network\Bin\Reports\Cab\*.CAB c:\trace\*.cab

del \\%1\C$\%1.cap

goto :EOF

:Usage

echo Usage:

echo %0 "remote machine host name"

echo In this case, it is assumed that the directory c:\trace exists on the executing machine and files are stored only in this location.

Want to gather traces and capture a particular event in the Event Logs? A great place to get information on this is Paul Long’s blog http://blogs.technet.com/netmon/archive/2007/02/22/eventmon-stopping-a-capture-based-on-an-eventlog-event.aspx. Again here, I tried to extend it to take traces simultaneously:


@echo off

if "%1"=="" goto Usage

if "%2"=="" goto Usage

REM Following line is wrapped

start cmd.exe /c nmcap /network * /capture /file c:\trace\%computername%.cap /stopwhen /frame "ipv4.DestinationAddress==4.3.2.1" /DisableConversations

start cmd.exe /c psexec \\%1 "nmcap" /network * /capture /file c:\%1.cap /stopwhen /frame "ipv4.DestinationAddress==4.3.2.1" /DisableConversations

cscript //NoLogo EvtMon.vbs %2 %3

psexec \\%1 "ping" -n 1 4.3.2.1

ping -n 1 4.3.2.1

ipconfig/all > c:\trace\%computername%.txt

psexec \\%1 "ipconfig" /all > \\%computername%\C$\trace\%1.txt

copy \\%1\C$\%1.cap c:\trace\%1.cap

del \\%1\C$\%1.cap

goto :EOF

:Usage

echo Usage:

echo %0 remotecomputer EventNumber [LogFile]

echo Logfile is optional. If used, the eventlog name

echo file ie, application, system, security, etc...

echo In this case, it is assumed that the directory c:\trace exists on the executing machine and files are stored only in this location.

Getting to see the pattern? Modify the above to gather whatever information you like.

Modify the nmcap command line to capture on a particular network connection, use a capture filter or start & stop using time as a trigger. For example:

@echo off

if "%1"=="" goto Usage

if "%2"=="" goto Usage

REM Following line is wrapped

start cmd.exe /c nmcap /network %2 /capture /file %3\%computername%.cap /stopwhen /timeafter %1 /DisableConversations

ipconfig/all > %3%computername%.txt

goto :EOF

:Usage

echo Usage:

echo %0 time networknumber Capturepath

echo.

echo time is the time in seconds for which you want the trace to run

echo.

echo use * for networknumber to capture on all available networks or use from list below

nmcap/displaynetworks

echo.

echo ensure that you complete the capture path with a "\" eg: "c:\trace\". In case path is omitted, the files are saved in the current directory (as the command prompt)

Don’t use arguments if modifying a script for a specific case.

One situation I can’t help sharing is when I discovered a certain pattern in a chain of traces (taken using .chn). Each file, after the display filter, was pretty small and it was simply crazy to refer several files to explain the pattern. Use the following to merge:

nmcap /InputCapture 10.cap 11.cap 12.cap 13.cap 14.cap 15.cap 16.cap 17.cap 18.cap 19.cap /Capture /File trace10-19.cap

10.cap, 11.cap etc are the discrete filtered files while trace10-19.cap is the resultant file.

Contributed by: Rajeev Narshana

Windows Server 2008 Technical Library

The Windows Server 2008 Technical Library is a great resource for finding information about performing common tasks and operations for Windows technologies.  The content in the networking section of this library offers some great information on many areas, like DHCP, Network Access Protection (NAP), Network Policy Server (NPS), Netsh, RRAS, SNMP, Windows Firewall with Advanced Security (WFAS), and more.. 

Looking for information about Windows Firewall with Advanced Security and IPsec?  The guide provides documentation regarding product evaluations, getting started, planning architecture, deployment, operations and troubleshooting.

How about NAP Step-by-Step Guides for IPsec, 802.1X, VPN or DHCP NAP Enforcement?

Some sections currently have placeholders that indicate "This document is not yet available.Keep checking back as these sections will be updated when the content is available.

Similar content for Windows 2003 is also available in the Windows Server 2003: Operations guide.

 

-Michael Vargo

New Networking-related KB articles for the week of April 5-11

Here are the most recent networking-related KB articles:

951008
No firewall is enabled after you upgrade a Windows Server 2003-based NAT/Basic Firewall router to Windows Server 2008

951005
The Network Policy Server may not log successful authentication events or failed authentication events in Event Viewer in Windows Server 2008

951006
Hyper-V virtual machines cannot reach the network when the vLan tagging is enabled on a Windows Server 2008-based computer

950094
A program may stop responding when it calls RPC functions on a Windows Server 2003-based computer that has remote access connections or VPN connections established

945553
MS08-020: Vulnerability in DNS client could allow spoofing

950676
PING commands to a Windows Server 2003-based computer may fail if you have enabled the Routing and Remote Access service by configuring the "NAT and basic firewall" and "LAN routing" services

946565
On a Windows Server 2003-based computer that has the update from security bulletin MS07-062 installed, you may experience a memory leak in DNS

951013
Error message when you establish an outgoing remote access connection in Windows Server 2003: "Error 734 - The PPP link control protocol terminated" or "TCP/IP CP reported error 31: A device attached to the system is not functioning"

947334
Error message when you try to connect a Windows Vista-based computer to a network projector in Windows Meeting Space: "The Network Projector could not be added to the meeting"

950134
On a Windows Vista-based computer, an application that uses the EnableStatic method of the Win32_NetworkAdapterConfiguration class may not always set a static IP address for a network adapter

949984
Changes to the 802.1X-based wired network connection settings in Windows XP Service Pack 3

- Mike Platts

How to benefit from Link-Local Multicast Name Resolution.

In a nutshell, Link-Local Multicast Name Resolution (LLMNR) resolves single label names (like: COMPUTER1), on the local subnet, when DNS devolution is unable to resolve the name.  This is helpful if you are in an Ad-Hoc network scenario, or in a scenario where DNS entries do not include hosts on the local subnet.

In order to benefit from LLMNR, you need to enable Network Discovery on all nodes on the local subnet.  In Microsoft operating systems, this option and LLMNR functionality are only included on Windows Vista and Windows Server 2008.

My testing of LLMNR has uncovered a couple of points of interest:

  • If Network Discovery is not enabled on a client, it will still send out an LLMNR request unless it has been disabled via group policy.  To disable LLMNR via group policy, set the following group policy value:

    Group Policy = Computer Configuration\Administrative Templates\Network\DNS Client\Turn off Multicast Name Resolution. (Enabled = Don't use LLMNR, Disabled = Use LLMNR)

  • However, a host will not respond to the LLMNR request if Network Discovery is not enabled. 

This limitation is important because, by default, a network where LLMNR is likely to be most useful is an Ad-Hoc network, such as a few friends at a coffee shop on a Wi-Fi network.  In these scenarios, Network and Sharing Center is most likely going to classify the network as a Public network.  This classification, in addition to enforcing the public firewall profile, will turn off Network Discovery, File Sharing, Public Folder Sharing and Printer Sharing.  Therefore, none of the hosts will respond to LLMNR requests since Network Discovery is turned off.

Network Discovery can be turned on in these scenarios by going to the Control Panel and double clicking Network and Sharing Center.  Then, under Sharing and Discovery, select Network Discovery.  Click the option Turn on Network Discovery and click Apply.  You will be prompted to accept the associated security risk of being discoverable on a public network.  After enabling Network Discovery on each host, they will respond to LLMNR requests and you will be able to resolve the IP of computers by single label name.

For a very good description of what Link-Local Multicast Name Resolution is, and how it works, see this article from The Cable Guy : http://technet.microsoft.com/en-us/library/bb878128.aspx

 

Windows Wireless and Cisco ACS Machine Access Restriction don’t always play nice together

Recently, I have been seeing a number of customers reporting problems attempting to implement a specific feature of Cisco's Access Control Server (ACS) 4.0.  This feature is called Machine Access Restriction (MAR).

The issue that arises is that this feature can sometimes inadvertently lock out a legitimate client, forcing the client to reboot in order to regain access to the network.  First, let’s talk about what this feature does.

MAR basically attempts to solve a common problem inherent in most of the current and popular EAP methods, namely that machine authentication and user authentication are separate, unrelated processes. Clients can be authenticated as a machine or as a user, but not as both simultaneously. How do you tie a user's authentication to a trusted hardware that also needs to authenticate? 

Take, for example, a Windows Vista client.  As the machine boots it will present its credentials to RADIUS for validation.  This is a distinct individual process.  When the user logs in, the RADIUS server handles the user's authentication also as a separate process and has no method of checking to see if the hardware the user is logging in from is trusted or belongs to the domain. 

MAR addresses this by keeping track of all the machine authentications.  It tracks the Calling-Station-Id attribute of all successful machine authentications and stores them in a database for a defined period of time, the default being 24 hours. When a user authentication is processed, MAR will check to see if the Calling-Station-Id of the user's request matches a record in its database of valid machines.  If it matches, the user is allowed to authenticate; if it does not the request is rejected.

This sounds great, so what is the problem?

Well, while MAR is a tremendous step forward in addressing the problem described above, it does not take into account how the Windows wireless supplicant works.  I described the Windows 802.1x process briefly above and the point that should be taken way is that when no user is logged on, Windows will use its machine credentials to establish the network connection.  When the user logs on, we change security contexts so that the logged on user credentials are used for all subsequent 802.1x authentication prompts.  There are some exceptions to this, but suffice to say that whatever context we are in, we will only use one set of credentials.

Let’s take a look at the next two scenarios and we will see why MAR can sometimes cause problems.

Scenario 1:

Mr. CEO has been working all day on his laptop connected via a wireless connection.  At the end of the day, he simply closes the lid and heads home.  This puts the laptop in hibernation.  The next day, Mr. CEO comes back into the office and powers his laptop up.  Now he is unable to establish a wireless connection.  Yesterday he had no problems whatsoever, but today he cannot connect.  What has happened?

When you hibernate your Windows system, you take a snapshot of the system in its current state to include the context of who is logged on.  During the night, the MAR cached entry for Mr. CEO's laptop expired and was purged.  However, when the laptop was powered up, it did not do machine authentication. It instead went straight into a user authentication since that was what the hibernation recorded.  The only way to resolve this is to log off Mr. CEO or to reboot his computer.

Scenario 2:

As system administrator, you have implemented 802.1x authentication on your wired network.  You have implemented a MAR cache timer of 24 hours.  The first day goes off without a hitch, but the second day you start seeing support calls that users cannot access the network.  The third day you see even more reported issues.  However, not everyone is experiencing the problem.

The problem is the same as above - the MAR cache entries are expiring.  When the end users reboot or log off, the problem is resolved because the machine authentication resets an entry in the MAR cache.  The reason this issue is not seen across the board is because some users actually log off whereas others only lock their machines at the end of the day.

Conclusion

Although MAR is a good feature, it has potential to cause network disruption.  These disruptions can be difficult to troubleshoot until you understand the way it works. When implementing MAR, it is important to educate your end users on how to properly shut down computers and to log off every machine at the end of the day.

Service Pack 1 for Windows Vista available for download!

vista_logo_100 The Networking Team's own David Pracht contributed information about Service Pack 1 for Windows Vista and how to get it over at the Windows Vista Now blog.  Check it out!

Windows Vista SP1 is here!!!

Don't be afraid of DNS Scavenging. Just be patient.

DNS Scavenging is a great answer to a problem that has been nagging everyone since RFC 2136 came out way back in 1997.  Despite many clever methods of ensuring that clients and DHCP servers that perform dynamic updates clean up after themselves sometimes DNS can get messy.  Remember that old test server that you built two years ago that caught fire before it could be used?  Probably not.  DNS still remembers it though.  There are two big issues with DNS scavenging that seem to come up a lot:

"I'm hitting this 'scavenge now' button like a snare drum and nothing is happening.  Why?"

or

"I woke up this morning, my DNS zones are nearly empty and Active Directory is sitting in a corner rocking back and forth crying.  What happened?"

This post should help us figure out when the first issue will happen and completely avoid the second.  We'll go through how scavenging is setup then I'll give you my best practices. 

Scavenging setup

Scavenging will help you clean up old unused records in DNS.  Since "clean up" really means "delete stuff" a good understanding of what you are doing and a healthy respect for "delete stuff" will keep you out of the hot grease.  Because deletion is involved there are quite a few safety valves built into scavenging that take a long time to pop.  When enabling scavenging patience is required.  It will work just fine, but not today!

Note: For purposes of this discussion we are going to concentrate on the most common Windows DNS scenario: Windows Server 2003 DNS servers hosting AD integrated zones.

Scavenging is set in three places on a Windows Server:

  1. On the individual resource record to be scavenged.
  2. On a zone to be scavenged.
  3. At one or more servers performing scavenging.

It must be set in all three places or nothing happens.

Scavenging settings on a Resource Record

To see the scavenging setting on a record hit View | Advanced in the DNS MMC then bring up properties on a record. 

image

Scavenging gets set on a resource record in one of three ways.  The first is by someone coming in here, checking the "Delete this record when it becomes stale" checkbox and hitting apply.  When you hit apply the time of day will be rounded down to the nearest hour and applied as the timestamp on the record.  Static records have a timestamp of 0 indicating do not scavenge. 

The second is when a record gets created by a client machine registering using dynamic DNS.  Windows clients will attempt to dynamically update DNS every 24 hours.  All DDNS records get set to scavenge.  When a record is first created by a client that has no existing record it is considered an "Update" and the timestamp is set.  If the client has an existing host record and changes the IP of the host record this is also considered an "Update" and the timestamp is set.  If the client has an existing host record with the same IP address then this is considered a "Refresh" and the timestamp may or may not get changed depending on zone settings.  More on this later.

The third way to set scavenging on records is by using DNScmd.exe with the /ageallrecords switch.  Let's pause here for a few moments to consider a few important words: All, Records, Delete, Stuff.  If you actually run this command against a zone it will truly set scavenging and a timestamp on all records in the zone including static records that you never want to be scavenged.  Because of the time it takes scavenging to do it's thing people find this command and get tempted to give it a try.  Do not.  It will delete stuff.  Have patience instead.

Once a timestamp is set on a record it will replicate around to all servers that host the zone.  There is one caveat to this.  If scavenging is not enabled on the zone that hosts the record then it will never scavenge so the timestamp is essentially irrelevant.  The timestamp may get updated on the server where the client dynamically registers but it will not replicate around to the other servers in the zone.

Scavenging Settings at the Zone

Before a server will even look at a record to see if it will be scavenged the zone must have scavenging enabled.  To access the scavenging settings for a zone right click the zone, select properties then on the general tab hit the "Aging" button.  This screen is universal for the zone.  If you view it on any DNS server where this zone is replicated it will be the same.

image

When you first set scavenging on a zone the timestamp seen at the bottom (reload zone if you don't see it) will be set to the current time of day rounded down to the nearest hour plus the Refresh interval.  This also gets reset any time the zone is loaded or any time dynamic updates get enabled on the zone. 

The "zone can be scavenged after" timestamp is the first of your safety valves.  It gives clients time to get their record timestamp updated before the big axe swings.  Since new record timestamps are not replicated while zone scavenging is disabled this also gives replication time to get things in order.

Refresh and No-Refresh intervals

The next safety valves are the Refresh and No-refresh intervals.  Both of these must elapse before a record can be deleted.

The No-refresh interval is a period of time during which a resource record cannot be refreshed.  Recall from earlier that a refresh is a dynamic update where we are not changing the host/IP of a resource record, just touching the timestamp.  If a client changes the IP of a host record this is considered an "update" and is exempt from the No-refresh interval.  The purpose of a No-refresh interval is simply to reduce replication traffic.  A change to a record means a change that must be replicated.

After the (Record Timestamp) + (No-refresh interval) elapses we enter the Refresh interval.  The refresh interval is the time when refreshes to the timestamp are allowed.  This is the time when good things must happen.  The client is allowed to come in and update it's timestamp.  This timestamp will be replicated around and the No-refresh interval begins again.  If for some reason the client fails to update it's record during the refresh interval it becomes eligible to be scavenged.  Will it disappear immediately?  Probably not but it is certainly possible.

Note: When setting Refresh and No-Refresh intervals be sure to allow enough time for clients to get several registration attempts during a Refresh interval.  Failure to do so could allow a record to become eligible for scavenging simply from a failed refresh attempt.

One last thing before we leave the zone setting behind.  If you right click on your server you will see the option to "Set Aging/Scavenging for All Zones...".  Selecting this will take you to a screen similar to the one above.  What does this do?  This sets the default settings that will be used if a new zone is created by this server.  Unless you check the subsequent box "Apply these settings to the existing Active Directory-integrated zones" it will not touch existing zones.

Scavenging settings on the Server

So you now have a resource resource record set to scavenge and a zone set to scavenge.  All that is left is for somebody to come along, check all the timestamps and delete some stuff.  This is done by any server that hosts the AD integrated zone. 

Setting scavenging on the server is done by right clicking the server in the MMC, selecting properties, going to the advanced tab and checking the "Enable automatic scavenging of stale records" checkbox.

image

The Scavenging Period is how often this particular server will attempt to scavenge.  When a server scavenges it will log a DNS event 2501 to indicate how many records were scavenged.  An event 2502 will be logged if no records were scavenged.  Only one server is required to scavenge since the zone data is replicated to all servers hosting the zone.

Tip: You can tell exactly when a server will attempt to scavenge by taking the timestamp on the most recent 2501/2502 event and adding the Scavenging period to it.

Although you can set every server hosting the zone to scavenge I recommend just having one.  The logic for this is simple: If the one server fails to scavenge the world won't end.  You'll have one place to look for the culprit and one set of logs to check.  If on the other hand you have many servers set to scavenge you have many logs to check if scavenging fails.  Worse yet, if things start disappearing unexpectedly you don't want to go hopping from server to server looking for 2501 events.

To facilitate strict control over which server is scavenging for a zone you can use DNSCmd.exe to specify exactly which servers may scavenge.  For example the following command will make it so that only 192.168.1.1 and 192.168.1.2 DNS servers are allowed to scavenge on the contoso.com zone:

DNSCmd . /ZoneResetScavengeServers contoso.com 192.168.1.1 192.168.1.2

With the server now scavenging, zones enabled for scavenging, and resources records set what actually happens when the server does it's thing?

The scavenging process and final safety valves

When the last 2501/2052 event + the server scavenging period comes around the server is going to make a scavenging attempt.  You can also manually initiate an attempt by right clicking the server and selecting "Scavenge Stale Resource Records".  Note that manually making an attempt in no way bypasses the safety valves.  These are the final safety valves before we "delete stuff":

  • Is scavenging enabled on the zone?  Pretty self explanatory.
  • Is dynamic update enabled on the zone?  If it's not there is a good chance timestamps will be old enough that mass deletions can occur.
  • Is the scavenging server listed as one of the "Scavenge Servers" for the zone?
  • Are we past the "zone can be scavenged after" timestamp on the zone?  This gives the clients and AD replication to get things squared away before we start. 
  • Has it been longer than a refresh interval since this zone was last replicated in Active Directory?  If scavenging gets enabled on a server that has replication issues this will prevent it from tombstoning a bunch of records that may be perfectly fine on other servers.

If all of the above checks are good then the zone is ready to be scavenged.  At this point the scavenging server checks the timestamp on each individual resource record.  If the current date/time is greater than the timestamp + No-refresh + Refresh then the record is deleted.

My best practices

Here is how I set scavenging up on a preexisting zone.  This procedure is designed for maximum safety.  Using default settings this process can take as long as 4-5 weeks (2 weeks Sanity phase, 2-3 weeks for Enable phase)

Setup phase
  1. Turn off scavenging on all servers.  To confirm scavenging won't inadvertently run use the DNSCmd /ZoneResetScavengeServers to confine scavenging to a single server then ensure this server has scavenging disabled.
  2. Turn on scavenging on the zones you wish to scavenge.  Set the refresh and No-refresh intervals as desired.  If you want things to scavenge more aggressively I would recommend lowering the No-refresh interval at the cost of some replication traffic.  Leave the refresh at the default.
  3. Add today's date plus the Refresh and No-Refresh intervals.  Come back in a few weeks when this time has elapsed.  Seriously you can't rush this.
Sanity check phase

Sift through your DNS records looking for any records older than the Refresh + No-Refresh interval.  If you see any then something has gone wrong with the dynamic registration process and it must be corrected before proceeding.  A thorough check at this point is the most important step in setup

Things to check if you find old records:

  • Does an IPConfig /registerdns work?
  • Who is the owner of the record (see security tab in the record properties)?
  • Was the record statically created by an admin then later enabled for scavenging?  If so you may need to delete the record to clear ownership and run an IPConfig /registerdns to get it updated.
  • Is the server replicating OK with AD?

Do not proceed unless you can explain any outdated records.  In the next phase they will be deleted.

Enable phase

The final step is to actually enable scavenging.  Enable scavenging on the single server you used the /ZoneResetScavengServers command on.

Once enabled create a new test record and enable it for scavenging.  Then map out the point in time when this record will disappear.  Here is how:

  1. Start with the timestamp on the record
  2. Add the refresh interval
  3. Add the no refresh interval
  4. The result will be your "eligible to scavenge" time.  The record will not disappear at this time though.  It's just eligible.
  5. Check your DNS event logs for 2501 and 2502 events to find what hour the DNS server is doing a scavenging run.
  6. Take your "eligible to scavenge" time, find the most recent 2501/2502 event and add the server's Scavenging Period (from server properties | advanced tab) to it.  This is the point in time when the test record you just created will disappear.

Lets look at an example with the following assumptions:

  • Zone is set to a 3 day Refresh and a 3 day No-Refresh interval
  • Server Scavenging period is set to 3 days
  • Last DNS Event id 2501 or 2502 occurred at 6am on 1/1/2008
  • We have a record with a timestamp of 1/1/2008 at 12:00 noon

Given these assumptions you can rub your temples for a bit and predict that the record will be deleted at approximately 6am on 1/10/2008.

image

Once scavenging is enabled you can check back periodically to look for the 2501 and 2502 events to see how things are going.  You can also come back at the predicted date and time and see if your test record disappeared.

That's it!

New update available for Windows Server 2003 SP2 systems to disable Scalable Networking Pack features

9440082As you may know, Service Pack 2 for Windows Server 2003 included the Scalable Networking Pack (or SNP) which allowed for increased performance in many situations by allowing some TCP functionality to be handled by the network driver and network adapter instead of the Windows TCP/IP stack itself.  This functionality was enabled by default in Service Pack 2.

There have been some problems seen in some environments where Windows Server 2003 SP2 has been deployed on systems that support the SNP features.  Issues like this have been discussed in several previously published Knowledge Base articles.

There is now a new update available that will turn off the Scalable Networking Pack features on Windows Server 2003 Service Pack 2 systems.  The article lists a number of symptoms that have been seen when Windows Server 2003 SNP is enabled and links to download the update for x86, x64, and Itanium-based systems:

An update to turn off default SNP features is available for Windows Server 2003-based and Small Business Server 2003-based computers

Windows Server 2008 Launch!

HEROES happen {here}

The day is finally here!  Windows Server 2008, along with SQL Server 2008 and Visual Studio 2008, is launching!  Check out our main corporate page at http://www.microsoft.com for information, or go direct to the HEROES happen {here} page:

Heroes Happen Here

Here's a link straight to the Windows Server 2008 launch page:

Windows Server 2008

Heroes Happen Here :: Products :: Windows Server 2008

Launch events

Attend a launch event near you and you'll leave with a promotional kit with versions of all three of the products!  At the time of this writing, the first two launch events (today in L.A. and March 4th in New York) are sold out.  You can register to attend other launch events, however, here:

http://www.microsoft.com/heroeshappenhere/register/default.mspx

Test Drives and Videos

Some great scenarios may be found here, showcasing the latest technologies new in Windows Server 2008.  Interesting to me in particular are the demos for Enabling a Remote Workforce (which involves the use of TS Gateway) and Protecting a critical server from Malware (which takes advantage of NAP and other cool new features):

Heroes Happen Here :: Test Drive

Enjoy!

Mike Platts

More Posts Next page »
Page view tracker