Welcome to TechNet Blogs Sign in | Join | Help

News

  • Welcome to the blog for the Microsoft CSS Enterprise Platforms Networking team.

    Disclaimer: All postings are provided "AS IS" with no warranties, and confer no rights. This weblog does not represent the thoughts, intentions, plans or strategies of Microsoft. Because a weblog is intended to provide a semi-permanent point-in-time snapshot, you should not consider out of date posts to reflect current thoughts and opinions.

    Blog Tools

    Add to Technorati Favorites
    Blog Flux Directory
    Computers Blogs - Blog Top Sites

    Add to Google

    Locations of visitors to this page

DNS Client Resolver Behavior

The following question comes up from time to time and for various reasons. What is the expected name resolution behavior of the DNS client resolver on Windows XP or Windows Vista? This may be for a single or for multiple network interfaces. So I thought I would put together a brief overview of what you would see on the network for DNS name resolution for different interface configurations. I am including network captures of three different scenarios that illustrate the expected behavior. This is just a quick overview; there is additional documentation available that covers how the ordering of the Preferred and Alternate DNS servers can change per interface, so I am not going to cover that here.

Scenario 1

A single network interface with a Preferred and Alternate DNS configured.
Preferred - 192.168.0.10
Alternate - 192.168.0.100

image

From the capture you will see the following behavior:

  1. Send a DNS query to the Preferred DNS server.
  2. If there is no response within 1 second then send a DNS query to the Alternate DNS server.
  3. If there is no response within 1 second send a DNS query again to the Preferred DNS server.
  4. If there is no response within 2 seconds send a DNS query to both the Preferred and Alternate DNS servers.
  5. If there is no response within 4 seconds again send a DNS query to both the Preferred and Alternate DNS servers.
  6. If there is still no response after 7 seconds, the process times out.

Notice that the whole process takes about 15 seconds.

Scenario 2

Two network interfaces each with a Preferred and Alternate DNS server configured.
Interface 1:
Preferred DNS server - 192.168.0.10
Alternate DNS server - 192.168.0.100

Interface 2:
Preferred DNS server - 10.10.10.10
Alternate DNS server - 10.10.10.11

image

From the capture you will see the following behavior:

  1. Send a DNS query to the Preferred DNS server.
  2. If there is no response within 1 second then send a DNS query to the Preferred DNS server on Interface 2 and the Alternate DNS server on Interface 1.
  3. If there is no response within 1 second then send a DNS query to the Preferred DNS server on Interface 1 and the Alternate DNS server on Interface 2.
  4. If there is no response within 2 seconds send a DNS query to ALL DNS servers.
  5. If there is no response within 4 seconds again send a DNS query to ALL DNS servers.
  6. If there is still no response after 7 seconds the process times out.

Again, notice that the whole process takes about 15 seconds.

Confused yet? If so, maybe this table will help simplify things. Let's say we have two interfaces, each with two DNS servers configured. The interfaces are numbered 1 and 2 and the DNS servers are A, B, C, and D.

Interface / DNS Server 1 DNS A,B 2 DNS C,D
1st Query A  
2nd Query B C
3rd Query A C, D
4th Query A, B C, D
5th Query A, B C, D

Scenario 3

Just for fun, let’s see what happens if you add additional DNS servers to the first interface.
Interface 1:
Preferred DNS server - 192.168.0.10
Alternate DNS server - 192.168.0.100
Additional DNS server - 192.168.0.200
Additional DNS server - 192.168.0.250

Interface 2:
Preferred DNS server - 10.10.10.10
Alternate DNS server - 10.10.10.11

image

From the capture you will see the following behavior:

  1. Send a DNS query to the Preferred DNS server.
  2. If there is no response within 1 second then send a DNS query to the Preferred DNS server on Interface 2 and the Alternate DNS server on Interface 1.
  3. If there is no response within 1 second then send a DNS query to the Preferred DNS server on Interface 1 and the Alternate DNS server on Interface 2.
  4. If there is no response within 2 seconds send a DNS query to ALL DNS servers.
  5. If there is no response within 4 seconds again send a DNS query to ALL DNS servers.
  6. If there is still no response after 7 seconds the process times out.

This is the same behavior as Scenario 2, we just have more DNS servers.

 

Interface / DNS Server 1 DNS A,B, C, D 2 DNS E,F
1st Query A  
2nd Query B E
3rd Query C F
4th Query A, B, C, D E, F
5th Query A, B, C, D E, F

Notice that there are still only 5 queries and the whole process still takes about 15 seconds. It is not likely that many people would run into this particular scenario, but it is interesting to see how things behave.

Hope that helps clear up any questions.

- Clark Satter

SNMP Traps in Windows Server

What is an SNMP Trap?

It’s nothing but an alert message with abstract information about an event sent from an SNMP agent to its configured SNMP manager. It notifies the administrators about an event that occurred in the SNMP agent. There is separate service called SNMP Trap service which runs in Microsoft operating systems and listens for traps on UDP port 162 by default.

How to install it?

When you install the SNMP service on any Microsoft Windows operating system except Windows Vista and Windows Server 2008, the SNMP Trap service is installed along with the SNMP Service. In Windows Vista and Windows Server 2008, the SNMP Trap service is by default installed but set to manual and is thus in a stopped state.

The SNMP Trap service runs using the Local Service account in Windows. The SNMP Trap service was dependent on the Event Log service up until Windows Server 2003 but since Windows Vista and Windows Server 2008, the SNMP Trap service has been independent.

I want my SNMP manager to listen for SNMP Traps on a different UDP port. Is this possible?

Yes, open the file named “Services”, which is located in %systemroot%\system32\drivers\etc.

Edit the port number on the following line on the file with your customized port numbers.

snmptrap 162/udp snmp-trap #SNMP trap

Save the file as it was with no extension. Restart the SNMP Trap service. Run the following command in a Command Prompt: Netstat -ano and you should see the SNMP Trap service listening on the new port number.

What does “Send Authentication Trap” mean?

An SNMP agent sends Authentication traps to its configured trap destination List in the following situations:

  • When an SNMP query is sent from an SNMP manager which is not listed in the Permitted Manager's list of SNMP agent.
  • When an SNMP query is sent from an SNMP manager which is listed in the Permitted Manager's list but the community name in the SNMP query doesn't match the agent's community name configured on the security tab of the SNMP agent, like when the community name is misspelled (it is case sensitive).
  • When both of the above conditions are true in a given situation.

An agent traps all the trap destinations of all the communities, provided these community names are configured in the Security tab of an agent. So if multiple trap destinations are configured with multiple community names, then a trap message is sent to all the destinations of all the communities specified on the trap tab. This happens three times in succession after each access violation. However a trap message to a trap destination will have the community name specified in the SNMP agent for that trap destination.

Make sure of following things:

  • The name configured in the security tab is case sensitive which means that it should be in the same case as that of the community name that is received in the query. But the community names configured in the trap tab is not case sensitive, meaning for example if a community name called "TEST" is configured in the security tab then the equivalent name "test" can be configured in trap tab for sending traps to the specified trap destinations.
  • The “SNMP Agent Service” is not in a disabled state on the SNMP agent device.
  • UDP Port 162 is open on any firewalls involved. In Windows Vista one exception rule is pre-defined in the firewall configuration settings for SNMP trap, but it is disabled by default. It needs to be enabled.

How do I test if my SNMP Manager is able to receive SNMP Traps?

You may have 3rd party applications which make use of the built-in SNMP trap service to receive traps and then react to the trap. If you find that your SNMP manager application is not receiving traps, first make sure the built in SNMP Trap Service is able to receive traps. If the SNMP Trap service is able to receive traps then it’s the application which is not working the way it should.

To check the functionality of the built-in SNMP Trap service, do the following:

  1. Create a new folder under any drive (For example: C:\snmputil) on the SNMP Manager machine which is configured to listen for the traps.
  2. Copy the “snmputil.exe” utility to the newly created folder.
    Snmputil.exe is available from the Windows 2000 and Windows Server 2003 Resource Kits.
  3. Open up a Command Prompt and change to the directory where you have the snmputil.exe (in our example it is C:\snmputil) and run the following command: “Snmputil trap”.
    You will see the following output:
           snmputil: listening for traps...
               Let the command run and do not close the Command Prompt window.
  4. Stop and Restart the SNMP Service on any SNMP Agent which is configured to send traps to the SNMP Manager mentioned in step 1 above.
  5. If the test is successful, you should see the below output in the SNMP Manager Command Prompt window on the SNMP manager machine. This will show that traps generated by the agent are being received.
    Refer http://support.microsoft.com/kb/323340 to learn more about snmputil.exe.

snmputil: listening for traps...
Incoming Trap:
generic = 0
specific = 0
enterprise = .iso.org.dod.internet.private.enterprises.microsoft.software.syst
ems.os.windowsNT.server
agent = 10.10.10.100
source IP = 10.10.10.100
community = public
Incoming Trap:
generic = 3
specific = 0
enterprise = .iso.org.dod.internet.private.enterprises.microsoft.software.syst
ems.os.windowsNT.server
agent = 10.10.10.100
source IP = 10.10.10.100
community = public
variable = interfaces.ifTable.ifEntry.ifIndex.1
value = Integer32 1
Incoming Trap:
generic = 3
specific = 0
enterprise = .iso.org.dod.internet.private.enterprises.microsoft.software.syst
ems.os.windowsNT.server
agent = 10.10.10.100
source IP = 10.10.10.100
community = public
variable = interfaces.ifTable.ifEntry.ifIndex.262147
value = Integer32 262147

Below are different types of traps that are built-in and are enabled by default in Windows:

  • Coldstart or Warmstart: The agent reinitialized its configuration tables.
  • Linkup or Linkdown: A network interface card (NIC) on the agent either fails or reinitializes.
  • Authentication fails: This happens when an SNMP agent gets a request from an unrecognized community name.
  • egpNeighborloss: Agent cannot communicate with its EGP (Exterior Gateway Protocol) peer.
  • Enterprise specific: Vendor specific error conditions and error codes.

Refer http://support.microsoft.com/kb/172879 for some more information on SNMP traps.

- Arun Kumar (P)

Missing Network Map under Network and Sharing Center in Windows Vista or Windows Server 2008

Windows Vista and Windows Server 2008 introduced many new and exciting enhancements for improving and changing Personal and Enterprise computing. The introduction of the Network and Sharing Center, improvements in the Security Center and the Network Map of neighboring devices using Link Layer Topology Discovery are some of the highlights.

Network and Sharing Center displays all the information related to network connectivity and sharing capabilities of a Windows Vista or Windows Server 2008 machine. It also introduces a Network Map under Network and Sharing Center which determines the connectivity of the machine with the Internet.

image

The protocol involved behind creating a Network Map on Windows Vista and Windows Server 2008 is Link Layer Topology Discovery (LLTD).

Sometimes the Network Map which is present under the Network and Sharing Center of Windows Server 2008 or Windows Vista does not display the exact network topology via which the machine is connected to the Internet. In this article, we will discuss various possibilities of troubleshooting this situation.

image

image

QUESTION: How do you troubleshoot a situation where the Network and Sharing Center on Windows Vista or Windows Server 2008 shows "You are currently not connected to any networks."?

We will be taking a typical example where we are unable to view the Network Map on a Windows Server 2008 machine. In our scenario, we also take into account that the machine is joined to a domain. The issue is seen with all the domain user accounts that can logon to the machine.

Network and Sharing Center displays as "You are currently not connected to any networks" and also a Red-X appears on the Network Map. However, there are no connectivity issues on the server, neither incoming nor outgoing. The machine can traverse through all the LAN subnets and the Internet (if directly connected to it).

ANSWER:
We start the troubleshooting with the basic steps:

1) Network List Service and Network Location Awareness Service
Check for the status of these services and their dependencies; they should be started. They are required for a machine to populate the Network Map of Network and Sharing Center.

  • Network Location Awareness Service Dependencies
    http://msdn.microsoft.com/en-us/library/ms739931(VS.85).aspx
        Network Store Interface Service
            NSI proxy service
        Remote Procedure Call (RPC)
            DCOM server Process Launcher
        TCP/IP Protocol Driver
  • Network List Service Dependencies
        Network location Awareness Service
        Remote Procedure Call (RPC)
            DCOM server Process Launcher

2) Link Layer Topology Discovery Mapper I/O Driver.
This component is required for creating a full Network Map for the machine by discovering the devices connected to the network. It is not responsible for creating the Network Map under the Network and Sharing Center, but its core job is to detect the devices which are connected around your machine. Please refer to the KB article below for further explanation on Link Layer Topology Discovery Mapper I/O Driver. Please make sure that it is checked on the Network Connection under “This connection uses the following items:”

image

Link Layer Topology Discovery (LLTD) Protocol Specification
http://msdn.microsoft.com/en-us/library/cc233983(PROT.10).aspx

3) Checking permissions of the components involved with the Network List Service.

DLLs:
    - Files: netprofm.dll and netprof.dll
    - Location: %SystemRoot%\System32\

Registry:
    - Service Location: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\netprofm

The Network List Service runs as the LocalService account. Please make sure that there are read permissions on the above locations for LocalService.

4) If the issue still persists
If you have verified that steps 1 – 3 gave a positive result for your machine and the Network Map is still not shown under Network and Sharing Center, check to see if the Network Map appears when logging onto the machine using one of the Local machine’s user accounts (non-domain account); preferably Administrator. If the map is still not seen, this means that there is a permission issue for the domain user accounts on this machine.
A workaround to test for the above symptom is adding LocalService to the Adminstrators group under the Local Users and Groups console (NOTE: This is not a recommended solution and the change must be undone after confirmation that it works.)

If the above test was successful, this proves that after providing the LocalService account elevated privileges or administrative rights, the issue disappears. So what is making the Local Service incapable of displaying the Network Map for Network and Sharing Center?
We troubleshoot this using a utility known as Process Monitor, which is available for download at: http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx

1.    Download the Zip file containing Process Monitor and extract ProcMon.exe from it to a location on the Windows Server 2008 machine.
2.    Since the Network List service runs under shared mode with Svchost.exe, we need to separate its shared mode for tracking the PID (Process ID) of the service in ProcMon.exe.
3.    Start a Command Prompt.
4.    Check for the svchost.exe status:
        C:\> tasklist /svc

image

5.    In the example above, the Network List Service (netprofm) is running in shared mode with EventSystem, LanmanWorkstation, Nsi etc. with a common PID of 1496.
6.    Configure the Network List Service to run in its own instance of SvcHost.exe by executing the following:
        C:\> sc config netprofm type=own
7.    Now restart the Network List service by executing the following:
        Net stop netprofm
        Net start netprofm

image

8.    Again run <tasklist /svc /fi "imagename eq svchost.exe"> to compare the results.

image

9.    Please note that in the above example, the Network List Service (netprofm) is now running in its own instance of svchost.exe with a PID of 4860 (PID’s are randomly assigned on every machine). Make a note of the PID assigned in your case for later use.

10.    Launch ProcMon.exe and then open Network and Sharing Center.

11.    Once the Network and Sharing Center window displays the incomplete Network Map, stop the ProcMon log by unchecking Capture Events (Ctrl+E) option under the File Menu of Procmon.exe.

12.    Now we need to filter this result with the PID of the Network List Service (netprofm). Press the Filter button (Ctrl+L) under Filter menu of Procmon.exe.

13.    In the Process Monitor Filter dialog as shown below, select PID – IS – 4860 (enter the PID noted down in Step 9), click ‘Add’, and then ‘OK’.

image

14.    You should see a “ACCESS DENIED” message for that PID on a registry location.

15.    If you double click on this event, you will see a result similar to the following:

Date & Time    :    28-01-09 10:32:31 PM
Event Class    :    Registry
Operation    :    RegOpenKey
Result        :    ACCESS DENIED
Path        :    HKLM\Software\Microsoft\Windows NT\CurrentVersion\NetworkList\Profiles\{A82736A5-7B12-4C33-94D6-8FF78B91750A}
TID        :    3608
Duration    :    0.0000704
Desired Access:    Read/Write

Description    :    Host Process for Windows Services
Company    :    Microsoft Corporation
Name        :    svchost.exe
Version    :    6.00.6001.18000
Path        :    C:\Windows\System32\svchost.exe
Command Line:    C:\Windows\System32\svchost.exe -k LocalService
PID        :    380
Parent PID    :    564
Session ID    :    0
User        :    NT AUTHORITY\LOCAL SERVICE
Auth ID    :    00000000:000003e5
Architecture    :    64-bit
Virtualized    :    False
Integrity    :    System
Started    :    28-01-09 10:30:48 PM
Ended        :    (Running)

Interpreting the results of the ProcMon analysis

While accessing the registry location of : HKLM\Software\Microsoft\WindowsNT\CurrentVersion\NetworkList\Profiles\{A82736A5-7B12-4C33-94D6-8FF78B91750A}

Network List Service received an Access is Denied error message. It does not have the authority to access this location in the registry. Since the Network List service runs under the privileges of Local Service, it works when we add the Local Service account to the local BuiltIn\Administrators group of the machine.

We need to check the permissions at the following registry location: HKLM\Software\Microsoft\WindowsNT\CurrentVersion\NetworkList\. Administrators and Netprofm should be present there, as below:

image

If netprofm is not present under the Permissions tab, we need to add it there for making the Network Map appear under the Network and Sharing Center. However, the account we are planning to add to the registry key is a Service SID which is new security feature introduced in Windows Server 2008.

Windows Service Hardening restricts critical Windows services from performing abnormal activities in the file system, registry, network, or other areas that could be used by malware. For example, the Remote Procedure Call (RPC) service can be restricted from replacing system files or modifying the registry.

In our case, the service in question is netprofm which is the Network List Service on Windows Server 2008. The service SID for this can be checked using the below commands:

C:\> sc qsidtype netprofm

[SC] QueryServiceConfig2 SUCCESS

SERVICE_NAME: netprofm

SERVICE_SID_TYPE:  UNRESTRICTED

Checking the SID for the service is possible, using the following command:

C:\> sc showsid netprofm

NAME: netprofm

SERVICE SID: S-1-5-80-3635958274-2059881490-2225992882-984577281-633327304

To add a Service SID to a resource:

1.    Right Click over the registry location: HKLM\Software\Microsoft\WindowsNT\CurrentVersion\NetworkList\ and then choose Permissions...
2.    Click Add...
3.    Please choose your computer for “From this location:” instead of your Active Directory domain, by clicking on Locations...
4.    Under “Enter the object names to select” type NT SERVICE\netprofm and then click on Check Names.
5.    Press OK to confirm.

If we type only the service name of netprofm like any other Builtin Service or user account, it would not be searchable. Therefore, we have to use NT SERVICE\Netprofm to make it searchable.

Once we have netprofm in place, click on Advanced as we need to specify special permissions (as per the below figure) for netprofm.

image

After completing the registry changes, please reboot the machine for making the new changes to take effect. This time the Network Map under Network and Sharing Center should come up even with the Domain credentials.

- Manuj Bhatia

Spurious WINS Event ID 4224 on Windows Server 2008

I recently fielded a call from a customer who was seeing Event ID 4224 errors on a Windows Server 2008 machine. The machine hosted DNS but also hosted WINS. It was noted that after a restart of the WINS Server service that the “Event 4224 – WINS encountered a database error” was logged in the System Event Log. The same behavior had been seen on a couple other servers hosting WINS. All were Windows Server 2008.

Log Name:       System
Source:         Wins
Date:           4/24/2009 4:19:10 PM
Event ID:       4224
Task Category:  None
Level:          Error
Keywords:       Classic
User:           N/A
Computer:       MachineName.corp.contoso.com
Description:
WINS encountered a database error. This may or may not be a serious error. WINS will try to recover from it. You can check the database error events under 'Application Log' category of the Event Viewer for the Exchange Component,  ESENT, source to find out more details about database errors.  If you continue to see a large number of these errors consistently over time (a span of few hours), you may want to restore the WINS database from a backup. 

It turns out that the behavior is very reproducible and (the good part) harmless. While certainly being annoying there is in fact no relation to an actual issue with WINS or the WINS database in particular in this case. I reproduced the Event with a brand new server install where only the locally registered WINS records existed. This behavior is expected given the combination of function calls that are made when there is a service restart. As you might have already guessed, restarting the server may also generate the Event. In the interest of preventing a call to Microsoft Support to discuss this Event ID, I thought it would be good to give you the “skinny” on it.

Obviously, WINS Event ID 4224 is a valid error and can be related to a legitimate problem. Caution should be taken to avoid dismissing any diagnostic Events, including this one, without adequate investigation.

- Pete Sullivan

Active Route gets removed on Windows Server 2008 offline Cluster IP Address

We have received calls about adding static routes on Windows Server 2008 Failover Clustering nodes and wanted to pass along some important information regarding this. The issue is that when you add a static persistent route to a network adapter that is on a Windows Server 2008 Failover Cluster and take a Clustered IP Address offline (or move it to another node), the “Active” route is removed and no connections can be made using this route even though it still shows as persistent. Once you bring the Clustered IP Address back online, the active route is returned.

I want to mention that the networking architecture in Windows Server 2008 Failover Clustering has been rewritten from the ground up and I will not into the specifics of it here. You can read more about it in the blog, “What is a Microsoft Failover Cluster Adapter anyway?”. 

On to the problem and resolution. As a little setup, here is the configuration that I want to discuss.

ClusterNode1
Physical IP Address: 10.44.60.4
Physical Subnet Mask: 255.255.0.0
Default Gateway: 10.44.60.1

ClusterNode2
Physical IP Address: 10.44.60.3
Physical Subnet Mask: 255.255.0.0
Default Gateway: 10.44.60.1

Failover Cluster Virtual IP Address
IP Address: 10.44.60.6

I also have a backup server that I use to create backups using an IP Address of 10.51.0.1 and subnet mask 255.255.0.0 that will use the same default gateway above. Most Network Administrators would use the following ROUTE.EXE command to add a persistent static route to the local tables so that a connection can be made.

route -p add 10.51.0.0 mask 255.255.0.0 10.44.60.1

So with everything online (including the Failover Cluster Virtual IP Address) on ClusterNode1, I can do a ROUTE PRINT command to display my IP Address version 4 table and see this. As a side note, I am just pulling the necessary information from the Route Table.

C:\>route print -4
IPv4 Route Table
==================================================================
Active Routes:
Network Destination  Netmask          Gateway     Interface  
10.44.0.0            255.255.0.0      On-link     10.44.60.4 
<<---
10.44.60.4           255.255.255.255  On-link     10.44.60.4  <<--- Phys. Node IP Address
10.44.60.6           255.255.255.255  On-link     10.44.60.4  <<--- Clustered IP Address
10.44.255.255        255.255.255.255  On-link     10.44.60.4  <<---
10.51.0.0            255.255.0.0      10.44.60.1  10.44.60.4  <<--- Static Route added
224.0.0.0            240.0.0.0        On-link     10.44.60.4 
<<---
255.255.255.255      255.255.255.255  On-link     10.44.60.4  <<---
===================================================================
Persistent Routes:
Network Address      Netmask          Gateway Address   Metric
10.51.0.0            255.255.0.0      10.44.60.1        1     <<--- Persistent Route
===================================================================

As long as the Clustered IP Address of 10.44.60.6 is online on this node, all is well. However, if I were to take the 10.44.60.6 IP Address offline, things change.

C:\>cluster res "IP Address 10.44.60.6" /offline
Taking resource ''IP Address 10.44.60.6'' offline...
Resource                 Group         Node             Status
--------------------     ----------    ---------------  ------
IP Address 10.44.60.6    Data Group    ClusterNode1     Offline

C:\>route print -4
IPv4 Route Table
==================================================================
Active Routes:
Network Destination  Netmask          Gateway     Interface  
10.44.0.0            255.255.0.0      On-link     10.44.60.4  <<---
10.44.60.4           255.255.255.255  On-link     10.44.60.4  <<--- Phys. Node IP Address
10.44.255.255        255.255.255.255  On-link     10.44.60.4  <<---
224.0.0.0            240.0.0.0        On-link     10.44.60.4  <<---
255.255.255.255      255.255.255.255  On-link     10.44.60.4  <<---
===================================================================
Persistent Routes:
Network Address      Netmask          Gateway Address   Metric
10.51.0.0            255.255.0.0      10.44.60.1        1     <<--- Persistent Route
===================================================================

Notice here that the Clustered IP Address 10.44.60.6 as well as the 10.51.0.1 “Active” route is removed. Because the 10.51.0.0 route is removed, connectivity to the backup server is lost. If you bring the Clustered IP Address 10.44.60.6 online again, the “Active” routes are re-populated again and connectivity to the backup server is restored.

C:\>cluster res "IP Address 10.44.60.6" /online
Bringing resource ''IP Address 10.44.60.6'' online...
Resource                 Group         Node             Status
--------------------     ----------    ---------------  ------
IP Address 10.44.60.6    Data Group    ClusterNode1     Online

C:\>route print -4
IPv4 Route Table
==================================================================
Active Routes:
Network Destination  Netmask          Gateway     Interface  
10.44.0.0            255.255.0.0      On-link     10.44.60.4  <<---
10.44.60.4           255.255.255.255  On-link     10.44.60.4  <<--- Phys. Node IP Address
10.44.60.6           255.255.255.255  On-link     10.44.60.4  <<--- Clustered IP Address
10.44.255.255        255.255.255.255  On-link     10.44.60.4  <<---
10.51.0.0            255.255.0.0      10.44.60.1  10.44.60.4  <<--- Static Route added
224.0.0.0            240.0.0.0        On-link     10.44.60.4  <<---
255.255.255.255      255.255.255.255  On-link     10.44.60.4  <<---
===================================================================
Persistent Routes:
Network Address      Netmask          Gateway Address   Metric
10.51.0.0            255.255.0.0      10.44.60.1        1     <<--- Persistent Route
===================================================================

According to our Networking Development Groups, the recommendation actually is that on-link routes should be added with a 0.0.0.0 entry for the next hop, not with the local address (particularly because the local address might be deleted) and with the interface specified. Specifying this entry will be the support stance moving forward from a networking perspective when you are talking about adding static persistent routes on any machine (Cluster or not).

The ROUTE.EXE command has additional parameters of METRIC and INTERFACE that you would need to specify that will bind the route to the card itself.

C:\>route /?

Manipulates network routing tables.

ROUTE [-f] [-p] [command [destination]
                  [MASK netmask]  [gateway] [METRIC metric]  [IF interface]

  interface    the interface number for the specified route.
  METRIC       specifies the metric, ie. cost for the destination.

So what you need to do first is determine what the interface is so that we can bind the route to it.  When doing the ROUTE PRINT or NETSH command, it will give you the interfaces at the top first.  Something similar to this:

C:\>route print
IPv4 Route Table
============================================================
Interface List
23 ...00 15 5d 4a ac 06 ...... Local Gigabit Controller
19 ...00 15 5d 4a ac 01 ...... Local Gigabit Controller #2
18 ...00 15 5d 4a ac 00 ...... Local Gigabit Controller #3
============================================================

-or-

C:\>netsh int ipv4 show int
Idx  Met   MTU       State        Name
---  ---  -----      -----------  -------------------
18   50 4294967295  connected    Local Gigabit Controller #3
19    5   1500      connected    Local Gigabit Controller #2
23    5   1500      connected    Local Gigabit Controller

I can go into the Network and Sharing Center if I have to to see which card is on this network. In my particular case, the “Local Gigabit Controller #3” is the one I want to use. So to get my persistent route to stay even though the Clustered IP Address goes offline, my command would be this. Note that the METRIC value is not needed.

route -p add 10.51.0.0 mask 255.255.0.0 0.0.0.0 metric 276 if 18

Now, the “Active” route will stay and you will have your connectivity regardless if a Clustered IP Address is online or offline. If you are adding a static persistent route, start specifying the 0.0.0.0 and the interface as this is the proper supported commands going forward from a networking perspective. This will result in the proper functioning no matter if Failover Clustering is configured or not. This will result in the proper functioning no matter if Failover Clustering is configured or not.

- John Marlin

The Cable Guy returns!

I am very happy to announce that Joseph Davies has returned to publish The Cable Guy column again on Technet. In his column for May, found here, he discusses DirectAccess and the Thin Edge Network. Welcome back, Cable Guy!

- Mike Platts

RPC to Go v.3 – Named Pipes

In this blog I’d like to give some information on what Named Pipes are, what a Named Pipes connection looks like, and some common things to look for if there is an issue. I do hope you have read RPC to Go v.1 and v.2. It makes this much easier to understand. Either way, I’ll try to keep this blog from getting complicated. I considered making it ‘less technical,’ but if you’re reading about RPC, you’re likely technical to a degree.

A customer case I worked on inspired this blog. I went onsite as they had a problem that was approaching its first birthday. The resolution was two-fold. The initial problem was resolved using Process Explorer to find handle leaks (tool available at www.systinternals.com ). The other issue could only be found by someone having a watchful eye on the flow of traffic.

This particular issue involved one client machine (XP SP2), a web server (Server 2003 with IIS), and a database server (Server 2003 with SQL 2005). The users accessed a web-based application in which they performed multiple tasks in one window. During certain tasks, they received varying errors. Most of the errors pointed to the file not existing (even though it did) or the response for query timing out.

After the initial issue was resolved, I crossed my fingers and had the customer start testing. I focused on three key areas: the process explorer (or task manager), the number of concurrent connections to both servers (using “netstat –ano” from a command prompt at the servers), and a network capture. As the number of connections to the server grew I noticed that the response time from the server to the clients increased. I checked process explorer. System resources were minimal on the application server. The same was true for the SQL server.

I filtered my capture for web (TCP port 80) and SQL (TCP port 1433 natively) traffic. These connections opened and closed with no issues. Becoming more perplexed, I removed the filter. I noticed an alarming number of connections on port 445 between the web and SQL servers. The SQL server sent an overwhelming number of its responses with STATUS_PIPE_DISCONNECTED. As it turned out, in the application code, the “call socket” function for the SQL query procedure was written to use Named Pipes to connect to the SQL server. We configured SQL to listen for Named Pipes. Issues resolved. This must be configured.

WHAT ARE NAMED PIPES?

Named Pipes, like RPC, allow inter-process communications (IPC). According to MSDN, named pipes are used to transfer data between processes that are not related processes and between processes on different computers. The differences between inter-process communications protocols include their capabilities and the way the socket connections are made. For a complete list of differences, visit the following link.

http://msdn.microsoft.com/en-us/library/aa365574(VS.85).aspx#base.using_pipes_for_ipc

I just want to focus on a Named Pipes connection establishment. Unlike RPC, there is no end point mapper (EPM) connection. Inter-process communications between separate hosts must accomplish several things:

  • A connection must be established on a pre-determined port number.
  • A level of authentication must be agreed upon and then met.
  • The remote process must be identified (even though the listening port is specified)
    • RPC uses a Universally Unique Identifier (UUID) for process identification
    • Named Pipes uses \\MachineName\IPC$

In order to accomplish all three, Named Pipes uses the Microsoft implementation of CIFS – the Server Message Block protocol, or SMB (blog coming within a couple months for clarity – promise. In the interim, here’s a great link: http://msdn.microsoft.com/en-us/library/aa365233.aspx )

Once there’s a connection to the IPC$ tree, subsequent RPC binds and calls are encapsulated in SMB and sent over the wire using TCP port 139 or 445 natively. When binding to a new process, an SMB Create Andx Request is sent. The request for the new process is \\machinename\IPC$\ServiceName i.e. “wkssvc,” which is the workstation service.

The RPC portion does have typical RPC information. There is still a UUID, a transfer syntax and OpNum. There is also an ‘AssociationGroupID.’ The latter isn’t covered in previous blogs and is overkill for this blog.

WHAT TO LOOK FOR
  • Does my traffic hit the wire? If no:
    • Make sure the name is resolvable.
    • Make sure there is a route the resolved IP address
  • Hits the wire, but no connection is made to the remote host:
    • Make sure the remote host is listening on port 139 or 445. The Server service must start.
    • Check for firewall issues at the client, server and intermediate network devices.
  • Makes a connection but is reset during the SMB portion:
    • Make sure the authenticating user has rights to the IPC$ share on the remote host.
    • Check your auth type. The source host must have a route to the DC for Kerberos auth.
  • SMB works fine, but RPC call gets a BIND_NAK response.
    • Verify the remote service is running. (Check the UUID in the RPC portion of the frame.) Refer to previous “RPC to Go” versions for more troubleshooting info.
  • Looking at a network capture? IPC$ is good about sending error messages back. There aren’t many blogs for these errors. Carefully examine where, in transmission, the errors occurred.
NETWORK CAPTURES

This is a concise example of a Named Pipes connection to a server named Fabfile-1 using remote registry. I’ll show the SMB tree connection to IPC$, the connection to the remote procedure, and the relevant RPC info.

1. In the Tree Connect Request, notice the file structure:

65 08:43:34.724815 192.168.3.100 192.168.3.5 SMB Tree Connect AndX Request, Path: \\FABFILE-1\IPC$

2. In the Create Andx Request, notice WinReg is requested as a file in the IPC$ share

67 08:43:34.725799 192.168.3.100         192.168.3.5           SMB      NT Create AndX Request, FID: 0x4000, Path: \winreg
SMB (Server Message Block Protocol)
    SMB Header
        Server Component: SMB
        [Response in: 68]
        SMB Command: NT Create AndX (0xa2)
        NT Status: STATUS_SUCCESS (0x00000000)
        Flags: 0x18
        Flags2: 0xc807
        Process ID High: 0
        Signature: 0000000000000000
        Reserved: 0000
       Tree ID: 2048  (\\FABFILE-1\IPC$)
        Process ID: 792
        User ID: 2048
        Multiplex ID: 192
    NT Create AndX Request (0xa2)
      [FID: 0x4000 (\winreg)]

3. In the encapsulated RPC bind, notice the UUID for WINREG and the’ x86’ transfer syntax. This is presented to the remote process when a procedure call is made.

Ctx Item[1]: ID:0
        Context ID: 0
        Num Trans Items: 1
        Abstract Syntax: WINREG V1.0
            Interface: WINREG UUID: 338cd001-2244-31f1-aaaa-900038001003
            Interface Ver: 1
            Interface Ver Minor: 0
       Transfer Syntax[1]: 8a885d04-1ceb-11c9-9fe8-08002b104860 V2
            Transfer Syntax: 8a885d04-1ceb-11c9-9fe8-08002b104860  ver: 2

Hopefully you’ve gained some insight on Named Pipes, its connection establishment and some “gotcha” to keep an eye on. I do hope the “RPC to Go” saves a couple of support calls in the near future.

-Rich Chambers

Source IP address selection on a Multi-Homed Windows Computer

There is often confusion about how a computer chooses which adapter to use when sending traffic.

This blog describes the process by which a network adapter is chosen for an outbound connection on a multiple-homed computer, and how a local source IP address is chosen for that connection.

What is Source IP address selection

Source IP address selection is the process by which the stack chooses an IP address.

Windows XP and Windows Server 2003 are based on the weak host model.

When a Windows Sockets program binds to a socket, one of the parameters that is passed in the bind() call is the local (source) IP address that should be used for outbound packets. Most programs do not have any knowledge of network topology, so they specify IPADDR_ANY instead of a specific IP address in their bind() call. IPADDR_ANY tells the stack that the program is going to let the stack choose the best local IP address to use;

Windows XP behavior

KB175396 - Windows Socket Connection from a Multiple-Homed Computer

The TCP/IP component of all Microsoft Windows operating systems prior to Windows Vista is based on a Weak Host model. This model gives program developers the greatest amount of leeway when they design programs that use the network and are compatible with Microsoft products. This model also puts the responsibility of the behavior of the networking program on the developers, because the developers specify how the program accesses the TCP/IP stack and responds to incoming and outgoing frames.

On a computer that has one network adapter, the IP address that is chosen is the Primary IP address of the network adaptor in the computer. However, on a multiple-homed computer, the stack must first make a choice. The stack cannot make an intelligent choice until it knows the target IP address for the connection.

When the program sends a connect() call to a target IP address, or sends a send() call to a UDP datagram, the stack references the target IP address, and then examines the IP route table so that it can choose the best network adapter over which to send the packet. After this network adapter has been chosen, the stack reads the Primary IP address associated with that network adapter and uses that IP address as the source IP address for the outbound packets.

Example:
Source supplied in the call: IPADDR_ANY
Target IP:192.168.1.5
Route Table:
Nic 1 - 192.168.1.10/32
Nic 1 - 192.168.1.11/32
Nic 2 - 10.0.0.10/32
Nic 2 - 10.0.0.11/32
The chosen source IP:192.168.1.10
The chosen source NIC: Nic 1

If the program specifies a source IP address to use in the bind() call, that IP address is used as the source IP address for connections sourced from that socket. However, the route table is still used to route the outbound IP datagrams, based on the target IP address. As a result of this behavior, the source IP address may not be the one associated with the network adapter that is chosen to send the packets.

Example:
Source supplied in the call:10.0.0.10
Target IP:192.168.1.5
Route Table:
Nic 1 - 192.168.1.10/32
Nic 1 - 192.168.1.11/32
Nic 2 - 10.0.0.10/32
Nic 2 - 10.0.0.11/32
The chosen source IP:10.0.0.10
The chosen source Nic: Nic 1 <- Note this is not the Nic the source IP is on.

Summary

If a source IP is not given the Primary IP address of the adapter with a route that most closely matches the target IP address is used to source the packet and the adapter that the Primary IP is associated with is used as the source adapter.

If the source IP is specified the adapter that is used to send the packet is the one with a route that most closely matches the target IP address and this may not be the adapter that is associated with the source IP.

Windows Vista/Windows Server 2008 behavior

Windows Vista and later are based on the strong host model. In the strong host model, the host can only send packets on an interface if the interface is assigned the source IP address of the packet being sent. Also the concept of a primary IP address does not exist.

Similar to XP when if a program doesn't specify a source IP, the stack references the target IP address, and then examines the entire IP route table so that it can choose the best network adapter over which to send the packet. After the network adapter has been chosen, the stack uses the address selection process defined in RFC 3484 and uses that IP address as the source IP address for the outbound packets.

Example:

Source supplied in the call: IPADDR_ANY
Target IP:192.168.1.5
Route Table:
Nic 1 - 192.168.2.10/32
Nic 1 - 192.168.1.11/32
Nic 2 - 10.0.0.10/32
Nic 2 - 10.0.0.11/32
The chosen source IP:192.168.1.11
The chosen source NIC: Nic 1

If the program specifies a source IP address, that IP address is used as the source IP address for connections sourced from that socket and the adapter associated with that source IP is used as the source interface. The route table is searched but only for routes that can be reached from that source interface.

Example:
Source supplied in the call:10.0.0.10
Target IP:192.168.1.5
Route Table:
Nic 1 - 192.168.1.10/32
Nic 1 - 192.168.1.11/32
Nic 2 - 10.0.0.10/32
Nic 2 - 10.0.0.11/32
The chosen source IP:10.0.0.10
The chosen source Nic: Nic 2 <- Note this is the Nic the source IP is on.
Note: the packet would be sent to the default gateway associated with Nic 2.

RFC 3484 and Source IP address selection

The last thing I want to talk about is RFC 3484.

Even though RFC 3484 says it only applies to IPV6 in Windows implementations IPV4 does follow the same rules when possible.

Windows Source IP V4 address selection:
Rule 1 Prefer same address (applies)
Rule 2 Prefer appropriate scope (applies)
Rule 3 Avoid deprecated addresses (applies)
Rule 4 - Prefer home addresses - does not apply to IP v4
Rule 5 Prefer outgoing Interfaces (applies)
Rule 6 Prefer matching label - does not apply to IP v4
Rule 7 Prefer public addresses - does not apply to IP v4
Rule 8a: Use longest matching prefix with the next hop IP address. (not in RFC!)
"If CommonPrefixLen(SA, D) > CommonPrefixLen(SB, D), then prefer SA. Similarly, if
CommonPrefixLen(SB, D) > CommonPrefixLen(SA, D), then prefer SB. "
This says that the IP with the most high order bits that match the destination of
the next hop will be used.
Note: Rule 8 - Use longest matching Prefix is similar to rule 8a except the match
is with the destination IP address rather than the next hop IP address.

For example, consider the following addresses:

Client machine
IP Address
192.168.1.14 /24
192.168.1.68 /24
Default Gateway
192.168.1.127

The server will use the 192.168.1.68 address because it has the longest matching prefix.

To see this more clearly, consider the IP addresses in binary:

 

11000000 10101000 00000001 00001110 = 192.168.1.14 (Bits matching the gateway = 25)
11000000 10101000 00000001 01000100 = 192.168.1.68 (Bits matching the gateway = 26)
11000000 10101000 00000001 01111111 = 192.168.1.127
The 192.168.1.68 address has more matching high order bits with the gateway address 192.168.1.127. Therefore, it is used for off-link communication.

Additional Information

Default Address Selection for Internet Protocol version 6 (IPv6)

For more information about Strong and Weak Host Models

http://technet.microsoft.com/en-us/magazine/2007.09.cableguy.aspx

The gethostbyname function has been deprecated. We recommend that you use the getaddrinfo function instead. However, we still cannot guarantee that primary IP address will be returned first.
For more information about the gethostbyname function, visit the following Microsoft Web site:

http://msdn2.microsoft.com/en-us/library/ms738524.aspx

For more information about the getaddrinfo function, visit the following Microsoft Web site:

http://msdn2.microsoft.com/en-us/library/ms738520(VS.85).aspx

- David Pracht

Network Monitor 3.3 is available for download

Network Monitor 3.3 has released and can be downloaded now!

Download details: Microsoft Network Monitor 3.3

For information about the features of this release, check out the following post on the Netmon blog:

Network Monitor 3.3 has arrived!

- Mike Platts

DNS Round Robin and Destination IP address selection

This post is meant to discuss the issues that can occur with Destination IP address selection and its affect on the DNS Round Robin process.

What is Round Robin and Netmask Ordering

DNS Round Robin is a mechanism for choosing an IP address from the list returned by a DNS server so that all clients won't get the same IP address every time. Netmask ordering is a mechanism for further optimizing which IP address is used by attempting to determine the closest result.

842197 Description of the netmask ordering feature and the round robin feature in Windows Server 2003 DNS http://support.microsoft.com/default.aspx?scid=kb;EN-US;842197

The netmask ordering feature is used to return addresses for type A DNS queries to prioritize local resources to the client. For example, if the following conditions are true, the results of a query for a name are returned to the client based on Internet protocol (IP) address proximity:

  • You have eight type A records for the same DNS name.
  • Each of your eight type A records has a separate address.

The round robin feature is used to randomize the results of a similar type of query to provide basic load-balancing functionality. In the earlier example, eight type A records with the same name and different IP addresses cause a different answer to be prioritized to the top with each query. Because a new IP address is prioritized to the top with each query, clients are not repeatedly routed to the same server.

The key points here are that DNS Round Robin only provides a simple load-balancing system by alternating the IP at the top of the list the DNS server returns and that Netmask Ordering will return a list with the "closest" IP at the top of the list the DNS server returns. Both are server side mechanisms commonly used to provide simple load balancing functionality.

Destination Address Selection

Destination address selection is how the client decides which destination IP address is selected when it gets a list of IP addresses.

IPv4: When using IPv4 only (Windows XP, Windows 2003 Server and prior),  destination address selection is fairly simple and done by selecting the IP address at the top of the list that was returned by the DNS server. This works well with DNS Round Robin as it lets the Server decide what address the client will use by putting it at the top of the list.

IPv6: IPv6 introduces a change in this behavior per RFC 3484.

RFC 3484 Default Address Selection for IPv6 - http://www.ietf.org/rfc/rfc3484.txt

6. Destination Address Selection

   The destination address selection algorithm takes a list of
   destination addresses and sorts the addresses to produce a new list.
   It is specified here in terms of the pair-wise comparison of
   addresses DA and DB, where DA appears before DB in the original list.
   The algorithm sorts together both IPv6 and IPv4 addresses.
   ...
   The pair-wise comparison of destination addresses consists of ten
   rules, which should be applied in order.  If a rule determines a
   result, then the remaining rules are not relevant and should be
   ignored.  Subsequent rules act as tie-breakers for earlier rules.

There are 10 rules, but it is rule 9 that we need to consider.

Rule 9:  Use longest matching prefix.
   When DA and DB belong to the same address family (both are IPv6 or
   both are IPv4): If CommonPrefixLen(DA, Source(DA)) >
   CommonPrefixLen(DB, Source(DB)), then prefer DA.  Similarly, if
   CommonPrefixLen(DA, Source(DA)) < CommonPrefixLen(DB, Source(DB)),
   then prefer DB.

Essentially this says that we should use the longest match and not just pull the first IP address off the list. The key point to understand is that there is a change in behavior by design when IPv6 is on the system and that when IPv6 is installed Windows does not just pull the first IP address off the list.

The affect of RFC3484 on DNS Round Robin

When Vista clients (or XP clients with IPv6 installed) query DNS and receive a list of IP addresses, a destination selection algorithm kicks in and returns the destination address that has the longest prefix match (per RFC3484). This breaks the DNS server's site load balancing as follows.

In the case of Round-Robin this means we can't count on the randomization provided by the DNS server.

Example:

A client with an IP address of 192.168.0.1 queries for Webserver.test.net and receives the following list:

Webserver.test.net A 192.168.1.10
Webserver.test.net A 192.168.5.20
Webserver.test.net A 192.168.6.30
Webserver.test.net A 192.168.0.40
Webserver.test.net A 192.168.4.50

With RFC3484 in effect, the client will always use the 192.168.0.40 address as it is the longest match, negating the effects of DNS round-Robin.

In the case of NetMask Ordering, if some server’s address is “closer” to the client address and would be preferred, it will always get that address.

Example:

A client with an IP address of 192.168.0.1 queries for Webserver.test.net and receives the following list:

Webserver.test.net A 192.168.0.100
Webserver.test.net A 192.168.0.10
Webserver.test.net A 192.168.0.11
Webserver.test.net A 192.168.0.15
Webserver.test.net A 192.168.0.20

With RFC3484 in effect, the client will always use the 192.168.0.10 address as it is the longest match, negating the effects of netmask ordering.

You can see why by looking at the 4th octet in binary. You compare bits until you reach one that doesn't match. With a client IP address of 192.168.0.1, the comparison is 00000001.

11000000 10101000 00000000 00000001 = 192.168.0.1 = Client IP to match.
11000000 10101000 00000000 01100100 = 192.168.0.100 = (24 + 1 = 25 bits matching the client IP)
11000000 10101000 00000000 00001010 = 192.168.0.10 = (24 + 4 = 28 bits matching the client IP)
11000000 10101000 00000000 00001011 = 192.168.0.11 = (24 + 4 = 28 bits matching the client IP)
11000000 10101000 00000000 00001101 = 192.168.0.15 = (24 + 4 = 28 bits matching the client IP)
11000000 10101000 00000000 00010100 = 192.168.0.20 = (24 + 3 = 27 bits matching the client IP)

Then the first entry from the longest match is chosen. In this case, 192.168.0.10.

An Alternative

You can change the behavior on Windows Vista SP1 and Windows Server 2008 with a client side registry entry documented in KB 968920.

Note: Windows 7 and Windows Server 2008 R2 will change the default behavior.

968920 Windows Vista and Windows Server 2008 DNS clients do not honor DNS round robin by default
http://support.microsoft.com/default.aspx?scid=kb;EN-US;968920

Symptom

By default, Windows Vista and Windows Server 2008 follow RFC 3484 for destination IP address selection, which does not honor DNS round robin. 

Resolution

To resolve this issue, add a registry key that disables subnet prioritization.

Add a new registry key with the following settings:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
DWORD = OverrideDefaultAddressSelection
Value data: = 1

- David Pracht

DNS Client Name Resolution behavior in Windows Vista vs. Windows XP

In Windows, the DNS Client service is the client component that resolves and caches Domain Name System (DNS) domain names. When the DNS Client service receives a request to resolve a DNS name that it does not contain in its cache, it queries an assigned DNS server for an IP address for the name. All computers that use DNS to resolve domain names (including DNS servers and domain controllers) use the DNS Client service for this purpose.

To extend or revise the DNS search capabilities, in Windows you have DNS domain suffix search list. By adding additional suffixes to the list you can search for short, unqualified computer names in more than one specified DNS domain. If a DNS query fails, the DNS client service can use this list to append other name suffix endings to your original name and repeat DNS queries to the DNS server for these alternate FQDNs (Fully Qualified Domain names).

When the suffix search list is empty or unspecified, the primary DNS suffix of the computer is appended to short unqualified names and DNS query is used to resolve the resultant FQDN. If no connection-specific suffixes are configured or queries for these connection-specific FQDNs fail, the client can then begin to retry queries based on systematic reduction of the primary suffix (also known as devolution). For example, if the primary suffix were "ad.contoso.corp", the devolution process would be able to retry queries for the short name by searching for it in the "contoso.corp".

When the suffix search list is not empty and has at least one DNS suffix specified, attempts to qualify and resolve short DNS names is limited to searching only those FQDNs made possible by the specific suffix list.

A change with respect to the DNS queries for multi-label names has been made in the default behavior of Windows Vista as compared to that of Windows XP. The change is as follows:

Windows XP:
When a Windows XP machine attempts to resolve an unqualified multi-label name, the
DNS client will attempt to resolve the name as specified, then will append the domains
that are listed in the DNS suffix search order.

Windows Vista:
When a Windows Vista machine attempts to resolve an unqualified multi-label name, the
DNS client will attempt to resolve the name as specified. The DNS suffix search
order will NOT be used.

Example:

Suppose you have a domain structure where you have the following DNS Suffix Search List:

Ad.Contoso.corp

Contoso.corp

From a command prompt, ping the hostname of a machine using the following
unqualified multi-label syntax:

<hostname>.site1

XP will attempt to resolve:

1. <hostname>.site1
2. <hostname>.site1.ad.contoso.com
3. <hostname>.site1.contoso.com

When viewed in a network capture:

192.168.1.1 192.168.1.5 DNS Query for hostname.site1 of type Host Addr on class Internet
192.168.1.5 192.168.1.1 DNS Response - Name Error 
192.168.1.1 65.52.16.29 DNS Query for hostname.site1.ad.contoso.corp of type Host Addr on class Internet
192.168.1.5 192.168.1.1 DNS Response - Name Error 
192.168.1.1 65.52.16.29 DNS Query for hostname.site1.contoso.corp of type Host Addr on class Internet
192.168.1.5 192.168.1.1 DNS Response - Name Error
Vista will attempt to resolve:

1. <hostname>.site1

Vista does not attempt name resolution any further.

When viewed in network capture:

192.168.1.1 192.168.1.5 DNS Query for hostname.site1 of type Host Addr on class Internet
192.168.1.5 192.168.1.1 DNS Response - Name Error
How to control this behavior

This registry entry works for both Windows XP and Windows Vista

HKLM\Software\Policies\Microsoft\Windows NT\DNSClient\AppendToMultiLabelName
Type = DWORD

Data:

  • 0 (Do not Append Suffix)
  • 1 (Append suffix)

If the registry entry is not present, the default in Windows XP is 1, and 0 in Windows Vista.

This registry changes and its effect apply only to the ping command, they do not apply to the Nslookup tool. This is because Nslookup contains its own DNS resolver and does not rely on the resolver built into the operating system (DNS Client). The DNS (multi-label) query packets sent by the nslookup tool will append the domains listed in the suffix search order irrespective of the registry key settings mentioned here.

Group Policy location (for Windows Vista only) - (run gpedit.msc):

Computer Configuration -> Administrative Templates -> Network -> DNS Client -> “Allow DNS Suffix Appending to Unqualified Multi-Label Name Queries”

Note: As with other GPOs, if you change the registry and there is also a GPO configured then GPO will override this registry value.

- Sneh Shah with additional information from Kapil Thacker

Error citing a Remote Access Policy problem prevents connection via Terminal Services Gateway

Hello All! It’s Brett Crane from the Networking teams here at Microsoft. Today I wanted to take a few minutes to talk a little about an issue that we have found regarding Terminal Services Gateway.

The problem that we have seen may occur when a TSG client tries to connect up through the Terminal Server Gateway. It gives an error stating the following information in the popup:

Terminal Services Resource Authorization Policy (TS RAP) is preventing connection to the remote computer through TS Gateway, possibly due to one of the following reasons:

- You do not have permission to connect to this remote computer through the TS Gateway server.

- The name specified for the remote computer does not match the name in the TS RAP.

Contact your administrator for further assistance.

But if you look at your Resource Authorization Policy there doesn’t seem to be any problem. “I am trying to connect to a client that’s listed in my policy!” Well, that just may be the case…

There are a few things you should check in your environment to see if you are in a situation where your clients will notice this behavior even though you are configured properly:

1. How are you requesting your resource from your client? By this I mean... go into the RDP Client settings and look at the "General" tab. Under Logon settings, what does your client have listed as "Computer"? This will be the backend resource when going through TSG. Are you requesting the resource by NetBIOS machine name or by FQDN?

clip_image002

2. Do you have TSG configured to use Active Directory Security Groups?

clip_image004

3. What is the FQDN of the domain your clients are connecting to?

4. What is the NetBIOS name of the domain your clients are connecting to?

So now you have done the research and answered the questions listed above… what is the problem? Well, basically, when your Fully Qualified Domain Name differs from your NetBIOS domain name and you are using Existing AD security groups, we don’t complete a process properly when we access security rights for a RAP check. So when you use an FQDN when trying to request a specific back-end resource the failure happens.

“What can we do?” Well, we have found out what the problem was and have fixed the issue in future releases of the product.

“What about those of us who are in the predicament now?” We’ve released a Hotfix that will resolve this issue. You can download the hotfix by going to this link:

http://support.microsoft.com/kb/967933

Once there, click on “View and request hotfix downloads” found at the top of the page.

* When choosing the hotfix you will notice that the “Product” is listed as “Windows Vista”. This is normal. Windows Server 2008 and Windows Vista are built around the same platform.

** This hotfix will be included in Windows Server 2008 Service Pack 2.

“With all that known, is there a workaround?”  Yes! The easiest workaround to this whole issue is to not use a FQDN to reference the back-end resource in your Remote Desktop Client. Just call the resource by its NetBIOS name and everything should work fine. If there is an instance to where you need to use the FQDN, then configure the option in your TSG Resource Authorization Policy to use TS Gateway managed groups instead of existing Active Directory Security groups.

I hope this information has been very informative. Until next time… Safe computing!

- Brett Crane

TCP/IP Networking from the Wire Up

We often have customers who want to better understand the resolution to their networking support issue and why what we did fixed their issue. Depending on the issue, this explanation may be quite involved and complex. This would be like asking the pilot on the way off the airplane to quickly and briefly explain the aerodynamics, electronics, hydraulics etc that went into landing the plane and how they all work together. Since there is no brief explanation of the in-depth workings of how computer networks do all the things they do I will be covering this topic in a series of blog posts in which I hope give at least a basic understanding of network protocols and architecture. My intention here is to lay a foundation from which a deeper understanding of Windows networking can be gained. We will discuss IPv4 over Ethernet only, and will not be delving into IPv6 or other physical network topologies.

In order to continue we need to at least mention the 7 Layers of the International Standards Organization/Open System Interconnect (OSI) model. This is the model where we get the reference for Layer 2 and Layer 3 when referring to types of switches and routers, for example.

This model consists of the following seven layers:

  • Layer 7: Application
  • Layer 6: Presentation
  • Layer 5: Session
  • Layer 4: Transport
  • Layer 3: Network
  • Layer 2: Data Link
  • Layer 1: Physical

Layer 1 is the Physical layer and this consists of the Network Interface Card (NIC) and other components that allow a system to physically and logically connect to a network. This is as deep as we need to go into Layer 1 for this discussion.

We will focus more on Layer 2 in this blog post, specifically starting with Layer 2 routing, and then get into Layer 3 routing in the next blog entry.

For more information on the OSI model and Window Network Architecture see the following:

Windows Network Architecture and the OSI Model
http://msdn.microsoft.com/en-us/library/aa938287.aspx

TCP/IP Architecture
http://technet.microsoft.com/en-us/library/cc751234.aspx

Ethernet

The first thing we will need in order to communicate on a computer network is some method for structuring what is being sent. This will be important not only for the computer itself but also allows other devices on the network such as routers and switches to be able to properly handle network traffic. The two standards we are using to accomplish this for TCP/IP, are Ethernet II and IEEE 802.3, and they are found in Layer 2 of the OSI model. These standards define what is included when data is "framed" to be sent so, data sent on a network will often be referred to as frames. For more in-depth discussion on how this is structured and the inner workings of Windows networking the following books are excellent references:

"Microsoft® Windows® Server 2003 TCP/IP Protocols and Services Technical Reference" http://www.microsoft.com/learning/en/us/books/5030.aspx

"Windows Server® 2008 TCP/IP Protocols and Services"
http://www.microsoft.com/learning/en/us/Books/11630.aspx

Media Access Control (MAC)

Now that we have a standard for constructing a frame to put on the wire, we will need a way to determine where we are going to send the frame. In order to communicate with other computers on the network a system or "Node" must have a way of identifying itself and other systems within the local subnet or "Broadcast Domain", a Broadcast Domain being the network that is reachable by broadcast. This identification is done using the MAC address of the network adapter. A MAC address may also be referred to as an Ethernet Address or Physical Address. This address is assigned by the manufacturer at the time the network adapter, also known as the Network Interface Controller (NIC), is created. It is possible to find NICs that allow the MAC address to be changed manually but care should be taken in doing this as this could cause addressing problems on the local subnet.

A MAC address consists of 6 Bytes, the first 3 of which are used for the Organizationally Unique Identifier (OUI) which is unique to the manufacturer of the NIC.

You can determine the manufacturer of your NIC by running an IPConfig /all on your system from a command prompt. Next take the first 3 bytes of the Physical Address and plug them into the "Search For" under "Search the public OUI listing" at the following link:

http://standards.ieee.org/regauth/oui/index.shtml

The last 3 bytes of the MAC are specific to the NIC. Together they provide a unique local address.

As I mentioned, this gives a system a way to identify itself and other systems on the local network. However, for this to be useful we need a way to discover what the MAC address of other systems are, as well as to allow that address to be discovered by other systems. To this end, Address Resolution Protocol (ARP) was developed, and is described in RFC 826.

IP Address

So that sounds pretty good, you might say. I have my MAC address. I'm all set to talk on the network. You would be mostly correct. From a layer 2 stand point you do have everything you need to communicate, however, most operating systems, including the Windows operating systems, allow communication to take place with systems beyond the Broadcast Domain. For this reason we have Layer 3, the Internet Protocol Layer, where we have Internet Protocol (IP) addresses. This is significant because the system cannot just always assume that traffic will be on the local subnet. The concept I want you to grasp here is that MAC addresses are for "local" routing and IP addresses are for "global" routing. This is a bit over simplified but let's run with this for now and it will all get clearer as we get farther into IP routing. For now, we just need to know that each system will have both an IP address as well as a MAC address. In order to communicate with other systems, we will need a way to match the IP address of the system we want to communicate with to its MAC address. In addition, the source system will want to share its own MAC and IP address so that the target node will know how to communicate back. In order to accomplish this matching of MAC to IP addresses we use Address Resolution Protocol (ARP).

Address Resolution Protocol (ARP)

You will notice as we discuss ARP that the requests are structured a certain way. This is because we are conforming to standards such as RFC 826 which defines ARP, and RFC 5227 which defines IPv4 Address Conflict Detection. ARP is used to resolve the next hop IP address of a node to its corresponding MAC address. This is significant because the next hop IP address is not necessarily the destination IP address. Remember the concept I mentioned earlier that the IP address can allow for global routing.

When looking at the ARP in a network capture you will see that there are four fields used to identify the source and target IP and MAC address.

In the ARP Request the fields are filled in as follows:

  1. Source Hardware Address (SHA) – MAC address of the requesting system.
  2. Source Protocol Address (SPA) – Protocol Address, this is the IP Address of the sending system.
  3. Target Hardware Address (THA) – MAC address of the system with the Target IP, for ARP request this will be 0.0.0.0 since this is the address we are trying to discover.
  4. Target Protocol Address (TPA) – Protocol Address, will be the destination IP address that we are trying to discover the MAC address for.

ARP1 (2)

When the broadcast ARP request is received on Node 2, Node 2 updates its ARP cache with the information it received in the request. In the ARP reply, notice that the SHA and SPA are updated to match the correct information for the sending system. This is the information used by Node 1 to update its ARP cache. Once this is done, both systems will have the MAC and IP information it needs to communicate with the other node.

Once an address gets put in the ARP cache it is maintained for a set amount of time. The default behavior is a two minute timer that is reset every time the destination MAC is used for a total of 10 minutes. After 10 minutes of use the destination MAC address is discarded and must be resolved again with a new ARP request. If after two minutes the destination MAC has not been used it is discarded.

It is important to remember that in order for ARP to work, the requester must already have a destination IP address that it will request the MAC address for. The IP address for the destination may be entered manually or may be discovered through name resolution.

Note that starting with Windows Vista we no longer refer the cache as ARP cache, we will discuss this further, later in the this blog post.

Address conflict detection (Gratuitous ARP)

ARP is also used to detect IP address conflicts. Address conflict detection is used to insure that a system that is brought up on the network or that is assigned a new IP address does not have an address that conflicts with a system already on the network.

In address conflict detection, we use what is known as a Gratuitous ARP. When a system is configured with an IP address either manually or by DHCP it will send a Gratuitous ARP to insure that another node on the network is not already configured with this IP address. In the case of a conflict the two nodes are defined as follows. The Offending Node is the node that is sending the gratuitous ARP, and the Defending Node is a system already configured with the IP Address in question. The contents of this request and how this affects the ARP cache on other systems on the network differs depending on the OS.

XP and 2003

In Windows XP and Windows Server 2003 the Gratuitous ARP request is sent with the Senders MAC filled in with the MAC of the sending system and the Target MAC set to 0's, but the Senders and Target IP address are both set to the address of the sending system. If a conflict is detected then the defending system replies with its IP and MAC address.

Example:

ARP2

The problem with this method is that all the nodes that receive this broadcast and have an ARP cache entry for this IP address will update their ARP cache with invalid data. So the defending node will now need to send its own Gratuitous ARP to correct the cache on the other systems on the network. Because of this, starting with Windows Vista the Gratuitous ARP is handled differently.

Vista and 2008

In Windows Vista and Windows Server 2008, ARP Cache is now known as Neighbor Cache. The ARP -a command will still display the legacy ARP Cache and we can still add static ARP entries.

Neighbor Cache

The contents of the neighbor cache can be displayed with the following netsh command.

netsh interface ipv4 show neighbors

When this command is run you will notice that we have different states for neighbors. The following states are possible:

  • Incomplete - Address resolution is in progress. This would indicate that an ARP request has been made but the node has not received the response yet.
  • Reachable - The ARP reply has been received.
  • Unreachable - The node did not receive a response to the address resolution request.
  • Stale - The reachable time has elapsed. This indicates that a frame has not been sent to the neighbor within the time out period. The entry will remain in this state until a frame is sent to this neighbor.
  • Probe - Reachability confirmation is in progress for a neighbor cache entry that was in a stale state.
Duplicate Address Detection

In Windows Vista and Windows Server 2008 there are some built in protections that reduce the chance of the Neighbor cache getting updated with incorrect information. This also helps keep the requesting system from incorrectly updating other systems.

ARP3

Changes to ARP cache updating

First, a Windows Vista or Windows Server 2008 will not update the Neighbor cache if an ARP broadcast is received unless it is part of a broadcast ARP request for the receiver. What this means is that when a gratuitous ARP is sent on a network with Windows Vista and Widows Server 2008, these systems will not update their cache with incorrect information if there is an IP address conflict.

Additionally, when a gratuitous ARP is sent by a Windows Vista or Windows Server 2008, the following change has been made –  the SPA field in the initial request is set to 0.0.0.0. This way the ARP or neighbor caches of systems receiving this request are not updated. So, if there is a duplicate IP address, the receivers do not need to have their cache corrected.

Proxy ARP

There will be times when a system needs to resolve the MAC address of a system that is not reachable within the Broadcast Domain. When this happens, we can use another device on the network to answer the ARP request, this is known as Proxy ARP. Proxy ARP is the answering of ARP Requests on behalf of another system. One example of this is when a Remote client connects to Windows Routing and Remote Access (RRAS) server. When the client connects to a RAS server it is assigned an IP address from the server and the server keeps track of which client was assigned the IP address. When clients on the internal network and remote clients attempt to communicate with each other the RAS server will use Proxy ARP to reply with its own MAC address. As far as the client sending the ARP request is concerned it has successfully resolved the IP to the MAC of the remote client. In the example, the LAN client is sending an ARP request for the IP of the Remote Access Client. Notice that the ARP reply comes from the RAS server using its own MAC.

ARP4

Review
  • As we mentioned, there must be a framework to structure data sent on the wire. By default in a Windows OS, ARP requests are sent using the Ethernet II frame format described in RFC 894.
  • MAC addresses are layer 2 addresses and ARP is used to match this address to the IP or layer 3 address of a system.
  • Gratuitous ARP is used to detect IP address conflicts on the local network.
  • Starting with Windows Vista, we changed the contents of the Gratuitous ARP so that the SPA is now set to all 0's this prevents other nodes from incorrectly updating their cache.
  • Proxy ARP is the answering of ARP Requests on behalf of another system. There are several instances when this may be used, but one example is a Windows RAS server.

Next time we will discuss IP routing and get deeper into IP addresses.

- Clark Satter

Stopping the Windows Authenticating Firewall Service and the boot time policy

Lately, I have been seeing a number of issues/concerns from people where they manually stop the Firewall service and lose connectivity to the machine. They always seem surprised when I explain that it is by design.

A little History

In versions of Windows XP prior to Windows XP SP2, there is a window of time between when the network stack starts and when the Windows Firewall Service (ICF) starts to provide protection. The firewall driver does not start to filter TCP/IP packets until the service is loaded and the appropriate policy is applied. The firewall service depends on several functions and must wait until those functions clear before the service pushes the policy to the driver. During this window of time, a packet could be received and delivered to a service without being filtered. This could potentially leave the computer vulnerable to an attack by exposing ports that would otherwise be protected by the firewall.

Note: The time period is based on the speed of the computer.

In Windows XP SP2, the firewall driver has a new static policy rule called the boot-time policy. The boot-time policy performs stateful filtering and eliminates the window of vulnerability when the computer is starting. The boot-time policy enables the computer to open ports so that basic networking tasks such as Domain Name System (DNS) and Dynamic Host Configuration Protocol (DHCP) can occur. The boot-time policy also enables the computer to communicate with a domain controller to obtain appropriate policies. As soon as the firewall service is running, the run-time policy is loaded, applied, and the boot-time filters are removed. The boot-time policy cannot be configured.

There was another security feature added so that if the firewall service is stopped or crashes, the boot-time filters are again loaded to protect the computer. This would prevent an attacker from crashing the firewall service and exposing the machine.

This can cause confusion if you are not aware of it and try to simply stop the firewall service to eliminate it as a potential cause while troubleshooting a connectivity issue.

In Windows Vista the boot-time policy functions the same as it does in Windows XP SP2 except that the service is MPSSVC.

Manually stopping the Windows Authenticating Firewall Service

There are multiple ways to manually stop the Windows Firewall:

  • In the Firewall CPL in control panel
  • In the Advanced Firewall MMC
  • In the Services Manger MMC
  • Netsh Firewall set opmode disable
  • Net stop MPSSVC or net stop sharedaccesss (Depending on the OS)

One of the more common methods to use to stop the firewall service as a test is to use Net stop MPSSVC (for Windows Vista) or Net stop SharedAccess (for Windows XP) but both of these will cause the boot-time filters to load. The proper way to completely stop the firewall is by setting the service to disabled in Services Manager and then stopping the service through one of the GUIs or Netsh. This will prevent the boot-time filters from loading when the firewall service is stopped.

clip_image002

Figure 1. Setting the firewall service to disabled in Services manger.

Additional considerations for Windows Vista

IPSec and Windows Vista

It is worth noting that when you stop the MPSSVC service, IPSec policies are no longer in effect.

This could be a potential issue for third-party firewall services that want to replace the Windows Firewall but don't provide IPSec functionality. The recommended way to resolve this situation is to set the firewall to allow all traffic and leave the service running. Microsoft provides an API call that third-party services can use to stop the Firewall Service. This call sets the firewall to allow all traffic while leaving the service running so IPsec can still function and is the expected method for third-parties to use.

Stopping the firewall Service should only be a test

Microsoft does not recommend you stop the firewall service (or a third-party firewall service) except for troubleshooting even if you are behind another edge/perimeter firewall. If another machine on the local subnet gets infected, a machine that is not running a host firewall is vulnerable.

- David Pracht

Having an IP Address conflict?

I just wanted to provide a quick “what to do” when having an IP address conflict. I’ve had a few of these cases this year. I’m not sure that one needs to open a support case with us to resolve this. In some instances, critical applications/resources are down due to IP address conflicts. I’ll provide a few steps one should take in troubleshooting this situation.

IP address conflicts happen within a subnet. I had a call recently in which a cluster resource wouldn’t come online. When the cluster resource attempted to come online, it issued an ARP request for its IP address. In a working scenario with no conflict, there would be no response. In this instance, a VMWare server responded. The cluster resource reported the conflict and failed to initiate. They were unable to locate the VMWare server, so the customer changed the IP address on the cluster resource.

Basic steps:

  1. Run IPConfig at a Command Prompt -- is the IP address in conflict assigned to the machine you're on?
  2. Ping the IP address in conflict (use "ping -a" as the name may resolve). Pinging with the –a switch causes a reverse DNS lookup. There is a chance the conflicting host registered the address in DNS.
  3. Run the ARP -a" command and locate the address in conflict (even if ping doesn't work)
  4. Go to http://standards.ieee.org/regauth/oui/index.shtml and enter the MAC address. This will
    help narrow down the MAC address and locate the conflicting host.

It will help if there is a record of MAC addresses. Use Excel to keep a list of MAC addresses in your environment. Easily use SQL to query for the MAC returned by "ARP -a" command. Other monitoring software such as Systems Center Operations Manager will record and store MAC addresses.

Hopefully this saves you a support call down the road. Archive it if you must. Inevitably, you’ll have an IP address conflict. This will save you some time.

- Rich Chambers

More Posts Next page »
Page view tracker