I'm sorry for not getting these posted sooner; I have been out of the office on and off over the past several weeks. I will publish a few of these posts in a row to get caught up again. Thanks!
955860 High CPU usage occurs for six seconds on a Windows Server 2003-based computer that has multiple network adapters
953731 Device Manager displays an exclamation point next to the Generic Packet Classifier device on a Windows Server 2003-based computer that uses the iSCSI Initiator boot version
958963 Microsoft Security Advisory: Exploit code published affecting the server service
954442 Applications or services that rely on the Named Pipe File System may encounter latency or time-out issues when many connections are made to a named pipe in Windows Server 2008 or in Windows Vista SP1
954420 Servers in a Network Load Balancing (NLB) failover cluster cannot be used as print servers in Windows Server 2008
- Mike Platts
BALANCING ACT
Dual-NIC NLB Configuration with Windows Server 2008 NLB Clusters
We’ve had a few calls from customers who have run into a particular issue when they’ve deployed NLB on a Windows Server 2008 cluster. Most of them have had older NLB deployments and thought we made a change to 2008 NLB to cause a problem. The installations with the issue have dual-NIC nodes with the default gateway on the Outbound NIC. This is the reported behavior:
What does that look like?
In some cases, you might want to keep your default gateway on a 2nd NIC in order to have all inbound traffic use one interface and outbound traffic use another, as shown in the diagram below:
In Windows Server 2003, a packet from the client would route in through the inbound NIC and because the response was not from the same subnet, it would be sent back via the outbound NIC to the default gateway and back to the client. The problem with the above configuration on a 2008 server is that we disabled IP forwarding by default. Therefore, when the packet enters the inbound NIC, without a default gateway, it has no way to get off subnet and the packet is dropped.
Does that mean it won’t work in Windows Server 2008?
There is actually a simple change in order to get this to work without putting the default gateway on the cluster NIC. You need to enable routing using one of the two following methods – via netsh or via the registry:
Via netsh:
Admin State State Type Interface Name ------------------------------------------------------------------------- Enabled Connected Dedicated Cluster NIC
Interface Cluster NIC Parameters ---------------------------------------------- IfLuid : ethernet_5 IfIndex : 10 Compartment Id : 1 State : connected Metric : 20 Link MTU : 1500 bytes Reachable Time : 30000 ms Base Reachable Time : 30000 ms Retransmission Interval : 1000 ms DAD Transmits : 3 Site Prefix Length : 64 Site Id : 1 Forwarding : enabled Advertising : disabled Neighbor Discovery : enabled Neighbor Unreachability Detection : enabled Router Discovery : dhcp Managed Address Configuration : enabled Other Stateful Configuration : enabled Weak Host Sends : disabled Weak Host Receives : disabled Use Automatic Metric : enabled Ignore Default routes : disabled
Via the registry
Add the following value:
Key name: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters Value Name: IpEnableRouter Data Type: REG_DWORD Value: 1
Hopefully, this blog will provide you with a quick fix for your new Server 2008 NLB deployment!
- Michael Rendino and Pete Sullivan
Have you ever run in to a problem where you are attempting to troubleshoot a network connectivity issue with a network capture utility and seen only the 3 way handshake? This will happen if you are using Netmon 2.x, Netmon 3.x, Wireshark, Ethereal and most other network capture utilities.
It is relatively common knowledge that this will happen when TCP Chimney offload is enabled but disabling it via the registry or netsh sometimes doesn’t always resolve the problem. TCP Chimney offload enables TCP/IP processing to be offloaded to network adapters that can handle the TCP/IP processing in hardware. The use of TCP Chimney offload causes traffic to be delivered at a lower layer of the TCP/IP stack than we listen on with most network capture utilities.
The initial troubleshooting for this type of issue is to turn off TCP Chimney Offload via Netsh as follows. The benefit of this is that it does not require a reboot.
To turn off TCP Chimney by using the Netsh.exe tool, follow these steps:
However, if this does not change what is shown in a network capture, you should then move forward with disabling all of the features of the Scalable Network Pack as documented in Knowledge Base article 948496 – “An update to turn off default SNP features is available for Windows Server 2003-based and Small Business Server 2003-based computers”.
To manually disable RSS, NetDMA and TCP Offload, follow these steps:
Disabling Chimney with netsh and changed the registry values above will allow you to see all the traffic in most cases but not always. You may also need to look at the features related to TCP Chimney offload available on the Network card. To access these options, choose the configure button on the general tab of the adapters properties. This will bring up a Window similar to what is displayed below. The Advanced tab is where the changes will be made.
The configurable options available vary depending on how the vendor implements the driver for Windows. Many network cards have features including Receive Side Scaling, TCP Checksum Offload and TCP Large Send Offload. Disabling the offload features of the network card will allow you to view all of the traffic in many cases where disabling the scalable network pack features in the OS doesn’t work. You should refer to the vendor’s documentation for specific steps on how to disable these features.
As a last resort you may have to disable chimney from a hardware perspective. Refer to the vendor’s documentation for specific information on how to disable offload features. Possible ways to do this vary, and may include settings on the NIC, jumpers on the motherboard, and/or configuration in System BIOS.
- Michael Vargo
When a demand dial connection is setup between two RRAS servers each server receives an address from the pool of available addresses located on the server it is connecting to. When Server 2003 servers are used on both ends of the demand dial connection you are then able to ping from each server to the assigned IP address on the other server. When a Server 2008 server is used on either or both ends of the demand dial connection this ping will fail. This is due to 2008 RRAS not adding a host route to its local routing table that 2003 server adds to its routing table.
As a best practice recommendation a server hosting RRAS should contain two NICs and be hosted on its own server. This helps keep the networking simple and if the server is compromised it keeps it a step away from sensitive data that may exist on other servers. However in reality this is not always done and Microsoft does support most single NIC and multiple-role servers where RRAS is involved. However, if there is a dependency on the demand dial connection to access resources located on the RRAS server when Server 2008 is used this dependency will fail. This write up supplies a workaround to this to make those resources available across the demand dial connection.
To setup a Demand Dial (also known as Dial on Demand, or simply DoD) connection the general steps below are followed: (More information can be found at http://technet.microsoft.com/en-us/network/bb545655.aspx, http://technet.microsoft.com/en-us/library/cc779726.aspx, and Knowledge Base article 159684)
Example setup for workaround:
Following are the parameters of our DoD connections.
RRAS1 – Server 2003 on domain corp1.local
RRAS2 – Server 2008 on domain corp9.local
Summary: The route that is added in step 7 is the route that 2008 RRAS leaves out by design. Steps 5e for RRAS1 and RRAS2 are used to assign a static IP address to the DoD interfaces to guarantee the route added in Step 7 will work. Steps 5e and 7 are required to make this setup work to allow you to access resources on the 2008 server from the 2003 server. Without these steps if you do a ping from the 2003 server to the 2008 server you will see a response of “Request timed out”. If you try to connect to a file share on the 2008 server the response will be a pop-up with the message of “No network provider accepted the given network path.”
- Barry McGugan
Given the following scenario:
When the Exchange Service is first failed over, the Client Access Point (IP Address and Network Name) updates DNS with the correct IP Address for the new active node. However, each subsequent failover may delay the DNS Registration update for up to 10 minutes.. These delays can cause clients to lose connectivity with Exchange and we all know that means trouble.
The details of the behavior are this. When you bring the Client Access Point on line for the first time, it will register itself with a DNS Server listed in the local machine’s TCP-IP Properties. It will also record a timestamp for a successful registration under the private properties of the Network Name.
C:\> Cluster res “Cluster Name” /prop Resource Name Value ------------ ----------------- ---------------------- Cluster Name LastDNSUpdateTime 10/20/2008 11:46:39 AM
If a failover (or an online/offline) occurs, the Cluster Service will check this timestamp and if it is within one hour (60 minutes) of the last registration time, the Client Access Point on the node that is becoming active will wait for 10 minutes before registering the new IP address in DNS. Windows 2008 does not have a Cluster Log being written to like previous versions. You can generate a Cluster log with the command cluster log /gen and one will be create on each node of the Cluster in the C:\WINDOWS\CLUSTER\REPORTS folder. In this CLUSTER.LOG, you will see an entry similar to this:
Client Access Point registers when comes online ------------------------------------------------------------------ 2008/07/16-21:33:14.405 INFO [RES] Network Name <2008-Cluster>: Bringing resource online... 2008/07/16-21:33:14.405 INFO [RES] Network Name <2008-Cluster >: TimerQueueTimer rescheduled to fire after 600 secs 2008/07/16-21:33:20.254 INFO [RES] Network Name <2008-Cluster >: Re-registering DNS records time period (4 secs ) between last registration and now is greater than 86400 2008/07/16-21:33:26.593 INFO [RES] Network Name <2008-Cluster >: Network Name 2008-Cluster is now online
The above registration was successful when the resource came Online, so it updated the LastDNSUpdateTime value with the current time. If you then move the Exchange Service Application to the other node, you will see the delay occur as it will look at the LastDNSUpdateTime value and postpone registration if it is within the time period:
Client Access Point delays 10 minutes ----------------------------------------------------- 2008/07/16-21:53:19.084 INFO [RES] Network Name <2008-Cluster>: Bringing resource online... 2008/07/16-21:53:19.084 INFO [RES] Network Name <2008-Cluster >: TimerQueueTimer rescheduled to fire after 600 secs 2008/07/16-21:53:26.174 INFO [RES] Network Name <2008-Cluster >: Postponing DNS registrations to post online... 2008/07/16-21:53:32.302 INFO [RES] Network Name <2008-Cluster >: Network Name JOHNGROUP is now online *** 10 minutes later *** 2008/07/16-22:03:19.074 INFO [RES] Network Name <2008-Cluster >: Re-registering DNS records time period (4 secs ) between last registration and now is greater than 86400
In previous versions of Windows Cluster Server, every time a Network Name came online, it would register with DNS. In the case of multiple online and offlines of the resource, this can become very “chatty” with the DNS Servers. If there are delays with the registration process, it will delay the Network Name from coming online. Because of this, there was effort built in to cut down this DNS traffic and to make the name online process a little more streamlined.
This could cause quite a problem if you had issues with nodes failing over on a regular basis, but for the most part, failing over more than once an hour is not the norm.
Windows 2008 Failover Clusters allow for nodes to be placed in different subnets to better allow for multi-site configurations. With the newer “streamlined” Network name online process, this can cause a delay for clients needing to find the new IP Address. After the Client Access Point changes the IP address in DNS, client machines will not be able to find it until their DNS cache flushes the old IP address. There is a Knowledge Base article that addresses this side of the issue:
Description of what to consider when you deploy Windows Server 2008 failover cluster nodes on different, routed subnets
The server side delay of 10 minutes for re-registration is by design and not configurable. This will have to be factored in when you fail over nodes for maintenance, etc. Looking at this from a bigger picture, if your Service/Application is failing over more than once per hour, you probably have bigger issues than registration in DNS.
- Steven Martin