Unable to connect to Windows Server 2008 NLB Virtual IP Address from hosts in different subnets when NLB is in Multicast Mode

Unable to connect to Windows Server 2008 NLB Virtual IP Address from hosts in different subnets when NLB is in Multicast Mode

  • Comments 18
  • Likes

If you work with NLB in Multicast mode on Windows Server 2008, this is a must read for you, as you may run into this issue.

Symptoms: You cannot connect to the NLB Virtual IP Address when NLB is running in Multicast Mode on Windows Server 2008 from machines on the other subnets.  You will see “Request Timed Out” errors when you ping the VIP Address from a client on different subnet.

Steps to reproduce the issue

In the following scenario, the client is on subnet 172.32.x.x/24 and the Windows Server 2008 NLB node is on subnet 172.33.x.x/24:

image

Note:  A static ARP entry is in place on the router, which maps the Virtual IP Address and Multicast Virtual MAC Address (which we generally do as most routers do not support ARP requests for a Unicast IP Address to a Multicast MAC Address).

  1. Clear the Arp cache on the NLB node using the command arp –d *
  2. Try to ping the NLB VIP from the client machine; this ping will fail.
  3. Try to ping the dedicated IP address from the client machine; this works fine.
  4. Again try to ping the NLB VIP from the client machine; this ping works fine.
  5. Clear the ARP cache on the NLB node and try to ping the NLB; the ping will fail. This proves that we have some ARP issue on the NLB node.
  6. Clear the ARP cache again on the NLB node as in step 1.
  7. Start a Network Monitor network capture on the NLB node.
  8. Try to ping the NLB VIP; the ping request will fail.
  9. Stop the network capture.

Analysis of the network capture will show that the NLB Node sends an ARP Request to the IP address of the Default Gateway, but there is no response.

In the ARP request, we see the Sender IP address as 172.33.X.100(VIP), Sender MAC address as 03-bf-ac-20-10-64(Cluster Multicast MAC), Target Address as 172.33.X.1(default gateway) and Target MAC as 00-00-00-00-00-00. See the below ARP packet:

 

- Arp: Request, 172.33.X.100 asks for 172.33.X.1
    HardwareType: Ethernet
    ProtocolType: Internet IP (IPv4)
    HardwareAddressLen: 6 (0x6)
    ProtocolAddressLen: 4 (0x4)
    OpCode: Request, 1(0x1)
    SendersMacAddress: 03-bf-ac-20-10-64   <-- Multicast MAC address
    SendersIp4Address: 172.33.X.100
    TargetMacAddress: 00-00-00-00-00-00
    TargetIp4Address: 172.33.X.1

The Router did not respond to the above ARP request, so the ARP resolution to Gateway IP fails, causing the ping to the Virtual IP address from the client to fail.

Why does the router not respond to the ARP request?

The above ARP Request Packet indicates that the Sender’s IP Address is Unicast and Sender's MAC Address is Multicast.  Most  routers do not respond to ARP Requests with Unicast Sender IP and Multicast Sender MAC (Multicast Cluster MAC).  So, the NLB node does not get a response to the ARP request and the ping from the NLB Virtual IP Address fails as the NLB node fails to resolve the MAC address of the Gateway.

If NLB is running in Unicast Mode, everything works fine.  The reason is that in Unicast NLB, the node sends ARP Requests with a Unicast Sender IP Address and a Unicast Sender MAC Address; the router will respond to ARP Requests such as these.

Allowed MAC Addresses (from Cisco’s website)

These destination MAC addresses are allowed through the transparent firewall. Any MAC address not on this list is dropped.

  • TRUE broadcast destination MAC address equal to FFFF.FFFF.FFFF
  • IPv4 multicast MAC addresses from 0100.5E00.0000 to 0100.5EFE.FFFF
  • IPv6 multicast MAC addresses from 3333.0000.0000 to 3333.FFFF.FFFF
  • BPDU multicast address equal to 0100.0CCC.CCCD
  • AppleTalk multicast MAC addresses from 0900.0700.0000 to 0900.07FF.FFFF

When a router receives the ARP Request from the NLB node, it has to send the ARP Response. In the ARP Response, the Destination MAC address field will be replaced with a Multicast MAC address, which is not in the above Allowed MAC addresses list. This explains why the router drops the packet.

Why does Multicast NLB work fine on Windows Server 2003?

In the following scenario, the client is on subnet 172.32.x.x/24 and the Windows Server 2003 NLB node is on subnet 172.33.x.x/24:

image

Steps to see the scenario succeed when Windows Server 2003 is used on the NLB node

  1. Clear the Arp cache on the NLB node using the command arp –d *
  2. Try to ping the NLB VIP from the client machine; this ping will succeed.
  3. Clear the ARP cache on the NLB node as in step 1.  This ensures that the network trace to be captured in the next steps will show ARP communication.
  4. Start a Network Monitor network capture on the NLB node.
  5. Try to ping the NLB VIP; the ping request will succeed.
  6. Stop the network capture.

Analysis of the network capture will show that the NLB Node sends an ARP Request to the IP address of the Default Gateway and the router sends the ARP response.

In the ARP request, we see the Sender IP address as 172.32.16.50 (Dedicated IP), Sender MAC Address as 01-0A-0B-0C-0D(Interface MAC) , Target IP Address as 172.32.16.1 (Gateway), and the Target MAC Address as 00-00-00-00-00-00.  See the below ARP packet:

 

Arp: Request, 172.33.X.50 asks for 172.33.X.1 
    HardwareType: Ethernet
    ProtocolType: Internet IP (IPv4)
    HardwareAddressLen: 6 (0x6)
    ProtocolAddressLen: 4 (0x4)
    OpCode: Request, 1(0x1)
    SendersMacAddress: 01-0A-0B-0C-0D
    SendersIp4Address: 172.33.X.50
    TargetMacAddress: 00-00-00-00-00-00
    TargetIp4Address: 172.33.X.1

Since Windows Server 2003 sends the ARP Request with a Unicast Sender IP Address and Unicast Sender MAC (Interface MAC) Address, the router sends an ARP response.

What’s the difference in Windows Server 2003 and Windows Server 2008 NLB?

The functionality of NLB is the same on Windows Server 2008 as it is on Windows Server 2003.  TCP/IP functionality has been changed in Windows Server 2008.

In Windows Server 2003, assume that we have the Virtual IP address and Dedicated Primary IP on the interface.  Whenever you try to ping the Virtual IP address from a client, the Windows Server 2003 NLB node sends out an ARP request to the Default Gateway IP address.  This ARP Request always goes from the Primary IP Address, which is a dedicated IP address with a Unicast MAC (Interface MAC) Address.

On Windows Server 2008 NLB Nodes operating in Multicast Mode, the ARP request to the Default Gateway IP Address goes from the Virtual IP Address with a Multicast MAC Address as the Sender's MAC Address and the Router (Gateway Device) never responds if the ARP request contains a Multicast MAC Address in the Sender's MAC Address field.

What is the workaround to resolve this issue?

We can add a static ARP table entry for the Default Gateway IP address on the NLB Node.

Command to add static ARP entry

Arp –s <IP address> <Mac Address>

Note: Microsoft is aware of this issue and we will keep you posted regarding its status.

For more information regarding use of the ARP command in Windows, see the following:

http://technet.microsoft.com/en-us/library/bb490864.aspx

- Saravanan N

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • http://support.microsoft.com/kb/960916 doesn't apply to R2...

  • Hi Guys,

    I have the same problem, we have 2 CAS/HUB Windows 2008 servers running Exchange 2007 on a vmware infrastructure. We also had the same prob of clients not able to ping to the NLB IP or OWA. Instead if we add a proxy they are able to do so. Logged to MS and configured NLB to use multicast, but now we are still not able to ping to the NLB VIP from other VLAN's. Can anybody help with a step-by-step solution to this problem

    Savio.

  • Hi Guys,

    many people have the same problem with Unicast mode, I have this problem for a week and then i found a solution: the automatic nlb configuration autoconfigure network setting of the load balancing network adapter, but leave blanck the gateway setting on that adapter, so retype gateway in the NLB network adapter of all yours server windows inform you that multiple adapter setting is not good etc... -> apply. Now you are able to connect to Windows Server 2008 NLB Virtual IP Address from hosts in different subnets when NLB is in Unicast Mode