As described in the blog on DHCP failover, there are two types of DHCP Failover relations – Load Balance which provides Active-Active configuration and Hot Standby which provides Active-Passive configuration. This blog article elaborates on the Load Balance failover relationship.
As is evident from the name, in load-balance mode of operation both the servers respond to client requests. Here’s how the servers ensure the distribution of client requests between themselves:
Each DHCP server on receiving the client request calculates hash of the MAC address in the client request as per hashing algorithm specified in RFC 3074. Each server hashes any MAC address to a value between 1 and 256. If the load distribution ratio between the 2 servers is left at the default of 50:50; and if the hash of the MAC address falls between 1 and 128 then the first server will respond to the client request else if the hash is any value between 129 and 256, the other server responds to the client. This ensures that only one server responds for a specific client. If the load distribution ratio has been changed by the admin to a different value, the distribution of hash buckets would be in that proportion. The admin does not need to configure the MAC addresses on any server configuration a-priori.
Figure 1: Load Balance Ratio in a Failover Relationship
The free IP addresses of each failover scope are also distributed in the same proportion as the load balancing ratio. So for example, let’s say the failover scope - 10.10.10.0/24 - with an IP address range of 10.10.10.1 through 10.10.10.250. Suppose all IP addresses from 10.10.10.1 to 10.10.10.50 in this scope are leased out and all IPs starting from 10.10.10.51 are free. In this situation, the IP addresses from 10.10.10.51 through 10.10.10.150 would be apportioned to the first server and IP addresses 10.10.10.151 through 10.10.10.250 to the second server assuming the load balance ratio is 50:50. So, a client requesting a new lease and whose MAC address hash falls within the hash buckets of the first server would get an IP address 10.10.10.51 and so on. If the client’s MAC address hash falls within the hash buckets of the second server, the client will get the IP address 10.10.10.151.
As you can see from this example, when a scope is configured for failover, the 2 failover servers would be granting new IP address leases from two different portions of the IP address range of the scope. This is in contrast to the case of a standalone server where the server proceeds sequentially through the free IP address pool of a scope, to give out new leases, starting with the first free IP address.
Figure 2: IP Address Lease view of a Failover Scope
As clients request new leases, based on the MAC addresses of the clients, the free IP address pool of one server may get depleted faster than the other. To ensure that free IP address pool is at all times apportioned as per the load balancing ratio, every 5 minutes, the primary server checks the distribution of free IP pool distribution and transfers ownership of the IP address from itself to the partner server or vice versa using server to server failover protocol messages (binding update). This is referred as periodic rebalancing of the free IP address pool.
You can get the number of free IP addresses (and percentage of free IP pool) on each server for a failover scope, by viewing the scope statistics. The fields Addresses Available (this Server's Pool) and Addresses Available (Partner Pool) indicate the number of free IP addresses owned by each server for the specific scope. You can view the scope statistics in DHCP MMC by right clicking on the failover scope and click on Display Statistics. You can also use the PowerShell cmdlet Get-DhcpServerv4ScopeStatistics with the –failover switch to get the same information in PowerShell. The two additional fields shown in display statistics – Addresses granted (this Server's Pool) and Addresses granted (Partner Pool) – show the number of IP addresses leased out by the servers.
Figure 3: Statistics for a scope in Failover Relationship
When the failover relationship is in Normal state, hash bucket algorithm is applied for serving every DHCP client request. In communication-interrupted and partner-down states (i.e. when the partner server is unreachable or has gone down) hash bucket algorithm is not employed for servicingclient requests and server responds to all the clients to ensure service continuity.
Even while in Normal state, the server responds to the client if the client has been retransmitting the same request for a while. The server determines that a client has been retransmitting based on the secs field in DHCP client request. As per RFC 2131 the secs field is defined as “seconds elapsed since client began address acquisition or renewal process”. If secs field in client request is greater than 6 seconds, DHCP server will respond to the client even if the hash of the client MAC address does not fall within the hash buckets of the server. The idea behind this approach is to cater to a scenario where the server which actually owns the hash bucket for that client is down, but relation state is still Normal (there is a lag of 30 seconds between network connection (or the server) going down and this being detected by the partner server).
Most of the details shared in this article are not something that a DHCP administrator has to worry about. However, if you ever wondered how failover works under the hood (and most people do!), now you know!
I was looking through some of your blog posts on this site and I believe this web site is really informative! Keep on putting up.This site is really helpful for us.
If client1 gets an IP address of 10.2.3.4 from DHCP1 and then DHCP1 goes down but is in a load balance failover pair with DHCP2 and client1 does a renew - will client1 get the same IP address again?
In other words - Does DHCP2 have IP to MAC address mapping for both pairs in the load balance?
Joe, yes, client 1 will get the same IP address again. There are two cases possible -
1. DHCP 1 synced the IP address 10.2.3.4 to DHCP2 before it went down.
In this case, the client will given the full lease duration which has been configured for the scope.
2. DHCP 1 went down before syncing the IP address 10.2.3.4 to DHCP2
In this case also, the client will be able to renew but the lease duration will be shorter - same as the value configured for MCLT.
There is another client behavior to be aware of here - at half the lease period, the client will attempt to renew the lease. This is just normal DHCP client behavior as per the DHCP protocol. The renew message is unicast. So in the scenario above, it will be directed to DHCP 1 which is down and so there will be no response. At 7/8th of the lease period, the client will broadcast the renew request message. This message will be seen by DHCP2 which will respond to the renew request.
I got a question here.
If the failover pair stay in Normal status, which lease time will the client obtain? MCLT or the time defined in the Scope configuration.
My test result is MCLT, I think it is unreasonable as it double the request traffic of DHCP.
According most of document of failover, MCLT should and ONLY be actived when failover enter COMMUITCATE-INTERRPUT or PARTER-DOWN status.
When a client sends a request for a new lease, it will get lease for MCLT duration. When the client attempts to renew the lease at half the lease period i.e. MCLT/2, it will be given the scope lease duration if DHCP failover is in NORMAL state.
This is as per the DHCP failover protocol. This does increase the traffic from new clients but given the scalability of Windows DHCP server, this should not pose any deployment problem.
It would be useful to show the PowerShell Cmdlets used to set this up.
Thomas, please see the blog article "DHCP Failover using PowerShell" at - blogs.technet.com/.../dhcp-failover-using-powershell.aspx
Could you please provide more explanation on MCLT? Does it has got any relation with DHCP lease period? I am confused.
Jobish, please look at the description of MCLT as well as the DHCP examples section in "Understand and deployment guide' for DHCP failover -
Hello DHCP Team, I got issues with DHCP Snooping as recorded on
http://support.microsoft.com/kb/2978225, but instead of reduce the number of servers on switches I changed the mode from Load Balance to StandBy and so far I don't have detected packages being dropped by the DHCP Snooping, so if this is actually a valid
configuration to keep both features fully functional, you could evaluate to include it on article as additional solution.
If one server goes down for an extended period, is there a setting to make 100% of the scope available to the good server instead of 50% so that all 200 of our hosts get an ip address instead of only 100? Thanks.
John, when one server goes down the second server (in communication interrupted state) will renew all existing clients including clients which were earlier responded to by the server which went down. I think this addresses your concern ?
There is a different aspect where the second server will be serving "new leases" from 50% of "free" IP addresses in the scope. However, after the second server moves to "Partner down" state, it will have take over 100% of the "free" IP addresses in the scope.
Assuming one DHCP server in a load-balanced pair has crashed and the second server is supporting 100% of the clients, what is the process to recover the second server and re-establish the 50%/50% split? Is this process documented somewhere on TechNet?
Hi Andrew, once the DHCP server service on the second server starts, it will automatically sync up with the first server and make its lease database up to date. After that, it will enter NORMAL failover state and start servicing clients. At that point,
they will start sharing the load 50:50. There is no admin intervention required. It is expected to just work!
Thanks for the previous answer - that helps, but I do have a follow-up question.
My customer has noticed that there seems to be a master/slave (or Primary / Secondary) relationship, whereby a scope created on DHCP-Server-1 can only be updated (i.e. reconfigured) on that server (and not on DHCP-Server-2). is that expected behaviour? And
if that is the case what happens if one server from a load-balancing pair has to be permanently removed? Is it possible to force the 'secondary' DHCP server to become the master for all defined scopes?