Although Exchange 2010 no longer deploys a cluster resource model we still use Windows Failover Clustering service for certain functions.
When a Windows 2008 / 2008 R2 cluster is created, the cluster core resources are groups together in the ‘Cluster Group’. THe Cluster Group is a hidden group that contains the following resources:
You can see the cluster core resources in failover cluster manager by selecting the cluster name in the upper left hand pane. In the center pane, expand the cluster core resources section.
The cluster core resource group can also be seen using cluster.exe (or in Windows 2008 R2 cluster powershell extensions).
Windows 2008 / Windows 2008 R2: Cluster.exe DAG.company.com group
cluster.exe dag.company.com group Listing status for all available resource groups:
Group Node Status -------------------- --------------- ------ Cluster Group DAG-1 Online Available Storage DAG-1 Offline
Windows 2008 R2: Get-ClusterGroup –Cluster DAG.company.com
PS C:\Users\Administrator> Get-ClusterGroup -Cluster DAG.company.com
Name OwnerNode State ---- --------- ----- Cluster Group dag-1 Online Available Storage dag-1 Offline
From an Exchange 2010 perspective you do not really need to manage the cluster core resources. As members join and depart the cluster this resource group will be automatically moved to a remaining member. Each member of the DAG should have the ability to arbitrate and fully bring online the cluster core resources.
When a cluster is created in Windows 2008 or Windows 2008 R2, the cluster service enumerates all network ports found on the nodes. These network ports are then combined into cluster networks. You can view the cluster networks in failover cluster manager by expanding the cluster name and expanding networks.
You can also view the cluster networks using cluster.exe or powershell.
Windows 2008 / Windows 2008 R2: cluster.exe dag.company.com network
cluster.exe dag.company.com network Listing status for all available networks:
Network Status ---------------------------------------- ----------- Cluster Network 2 Up Cluster Network 4 Up Cluster Network 1 Up
Windows 2008 R2: get-clusternetwork –cluster DAG.company.com
Get-ClusterNetwork -Cluster DAG.company.com
Name State ---- ----- Cluster Network 1 Up Cluster Network 2 Up Cluster Network 4 Up
A cluster network has three settings:
You can see these settings in failover cluster manager by getting the properties of a cluster network.
You can also view the network role either by using cluster.exe or powershell.
Windows 2008 / Windows 2008 R2: cluster.exe dag.company.com network "Cluster Network 1” /prop
cluster dag.company.com network "Cluster Network 1" /prop
Listing properties for 'Cluster Network 1':
T Network Name Value -- -------------------- ------------------------------ ----------- SR Cluster Network 1 Name Cluster Network 1 MR Cluster Network 1 IPv6Addresses MR Cluster Network 1 IPv6PrefixLengths MR Cluster Network 1 IPv4Addresses 10.0.0.0 MR Cluster Network 1 IPv4PrefixLengths 24 SR Cluster Network 1 Address 10.0.0.0 SR Cluster Network 1 AddressMask 255.255.255.0 S Cluster Network 1 Description D Cluster Network 1 Role 3 (0x3) D Cluster Network 1 Metric 1200 (0x4b0) D Cluster Network 1 AutoMetric 1 (0x1)
Windows 2008 R2: get-clusternetwork –cluster DAG.company.com | fl name,role
Get-ClusterNetwork -Cluster DAG-1.company.com | fl name,role
Name : Cluster Network 1 Role : 3
Name : Cluster Network 2 Role : 1
Name : Cluster Network 4 Role : 1
The role of the networks can also be viewed in the registry of each node. This information is located at: HKEY_LOCAL_MACHINE\Cluster\Networks. Each cluster network is represented by a subkey which is the GUID of the network. Expanding the GUID, you will see sub-values including Name and Role.
[HKEY_LOCAL_MACHINE\Cluster\Networks\2cd2b920-0a2a-4851-bb24-de02d4a70b7e] @="class mscs::TmNetworkInfo" "Id"="2cd2b920-0a2a-4851-bb24-de02d4a70b7e" "Name"="Cluster Network 2" "Signature"="NETW" "Description"="" "Role"=dword:00000001 "Priority"=dword:ffffffff "Transport"="TCP/IP" "Ignore"=dword:00000000 "Address"="192.168.0.0" "AddressMask"="255.255.255.0" "IPv6Address"="" "State"=dword:00000003 "Metric"=dword:0000044c "AutoMetric"=dword:00000001
The role value can contain three different values depending on the cluster network settings. The values are:
In order for an IPv4 resource to be brought online it must be associated with a network that is configured to “Allow cluster network communications on this network” and to “Allow clients to connect through this network”. If for any reason the “Allow clients to connect through this network” option is not enabled, the IPv4 resource associated with that network will not be able to be brought online.
On an Exchange 2010 DAG member, when attempting to move the cluster core resources to another DAG member the resources may fail to come online. Specifically the IPv4 resource fails to come online which results in the network name resource failing to come online (due to dependency).
If using Failover Cluster Manager and attempting to bring online the IPv4 resource in the cluster core resources group, the following pop up error is displayed:
A review of the system log shows event 1223:
Log Name: System
Source: Microsoft-Windows-FailoverClustering
Date: 5/10/2010 1:14:42 PM
Event ID: 1223
Task Category: IP Address Resource
Level: Error
Keywords:
User: SYSTEM
Computer: dagNode.company.com
Description:
Cluster IP address resource 'IPv4 Static Address 2 (Cluster Group)' cannot be brought online because the cluster network 'Cluster Network 2' is not configured to allow client access. Please use the Failover Cluster Manager snap-in to check the configured properties of the cluster network.
This Event 1223, described above, indicates that the effective setting for Cluster Network 2 is “Allow cluster network communications on this network” but does not have “Allow clients to connect through this network” set. However, when reviewing the settings in failover cluster manager for Cluster Network 2 you might see that both “Allow cluster network communications on this network” and “allow clients to connect through this network” are enabled.
The Microsoft Exchange Replication Service is responsible for assisting to maintain the cluster network configuration. There is an issue in the current Replication Service where settings are not changed. This essentially causes a difference between the setting inside the cluster and the setting displayed in Failover Cluster Management tools.
Workaround:
A quick and easy workaround for this issue is to simply reset the state of the network. There are multiple ways to accomplish this and I will outline each below. Step zero before proceeding with any other steps is to note the cluster network that is displayed in the above event since that is the network that will need to be reset (in this example Cluster Network 2).
Windows 2008 / Windows 2008 R2 – Using Failover Cluster Management Tool
The network state can be reset using Failover Cluster Manager
Next we need to enable the network for “Allow clients to connect through this network”.
The network has been reset and cluster core resources should successfully arbitrate to any DAG member with a network port in this network.
Windows 2008 / Windows 2008 R2: Using cluster.exe
cluster.exe dag.company.com network “Cluster Network 2” /prop role=1
Next, we need to enable the network for “Allow clients to connect through this network”.
cluster.exe dag.company.com network “Cluster Network 2” /prop role=3
The network has now been reset and cluster core resources should successfully arbitrate to any DAG member with a network port in this network.
Windows 2008 R2: Using powershell
Get-clusternetwork –cluster DAG.company.com –name “Cluster Network 2” | % {$_.role=1}
Next, enable the network for “Allow clients to connect through this network”.
Get-clusternetwork –cluster DAG.company.com –name “Cluster Network 2” | % {$_.role=3}
LONG TERM FIX
This issue will be fixed in Exchange 2010 Service Pack 1. The issue will not be fixed in Exchange 2010 RTM.
==========================================
Updated – 6/2/2010
Updated to list Exchange 2010 SP1 confirmed to contain fix.
Can anyone confirm this has 100% fixed in e2010sp1?
This is not fixed in SP1. I have tested this myself.
@Gaz:
Can you provide some more information for me. This is not something that you can reproduce. If you manually change the cluster settings you can force this issue to occur but it is not covered under the fix described in this blog post. Are you saying that you have experienced this issue on an SP1 DAG?
TIMMCMIC
Tim,
This is still a problem with Service pack 1. Today I had to go to the failover cluster manager and remove the check, click apply, then add the check and click apply and finally I can bring the DAG online to both ping it and use Backup Exec to select the DAG.
McCue
@McCue
Thanks for posting. Can you confirm whether or not the issue was present prior to upgrading to SP1?
@Tim
Unfortunately no, I had already installed EX2010-SP1 when I installed BE2010 which is to be my last step before moving real mailboxes to the EX2010 server.
This is an issue before SP1 as well. I have Exchange 2010 and BE2010 R2, and cluster was offline after a reboot.
To set an IP address for the DAG, use the following exchange shell command:
Set-DatabaseAvailabilityGroup -identity DAGGroupName -databaseavailabilitygroupipaddress 192.168.x.x
Confirm with Get-DatabaseAvailabilityGroup -Idenity DagGroupName |fl
@Joe:
This is correct but you should not have to reset the IP address to correct the issue outlined in this blog.
Is this related to BE2010? I had this same issue with Exchange 2010 (not SP1) but not until after BE2010 was installed.
Thanks for the fix !
@Phil:
I am not aware of what BE2010 is.
I'm pretty sure BE2010 is Backup Exec 2010
With Exchange 2010 SP1 UR0 this is not fixed.
If the issue existed prior to upgrading then you will have to follow the workaround. SP1 will prevent the issue from reoccuring.
I have run the fix a number of times after installing SP1 for Exchange 2010 and one of my two DAG addresses are reporting as offline. Is there any other fix available?