How to: Agile High Availability Structure for Exchange Server 2010

Reason and Goal:

The traditional and standard HA structure for Exchange Server 2010 requires minimum of four exchange servers, as the DAG and NLB cannot run together.

While for most small to medium customers or medium site deployment, such as less than 1000 deployment and no robust virtualization platform, the investment of four HW/SW suites is not practical but they do want the high availability to support business continuity.

Then my goal is: To provide the basic high availability features for Exchange Server 2010 with the minimum of only two physical Exchange servers.

 

Target Scenario:

In Small to Medium size of exchange deployment, in which one Client Access Server is enough for handling client access.

 

Solution:

* Through Exchange Server 2010 DAG to provide high availability features for mailbox databases.

* Through "Microsoft Fail-over Cluster" to provide high availability feature for Client Access Server role. Give up the load-balancing feature provided by NLB as we confirm one server is power enough to handle client accesses.

* Hub Transport Server role provides built-in high availability feature.

 

Benefits:

Less hardware & software cost with almost full HA features.

 

Steps to make it:

Test Environment:

Server Role

Server Name

Domain Controller

DC01.rogertech.local

Exchange Mailbox/HUB/CAS server 1

EX01.rogertech.local

Exchange Mailbox/HUB/CAS server 2

EX02.rogertech.local

 

1. Complete Exchange 2010 DAG deployment following usual steps.

2. Open Windows Failover Cluster Manager, you can see the "PrimaryDAG" object with a blank "Services and Applications" list.

 

3. Now let's start creating the cluster resource for Client Access Server Array, by right clicking the "Services and applications" and choose "Configure a Service or Application".

 

4. Follow the prompted screen to go forward.

 

5. In the "Select Service or Application" section, locate the "Other Server" type and click Next.

 

6. Give the new cluster resource name of "CASArray" and assign it a valid IP address, and here is "192.168.0.110". Go Next.

 

7. Wait for the validation to complete.

 

8. Leave the shared storage part as it, as we do not have shared storage configured here.

 

9. Click Next in the confirmation page.

 

10. Wait for the configuration to complete.

 

11. Until you see below success page, click "View Report" to confirm all steps are completed successfully.

 

12. Go back to the "Failover Cluster Manager", and you now can see a new cluster resource appear with the name of "CASArray". The cluster resource is running in the active node and you can configure the properties as you like as the usual cluster resources.

 

13. Go to the DNS panel and you now see an entry for the "CASArray" appears automatically.

 

14. The above steps only complete the cluster resource object for "CASArray". Now we are going to actually enable this array object for Exchange Client Access Servers.

15. Open Exchange Management Shell, check and confirm there is no any CAS array in place by now.

 

16. Create a new CAS array for the default AD site, and assign the name to "CASArray.<your domain name>". Make sure the successful result is returned.

 

17. Now we need to set all mailbox databases to use this new "CASArray" for client access.

 

18. After completion, now we can access mail through the "CASArray" instead of specific CAS server.

 

19. Now let's force a CASArray fail-over and check if the OWA access keeps alive.

 

20. Once the resource is switched over, we need refresh the page and re-log on OWA because backend node changes, while the URL and user experience keep consistent.

 

21. Now let's test the Outlook MAPI connection activities against the CAS array.

22. Following the usual steps to check if Auto-discover work.

 

23. By now, we can confirm the Outlook auto-discover works properly under the newly created CAS array. By checking the connection status, we can see now the client connects to server through the CAS array instead of any specific server.

 

24. Send/receive test messages to double confirm email flow functions.

25. We have completed the configuration, now we are going to test if the Outlook can really recognize CAS array and complete auto-failover when one backend server is down.

26. Firstly, let's checking the current service and database status on server:

  • User account: administrator@rogertech.com
  • Mailbox database: VIP
  • Active database copy: EX01
  • Cluster resource "CASArray" is running on node: EX01

 

27. Now I will disable all the NICs on EX01 to force a service down action.

 

28. As expected, the cluster automatically switched the "CAS Array" to EX02. And the database copy on EX02 take over and act as "Active" through the DAG function.

 

29. How about client? I kept the Outlook open and tried to send a message during server switching. I can see the "Trying to connect exchange server" for a few seconds. This message stands for "CASArray" switching. The database copy switching is invisible.

 

30. After all the resources switched over, outlook is performing without any problem, and the pending message is sent out successfully. In my virtual test environment, all the switching-overs complete within 20 seconds.

 

Enjoy it.