Tim McMichael

Navigating the world of high availability...

Exchange 2010 SP1: StartDagServerMaintenance.ps1 fails on databases that have only two database copies.

Exchange 2010 SP1: StartDagServerMaintenance.ps1 fails on databases that have only two database copies.

  • Comments 13
  • Likes

In Exchange 2010 Service Pack 1 we introduced some new DAG management scripts. These scripts can be found in the Exchange Server installation directory \ scripts. (This is usually c:\Program Files\Microsoft\Exchange Server\v14\scripts).

 

One of the scripts introduced is the StartDagServerMaintenance.ps1 script. More information on this script can be found at:

http://technet.microsoft.com/en-us/library/ff625233.aspx

http://technet.microsoft.com/en-us/library/dd298065.aspx

 

When administrators utilize this script the following actions are being taken:

1) All database copies are moved to another server in the DAG based on the selection of the next best copy.

2) If the cluster core resources are owned on the node the resources are arbitrated to a different DAG member (thereby moving the Primary Active Manager functionality to another node).

3) The DatabaseCopyAutoActivationPolicy property of the mailbox server is set to a value of BLOCKED thereby preventing the DAG member from receiving or activating database copies.

4) The individual database copies hosted on the DAG member are activation suspended.

5) The node is paused within the cluster service preventing the cluster core resources from arbitrating to the node (and thereby preventing the node from becoming the Primary Active Manager).

 

When an administrator attempts to place a DAG member into maintenance mode and the DAG member hosts an ACTIVE database that has only two copies the following occurs:

1)  The database copy is moved to the other node hosting the passive copy (pending the copy is healthy).

2)  The command fails with the following error after the database is moved.  (In this example the mounted copy is on server DAG-4).

 

*Pre StartDagServerMaintenance*

Name                                          Status          CopyQueue ReplayQueue LastInspectedLogTime   ContentIndex
                                                              Length    Length                             State
----                                          ------          --------- ----------- --------------------   ------------

TESTSCRIPT\DAG-4                              Mounted         0         0                                  Healthy

TESTSCRIPT\DAG-3                              Healthy         0         0           7/25/2011 10:17:30 AM  Healthy

*StartDagServerMaintenance*

 

[PS] C:\Program Files\Microsoft\Exchange Server\V14\Scripts>.\StartDagServerMaintenance.ps1 DAG-4
The following objects are hosted by 'DAG-4', before attempting to move them off: `n(Database='TESTSCRIPT', Reason='Copy is active'))
Write-Error : The following objects are still hosted by 'DAG-4', even after attempting to move them off: `n(Database='TESTSCRIPT', Reason='Copy is critical for redundancy according to Red Alert script'))
At C:\Program Files\Microsoft\Exchange Server\V14\Scripts\StartDagServerMaintenance.ps1:216 char:16
+                 write-error <<<<  ($StartDagServerMaintenance_LocalizedStrings.res_0014 -f ( PrintCriticalMailboxResourcesOutput($criticalMailboxResources)),$shortServerName) -erroraction:stop
    + CategoryInfo          : NotSpecified: (:) [Write-Error], WriteErrorException
    + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Microsoft.PowerShell.Commands.WriteErrorCommand

*Post StartDagServerMaintenance*

 

Name                                          Status          CopyQueue ReplayQueue LastInspectedLogTime   ContentIndex
                                                              Length    Length                             State
----                                          ------          --------- ----------- --------------------   ------------
TESTSCRIPT\DAG-3                              Mounted         0         0                                  Healthy
TESTSCRIPT\DAG-4                              Healthy         0         0           7/25/2011 10:33:57 AM  Healthy

When an administrator attempts to place a DAG member into maintenance mode and the DAG member hosts an PASSIVE database that has only two copies the following occurs:

1) The command fails with the following error after the database is moved. (In this example the passive copy is on server DAG-4).

 

*Pre StartDagServerMaintenance*

 

Name                                          Status          CopyQueue ReplayQueue LastInspectedLogTime   ContentIndex
                                                              Length    Length                             State
----                                          ------          --------- ----------- --------------------   ------------
TESTSCRIPT\DAG-3                              Mounted         0         0                                  Healthy
TESTSCRIPT\DAG-4                              Healthy         0         0           7/25/2011 10:33:57 AM  Healthy

 

*StartDagServerMaintenance*

 

[PS] C:\Program Files\Microsoft\Exchange Server\V14\Scripts>.\StartDagServerMaintenance.ps1 DAG-4
The following objects are hosted by 'DAG-4', before attempting to move them off: `n(Database='TESTSCRIPT', Reason='Copy is active'))
Write-Error : The following objects are still hosted by 'DAG-4', even after attempting to move them off: `n(Database='TESTSCRIPT', Reason='Copy is critical for redundancy according to Red Alert script'))
At C:\Program Files\Microsoft\Exchange Server\V14\Scripts\StartDagServerMaintenance.ps1:216 char:16
+ write-error <<<< ($StartDagServerMaintenance_LocalizedStrings.res_0014 -f ( PrintCriticalMailboxResourcesOutput($criticalMailboxResources)),$shortServerName) -erroraction:stop
+ CategoryInfo : NotSpecified: (:) [Write-Error], WriteErrorException
+ FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Microsoft.PowerShell.Commands.WriteErrorCommand

 

*Post StartDagServerMaintenance*

 

Name                                          Status          CopyQueue ReplayQueue LastInspectedLogTime   ContentIndex
                                                              Length    Length                             State
----                                          ------          --------- ----------- --------------------   ------------
TESTSCRIPT\DAG-3                              Mounted         0         0                                  Healthy
TESTSCRIPT\DAG-4                              Healthy         0         0           7/25/2011 10:33:57 AM  Healthy

Administrators can find manual maintenance mode instructions available in the following blog post:

http://blogs.technet.com/b/timmcmic/archive/2011/07/25/exchange-2010-sp1-startdagservermaintenance-ps1-fails-when-a-server-contains-databases-with-a-single-copy.aspx

 

After completing the manual instructions and when maintenance mode is no longer needed the administrator may utilize the StopDagServerMaintenance.ps1 script to revert the manual changes.

Comments
  • Is there a specific reason as to why the script behaves in this way? Is it related to DCs not replicating correctly or what?

  • @Amir

    The script today has protections to ensure > 1 viable copy left after a node is put into maintenance mode.

    TIMMCMIC

  • Tim,

    Any chance the person who wrote the script will offer a modification of the script to work with a DAG with only 2 database replicas?

    FYI - we have 3 replicas with the 3rd being on redudant DAG nodes in a second datacenter for DR purposes, and the script fails. So it looks like this fails not just for DAGs with 2 database replicas, but more specifically for DAGs with only 2 local database replicas.

  • @Dan:

    I'd like to see the output of the failure to confirm.

    I do not expect the design of this script to be changed.

    TIMMCMIC

  • We will have to wait until our next maintenance window to get you the output.

    Side note - we had originally set all four of our DAG nodes in our passive datacenter to:

    DatabaseCopyAutoActivationPolicy : IntrasiteOnly

    on top of setting our DAG to DAC mode to help prevent database failovers/moves to the passive datacenter.

    We used the start and stop scrpts on the nodes in our passive datacenter which seemed to work fine (since they were a tertiary copy of the databases), but we just noticed at our last maintenance window some databases actually migrated over to the passive datacenter and we were stumped as to why. Apparently the StopDagServerMaintenance.ps1 script set the mailbox servers back to "Unrestricted" after our first post-DR deloyment maintenance window which allowed this to happen as a result of the Move-ActiveMailboxDatabase cmdlet in our second post-DR deployment maintenance window.

    Do you have any suggestions other then completely abandoning the start and stop maintenance scripts how to keep our servers in the passive datacenter from being potential targets for the move-mailboxdatabase cmdlet? It seems as if the StopDagServerMaintenance script will always set the AutoActivationPolicy back to Unrestricted, and this is not ideal in an Active/Passive datacenter DAG deployment.

  • @Dan:

    You are correct - the stop and start DAG server scripts reset your atttributes to their defaults.  IE - if you set a mailbox copy auto activation policy to instrasite it will be reset to unrestricted.  There's nothing in the script to remember what you had before.

    For customers with custom settings they either utilize these scripts wrapped in another script to reset them back to their preferred settings or given your circumstances abandon their use.

    TIMMCMIC

  • Had the same scenario - 2 member dag with 2 copies - Didnt get this error. We got the same error with 3 member dag where some databases had only 2 copies. Environment is at E2K10 SP1 RU4v2.

  • @adi

    You are correct.  It was pointed out to me after this that it's when there are two copies of a database in greater than a two node DAG.

    Interestingly enough this scenario is fixed in Exchange 2010 SP2 RU1.

    TIMMCMIC

  • @Tim - That's excellent news as we had almost given up and started to write our own maintenance scripts to try and automate the same steps in the current scripts but w/o the limitation discussed. Also do you still want to see the output of the script failing in our enviornment givne the information on the change in SP2 RU1?

    Also, and I know this is always a tough question to answer, but do you have a rought time frame when RU1 *might* be out? I'm not looking for a specific date, just a rough idea in months when it might be out so we know how long we will have to limp along running all the commands by hand.

    Thanks for the follow up on this BTW.

  • Nevermind on the question of when SP1 RU1 will be out - it was released today and here is the specific KB# regarding the scripting issues:

    support.microsoft.com/.../2585649

    Thanks again for staying on top of stuff like this Tim!

  • you have to use the switch "overrideMinimumTwoCopies" in order to execute this script  on a DAG memberserver which hosts databases that have only two copies of a specified database. otherwise this script will return the folllowing errormessage

    “The following objects are hosted by Mailbox Server, before attempting to move them off: `n(Database=<Mailbox Database>, Reason=’Copy is critical for redundancy according to Red Alert script’))”

  • @Peddy1st

    You are correct.  This is a switch that was added to correct this condition in SP2 RU2 (I believe).

    Technet documentation is being updated to offiicially reflect this.

    TIMMCMIC

  • Greetings,

    Should this script work with Exchange 2010 SP3 when there are only two copies of the databases? (One active, one passive) Powershell gives an error when running the script. Manual switchover works fine, and suspending replication on each DB works fine.

    I am sure this worked previously, but maybe I did not have a copy set up.

    Any info would be appreciated.

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment