December, 2012

  • Secure Channel Broken - continuation of "The trust relationship between this workstation and the primary domain failed."

    While there can be several reasons for AD replication to fail due to an "access denied" error (you may find more information in KB article 2002013 - Troubleshooting AD Replication error 5: Access is denied http://support.microsoft.com/kb/2002013/EN-US),
    in here we will be focusing on broken secure channel issues on Domain Controllers and how to reset them.
    Then as promised in my previous post we will also have a look at tools/commands you may use to verify and reset secure channels and trusts.

    A replication issue may actually be the cause for a broken secure channel between DCs in the first place.
    Another common cause may be restoring from backup a DC to a previous point in time (or to a system restore point in case of client machines) or time jumps which don't cause the broken secure channel directly but may seen be as a colateral damage.
    (by the way take a look at http://blogs.technet.com/b/askpfeplat/archive/2012/11/19/did-your-active-directory-domain-time-just-jump-to-the-year-2000.aspx and http://blogs.technet.com/b/askpfeplat/archive/2012/11/26/fixing-when-your-domain-traveled-back-in-time-the-great-system-time-rollback-to-the-year-2000.aspx
    which have been updated recently).

    To reset the Secure Channel of a Domain Controller you have 2 methods, one requires reboot the other doesn't however requires more tasks to be completed.

    If restarting the DC is an issue for you due to business, availability or change control requirements do the following:

    1. Set the Kerberos Key Distribution Center (KDC) service startup type to Disabled, and restart the domain controller
    (particularly important if you have more than one DC on the same domain, so this way you will force the affected DC to contact another DC for kerberos authentication, instead of using itself)

    2. Once the DC restarts, reset the secure channel:

    netdom resetpwd /server: /userd:\ /passwordd:*
    (you should get the foolowing message: "The machine account password for the local machine has been successfully reset")

    3. Then, set the startup type for the KDC service back to Automatic and restart the DC once again.

    Note: Do not start the service manually as it will be started automatically during reboot.

    The 2nd way is:

    1. Delete any mapped drives that might exist on the affected DC, for that run:

    net use * /d

    2. Stop the KDC service.

    3. Purge all kerberos tickets from the computer credentials cache (this is the memory space were the computer stores its tickets, there's nothing in common with cached credentials).

    Depending on the Operating System run the following:

    On Windows Server 2003:

    at /interactive cmd.exe
    (check current system time and add a couple of extra minutes, for example if current time is 14:14 type 14:16)

    Note:
    Another command prompt will launch under the System account context (SVCHost.exe )on the Console session of the problem DC at the time you entered (14:16 in the example above).
    If running the command via RDP ensure that you are connected to session 0 (mstsc /console or /admin depending on mstsc version) or you won't be able to see the 2nd command prompt.

    klist purge

    On Windows Server 2008 or above:

    klist -li 0x3e7 purge

    Note:

    The -li 0x3e7 gets klist into the computer credentials cache so there's no need for the "at" command.

    4. Reset the Secure Channel:

    netdom resetpwd /server: /userd:\ /passwordd:*
    (You should get the following message "The machine account password for the local machine has been successfully reset"

     
    net use \\\IPC$
    (The DC will request new kerberos tickets)

    5. Force the DC to replicate from/to the PDCemulator (so both ways first inbound then outbound):

    repadmin /replicate <destination> <source> <NC> /force

    example:
    repadmin /replicate DC1 DC2 DC=Contoso,DC=com /force
    (you should get the following message "sync from DC2 (the PDCe) to DC1 (the affected DC) completed successfully")
    repadmin /replicate DC2 DC1 DC=Contoso,DC=com /force
    (you should get the following message "sync from DC1 (the affected DC) to DC2 (the PDCe) completed successfully")

    Note:
    If the affected DC successfully replicates from the PDCe then start the KDC service. If not successful, you will have to use the method the requires restart.
    If replication to the PDCe fails, delete the old outbound connection object and trigger KCC (repadmin /kcc) in order to rebuild them, and force outbound replication again. (this might be required for COs with other replication partners)

    Done.

    In order to verify the health status of secure channels you may use Nltest.exe or Netdom.exe (which is also used for trusts).

    If using Nltest run:

    Nltest /sc_verify:<DomainName>
    You should get "Trust Verification Status = 0 0x0 NERR_Success" if not you must reset the secure channel.

    Note:
    This command doesn't work on a PDCe, so please don't reset the PDCe secure channel once you run it thus getting an error message.
    Don't use /sc_query to verify secure channels because it doesn't verify the SC, it just tells you the information about the last established SC. (so you will get a valid answer even when you cannot reach a DC)

    If using Netdom:

    netdom verify /domain:<DomainName>

    In order to verify Trusts:
    (Trusts work in a similar way as Secure Channels, there is a TDO (Trust Domain Object) maintained in each trusting and trusted domain partition, which password has to be in sync, of not the trust gets broken).

    netdom trust /Domain: /verify

    Note:
    Please take into account that depending on security settings between domains this command may fail, for more troubleshooting information see the article below:
    Client, service, and program issues can occur if you change security settings and user rights assignments
    http://support.microsoft.com/kb/823659

    Hope it has been useful for you. Feel free to share your thoughts.

     

  • So you wanted to deploy Domain Controllers faster...Now you can!

    A Domain Controller must have a unique name, invocation ID, and security identifier (SID) in the entire forest.
    Up to Windows Server 2008 R2 promoting "syspreped" standalone images multiple times, was the fastest you could go in order to deploy a large number of Domain Controllers.
    Sysprep was needed for ensuring that the deployed images were unique. Yes, scripts and answer files could be used for unnattended deployment in order to minimize administrative ovehead and deployment times.

    As you could guess from my previous post, Microsoft has introduced new features in Hyper-V and Windows Server 2012 that allow to do things with Virtualized DCs that before were impossible.
    This has became possible due to the introduction of an identifier called VM-Generation ID on the Hypervisor driver and the corresponding attribute in AD (msDS-GenerationID).

    The feature I want to tell you about is VDC Cloning.

    Now a Cloned Windows Server 2012 Domain Controller automatically executes Sysprep and uses the existing NTDS and SYSVOL data for promotion (similar to IFM - Install from Media). Additionally, you may create a configuration file (DCCloneConfig.xml).
    Then just need to copy or export the source VDC and it's done.
    You get faster deployment, scalability and easier disaster recovery.

    Security wise, Domain Admins select DCs that can be cloned.
    The Hyper-V administrators can only deploy the approved cloned DCs thus ensuring that unauthorized users cannot create rogue DCs.

    The following table exemplifies how VDC Cloning is achieved:

    Decision
      Point
    Answer Actions Outcome
    VM-Generation ID exists? No    
    DCCloneConfig exists? Yes Rename DCCloneConfig Reboot into DSRM
           
    VM-Generation ID exists? No    
    DCCloneConfig exists? No   Normal Reboot
           
    VM-Generation ID exists? Yes    
    Do IDs match? Yes    
    DCCloneConfig exists? Yes Rename DCCloneConfig Normal Reboot
           
    VM-Generation ID exists? Yes    
    Do IDs match? Yes    
    DCCloneConfig exists? No   Normal Reboot
           
    VM-Generation ID exists? Yes    
    Do IDs match? No VDC safe restore triggered  
    DCCloneConfig exists? No    
    Is there a Duplicated IP? No   Normal Reboot
           
    VM-Generation ID exists? Yes    
    Do IDs match? No VDC safe restore triggered  
    DCCloneConfig exists? No    
    Is there a Duplicated IP? Yes   Reboot into DSRM 
           
    VM-Generation ID exists? Yes    
    Do IDs match? No VDC safe restore triggered  
    DCCloneConfig exists? No    
    Is there a Duplicated IP? No   Normal Reboot
           
    VM-Generation ID exists? Yes    
    Do IDs match? No VDC safe restore triggered  
    DCCloneConfig exists? Yes Clone  
    Clone Succeeded? No   Reboot into DSRM
           
    VM-Generation ID exists? Yes    
    Do IDs match? No VDC safe restore triggered  
    DCCloneConfig exists? Yes Clone  
    Clone Succeeded? Yes   Normal Reboot

     

    So in a nutshell cloning succeeds the following will happen:

    If both IDs don't match and a DCCloneConfig.xml file exists. (If xml file doesn't exist then VDC performs Snapshot safe restore - please refer to my previous blog for more information).

    DSRM boot flag is set (to prevent boot into production in case cloning fails)

    Then VDC checks for VDCisCloning DWORD under NTDS parameters reg key, and if it doesn't exist NTDS invalidates RID pool and resets Invocation ID; if exists then cloning has failed before (and went to DSRM as a safety mechanism) and this a re-attempt of cloning.

    The Cloned VM gets the configuration settings (from DCCloneConfig.xml) and all ADDS related services are stopped.

    Time synchronization is enforced (using DOMHIER) and Kerberos tickets are purged.

    All FRS/DFSR database files are deleted (Not the SYSVOL content which will be used later for pre-seeding in order to minimize SYSVOL replication traffic) and services set to start automatically.

    The cloned VM is renamed and DC promotion starts using the existing NTDS.DIT file as source (which will minimize replication traffic later).

    Gets a new RID pool from the RID Master.

    New Invocation ID and NTDS settings are created.

    Starts inbound replication.

    FRS/DFSR services start and non-authoritative restore of the SYSVOL is triggered.

    Enables DNS client registration.

    Runs Sysprep (DefaultDCCloneAllowList.xml determines which Sysprep modules are run)

    Once promotion completes the DSRM boot flag is removed, the DCCloneConfig is renamed (will prevent it from being read during reboot),

    the VDCisCloning DWORD removed and sets "VDC Cloning Done" to 1.

    The value of VM-GenerationID (from HV) is copied to the VDC computer account VM-Generation ID (msDS-GenerationID attribute).

    Finally once the VDC reboots it starts advertising itself as a DC.

     

    Note: Currently, this feature only works on Windows Server 2012 Hypervisor or HV2012.

    Hope this helps on your future DC deployments and feel free to comment.

    Thanks

    Paulo

     

     

     

     

     

     

     

     

     

  • USN Rollback, Virtualized DCs and improvements on Windows Server 2012.

    The USN rollback issue has been causing hundreds of support calls and AD replication halts throughout the world since the introduction of AD in Windows 2000 Server and up to Windows Server 2008 R2.

    Every DC maintains a table - ReplUpToDateVector - (or Up-to-Dateness Vector) per Naming Context (NC or AD partition).
    These tables record data from the local DC and its replication partners, this data includes the uuidDSA (or DSA GUID); usnHighPropUpdate (or High Watermark) and timeLastSyncSuccess (or the time stamp of last successfull replication from that replication partner) for a particular partition.

    When a change is made (ie. an object is created or deleted, an attribute of an object is modified) one (or more) attribute(s) will have their Originating and Local USN incremented.

    Example:

    repadmin /showobjmeta * OU=USNROLLBACK,DC=contoso,DC=com

    Additionally, the ReplUptoDateVector table (UTDVEC table from now on) on the local DC for itself will be updated.

    screenshot of repadmin /showutdvec * dc=contoso,dc=com

     

    Normal operation, In this case there is no outstanding replication from ContosoDC1 to ContosoDC2 (values of ContosoDC1 match on both DCs; on the other hand ContosoDC1 will have to replicate from ContosoDC2 66 changes (220356-220290)

     

    Then a replication partner will compare its own version of the table and requests the changes that are higher than the High Watermark from the source.

    If the USN for the DC on the replication partners is higher than the one the DC has for itself, you are dealing with a USN Rollback.

    Example:

    screenshot of repadmin /showutdvec ( relevant USNs highlighted)

     

    USN Rollback. Note that ContosoDC1 "thinks" that ContosoDC2 has a higher "High Watermark" than in reality. So without the USN rollback protection mechanism the next 182 (220380-220198) changes originated on ContosoDC2 will be discarded by ContosoDC1

     

    In that case the originating DC will log Replication Event ID 2095 in Directory Service log and will disable inbound and outbound replication as a protection mechanism in order to avoid further damage.

    screenshot of event 2095

    Without this safety valve further changes held on the originating DC will never be replicated, and eventually only when the originating DC catches-up with the USN known by its replication partners will start replicating again, however any changes in between are lost forever.

    NOTE: Ensure that "Allow replication with corrupt and divergent partners" is not in use, or this protection will be ignored.
     
    In order to avoid this issue you should backup your DCs using a supported method, that is an AD aware backup application (NTBackup and Windows Backup are AD aware as so other 3rd party backup applications).
    What you should NOT do as a replacement for the applications above is to restore AD from unsupported backup methods like:

    Disk Mirroring
    Cloning
    VHD copies
    VM snapshots
    or any other cloning method that doesn't reset the DSA Invocation ID when an AD restore is executed.

    Note:

    The DSA invocation ID is reset once you restore AD using a supported method. Thus replication partners will update their UTDVEC tables with the new value for the restored DC. This doesn't happen when using unsupported methods.

    To fix the problem there are two supported methods:

    1. Reinstall Active Directory on the affected Domain Controller.

    Transfer any FSMO roles if needed.
    Demote the DC.
    Perform a metadata clean-up of all references to the DC.
    Re-promote the DC.

    2. Restore the System State.

    If a valid system state backup was made before the DC was restored from one unsupported method. Restore the system state from the most recent backup.

    For more information on how AD replication works and USN rollback please refer to the following articles:

    How the Active Directory Replication Model Works
    http://technet.microsoft.com/en-us/library/cc772726(WS.10).aspx

    Running Domain Controllers in Hyper-V
    http://technet.microsoft.com/en-us/library/d2cae85b-41ac-497f-8cd1-5fbaa6740ffe(v=WS.10)#usn_and_usn_rollback

    In Windows Server 2012 virtualized Domain Controllers, you can now restore snapshots without permanently damage domain controllers.
    While this does not prevent other issues for other technologies and applications, it does make domain controller virtualization safer.

    Now virtualized domain controller snapshot restore resets the DC's unique Invocation ID.
    Additionally discards the local RID pool and non-authoritatively restores the SYSVOL folder. 
    This means that accidentally restoring a snapshot is no longer an unsafe operation on domain controllers.

    The following process describes how Virtualized DC (VDC) Safe Restore is achieved:

    1. Restore of an existing virtual machine (VM) domain controller from a snapshot in a hypervisor that supports VM-Generation ID (Windows Server 2012 Hyper-V for instance).

    Assuming that this VM already has an existing VM Generation-ID on its DC computer object when the snapshot was taken as part of the msDS-GenerationID attribute (Schema version 56).

    2. The VM then reads the VM-Generation ID provided by Hyper-V VMGenerationCounter driver and compares the VM-Generation IDs values.

    If they do not match, it continues with snapshot restoration operations.
    Once restored, the Generation-ID set on the DC computer object (in AD) is updated to match the new ID provide by the hypervisor host.

    If the hypervisor does not provide a VM-Generation ID for comparison, the guest will operate like a Windows Server 2008 R2 or earlier virtualized domain controller.

    3. The Virtualized DC then:

    Invalidates the local RID pool.
    Sets a new DSA invocation ID.

    4. Non-authoritative inbound replication is triggered  from a replication partner. The DC requests changes starting at a USN that precedes the USN at which the local directory service was restored. The UTDVEC table of the destination DC is updated appropriately.

    5. The virtualized DC synchronizes the SYSVOL:

    If using FRS, it stops the NTFRS service and sets the BURFLAGS registry value (D2).
    It then starts the NTFRS service, thus performing a non-authoritative restore of the SYSVOL.

    If using DFSR, it stops the DFSR service and deletes the DFSR database files. It then starts the DFSR service, thus performing a non-authoritative restore of the SYSVOL.

    6. The VM updates the msDS-GenerationID attribute on its own DC object to match the current Hypervisor VM-Generation ID

    If you haven't got started with Windows Server 2012, download Windows Server 2012 from:
    http://technet.microsoft.com/en-us/evalcenter/hh670538.aspx

     

    To finish this post as a personal note, and from all improvements in Windows Server 2012 this my preferred feature. And I think that whomever imagined it deserves a big round of applause from all of us.

    Feel free to leave your comments.

    Cheers.