Blog - Title

Ask the Directory Services Team

  • A Treatise on Group Policy Troubleshooting–now with GPSVC Log Analysis!

    Hi all, David Ani here from Romania. This guide outlines basic steps used to troubleshoot Group Policy application errors using the Group Policy Service Debug logs (gpsvc.log). A basic understanding of the logging discussed here will save time and may prevent you from having to open a support ticket with Microsoft. Let's get started.

    The gpsvc log has evolved from the User Environment Debug Logs (userenv log) in Windows XP and Windows Server 2003 but the basics are still there and the pattern is the same. There are also changes from 2008 to 2012 in the logging itself but they are minor and will not prevent you from understanding your first steps in analyzing the debug logs.

    Overview of Group Policy Client Service (GPSVC)

    • One of the major changes that came with Windows Vista and later operating systems is the new Group Policy Client service. Earlier operating systems used the WinLogon service to apply Group Policy. However, the new Group Policy Client service improves the overall stability of the Group Policy infrastructure and the operating system by isolating it from the WinLogon process.
    • The service is responsible for applying settings configured by administrators to computers and users through the Group Policy component. If the service is stopped or disabled, the settings will not be applied, so applications and components will not be manageable through Group Policy. Please keep in mind that, to increased security, users cannot start or stop the Group Policy Client service. In the Services snap-in, the options to start, stop, pause, and resume the Group Policy client are unavailable.
    • Finally, any components or applications that depend on the Group Policy component will not be functional if the service is stopped or disabled.

    Note: The important thing to remember is that the Group Policy Client is a service running on every OS since Vista and is responsible for applying GPOs. The process itself will run under a svchost instance, which you can check by using the “tasklist /svc” command line.

    clip_image003

    One final point: Since the startup value for the service is Automatic (Trigger Start), you may not always see it in the list of running services. It will start, perform its actions, and then stop.

    Group Policy processing overview
    Group Policy processing happens in two phases:

    • Group Policy Core Processing - where the client enumerates all Group Policies together with all settings that need to be applied. It will connect to a Domain Controller, accessing Active Directory and SYSVOL and gather all the required data in order to process the policies.
    • Group Policy CSE Processing - Client Side Extensions (CSEs) are responsible for client side policy processing. These CSEs ensure all settings configured in the GPOs will be applied to the workstation or server.

    Note: The Group Policy architecture includes both server and client-side components. The server component includes the user interface (GPEdit.msc, GPMC.msc) that an administrator can use to configure a unique policy. GPEdit.msc is always present even on client SKU's while GPMC.msc and GPME.msc get installed either via RSAT or if the machine is a domain controller. When Group Policy is applied to a user or computer, the client component interprets the policy and makes the appropriate changes to the environment. These are known as Group Policy client-side extensions. 

    See the following post for a reference list for most of the CSEs: http://blogs.technet.com/b/mempson/archive/2010/12/01/group-policy-client-side-extension-list.aspx

    In troubleshooting a given extension's application of policy, the administrator can view the configuration parameters for that extension. These parameters are in the form of registry values. There are two things to keep in mind:

    • When configuring GPOs in your Domain you must make sure they have been replicated to all domain controllers, both in AD and SYSVOL. It is important to understand that AD replication is not the same as SYSVOL replication and one can be successful while the other may not. However, if you have a Windows 8 or Windows Server 2012 or later OS, this is easily verified using the Group Policy Management Console (GPMC) and the status tab for an Organizational Unit (OU).
    • At a high level, we know that the majority of your GPO settings are just registry keys that need to be delivered and set on a client under the user or machine keys.

    First troubleshooting steps

    • Start by using GPResult or the Group Policy Results wizard in GPMC and check which GPOs have been applied. What are the winning GPOs? Are there contradictory settings? Finally, be on the lookout for Loopback Policy Processing that can sometimes deliver unexpected results.

    Note: To have a better understanding of Loopback Policy Processing please review this post: http://blogs.technet.com/b/askds/archive/2013/02/08/circle-back-to-loopback.aspx

    • On the target client, you can run GPResult /v or /h and verify that the GPO is there and listed under “Applied GPOs.” Is it listed? It should look the same as the results from the Group Policy Results wizard in GPMC. If not verify replication and that policy has been recently applied.

    Note: You can always force a group policy update on a client with gpupdate /force. This will require admin privileges for the computer side policies. If you do not have admin rights an old fashioned reboot should force policy to apply.

    • If the Group Policy is unexpectedly listed under “Denied GPOs”, then please check the following:

    – If the reason for “Denied GPOs” is empty, then you probably have linked a User Configuration GPO to an OU with computers or the other way around. Link the GPO to the corresponding OU, the one which contains your users.

    – If the reason for “Denied GPOs” is “Access Denied (Security Filtering)”, then make sure you have the correct objects (Authenticated Users or desired Group) in “Security Filtering” in GPMC. Target objects need at least “Read” and “Apply Group Policy” permissions.

    – If the reason for “Denied GPOs” is “False WMI Filter”, then make sure you configure the WMI filter accordingly, so that the GPO works with the WMI filter for the desired user and computers.

    See the following TechNet reference for more on WMI Filters: http://technet.microsoft.com/en-us/library/cc787382(v=ws.10).aspx

    – If the Group Policy isn’t listed in gpresult.exe at all, verify the scope by ensuring that either the user or computer object in Active Directory reside in the OU tree the Group Policy is linked to in GPMC.


    Start Advanced Troubleshooting

    • If the problem cannot be identified from the previous steps, then we can enable gpsvc logging. On the client where the GPO Problem occurs follow these steps to enable Group Policy Service debug logging.

    1. Click Start, click Run, type regedit, and then click OK.
    2. Locate and then click the following registry subkey: HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion
    3. On the Edit menu, point to New, and then click Key.
    4. Type Diagnostics, and then press ENTER.
    5. Right-click the Diagnostics subkey, point to New, and then click DWORD (32-bit) Value.
    6. Type GPSvcDebugLevel, and then press ENTER.
    7. Right-click GPSvcDebugLevel, and then click Modify.
    8. In the Value data box, type 30002 (Hexadecimal), and then click OK.

    9. Exit Registry Editor.
    10. View the Gpsvc.log file in the following folder: %windir%\debug\usermode

    Note - If the usermode folder does not exist, create it under %windir%\debug.
    If the usermode folder does not exist under %WINDIR%\debug\ the gpsvc.log file will not be created.

    • Now, you can either do a “gpupdate /force” to trigger GPO processing or do a restart of the machine in order to get a clean boot application of group policy (Foreground vs Background GPO Processing).
    • After that, the log itself should be found under: C:\Windows\Debug\Usermode\gpsvc.log

    An important note for Windows 7/ Windows Server 2008 R2 or older operating systems to consider: On multiprocessor machines, we might have concurrent threads writing to log at the same time. In heavy logging scenarios, one of the writes attempts may fail and we may possibly lose debug log information.
    Concurrent processing is very common with group policy troubleshooting since you usually run "gpupdate /force" without specifying user or machine processing separately. To reduce the chance of lost logging while troubleshooting, initiate machine and user policy processing separately:

    • Gpupdate /force /target:computer
    • Gpupdate /force /target:user


    Analysis - Understanding PID, TID and Dependencies
    Now let's get started with the GPSVC Log analysis! The first thing to understand is the Process Identifier (PID) and Thread Identifier (TID) of a gpsvc log. Here is an example:

    GPSVC(31c.328) 10:01:56:711 GroupPolicyClientServiceMain

    What are those? As an example I took “GPSVC(31c.328)”, where the first number is 31c, which directly relates to the PID. The second number is 328, which relates to the TID. We know that the 31c doesn’t look like a PID, but that’s because it is in Hexadecimal. By translating it into decimal, you will get the PID of the process for the SVCHOST containing the GPSVC.

    Then we have a TID, which will differ for every thread the GPClient is working on. One thing to consider: we will have two different threads for Machine and User GPO processing, so make sure you follow the correct one.

    Example:

    GPSVC(31c.328) 10:01:56:711 CGPService::Start: InstantiateGPEngine
    GPSVC(31c.328) 10:01:56:726 CGPService::InitializeRPCServer starting RPCServer

    GPSVC(31c.328) 10:01:56:741 CGPService::InitializeRPCServer finished starting RPCServer. status 0x0
    GPSVC(31c.328) 10:01:56:741 CGPService::Start: CreateGPSessions
    GPSVC(31c.328) 10:01:56:758 Updating the service status to be RUNNING.

    This shows that the GPService Engine is being started and we can see that it also checks for dependencies (RPCServer) to be started.

    Synchronous vs Asynchronous Processing
    I will not spend a lot of time explaining this because there is a great post from the GP Team out there which explains this very well. This is important to understand because it has a big impact on how settings are applied and when. Look at:
    http://blogs.technet.com/b/grouppolicy/archive/2013/05/23/group-policy-and-logon-impact.aspx

    Synchronous vs. asynchronous processing
    Foreground processing can operate under two different modes—synchronously or asynchronously. The default foreground processing mode for Windows clients since Windows XP has been asynchronous.

    Asynchronous GP processing does not prevent the user from using their desktop while GP processing completes. For example, when the computer is starting up GP asynchronous processing starts to occur for the computer. In the meantime, the user is presented the Windows logon prompt. Likewise, for asynchronous user processing, the user logs on and is presented with their desktop while GP finishes processing. There is no delay in getting either their logon prompt or their desktop during asynchronous GP processing. When foreground processing is synchronous, the user is not presented with the logon prompt until computer GP processing has completed after a system boot. Likewise the user will not see their desktop at logon until user GP processing completes. This can have the effect of making the user feel like the system is running slow. To summarize, synchronous processing can impact startup time while asynchronous does not.

    Foreground processing will run synchronously for two reasons:

    1)      The administrator forces synchronous processing through a policy setting. This can be done by enabling the Computer Configuration\Policies\Administrative Templates\System\Logon\Always wait for the network at computer startup and logon policy setting. Enabling this setting will make all foreground processing synchronous. This is commonly used for troubleshooting problems with Group Policy processing, but doesn’t always get turned back off again.

    Note: For more information on fast logon optimization see:
    305293 Description of the Windows Fast Logon Optimization feature
    http://support.microsoft.com/kb/305293

    2)      A particular CSE requires synchronous foreground processing. There are four CSEs provided by Microsoft that currently require synchronous foreground processing: Software Installation, Folder Redirection, Microsoft Disk Quota and GP Preferences Drive Mapping. If any of these are enabled within one or more GPOs, they will trigger the next foreground processing cycle to run synchronously when they are changed.

    Action: Avoid synchronous CSEs and don’t force synchronous policy. If usage of synchronous CSEs is necessary, minimize changes to these policy settings.

    Analysis - Starting to read into the gpsvc log
    Starting to read into the gpsvc log

    First, we identify where the machine settings are starting, because they process first:

    GPSVC(31c.37c) 10:01:57:101 CStatusMessage::UpdateWinlogonStatusMessage::++ (bMachine: 1)
    GPSVC(31c.37c) 10:01:57:101 Message Status = <Applying computer settings>
    GPSVC(31c.37c) 10:01:57:101 User SID = MACHINE SID
    GPSVC(31c.37c) 10:01:57:101 Setting GPsession state = 1
    GPSVC(31c.174) 10:01:57:101 CGroupPolicySession::ApplyGroupPolicyForPrincipal::++ (bTriggered: 0, bConsole: 0)

    The above lines are quite clear, “<Applying computer settings>” and “User SID = MACHINE SID” pointing out we are talking about machine context. From the “bConsole: 0” part, which means “Boolean Console” with a value of 0, as in false, meaning no user – machine processing.

     

    GPSVC(31c.174) 10:01:57:101 Waiting for connectivity before applying policies
    GPSVC(31c.174) 10:01:57:116 CGPApplicationService::MachinePolicyStartedWaitingOnNetwork.
    GPSVC(31c.564) 10:01:57:804 NlaGetIntranetCapability returned Not Ready error. Consider it as NOT intranet capable.
    GPSVC(31c.564) 10:01:57:804 There is no connectivity. Waiting for connectivity again...
    GPSVC(31c.564) 10:01:59:319 There is connectivity.
    GPSVC(31c.564) 10:01:59:319 Wait For Connectivity: Succeeded
    GPSVC(31c.174) 10:01:59:319 We have network connectivity... proceeding to apply policy.

    This shows us that, at this moment in time, the machine does not have connectivity. However, it does state that it is going to wait for connectivity before applying the policies. After two seconds, we can see that it does find connectivity and moves on with GPO processing.
    It is important to understand that there is a default timeout when waiting for connectivity. The default value is 30 seconds, which is configurable.

    Connectivity
    Now let’s check a bad case scenario where there won’t be a connection available and we run into a timeout:

    GPSVC(324.148) 04:58:34:301 Waiting for connectivity before applying policies
    GPSVC(324.578) 04:59:04:301 CConnectivityWatcher::WaitForConnectivity: Failed WaitForSingleObject.
    GPSVC(324.148) 04:59:04:301 Wait for network connectivity timed out... proceeding to apply policy.
    GPSVC(324.148) 04:59:04:301 CGroupPolicySession::ApplyGroupPolicyForPrincipal::ApplyGroupPolicy (dwFlags: 7).
    GPSVC(324.148) 04:59:04:317 Application complete with bConnectivityFailure = 1.

    As we can see, after 30 seconds it is failing with a timeout and then proceeds to apply policies.
    Without a network connection there are no policies from the domain and no version checks between cached ones and domain ones that can be made.
    In such cases, you will always encounter “bConnectivityFailure = 1”, which isn’t only typical to a general network connectivity issue, but also for every connectivity problem that the machine encounters, LDAP bind as an example.

    Slow Link Detection

    GPSVC(31c.174) 10:01:59:397 GetDomainControllerConnectionInfo: Enabling bandwidth estimate.
    GPSVC(31c.174) 10:01:59:397 Started bandwidth estimation successfully
    GPSVC(31c.174) 10:01:59:976 Estimated bandwidth : DestinationIP = 192.168.1.102
    GPSVC(31c.174) 10:01:59:976 Estimated bandwidth : SourceIP = 192.168.1.105
    GPSVC(31c.174) 10:02:00:007 IsSlowLink: Bandwidth Threshold (WINLOGON) = 500.
    GPSVC(31c.174) 10:02:00:007 IsSlowLink: Bandwidth Threshold (SYSTEM) = 500.
    GPSVC(31c.174) 10:02:00:007 IsSlowLink: WWAN Policy (SYSTEM) = 0.
    GPSVC(31c.174) 10:02:00:007 IsSlowLink: Current Bandwidth >= Bandwidth Threshold.

    Moving further, we can see that a bandwidth estimation is taking place, since Vista, this is done through Network Location Awareness (NLA).

    Slow Link Detection Backgrounder from our very own "Group Policy Slow Link Detection using Windows Vista and later"

    The Group Policy service begins bandwidth estimation after it successfully locates a domain controller. Domain controller location includes the IP address of the domain controller. The first action performed during bandwidth estimation is an authenticated LDAP connect and bind to the domain controller returned during the DC Locator process.

    This connection to the domain controller is done under the user's security context and uses Kerberos for authentication. This connection does not support using NTLM. Therefore, this authentication sequence must succeed using Kerberos for Group Policy to continue to process. Once successful, the Group Policy service closes the LDAP connection. The Group Policy service makes an authenticated LDAP connection in computer context when user policy processing is configured in loopback-replace mode.

    The Group Policy service then determines the network name. The service accomplishes this by using IPHelper APIs to determine the best network interface in which to communicate with the IP address of the domain controller. Additionally, the domain controller and network name are saved in the client computer's registry for future use.

    The Group Policy service is ready to determine the status of the link between the client computer and the domain controller. The service asks NLA to report the estimated bandwidth it measured while earlier Group Policy actions occurred. The Group Policy service compares the value returned by NLA to the GroupPolicyMinTransferRate named value stored in Registry.

    The default minimum transfer rate to measure Group Policy slow link is 500 (Kbps). The link between the domain controller and the client is slow if the estimated bandwidth returned by NLA is lower than the value stored in the registry. The policy value has precedence over the preference value if both values appear in the registry. After successfully determining the link state (fast or slow—no errors), then the Group Policy service writes the slow link status into the Group Policy history, which is stored in the registry. The named value is IsSlowLink.

    If the Group Policy service encounters an error, it read the last recorded value from the history key and uses that true or false value for the slow link status.

    There is updated client-side behavior with Windows 8.1 and later:
    What's New in Group Policy in Windows Server - Policy Caching

    In Windows Server 2012 R2 and Windows 8.1, when Group Policy gets the latest version of a policy from the domain controller, it writes that policy to a local store. Then if Group Policy is running in synchronous mode the next time the computer reboots, it reads the most recently downloaded version of the policy from the local store, instead of downloading it from the network. This reduces the time it takes to process the policy. Consequently, the boot time is shorter in synchronous mode. This is especially important if you have a latent connection to the domain controller, for example, with DirectAccess or for computers that are off premises. This behavior is controllable by a new policy called Configure Group Policy Caching.

    - The updated slow link detection only takes place during synchronous policy processing. It “pings” the Domain Controller with calling DsGetDcName and measures the duration.

    - By default, the Configure Group Policy Caching group policy setting is set to Not Configured. The feature will be enabled by default and using the default values for slow link detection (500ms) and time-out for communicating with a Domain Controller (5000ms) to determine whether it is on the network, if the below conditions are met:

    o The Turn off background refresh of Group Policy policy setting is Not Configured or Disabled.

    o The Configure Group Policy slow link detection policy setting is Not Configured, or, when Enabled, contains a value for Connection speed (Kbps) that is not outlandish (500 is the default value).

    o The Set Group Policy refresh interval for computers is Not Configured or, when Enabled, contains values for Minutes that are not outlandish (90 and 30 at the default values).

    Order of processing settings
    Next on the agenda is retrieving GPOs from the domain. Here we have Group Policy processing and precedence, Group Policy objects that apply to a user (or computer) do not have the same precedence.
    Settings that are applied later can override settings that are applied earlier. The policies are applied in the hierarchy --> Local machine, Sites, Domains and Organizational Units (LSDOU).
    For nested organizational units, GPOs linked to parent organizational units are applied before GPOs linked to child organizational units are applied.

    Note: The order in which GPOs are processed is significant because when policy is applied, it overwrites policy that was applied earlier.

    There are of course some exceptions to the rule:

    • A GPO link may be enforced, or disabled, or both.
    • A GPO may have its user settings disabled, its computer settings disabled, or all settings disabled.
    • An organizational unit or a domain may have Block Inheritance set.
    • Loopback may be enabled. 

    For a better understanding regarding these, please have a look in the following TechNet article: http://technet.microsoft.com/en-us/library/bb742376.aspx

    How does the order of processing look in a gpsvc log
    In the gpsvc log you will notice that the ldap search is done starting at the OU level and up to the site level.

    "The Group Policy service uses the distinguished name of the computer or user to determine the list of OUs and the domain it must search for group policy objects. The Group Policy service builds this list by analyzing the distinguished name from left to right. The service scans the name looking for each instance of OU= in the name. The service then copies the distinguished name to a list, which is used later. The Group Policy service continues to scan the distinguished name for OUs until it encounters the first instance of DC=. At this point, the Group Policy service has found the domain name, finally it searches for policies at site level."

    As you have probably noticed in our example, we only have two GPOs, one at the OU level and one at the Domain level.

    The searches are done using the policies GUID and not their name, the same way you would find them in Sysvol, not by name but by their policy GUID.
    It is always a best practice to be aware of the policy name and its GUID, thus making it easier to work with, while troubleshooting.

    GPSVC(31c.174) 10:01:59:413 GetGPOInfo: Entering...
    GPSVC(31c.174) 10:01:59:413 GetMachineToken: Looping for authentication again.
    GPSVC(31c.174) 10:01:59:413 SearchDSObject: Searching <OU=Workstations,DC=contoso,DC=lab>
    GPSVC(31c.174) 10:01:59:413 SearchDSObject: Found GPO(s): <[LDAP://cn={CC02524C-727C-4816-A298-
    63D12E68C0F},cn=policies,cn=system,DC=contoso,DC=lab;
    0]>
    GPSVC(31c.174) 10:01:59:413 ProcessGPO(Machine): ==============================
    GPSVC(31c.174) 10:01:59:413 ProcessGPO(Machine): Deferring search for LDAP://cn={CC02524C-727C-4816-A298-63D12E68C0F},cn=policies,cn=system,DC=contoso,DC=lab
    GPSVC(31c.174) 10:01:59:413 SearchDSObject: Searching <DC=contoso,DC=lab>
    GPSVC(31c.174) 10:01:59:413 SearchDSObject: Found GPO(s): <[LDAP://CN={31B2F340-016D-11D2-945F-00C04FB984F9},CN=Policies,CN=System,DC=contoso,DC=lab;0]>
    GPSVC(31c.174) 10:01:59:413 ProcessGPO(Machine): ==============================
    GPSVC(31c.174) 10:01:59:413 ProcessGPO(Machine): Deferring search for LDAP://CN={31B2F340-016D-11D2-945F-00C04FB984F9},CN=Policies,CN=System,DC=contoso,DC=lab
    GPSVC(31c.174) 10:01:59:522 SearchDSObject: Searching <CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=contoso,DC=lab>
    GPSVC(31c.174) 10:01:59:522 SearchDSObject: No GPO(s) for this object.

    You can see if the policy is enabled, disable or enforced here:

    GPSVC(31c.174) 10:01:59:413 SearchDSObject: Searching <OU=Workstations,DC=contoso,DC=lab>
    GPSVC(31c.174) 10:01:59:413 SearchDSObject: Found GPO(s): <[LDAP://cn={CC02524C-727C-4816-A298-D63D12E68C0F},cn=policies,cn=system,DC=contoso,DC=lab;0]>

    Note the 0 at the end of the ldap query, this is the default setting. If the value were 1 instead of 0 it would mean the policy is set to disabled. In other words, a value of 1 means the policy is linked to that particular OU, domain, or site level but is disabled. If the value is set to 2 then it would mean that the policy has been set to “Enforced.”

    A setting of “Enforced” means that if two separate GPOs have the same setting defined, but hold different values, the one that is set to “Enforced” will win and will be applied to the client. If a policy is set to “Enforced” at an OU/domain level and an OU below that is set to block inheritance, then the policy set for “Enforced” will still apply. You cannot block a policy from applying if “Enforced” has been set.

    Example of an enforced policy:

    GPSVC(328.7fc) 07:01:14:334 SearchDSObject: Searching <OU=Workstations,DC=contoso,DC=lab>
    GPSVC(328.7fc) 07:01:14:334 SearchDSObject: Found GPO(s): <[LDAP://cn={CC02524C-727C-4816-A298-D63D12E68C0F},cn=policies,cn=system,DC=contoso,DC=lab;2]>
    GPSVC(328.7fc) 07:01:14:334 AllocGpLink: GPO cn={CC02524C-727C-4816-A298-D63D12E68C0F},cn=policies,cn=system,DC=contoso,DC=lab has enforced link.
    GPSVC(328.7fc) 07:01:14:334 ProcessGPO(Machine): ==============================

    Now let‘s move down the log and we‘ll find the next step where the policies are being processed:

    GPSVC(31c.174) 10:02:00:007 ProcessGPO(Machine): ==============================
    GPSVC(31c.174) 10:02:00:007 ProcessGPO(Machine): Searching <CN={31B2F340-016D-11D2-945F-00C04FB984F9},CN=Policies,CN=System,DC=contoso,DC=lab>
    GPSVC(31c.174) 10:02:00:007 ProcessGPO(Machine): Machine has access to this GPO.
    GPSVC(31c.174) 10:02:00:007 ProcessGPO(Machine): Found common name of: <{31B2F340-016D-11D2-945F-00C04FB984F9}>
    GPSVC(31c.174) 10:02:00:007 ProcessGPO(Machine):
    GPO passes the filter check.
    GPSVC(31c.174) 10:02:00:007 ProcessGPO(Machine): Found functionality version of: 2
    GPSVC(31c.174) 10:02:00:007 ProcessGPO(Machine): Found file system path of: \\contoso.lab\sysvol\contoso.lab\Policies\{31B2F340-016D-11D2-945F-00C04FB984F9}
    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): Found display name of: <Default Domain Policy>
    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): Found machine version of: GPC is 17, GPT is 17
    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): Found flags of: 0
    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): Found extensions: [{35378EAC-683F-11D2-A89A-00C04FBBCFA2}{53D6AB1B-2488-11D1-A28C-00C04FB94F17}{53D6AB1D-2488-11D1-A28C-00C04FB94F17}][{827D319E-6EAC-11D2-A4EA-00C04F79F83A}{803E14A0-B4FB-11D0-A0D0-00A0C90F574B}][{B1BE8D72-6EAC-11D2-A4EA-00C04F79F83A}{53D6AB1B-2488-11D1-A28C-00C04FB94F17}{53D6AB1D-2488-11D1-A28C-00C04FB94F17}]
    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): ==============================

     

    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): ==============================
    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): Searching <cn={CC02524C-727C-4816-A298-D63D12E68C0F},cn=policies,cn=system,DC=contoso,DC=lab>
    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): Machine has access to this GPO.
    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): Found common name of: <{CC02524C-727C-4816-A298-D63D12E68C0F}>
    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): GPO passes the filter check.
    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): Found functionality version of: 2
    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): Found file system path of: \\contoso.lab\SysVol\contoso.lab\Policies\{CC02524C-727C-4816-A298-D63D12E68C0F}
    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): Found display name of: <GPO Guide test>
    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): Found machine version of: GPC is 1, GPT is 1
    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): Found flags of: 0
    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): Found extensions: [{35378EAC-683F-11D2-A89A-00C04FBBCFA2}{D02B1F72-3407-48AE-BA88-E8213C6761F1}]
    GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): ==============================

    First, we find the path where the GPO is stored in AD. As you can see, the GPO is still being represented by the GPO GUID and not its name: Searching <cn={CC02524C-727C-4816-A298-D63D12E68C0F},cn=policies,cn=system,DC=contoso,DC=lab>
    After that, it checks to see if the machine has access to the policy, if yes then the computer can apply the policy; if it does not have access, then he cannot apply it. As per example: Machine has access to this GPO.

    Moving on, if a policy has a WMI filter being applied, it will be verified in order to see if the filter matches the current machine\user or not.
    The WMI filter can be found in AD. If you are using GPMC, then this can be found in the right hand pane at the very bottom box, after highlighting the policy. From our example: GPO passes the filter check.

    Functionality version has to be a 2 for a Windows 2003 or later OS to apply the policy. From our example: Found functionality version of: 2
    A search in Sysvol for the GPO is also being executed, as explained in the beginning, both AD and Sysvol must be aware of the GPO and its settings. From our example: Found file system path of: <\\contoso.lab\SysVol\contoso.lab\Policies\{CC02524C-727C-4816-A298-D63D12E68C0F}>

    The next part is where we check the GPC (Group Policy Container, AD) and the GPT (Group Policy Template, Sysvol) for the version numbers. We check the version numbers to determine if the policy has changed since the last time it was applied. If the version numbers are different (GPC different than GPT) then we either have an AD replication or File replication problem. From our example we can see that there’s a match between those two: Found machine version of: GPC is 1, GPT is 1

    The extensions in the next line refers to the CSE (client-side extensions GUIDs) and will vary from policy to policy. As explained, they are the ones in charge at the client side to carry on our settings: From our example: GPSVC(31c.174) 10:02:00:038 ProcessGPO(Machine): Found extensions: [{35378EAC-683F-11D2-A89A-00C04FBBCFA2}{D02B1F72-3407-48AE-BA88-E8213C6761F1}]

    Let‘s have a look at an example with a WMI Filter being used, which does not suit our current system:

    GPSVC(328.7fc) 08:04:32:803 ProcessGPO(Machine): ==============================
    GPSVC(328.7fc) 08:04:32:803 ProcessGPO(Machine): Searching <cn={CC02524C-727C-4816-A298-D63D12E68C0F},cn=policies,cn=system,DC=contoso,DC=lab>
    GPSVC(328.7fc) 08:04:32:803 ProcessGPO(Machine): Machine has access to this GPO.
    GPSVC(328.7fc) 08:04:32:803 ProcessGPO(Machine): Found common name of: <{CC02524C-727C-4816-A298-D63D12E68C0F}> GPSVC(328.7fc) 08:04:32:803 FilterCheck: Found WMI Filter id of: <[contoso.lab;{CD718707-ACBD-4AD7-8130-05D61C897783};0]>
    GPSVC(328.7fc) 08:04:32:913 ProcessGPO(Machine): The GPO does not pass the filter check and so will not be applied.
    GPSVC(328.7fc) 08:04:32:913 ProcessGPO(Machine): Found functionality version of: 2
    GPSVC(328.7fc) 08:04:32:913 ProcessGPO(Machine): Found file system path of: \\contoso.lab\SysVol\contoso.lab\Policies\{CC02524C-727C-4816-A298-D63D12E68C0F}
    GPSVC(328.7fc) 08:04:32:928 ProcessGPO(Machine): Found display name of: <GPO Guide test>
    GPSVC(328.7fc) 08:04:32:928 ProcessGPO(Machine): Found machine version of: GPC is 1, GPT is 1
    GPSVC(328.7fc) 08:04:32:928 ProcessGPO(Machine): Found flags of: 0
    GPSVC(328.7fc) 08:04:32:928 ProcessGPO(Machine): Found extensions: [{35378EAC-683F-11D2-A89A-00C04FBBCFA2}{D02B1F72-3407-48AE-BA88-E8213C6761F1}]
    GPSVC(328.7fc) 08:04:32:928 ProcessGPO(Machine): ==============================

    In this scenario a WMI filter was used, which specifies that the used OS has to be Windows XP, so in order to apply the GPO the system OS has to match our filter. As our OS is Windows 2012R2, the filter does not match and so the GPO will not apply.

    Now we come to the part where we process CSE’s for particular settings, such as Folder Redirection, Disk Quota, etc. If the particular extension is not being used then you can simply ignore this section.

    GPSVC(31c.174) 10:02:00:038 ProcessGPOs(Machine): Get 2 GPOs to process.
    GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {35378EAC-683F-11D2-A89A-00C04FBBCFA2}
    GPSVC(31c.174) 10:02:00:038 ReadStatus: Read Extension's Previous status successfully.
    GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {0ACDD40C-75AC-47ab-BAA0-BF6DE7E7FE63}
    GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {0E28E245-9368-4853-AD84-6DA3BA35BB75}
    GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {16be69fa-4209-4250-88cb-716cf41954e0} GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {17D89FEC-5C44-4972-B12D-241CAEF74509}
    GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {1A6364EB-776B-4120-ADE1-B63A406A76B5}
    GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {25537BA6-77A8-11D2-9B6C-0000F8080861}
    GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {3610eda5-77ef-11d2-8dc5-00c04fa31a66} GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {3A0DBA37-F8B2-4356-83DE-3E90BD5C261F}
    GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {426031c0-0b47-4852-b0ca-ac3d37bfcb39} GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {42B5FAAE-6536-11d2-AE5A-0000F87571E3}GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {4bcd6cde-777b-48b6-9804-43568e23545d} GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {4CFB60C1-FAA6-47f1-89AA-0B18730C9FD3}
    GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {4D2F9B6F-1E52-4711-A382-6A8B1A003DE6}
    GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {5794DAFD-BE60-433f-88A2-1A31939AC01F}
    GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {6232C319-91AC-4931-9385-E70C2B099F0E} GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {6A4C88C6-C502-4f74-8F60-2CB23EDC24E2}GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {7150F9BF-48AD-4da4-A49C-29EF4A8369BA}
    GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {728EE579-943C-4519-9EF7-AB56765798ED} GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {74EE6C03-5363-4554-B161-627540339CAB} GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {7B849a69-220F-451E-B3FE-2CB811AF94AE} GPSVC(31c.174) 10:02:00:038 ReadExtStatus: Reading Previous Status for extension {827D319E-6EAC-11D2-A4EA-00C04F79F83A}


    Note:

    • You can always do a search for each of these GUIDs on MSDN and you should be able to find their proper names.
    • At the end of the machine GPO thread, we can also see the Foreground processing that we talked about in the beginning. We can see that the Foreground processing was Synchronous and that the next one will be Synchronous as well.
    • The end of the machine GPO processing thread comes to an end and we can see that it was completed with a bConnectivityFailure = 0.

    GPSVC(31c.174) 10:02:00:397 ProcessGPOs(Machine): SKU is SYNC: Mode: 1, Reason: 7
    GPSVC(31c.174) 10:02:00:397 gpGetFgPolicyRefreshInfo (Machine): Mode: Synchronous, Reason: 7
    GPSVC(31c.174) 10:02:00:397 gpSetFgPolicyRefreshInfo (bPrev: 1, szUserSid: Machine, info.mode: Synchronous)
    GPSVC(31c.174) 10:02:00:397 SetFgRefreshInfo: Previous Machine Fg policy Synchronous, Reason: SKU.
    GPSVC(31c.174) 10:02:00:397 gpSetFgPolicyRefreshInfo (bPrev: 0, szUserSid: Machine, info.mode: Synchronous)
    GPSVC(31c.174) 10:02:00:397 SetFgRefreshInfo: Next Machine Fg policy Synchronous, Reason: SKU.
    GPSVC(31c.174) 10:02:00:397 ProcessGPOs(Machine): Policies changed - checking if UBPM trigger events need to be fired
    GPSVC(31c.174) 10:02:00:397 CheckAndFireGPTriggerEvent: Fired Policy present UBPM trigger event for Machine.
    GPSVC(31c.174) 10:02:00:397 Application complete with bConnectivityFailure = 0.

     

    User GPO Thread

    This next part of the GPO log is dedicated to the user thread.

    While the machine thread had the TID (31c.174) the user thread has (31c.b8) which you can notice when the thread actually starts. You can see that the user SID is found.
    Also, notice this time the “bConsole: 1” at the end instead of 0 which we had for the machine.

    GPSVC(31c.704) 10:02:47:147 CGPEventSubSystem::GroupPolicyOnLogon::++ (SessionId: 1)
    GPSVC(31c.704) 10:02:47:147 CGPApplicationService::UserLogonEvent::++ (SessionId: 1, ServiceRestart: 0)
    GPSVC(31c.704) 10:02:47:147 CGPApplicationService::CheckAndCreateCriticalPolicySection.
    GPSVC(31c.704) 10:02:47:147 User SID = <S-1-5-21-646618010-1986442393-1057151281-1103>
    GPSVC(31c.b8) 10:02:47:147 CGroupPolicySession::ApplyGroupPolicyForPrincipal::++ (bTriggered: 0, bConsole: 1)

    You can see that it does the network check again and that it is also prepared to wait for network.

    GPSVC(31c.b8) 10:02:47:147 CGPApplicationService::GetTimeToWaitOnNetwork.
    GPSVC(31c.b8) 10:02:47:147 CGPMachineStartupConnectivity::CalculateWaitTimeoutFromHistory: Average is 3334.
    GPSVC(31c.b8) 10:02:47:147 CGPMachineStartupConnectivity::CalculateWaitTimeoutFromHistory: Current is 2203.
    GPSVC(31c.b8) 10:02:47:147 CGPMachineStartupConnectivity::CalculateWaitTimeoutFromHistory: Taking min of 6668 and 120000.
    GPSVC(31c.b8) 10:02:47:147 CGPApplicationService::GetStartTimeForNetworkWait.
    GPSVC(31c.b8) 10:02:47:147 StartTime For network wait: 3750ms

    In this case it decides to wait for network with timeout 0 ms because it already has network connectivity and so moves on to processing GPOs.

    GPSVC(31c.b8) 10:02:47:147 UserPolicy: Waiting for machine policy wait for network event with timeout 0 ms
    GPSVC(31c.b8) 10:02:47:147 CGroupPolicySession::ApplyGroupPolicyForPrincipal::ApplyGroupPolicy (dwFlags: 38).

    The next part remains the same as for the machine thread, it searches and returns networks found, number of interfaces and bandwidth check.

    GPSVC(31c.b8) 10:02:47:147 NlaQueryNetSignatures returned 1 networks
    GPSVC(31c.b8) 10:02:47:147 NSI Information (Network GUID) : {1F777393-0B42-11E3-80AD-806E6F6E6963}
    GPSVC(31c.b8) 10:02:47:147 # of interfaces : 1
    GPSVC(31c.b8) 10:02:47:147 Interface ID: {9869CFDA-7F10-4B3F-B97A-56580E30CED7}
    GPSVC(31c.b8) 10:02:47:163 GetDomainControllerConnectionInfo: Enabling bandwidth estimate.
    GPSVC(31c.b8) 10:02:47:475 Started bandwidth estimation successfully
    GPSVC(31c.b8) 10:02:47:851 IsSlowLink: Current Bandwidth >= Bandwidth Threshold.

    The ldap query for the GPOs is done in the same manner as for the machine thread:

    GPSVC(31c.b8) 10:02:47:490 GetGPOInfo: Entering...
    GPSVC(31c.b8) 10:02:47:490 SearchDSObject: Searching <OU=Admin Users,DC=contoso,DC=lab>
    GPSVC(31c.b8) 10:02:47:490 SearchDSObject: Found GPO(s): <[LDAP://cn={CCF581E3-E2ED-441F-B932-B78A3DFAE09B},cn=policies,cn=system,DC=contoso,DC=lab;0]>
    GPSVC(31c.b8) 10:02:47:490 ProcessGPO(User): ==============================
    GPSVC(31c.b8) 10:02:47:490 ProcessGPO(User): Deferring search for LDAP://cn={CCF581E3-E2ED-441F-B932-B78A3DFAE09B},cn=policies,cn=system,DC=contoso,DC=lab
    GPSVC(31c.b8) 10:02:47:490 SearchDSObject: Searching <DC=contoso,DC=lab>
    GPSVC(31c.b8) 10:02:47:490 SearchDSObject: Found GPO(s): <[LDAP://CN={31B2F340-016D-11D2-945F-00C04FB984F9},CN=Policies,CN=System,DC=contoso,DC=lab;0]>
    GPSVC(31c.b8) 10:02:47:490 ProcessGPO(User): ==============================
    GPSVC(31c.b8) 10:02:47:490 ProcessGPO(User): Deferring search for LDAP://CN={31B2F340-016D-11D2-945F-00C04FB984F9},CN=Policies,CN=System,DC=contoso,DC=lab
    GPSVC(31c.b8) 10:02:47:490 SearchDSObject: Searching <CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=contoso,DC=lab>
    GPSVC(31c.b8) 10:02:47:490 SearchDSObject: No GPO(s) for this object.
    GPSVC(31c.b8) 10:02:47:490 EvaluateDeferredGPOs: Searching for GPOs in cn=policies,cn=system,DC=contoso,DC=lab
    GPSVC(31c.b8) 10:02:47:490 EvaluateDeferredGPOs: Adding filters (&(!(flags:1.2.840.113556.1.4.803:=1))(gPCUserExtensionNames=[*])((|(distinguishedName=CN={31B2F340-016D-11D2-945F-00C04FB984F9},CN=Policies,CN=System,DC=contoso,DC=lab)(distinguishedName=cn={CCF581E3-E2ED-441F-B932-B78A3DFAE09B},cn=policies,cn=system,DC=contoso,DC=lab))))

    We can see the GPOs are processed exactly as explained in the machine part, while the difference is that the GPO has to be available for the user this time and not the machine. The important thing in the following example is that the Default Domain Policy (we know it is the Default Domain Policy because it has a hardcoded GUID {31B2F340-016D-11D2-945F-00C04FB984F9} which will be that same in every Domain) contains no extensions for the user side, thus being reported to us “has no extensions”:

    GPSVC(31c.b8) 10:02:47:851 EvalList: Object <CN={31B2F340-016D-11D2-945F-00C04FB984F9},CN=Policies,CN=System,DC=contoso,DC=lab> cannot be accessed/is disabled/or has no extensions
    GPSVC(31c.b8) 10:02:47:851 ProcessGPO(User): ==============================
    GPSVC(31c.b8) 10:02:47:851 ProcessGPO(User): Searching <cn={CCF581E3-E2ED-441F-B932-B78A3DFAE09B},cn=policies,cn=system,DC=contoso,DC=lab>
    GPSVC(31c.b8) 10:02:47:851 ProcessGPO(User): User has access to this GPO.
    GPSVC(31c.b8) 10:02:47:851 ProcessGPO(User): Found common name of: <{CCF581E3-E2ED-441F-B932-B78A3DFAE09B}>
    GPSVC(31c.b8) 10:02:47:851 ProcessGPO(User):
    GPO passes the filter check.
    GPSVC(31c.b8) 10:02:47:851 ProcessGPO(User): Found functionality version of: 2
    GPSVC(31c.b8) 10:02:47:851 ProcessGPO(User): Found file system path of: \\contoso.lab\SysVol\contoso.lab\Policies\{CCF581E3-E2ED-441F-B932-B78A3DFAE09B}
    GPSVC(31c.b8) 10:02:47:851 ProcessGPO(User): Found display name of: <GPO Guide Test Admin Users>
    GPSVC(31c.b8) 10:02:47:851 ProcessGPO(User): Found user version of: GPC is 3, GPT is 3
    GPSVC(31c.b8) 10:02:47:851 ProcessGPO(User): Found flags of: 0
    GPSVC(31c.b8) 10:02:47:851 ProcessGPO(User): Found extensions: [{35378EAC-683F-11D2-A89A-00C04FBBCFA2}{D02B1F73-3407-48AE-BA88-E8213C6761F1}]
    GPSVC(31c.b8) 10:02:47:851 ProcessGPO(User): ==============================

    After that, our policy settings are processed directly into the registry by the CSE:

    GPSVC(318.7ac) 02:02:02:187 SetRegistryValue: NoWindowsMarketplace => 1 [OK]
    GPSVC(318.7ac) 02:02:02:187 SetRegistryValue: ScreenSaveActive => 0 [OK]

    While moving on to process CSE’s for particular settings, such as Folder Redirection, Disk Quota, etc., exactly as it was done for the machine thread.

    Here it is the same as the machine thread, where the user thread is also finished with a bConnectivityFailure = 0 and everything was applied as expected.

    GPSVC(31c.b8) 10:02:47:912 User logged in on active session
    GPSVC(31c.b8) 10:02:47:912 ApplyGroupPolicy: Getting ready to create background thread GPOThread.
    GPSVC(31c.b8) 10:02:47:912 CGroupPolicySession::ApplyGroupPolicyForPrincipal Setting m_pPolicyInfoReadyEvent
    GPSVC(31c.b8) 10:02:47:912 Application complete with bConnectivityFailure = 0.

    In the gpsvc log, you will always have a confirmation that the “problematic” GPO was indeed processed or not; this is to make sure that the GPO was read and applied from the domain. The registry values that the GPO contains should be applied on the client side by the CSEs, so if you see a GPO in gpsvc getting applied but the desired setting isn’t applied on the client side, it is a good idea to check the registry values yourself by using “regedit” in order to ensure they have been properly set.

    If these registry values are getting changed after they have been applied, a good tool provided by Microsoft to further troubleshoot this is Process Monitor, which can be used to follow those certain registry settings and see who’s changing them.

    There are definitely all sort of problem scenarios that I haven’t covered with this guide. This is meant as a starter guide for you to have an idea how to follow up if your domain GPOs aren’t getting applied and you want to use our gpsvc log to troubleshoot this.

    Finally, as Client Side Extensions (CSE) play a major role for GPO settings distribution, here is a list for those of you that want to go deeper with CSE Logging, which you can enable in order to gather more information about the CSE state:

    Scripts and Administrative Templates CSE Debug Logging (gptext.dll) HKLM\Software\Microsoft\WindowsNT\CurrentVersion\Winlogon

    ValueName: GPTextDebugLevel
    ValueType: REG_DWORD
    Value Data: 0x00010002
    Options: 0x00000001 = DL_Normal
    0x00000002 = DL_Verbose
    0x00010000 = DL_Logfile
    0x00020000 = DL_Debugger

    Log File: C:\WINNT\debug\usermode\gptext.log

    Security CSE WINLOGON Debug Logging (scecli.dll)
    KB article: 245422 How to Enable Logging for Security Configuration Client Processing in Windows 2000

    HKLM\Software\Microsoft\WindowsNT\CurrentVersion\WinLogon\GPExtensions\{827D319E-6EAC-11D2- A4EA-00C04F79F83A

    ValueName: ExtensionDebugLevel
    ValueType: REG_DWORD
    Value Data: 2
    Options: 0 = Log Nothing
    1 = Log only errors
    2 = Log all transactions

    Log File: C:\WINNT\security\logs\winlogon.log

    Folder Redirection CSE Debug Logging (fdeploy.dll)
    HKLM\Software\Microsoft\WindowsNT\CurrentVersion\Diagnostics

    ValueName: fdeployDebugLevel
    ValueType: REG_DWORD
    Value Data: 0x0f

    Log File: C:\WINNT\debug\usermode\fdeploy.log

    Offline Files CSE Debug Logging (cscui.dll)
    KB article: 225516 How to Enable the Offline Files Notifications Window in Windows 2000

    Software Installation CSE Verbose logging (appmgmts.dll)
    KB article: 246509 Troubleshooting Program Deployment by Using Verbose Logging
    HKLM\Software\Microsoft\WindowsNT\CurrentVersion\Diagnostics

    ValueName: AppmgmtDebugLevel
    ValueType: REG_DWORD
    Value Data: 0x9B or 0x4B

    Log File: C:\WINNT\debug\usermode\appmgmt.log

    Software Installation CSE Windows Installer Verbose logging
    KB article: 314852 How to enable Windows Installer logging

    HKLM\Software\Policies\Microsoft\Windows\Installer

    ValueName: Logging
    Value Type: Reg_SZ
    Value Data: voicewarmup

    Log File: C:\WINNT\temp\MSI*.log

    Desktop Standard CSE Debug Logging
    KB article: 931066 How to enable tracing for client-side extensions in PolicyMaker

    GPEDIT - Group Policy Editor Console Debug Logging
    TechNet article: Enabling Logging for Group Policy Editor
    HKLM\Software\Microsoft\Windows NT\CurrentVersion\Winlogon

    Value Name: GPEditDebugLevel
    Value Type: REG_DWORD
    Value Data: 0x10002

    Log File: %windir%\debug\usermode\gpedit.log

    GPMC - Group Policy Management Console Debug Logging
    TechNet article: Enable Logging for Group Policy Management Console
    HKLM\Software\Microsoft\Windows NT\CurrentVersion\Diagnostics

    Value Name: GPMgmtTraceLevel
    Value Type: REG_DWORD
    Value Data: 2

    HKLM\Software\Microsoft\Windows NT\CurrentVersion\Diagnostics

    Value Name: GPMgmtLogFileOnly
    Value Type: REG_DWORD
    Value Data: 1

    Log File: C:\Documents and Settings\<user>\Local Settings\Temp\gpmgmt.log

     

    RSOP - Resultant Set of Policies Debug Logging
    Debug Logging for RSoP Procedures:
    HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Winlogon

    Value Name: RsopDebugLevel
    Value Type: REG_DWORD
    Value Data: 0x00010004


    Log File: %windir%\system32\debug\USERMODE\GPDAS.LOG

    WMI Debug Logging
    ASKPERF blog post: WMI Debug Logging

    I hope this was interesting and shed some light on how to start analyzing the gpsvc log.

    Thank you,

    David Ani

  • Migrating your Certification Authority Hashing Algorithm from SHA1 to SHA2

     

    Hey all, Rob Greene here again. Well it’s been a very long while since I have written anything for the AskDS blog. I’ve been heads down supporting all the new cool technology from Microsoft.

    I wanted to see if I could head off some cases coming our way with regard to the whole SHA1 deprecation that seems to be getting talked about on all kinds of PKI related websites. I am not discussing anything new about Microsoft SHA1 deprecation plans. If you want information on this topic please look at the following link: SHA1 Deprecation Policy - http://blogs.technet.com/b/pki/archive/2013/11/12/sha1-deprecation-policy.aspx

    It does appears that some Web browsers are on a faster timeline to not allow SHA1 certificates as Goggle Chrome has outlined in this blog: http://blog.chromium.org/2014/09/gradually-sunsetting-sha-1.html

    So as you would suspect, we are starting to get a few calls from customers wanting to know how to migrate their current Microsoft PKI hierarchy to support SHA2 algorithms. We actually do have a TechNet article explaining the process.

    Before you go through this process of updating your current PKI hierarchy, I have one question for you. Are you sure that all operating systems, devices, and applications that currently use internal certificates in your enterprise actually support SHA2 algorithms?

    How about that ancient Java based application running on the 20 year old IBM AS400 that basically runs the backbone of your corporate data? Does the AS400 / Java version running on it support SHA2 certificates so that it can do LDAPS calls to the domain controller for user authentication?

    What about the old version of Apache or Tomcat web servers you have running? Do they support SHA2 certificates for the websites they host?

    You are basically going to have to test every application within your environment to make sure that they will be able to do certificate chaining and revocation checking against certificates and CRLs that have been signed using one of the SHA2 algorithms. Heck, you might remember we have the following hotfix’s so that Windows XP SP3 and Windows Server 2003 SP2 can properly chain a certificate that contains certification authorities that were signed using SHA2 algorithms.

    Windows Server 2003 and Windows XP clients cannot obtain certificates from a Windows Server 2008-based certification authority (CA) if the CA is configured to use SHA2 256 or higher encryption

    http://support.microsoft.com/kb/968730/EN-US

    Applications that use the Cryptography API cannot validate an X.509 certificate in Windows Server 2003

    http://support.microsoft.com/kb/938397/EN-US

    Inevitably we get the question “What would you recommend Microsoft?” Well that is really a loaded question since we have no idea what is in your vast enterprise environment outside of Microsoft operating systems and applications. When this question comes up the only thing that we can say is that any currently supported Microsoft operating system or application should have no problems supporting a certificate chain or CRL signed using SHA2 algorithms. So if that is the only thing in your environment you could easily follow the migration steps and be done. However, if you are using a Microsoft operating system outside of main stream support, it most likely does not support SHA2 algorithms. I actually had a customer ask if Windows CE supported SHA2; which I had to tell him it does not. (Who knew you guys still ran those things in your environments!)

    If you have any 3rdparty applications or operating systems, then I would suggest you look on the vendor’s website or contact their technical support to get a definitive answer about support for SHA2 algorithms. If you are using a product that has no support then you might need to stand up a SHA2 certificate chain in a lab environment and test the product. Once a problem has been identified you can work with that vendor to find out if they have a new version of the application and/or operating system that supports SHA2 or find out when they plan on supporting it.

    If you do end up needing to support some applications that currently do not support SHA2 algorithms, I would suggest that you look into bringing up a new PKI hierarchy alongside your current SHA1 PKI hierarchy. Slowly begin migrating SHA2 supported applications and operating systems over to the new hierarchy and only allow applications and operating systems that support SHA1 on the existing PKI hierarchy.

    Nah, I want to do the migration!

    So if you made it down to this part of the blog you either actually want to do the migration or curiosity has definitely got the better of you, so let’s get to it. The TechNet article below discusses how to migrate your private key from using a Cryptographic Service Provider (CSP) which only supports SHA1 to a Key Storage Provider (KSP) that supports SHA2 algorithms:

    Migrating a Certification Authority Key from a Cryptographic Service Provider (CSP) to a Key Storage Provider (KSP) - http://technet.microsoft.com/en-us/library/dn771627.aspx

    In addition to this process, I would first recommend that you export all the private and public key pairs that your Certification Authority has before going through with the steps outlined in the above TechNet article. The article seems to assume you have already taken good backups of the Certification Authorities private keys and public certificates.

    Keep in mind that if your Certification Authority has been in production for any length of time you have more than likely renewed the Certification Authority certificate at least once in its lifetime. You can quickly find out by looking at the properties of the CA on the general tab.

    When you change the hashing algorithm over to a SHA2 algorithm you are going to have to migrate all CA certificates to use the newer Key Storage Providers if you are currently using Cryptographic Service Providers. If you are NOT using the Microsoft Providers please consult your 3rdparty vendor to find out their recommended way to migrate from CSP’s to KSP’s. This would also include those certification authorities that use Hardware Storage Modules (HSM).

    Steps 1 -9 in the article further explain backing up the CA configuration, and then changing from CSP’s over to KSP’s. This is required as I mentioned earlier, since SHA2 algorithms are only supported by Key Storage Providers (KSP) which was not possible prior to Windows Server 2008 Certification Authorities. If you previously migrated your Windows Server 2003 CA to one of the newer operating systems you were previously kind of stuck using CSP’s.

    Step 10 is all about switching over to use SHA2 algorithms, and then starting the Certification Authority back up.

    So there you go. You have your existing Certification Authority issuing SHA2 algorithm certificates and CRLS. This does not mean that you will start seeing the SHA256 RSA for signature algorithm or SHA256 for signature hash algorithm on the certification authority’s certificates. For that to happen you would need to do the following:

    · Update the configuration on the CA that issued its certificate and then renew with a new key.

    · If it is a Root CA then you also need to renew with a new key.

    Once the certification authority has been configured to use SHA2 hashing algorithms. not only will newly issued certificates be signed using the new hashing algorithm, all the certification authorities CRLs will also be signed using the new hashing algorithm.

    Run: CertUtil –CRL on the certification authority; which causes the CA to generate new CRLs. Once this is done double click on one of the CRLs and you will see the new signature algorithm.

    As you can tell, not only do newly issued end entity certificates get signed using the SHA2 algorithm, so do all existing CRLs that the CA needs to publish. This is why you not only have to update the current CA certificate to use KSP’s, you also need to update the existing CA certificates as well as long as they are still issuing new CRLs. Existing CA certificates issue new CRLs until they expire, once the expiration period has happened then that CA certificate will no longer issue CRLs.

    As you can see, asking that simple question of “can I migrate my current certification authority from SHA1 to SHA2” it’s really not such an easy question to answer for us here at Microsoft. I would suspect that most of you are like me and would like to err on the side of caution in this regard. If this was my environment I would stand up a new PKI hierarchy that is built using SHA2 algorithms from the start. Once that has been accomplished, I would test each application in the environment that leverages certificates. When I run into an application that does not support SHA2 I would contact the vendor and get on record when they are going to start supporting SHA2, or ask the application owner when they are planning to stop using the application. Once all this is documented I would revisit these end dates to see if the vendor has updated support or find out if the application owner has replaced the application with something that does support SHA2 algorithms.

    Rob “Pass the Hashbrowns” Greene

  • DFSR: Limiting the Number of Imported Replicated Folders when using DB cloning

    Hello! Warren here to talk about a very specific scenario involving the new DB cloning feature added to DFSR in Windows Server 2012 R2. The topic is how to limit or control which RFs you import on the import server in a DB cloning scenario.

    Ned Pyle has already covered in detail the topic of DB cloning over at the Filecab blog, which you can find here. If you do not know anything about DB cloning in DFSR, you should read Ned's blog post first before reading this blog post. Otherwise, this topic will not make much sense to you. Every DFSR admin should know about DB cloning, it is a very significant feature in DFSR on Windows Server 2012 R2. Trust me you will want to use it.

    Why Would I Want to Limit the Number of Replicated Folders on the Import Server?

    To understand why you may need to limit the number of RFs on an import server you need to understand these two facts:

    • When you export a DFSR DB the export will include every Replicated Folder (RF) on the volume.
    • The import server must have a preseeded copy of each RF on the import volume or the import will fail.

    What this means is that if you want to setup or recover a DFSR server using DB cloning, the import and export servers must replicate a common set of RFs. If the export server has 5 RFs on the volume the import server must have the same 5 RFs preseeded on the import volume. This makes DB cloning unusable in some situations such as hub and spoke deployments. Luckily, there is a workaround.

    What is the Workaround?

    Fortunately, a workaround is available to limit the number of RFs on the import server. Before we discuss the guts of the workaround we will briefly discuss the DB cloning process. Please Ned's blog post for in-depth details.

    1. Have at least one established RF or create a new one with one primary member

    2. Export the DB from the source server with the cmdlet "Export-DfsrClone". This Exports both the DB and a config.xml file

    3. Preseed the RF data on the import server.

    4. Copy the exported DB and config.xml file to the import server

    5. Import the DB on the import server with the cmdlet "Import-DfsrClone"

    6. Add the import server to the Replication Group

    7. Configure connections

    8. Configure membership in the RFs

    The workaround consists of editing the config.xml file created in step 2 above to remove the RFs that you will not host on the server. Simple enough! How do I edit the config.xml file?

    How to Edit the Config.xml file

    Editing the config.xml file will be much easier using Visual Studio or some other XML editing tool. The steps below use Visual Studio 2012 for Windows Desktop, which you can use free of charge.

    clip_image001 Note:

    Whatever tool you use must not encode the file when you save the file. XMLNotepad for example, cannot be used as it enforces saving in the UTF-8 format. I recommend using Visual Studio as it "just works" for this task and is free.

    1. Make a backup copy of the config.xml file just in case you make a mistake.
    2. Download Visual Studio Express 2012 for Windows Desktop from here and install it.
    3. Register for a free PID here
    4. Open Visual Studio and enter your PID when prompted.
    5. Click File\Open File and select the config.xml file from the exported DB.
    6. Select Edit\Advanced\Format Document. This will switch the format to a hierarchical format, which is much easier to edit.
    7. Locate the RFs you want to remove from the config.xml file. (I bet you are asking, "Exactly how do I locate the RFs I want to remove in the config.xml?")
    How to locate your RFs in the config.xml:

    Each RF in the config.xml can be located by these tags. <DfsrReplicatedFolder> and </DfsrReplicatedFolder> . Each RF name located by the tags <DfsrReplicatedFolderName> and </DfsrReplicatedFolderName>. In the example below I have identified the RF "RF01" and circled the beginning and ending tags.

    image

    8. Locate the RF or RFs that you do not want to import. Next to the <DfsrReplicatedFolder> tag click the minus sign to collapse the node. You should now see the collapsed RF as a single line in the config.xml file. An example of a collapsed RF is shown below circled in red

    clip_image005

    1. Highlight the collapsed RF, right click and select cut. This will remove the RF you do not want to import from the config.xml. Repeat for each RF you want to remove.
    2. Save the edited config.xml file. The Import-DfsrClone cmdlet is hardcoded to use the filename config.xml. Do not try to use another file name.

    Now that you have your edited config.xml you can finish your DB clone as you normally would.

    Warren

  • Understanding ATQ performance counters, yet another twist in the world of TLAs

    Hello again, this is guest author Herbert from Germany.

    If you worked an Active Directory performance issue, you might have noticed a number of AD Performance counters for NTDS and “Directory Services” objects including some ATQ related counters.

    In this post, I provide a brief overview of ATQ performance counters, how to use them and discuss several scenarios we've seen.

    What are all these ATQ thread counters there for anyway?

    “ATQ” stands for “Asynchronous Thread Queue”.
    LSASS adopted its threading library from IIS to handle Windows socket communication and uses a thread queue to handle requests from Kerberos and LDAP.
    English versions of ATQ counters are named per component so you can group them together when viewing a performance log. Here is list followed by a short explanation of each ATQ counter:

    Counter

    Explanation

    ATQ Estimated Queue Delay

    How long a request has to wait in the queue

    ATQ Outstanding Queued Requests

    Current number of requests in the queue

    ATQ Request Latency

    Time it takes to process a request

    ATQ Threads LDAP

    The number of threads used by the LDAP server as determined by LDAP policy.

    ATQ Threads Other

    Threads used by other component, in this case the KDC

    ATQ Threads Total

    All Threads currently allocated

    More details on the counters
    ATQ Threads Total
    This counter tracks the total number of threads from the ATQ Threads LDAP and ATQ Threads Other counters. The maximum number of threads that a given DC can apply to incoming workloads can be found my multiplying the product of MaxPoolThreads times the number of logical CPU cores. MaxPoolThreads defaults to a value of 4 in LDAP Policy and should not be modified without understanding the implications.

    When viewing performance logs from a performance challenged DC:

    • Compare the “ATQ Threads Total” counter with the other two “ATQ Threads…” counters. If the “ATQ Threads LDAP” counter equals “ATQ Threads Total” then all of the LDAP listen threads are stuck processing LDAP requests currently. If the “ATQ Threads Other” counter equals “ATQ Threads Total”, then all of the LDAP listen threads are busy responding to Kerberos related traffic.
    • Similarly, note how close the current value for ATQ Thread total is to the max value recorded in the trace and whether both values are using the maximum number of threads supported by the DC being monitored.

    Note that the value for the current number of ATQ Threads Total does not have to match the maximum value as the thread count will increase and decrease based on load. Pay attention when the current value for this counter matches the total # of threads supported by the DC being monitored.

    ATQ Threads LDAP
    This is the number of threads currently servicing LDAP requests. If there are a significant number of concurrent LDAP queries being processed, check for

    • Expensive or Inefficient LDAP queries
    • Excessive numbers of LDAP queries
    • An Insufficient number of DCs to service the workload (or existing DCs are undersized)
    • Memory, CPU or disk bottlenecks on the DC

    Large values for this counter are common but the thread count should remain less than the total # of threads supported by your DC. The ATQ Threads LDAP and other ATQ counters are captured by the built-in AD Diagnostic Data Collector Set documented in this blog entry.

    Follow these guides if applications are generating expensive queries:

    The ATQ Threads LDAP counter could also run “hot” for reasons that are initially triggered by LDAP but are ultimately affected by external reasons:

    External factors Scenario

    Symptom and Cause

    Resolution

    Scenario 1

    DC locator traffic (LDAP ping) from clients whose IP address doesn't map to an AD site

    The LDAP server performs an exhaustive address lookup to discover additional client IP addresses so that it may find a site to map to the client.

    LDAP, Kerberos and DC locator responses are slow or time out

    Netlogon event 5807 may be logged within a four hour window.

    According to the name resolution response or time-out, the related LDAP ping is locking one of the threads of the limited Active Thread Queue (ATQ) pool. Many of these LDAP pings over a longer time may constantly exhaust the ATQ pool. Because the same pool is required for regular LDAP and Kerberos requests, the domain controller may become unresponsive to unavailable to users and applications.

    The problem is described in KB article 2668820. Install corrective fixes and policy documented in KB 2922852.

    Scenario 2

    DC supports LDAP over SSL/TLS

    A user sends a certificate on a session. The server need to check for certificate revocation which may take some time.


    This becomes problematic if network communication is restricted and the DC cannot reach the Certificate Distribution Point (CDP) for a certificate.

    To determine if your clients are using secure LDAP (LDAPs), check the counter "LDAP New SSL Connections/sec".

    If there are a significant number of sessions, you might want to look at CAPI-Logging.

    See the details below

    For scenario 2: Depending on the details, there are a few approaches to remove the bottleneck:

    1. In certificate manager, locate the certificate used for LDAPs for the account in question and in the general pane, select the item “Enable only the following purposes” and uncheck the CLIENT_AUTHENTICATION purpose. The Internet Proxy and Universal Access Gateway team sees this more often for reserve proxy scenarios, this guide describes the Windows Server 2003 UI: http://technet.microsoft.com/en-us/library/cc514301.aspx
    2. Use different certificates that can be checked in the internal network, or remove the CLIENT_AUTHENTICATION purpose on new certificates.
    3. Allow the DC to access the real CDP, maybe allow it to traverse the proxy to the Internet. It’s quite possible that your security department goes a bit frantic on the idea.
    4. Shorten the time-out for CRL checks so the DC gives up faster, see ChainUrlRetrievalTimeoutMilliseconds and ChainRevAccumulativeUrlRetrievalTimeoutMilliseconds on TechNet. This does not avoid the problem, but reduces the performance impact.
    5. You can suppress the “invitation” to send certificates by not sending a list of trusted roots in the local store by using SendTrustedIssuerList=0. This does not help if the client is coded to always include a certificate if a suitable certificate is present. The Microsoft LDAP client defaults to doing this, thus:
    6. Change the client application to not include the user certificate. This requires setting an LDAP session option before starting the actual connection. In LDAP API set the option:

    LDAP_OPT_SSPI_FLAGS
    0x92
    Sets or retrieves a ULONG value giving the flags to pass to the SSPI InitializeSecurityContext function.

    In System.DirectoryServices.Protocols:

    SspiFlag

    The SspiFlag property specifies the flags to pass to the Security Support Provider Interface (SSPI) InitializeSecurityContext function. For more information about the InitializeSecurityContext function, see the InitializeSecurityContext function topic in the MSDN library

    From InitializeSecurityContext:

    ISC_REQ_USE_SUPPLIED_CREDS

    Schannel must not attempt to supply credentials for the client automatically

    ATQ Threads Other
    You can also have external dependencies generating requests that hit the Kerberos Key Distribution Center (KDC).
    One common operation is getting the list of global and universal groups from a DC that is not a Global Catalog (GC).
    A 2nd external and potentially intermittent root cause occurs when the Kerberos Forest Search Order (KFSO) feature has been enabled on Windows Server 2008 R2 and later KDCs to search trusted forests for SPNs that cannot be located in the local forest.
    The worst case scenario occurs when the KDC searches both local and trusted forests for an SPN that can’t be found either because the SPN does not exist or because the search focused on an incorrect SPN.
    Memory dumps from in-state KDCs will reveal a number of threads working on Kerberos Service Ticket Requests along with pending RPC calls +to remote domain controllers.
    Procdump triggered by performance counters could also be used to identify the condition if the spikes last long enough to start and capture the related traffic.

    More information on KFSO can be found on TechNet including performance counters to monitor when using this feature.

    ATQ Queues and ATQ Request Latency
    The ATQ Queue and latency counters provide statistics as to how requests are being processed. Since the type of requests can differ, the average processing time is typically not significant. An expensive LDAP query that takes minutes to execute can be masked by hundreds of fast LDAP queries or KDC requests.

    The main use of these counters is to monitor the wait time in queue and the number of requests in the queue. Any non-zero values indicate that the DC has run out of threads.

    Note the performance monitor counters have a timing behavior on the actual time a performance variable is sampled. This is quite a problem when you have a high sample interval. Thus a counter for current queue length such as “ATQ Outstanding Queued Requests” may not be reliable to show the actual degree of server overload.

    To work around the averaging problem, you have to take other counters into consideration to better validate confidence in the value. In the event of an actual wait time, there must have been requests sitting in the queue at some point in the last sample interval. The load and processing delay was just not bad enough to have at least one in the queue at the sample time-stamp.

    What about other thread pools?

    LSASS has a number of other worker threads, e.g. to process IPSec-handshakes. Then of course there is the land of RPC server threads for the various RPC servers. Describing all the RPC servers would take up a number of additional blog entries. You can see them listed as “load generators” in the data collector set results.

    A lot of details on LSASS ATQ performance counters, I know. But, geeks love the details.

    Cheers,

    Herbert

  • Remove Lingering Objects that cause AD Replication error 8606 and friends

    Introducing the Lingering Object Liquidator

    Hi all, Justin Turner here ---it's been a while since my last update. The goal of this post is to discuss what causes lingering objects and show you how to download, and then use the new GUI-based Lingering Object Liquidator (LOL) tool to remove them. This is a beta version of the tool, and it is currently not yet optimized for use in large Active Directory environments.

    This is a long article with lots of background and screen shots, so plug-in or connect to a fast connection when viewing the full entry. The bottom of this post contains a link to my AD replication troubleshooting TechNet lab for those that want to get their hands dirty with the joy that comes with finding and fixing AD replication errors.  I’ve also updated the post with a link to my Lingering Objects hands-on lab from TechEd Europe.

    Overview of Lingering Objects

    Lingering objects are objects in AD than have been created, replicated, deleted, and then garbage collected on at least the DC that originated the deletion but still exist as live objects on one or more DCs in the same forest. Lingering object removal has traditionally required lengthy cleanup sessions using tools like LDP or repadmin /removelingeringobjects. The removal story improved significantly with the release of repldiag.exe. We now have another tool for our tool belt: Lingering Object Liquidator. There are related topics such as “lingering links” which will not be covered in this post.

    Lingering Objects Drilldown

    The dominant causes of lingering objects are

    1. Long-term replication failures
    While knowledge of creates and modifies are persisted in Active Directory forever, replication partners must inbound replicate knowledge of deleted objects within a rolling Tombstone Lifetime (TSL) # of days (default 60 or 180 days depending on what OS version created your AD forest). For this reason, it is important to keep your DCs online and replicating all partitions between all partners within a rolling TSL # of days. Tools like REPADMIN /SHOWREPL * /CSV, REPADMIN /REPLSUM and AD Replication Status should be used to continually identify and resolve replication errors in your AD forest.

    2. Time jumps
    System time jump more than TSL # of days in the past or future can cause deleted objects to be prematurely garbage collected before all DCs have inbound replicated knowledge of all deletes. The protection against this is to ensure that :

      1. your forest root PDC is continually configured with a reference time source (including following FSMO transfers
      2. All other DCs in the forest are configured to use NT5DS hierarchy
      3. Time rollback and roll-forward protection has been enabled via the maxnegphasecorrection and maxposphasecorrection registry settings or their policy-based equivalents.

    The importance of configuring safeguards can't be stressed enough. Look at this post to see what happens when time gets out of whack.

    3. USN Rollbacks

    USN rollbacks are caused when the contents of an Active Directory database move back in time via an unsupported restore. Root causes for USN Rollbacks include:

    • Manually copying previous version of the database into place when the DC is offline
    • P2V conversions in multi-domain forests
    • Snapshot restores of physical and especially virtual DCs. For virtual environments, both the virtual host environment AND the underlying guest DCs should be Virtual Machine Generation ID capable. Windows Server 2012 or later. Both Microsoft and VMWARE make VM-Generation ID aware Hyper-V host.

    Events, errors and symptoms that indicate you have lingering objects
    Active Directory logs an array of events and replication status codes when lingering objects are detected. It is important to note that while errors appear on the destination DC, it is the source DC being replicated from that contains the lingering object that is blocking replication. A summary of events and replication status codes is listed in the table below:

    Event or Error status

    Event or error text

    Implication

    AD Replication status 8606

    "Insufficient attributes were given to create an object. This object may not exist because it may have been deleted."

    Lingering objects are present on the source DC (destination DC is operating in Strict Replication Consistency mode)

    AD Replication status 8614

    The directory service cannot replicate with this server because the time since the last replication with this server has exceeded the tombstone lifetime.

    Lingering objects likely exist in the environment

    AD Replication status 8240

    There is no such object on the server

    Lingering object may exist on the source DC

    Directory Service event ID 1988

    Active Directory Domain Services Replication encountered the existence of objects in the following partition that have been deleted from the local domain controllers (DCs) Active Directory Domain Services database.

    Lingering objects exist on the source DC specified in the event

    (Destination DC is running with Strict Replication Consistency)

    Directory Service event ID 1388

    This destination system received an update for an object that should have been present locally but was not.

    Lingering objects were reanimated on the DC logging the event

    Destination DC is running with Loose Replication Consistency

    Directory Service event ID 2042

    It has been too long since this server last replicated with the named source server.

    Lingering object may exist on the source DC

    A comparison of Tools to remove Lingering Objects

    The table below compares the Lingering Object Liquidator with currently available tools that can remove lingering objects

    Removal method

    Object / Partition & and Removal Capabilities

    Details

    Lingering Object Liquidator

    Per-object and per-partition removal

    Leverages:

    • RemoveLingeringObjects LDAP rootDSE modification
    • DRSReplicaVerifyObjects method

    • GUI-based.
    • Quickly displays all lingering objects in the forest to which the executing computer is joined.
    • Built-in discovery via DRSReplicaVerifyObjects method
    • Automated method to remove lingering objects from all partitions
    • Removes lingering objects from all DCs (including RODCs) but not lingering links.
    • Windows Server 2008 and later DCs (will not work against Windows Server 2003 DCs)

    Repldiag /removelingeringobjects

    Per-partition removal

    Leverages:

    • DRSReplicaVerifyObjects method

    • Command line only
    • Automated method to remove lingering objects from all partitions
    • Built-in discovery via DRSReplicaVerifyObjects
    • Displays discovered objects in events on DCs
    • Does not remove lingering links. Does not remove lingering objects from RODCs (yet)

    LDAP RemoveLingeringObjects rootDSE primative (most commonly executed using LDP.EXE or an LDIFDE import script)

    Per-object removal

    • Requires a separate discovery method
    • Removes a single object per execution unless scripted.

    Repadmin /removelingeringobjects

    Per-partition removal

    Leverages:

    • DRSReplicaVerifyObjects method

    • Command line only
    • Built-in discovery via DRSReplicaVerifyObjects
    • Displays discovered objects in events on DCs
    • Requires many executions if a comprehensive (n * n-1 pairwise cleanup is required. Note: repldiag and the Lingering Object Liquidator tool automate this task.

    The Repldiag and Lingering Object Liquidator tools are preferred for lingering object removal because of their ease of use and holistic approach to lingering object removal.

    Why you should care about lingering object removal

    Widely known as the gift that keeps on giving, it is important to remove lingering objects for the following reasons

    • Lingering objects can result in a long term divergence for objects and attributes residing on different DCs in your Active Directory forest
    • The presence of lingering objects prevents the replication of newer creates, deletes and modifications to destination DCs configured to use strict replication consistency. These un-replicated changes may apply to objects or attributes on users, computers, groups, group membership or ACLS.
    • Objects intentionally deleted by admins or application continue to exist as live objects on DCs that have yet to inbound replicate knowledge of the deletes.

    Once present, lingering objects rarely go away until you implement a comprehensive removal solution. Lingering objects are the unwanted houseguests in AD that you just can't get rid of.

    Mother in law jokes… a timeless classic.

    We commonly find these little buggers to be the root cause of an array of symptom ranging from logon failures to Exchange, Lync and AD DS service outages. Some outages are resolved after some lengthy troubleshooting only to find the issue return weeks later.
    The remainder of this post, we will give you everything needed to eradicate lingering objects from your environment using the Lingering Object Liquidator.

    Repldiag.exe is another tool that will automate lingering object removal. It is good for most environments, but it does not provide an interface to see the objects, clean up RODCs (yet) or remove abandoned objects.

    Introducing Lingering Object Liquidator

     More:

    Lingering Object Liquidator automates the discovery and removal of lingering objects by using the DRSReplicaVerifyObjects method used by repadmin /removelingeringobjects and repldiag combined with the removeLingeringObject rootDSE primitive used by LDP.EXE. Tool features include:

    • Combines both discovery and removal of lingering objects in one interface
    • Is available via the Microsoft Connect site
    • The version of the tool at the Microsoft Connect site is an early beta build and does not have the fit and finish of a finished product
    • Feature improvements beyond what you see in this version are under consideration

    How to obtain Lingering Object Liquidator

    1. Log on to the Microsoft Connect site (using the Sign in) link with a Microsoft account:

    http://connect.microsoft.com

    Note: You may have to create a profile on the site if you have never participated in Connect.

    2. Open the Non-feedback Product Directory:

    https://connect.microsoft.com/directory/non-feedback

    3. Join the following program:

    AD Health

    Product Azure Active Directory Connection Join link

    4. Click the Downloads link to see a list of downloads or this link to go directly to the Lingering Objects Liquidator download. (Note: the direct link may become invalid as the tool gets updated.)

    5. Download all associated files

    6. Double click on the downloaded executable to open the tool.

    Tool Requirements

    1. Install Lingering Object Liquidator on a DC or member computer in the forest you want to remove lingering objects from.

    2. .NET 4.5 must be installed on the computer that is executing the tool.

    3. Permissions: The user account running the tool must have Domain Admin credentials for each domain in the forest that the executing computer resides in. Members of the Enterprise Admins group have domain admin credentials in all domains within a forest by default. Domain Admin credentials are sufficient in a single domain or single domain forest.

    4. The admin workstation must have connectivity over the same port and protocol required of a domain-joined member computer or domain controller against any DC in the forest. Protocols of interest include DNS, Kerberos, RPC, LDAP and ephemeral port range used by the targeted DC See TechNet for more detail. Of specific concern: Pre-W2K8 DCs communicate over the “low” ephemeral port between 1024 and 5000 while post W2K3 DCs use the “high” ephemeral port range between 49152 to 65535. Environments containing both OS version families will need to enable connectivity over both port ranges.

    5. You must enable the Remote Event Log Management (RPC) firewall rule on any DC that needs scanning. Otherwise, the tool displays a window stating, "Exception: The RPC server is unavailable"

    6. The liquidation of lingering objects in AD Lightweight Directory Services (AD LDS / ADAM) environments is not supported.

    7. You cannot use the tool to cleanup lingering objects on DCs running Windows Server 2003.  The tool leverages the event subscriptions feature which wasn’t added until Windows Server 2008.

    Lingering Object Discovery

    To see all lingering objects in the forest:

    1. Launch Lingering Objects.exe.

    2. Take a quick walk through the UI:

    Naming Context:

    Reference DC: the DC you will compare to the target DC. The reference DC hosts a writeable copy of the partition.

    Note: ChildDC2 should not be listed here since it is an RODC, and RODCs are not valid reference DCs for lingering object removal.

     More:

    The version of the tool is still in development and does not represent the finished product. In other words, expect crashes, quirks and everything else normally encountered with beta software.

    Target DC: the DC that lingering objects are to be removed from

    3. In smaller AD environments, you can leave all fields blank to have the entire environment scanned, and then click Detect. The tool does a comparison amongst all DCs for all partitions in a pairwise fashion when all fields are left blank. In a large environment, this comparison will take a great deal of time as the operation targets (n * (n-1)) number of DCs in the forest for all locally held partitions. For shorter, targeted operations, select a naming context, reference DC and target DC. The reference DC must hold a writable copy of the selected naming context.

    During the scan, several buttons are disabled. The current count of lingering objects is displayed in the status bar at the bottom of the screen along with the current tool status. During this execution phase, the tool is running in an advisory mode and reading the event log data reported on each target DC.

    Note: The Directory Service event log may completely fill up if the environment contains large numbers of lingering objects and the Directory Services event log is using its default maximum log size. The tool leverages the same lingering object discovery method as repadmin and repldiag, logging one event per lingering object found.

    When the scan is complete, the status bar updates, buttons are re-enabled and total count of lingering objects is displayed. The log pane at the bottom of the window updates with any errors encountered during the scan.
    Error 1396 is logged if the tool incorrectly uses an RODC as a reference DC.
    Error 8440 is logged when the targeted reference DC doesn't host a writable copy of the partition.

     Note:

    Lingering Object Liquidator discovery method

    • Leverages DRSReplicaVerifyObjects method in Advisory Mode
    • Runs for all DCs and all Partitions
    • Collects lingering object event ID 1946s and displays objects in main content pane
    • List can be exported to CSV for offline analysis (or modification for import)
    • Supports import and removal of objects from CSV import (leverage for objects not discoverable using DRSReplicaVerifyObjects)
    • Supports removal of objects by DRSReplicaVerifyObjects and LDAP rootDSE removeLingeringobjects modification

    The tool leverages the Advisory Mode method exposed by DRSReplicaVerifyObjects that both repadmin /removelingeringobjects /Advisory_Mode and repldiag /removelingeringobjects /advisorymode use. In addition to the normal Advisory Mode related events logged on each DC, it displays each of the lingering objects within the main content pane.

    Details of the scan operation log in the linger.log.txt file in the same directory as the tool's executable.

    The Export button allows you to export a list of all lingering objects listed in the main pane into a CSV file. View the file in Excel, modify if necessary and use the Import button later to view the objects without having to do a new scan. The Import feature is also useful if you discover abandoned objects (not discoverable with DRSReplicaVerifyObjects) that you need to remove. We briefly discuss abandoned objects later in this post.

    Removal of individual objects

    The tool allows you to remove objects a handful at a time, if desired, using the Remove button:

    1. Here I select three objects (hold down the Ctrl key to select multiple objects, or the SHIFT key to select a range of objects) and then select Remove.

    The status bar updates with the new count of lingering objects and the status of the removal operation:

    Logging for removed objects

    The tool dumps a list of attributes for each object before removal, and logs this along with the results of the object removal in the removedLingeringObjects.log.txt log file. This log file is in the same location as the tool's executable.

    C:\tools\LingeringObjects\removedLingeringObjects.log.txt

    the obj DN: <GUID=0bb376aa1c82a348997e5187ff012f4a>;<SID=010500000000000515000000609701d7b0ce8f6a3e529d669f040000>;CN=Dick Schenk,OU=R&D,DC=root,DC=contoso,DC=com

    objectClass:top, person, organizationalPerson, user;
    sn:Schenk ;
    whenCreated:20121126224220.0Z;
    name:Dick Schenk;
    objectSid:S-1-5-21-3607205728-1787809456-1721586238-1183;primaryGroupID:513;
    sAMAccountType:805306368;
    uSNChanged:32958;
    objectCategory:<GUID=11ba1167b1b0af429187547c7d089c61>;CN=Person,CN=Schema,CN=Configuration,DC=root,DC=contoso,DC=com;
    whenChanged:20121126224322.0Z;
    cn:Dick Schenk;
    uSNCreated:32958;
    l:Boulder;
    distinguishedName:<GUID=0bb376aa1c82a348997e5187ff012f4a>;<SID=010500000000000515000000609701d7b0ce8f6a3e529d669f040000>;CN=Dick Schenk,OU=R&D,DC=root,DC=contoso,DC=com;
    displayName:Dick Schenk ;
    st:Colorado;
    dSCorePropagationData:16010101000000.0Z;
    userPrincipalName:Dick@root.contoso.com;
    givenName:Dick;
    instanceType:0;
    sAMAccountName:Dick;
    userAccountControl:650;
    objectGUID:aa76b30b-821c-48a3-997e-5187ff012f4a;
    value is :<GUID=70ff33ce-2f41-4bf4-b7ca-7fa71d4ca13e>:<GUID=aa76b30b-821c-48a3-997e-5187ff012f4a>
    Lingering Obj CN=Dick Schenk,OU=R&D,DC=root,DC=contoso,DC=com is removed from the directory, mod response result code = Success
    ----------------------------------------------
    RemoveLingeringObject returned Success

    Removal of all objects

    The Remove All button, removes all lingering objects from all DCs in the environment.

    To remove all lingering objects from the environment:

    1. Click the Remove All button. The status bar updates with the count of lingering objects removed. (the count may differ to the discovered amount due to a bug in the tool-this is a display issue only and the objects are actually removed)

    2. Close the tool and reopen it so that the main content pane clears.

    3. Click the Detect button and verify no lingering objects are found.

    Abandoned object removal using the new tool

    None of the currently available lingering object removal tools will identify a special sub-class of lingering objects referred to internally as, "Abandoned objects".

    An abandoned object is an object created on one DC that never got replicated to other DCs hosting a writable copy of the NC but does get replicated to DCs/GCs hosting a read-only copy of the NC. The originating DC goes offline prior to replicating the originating write to other DCs that contain a writable copy of the partition.

    The lingering object liquidator tool does not currently discover abandoned objects automatically so a manual method is required.

    1. Identify abandoned objects based on Oabvalidate and replication metadata output.

    Abandoned objects can be removed with the LDAP RemoveLingeringObject rootDSE modify procedure, and so Lingering Objects Liquidator is able to remove these objects.

    2. Build a CSV file for import into the tool. Once, they are visible in the tool, simply click the Remove button to get rid of them.

    a. To create a Lingering Objects Liquidator tool importable CSV file:

    Collect the data in a comma separated value (CSV) with the following data:

    FQDN of RWDC

    CNAME of RWDC

    FQDN of DC to remove object from

    DN of the object

    Object GUID of the object

    DN of the object's partition

    3. Once you have the file, open the Lingering Objects tool and select the Import button, browse to the file and choose Open.

    4. Select all objects and then choose Remove.

    Review replication metadata to verify the objects were removed.

    Resources

    For those that want even more detail on lingering object troubleshooting, check out the following:

    To prevent lingering objects:

    • Actively monitor for AD replication failures using a tool like the AD Replication Status tool.
    • Resolve AD replication errors within tombstone lifetime number of days.
    • Ensure your DCs are operating in Strict Replication Consistency mode
    • Protect against large jumps in system time
    • Use only supported methods or procedures to restore DCs. Do not:
      • Restore backups older than TSL
      • Perform snapshot restores on pre Windows Server 2012 virtualized DCs on any virtualization platform
      • Perform snapshot restores on a Windows Server 2012 or later virtualized DC on a virtualization host that doesn't support VMGenerationID

    If you want hands-on practice troubleshooting AD replication errors, check out my lab on TechNet Virtual labs. Alternatively, come to an instructor-led lab at TechEd Europe 2014. "EM-IL307 Troubleshooting Active Directory Replication Errors"

    For hands-on practice troubleshooting AD lingering objects: check out my lab from TechEd Europe 2014. "EM-IL400 Troubleshooting Active Directory Lingering Objects"

    Finally, if you would like access to a hands-on lab for in-depth lingering object troubleshooting; let us know in the comments.

    Thank you,

    Justin Turner and A. Conner

    Update 2014/11/20 – Added link to TechEd Lingering objects hands-on lab
    Update 2014/12/17 – Added text to indicate the lack of support in LOL for cleanup of Windows Server 2003 DCs

  • Managing the Store app pin to the Taskbar added in the Windows 8.1 Update

    Update 9/9/2014

    Warren here yet again to update this blog to tell you that the GP to control the Store icon pin has shipped in the August 2014 update: http://support.microsoft.com/kb/2975719/. If you want to control the Store icon pinned to the taskbar be sure to install the August 2014 update on all the targeted machines.

    You can now have the Store disabled and the Store Icon removed via GP, or leave the Store enabled but remove the Store Icon pinned to the taskbar if that is what you need. The previous behavior of preventing the Store icon from being pinned during installation of Update 1 if the Store is disabled via GP remains unchanged.

    The new GP is named: “Do not allow pinning Store app to the Taskbar”  

    The full path to the new GP is: “User Configuration\Administrative Templates\Start Menu and Taskbar\Do not allow pinning Store app to the Taskbar”

    Explain text for this GP:

    This policy setting allows you to control pinning the Store app to the Taskbar

    If you enable this policy setting, users cannot pin the Store app to the Taskbar. If the Store app is already pinned to the Taskbar, it will be removed from the Taskbar on next login

    If you disable or do not configure this policy setting, users can pin the Store app to the Taskbar

    Thanks to everyone for their feedback on this issue and their patience while we developed and shipped the fix.

    ===========================================================================================================

    Update 7/14/2014

    Warren here with an update on the Store icon issue. Good News! Your feedback has been heard, understood and acted upon. A fix is in the works that will address the scenarios below:

     

    Scenario 1 - You want to block the Store but have enabled the GP to block the Store after applying Windows 8.1 Update.  A fix will be made to the GP, such that it will remove the Store Icon pin if the “disable Store” GP is already set.

     

    Scenario 2 - You want to provide access to the Store but want to remove the Store icon pin from the taskbar. A GP will be provided that can manage the Store icon pin.

     

    Thanks for all of your feedback on this issue!

     

    Warren

    ===========================================================================================================

    4/9/2014

    Warren here, posting with more news regarding the Windows 8.1 Update. Among the many features added by Windows 8.1 Update is that the Store icon will be pinned to the users taskbar when users first logon after updating their PC with Windows 8.1 Update.

    Some companies will not want the Store icon pinned to the taskbar on company owned devices.  There are currently two Group Policy options to control the Store tile pin - one that you can use before deploying the update that will prevent the Store app from being pinned to the Taskbar, and another that you can use after the update has been deployed and the Store app has been pinned to the Taskbar.

    Option 1:  Turn off the Store application before Installing the Windows 8.1

    Use the Group Policy “Turn off the Store application”

    As mentioned earlier, the Store Icon is pinned to the Taskbar at first logon after Windows 8.1 Update is applied. The Store application will not be pinned to the taskbar if the Group Policy “Turn off the Store application” is applied to computer. This option is not retroactive. The Group Policy must be applied to the workstation before the update is applied. The full path to this Group Policy is:

    Computer Configuration\Administrative Templates\Windows Components\Store\Turn off the Store application

    Or

    User Configuration\Administrative Templates\Windows Components\Store\Turn off the Store application

    You can use either Group Policy. As the name of the policy indicates, this will completely disable the Store. If your desire is to allow access to the Store but do not want the Store tile pinned to the Taskbar see option 2.

    Important note: By default the Group Policy setting “Turn off the Store application” will not show up in GPEDIT.MSC or GPMC.MSC if you run the tools on a Windows Server. You have two options: Install the Remote Server Admin Tools (RSAT) tools on a Windows 8.1 client and edit the group policy from that machine or install the Desktop Experience feature on the server used for editing Group Policy. The preferred method is to install the RSAT tools on a workstation. You can download the RSAT tools for Windows 8.1 here: http://www.microsoft.com/en-us/download/details.aspx?id=39296

    Option 2:  Use Group Policy to remove Pinned applications from the Taskbar after Installing the Update

    Use the Group Policy “Remove pinned programs from the Taskbar”

    This GP is a big hammer in that it will remove all pined tiles from the task bar and users subject to the policy will not be able to pin any applications or tiles to the Taskbar. This accomplishes the goal of not pinning the Store tile to the taskbar and leaves the Store accessible from Start.

    User Configuration\Administrative Templates\Start Menu and Taskbar\Removed pinned programs from the Taskbar”

    Other Options

    The last available option at this time is to have users unpin the Store app on their systems. Programmatically changing the Taskbar pins is not supported nor encouraged by Microsoft. See http://msdn.microsoft.com/en-us/library/dd378460(VS.85).aspx

  • Hate to see you go, but it’s time to move on to greener pastures. A farewell to Authorization Manger aka AzMan

    Hi all, Jason here. Long time reader, first time blogger. AzMan is Microsoft’s tool to manage authorization to applications based on a user’s role. AzMan has been around since 2003 and has had a good run. Now it’s time to send it out to pasture. If you haven’t seen thisarticle, AzMan has been added to list of technologies that will be eventually removed from the OS. As of Server 2012 and 2012 R2 AzMan has been marked as deprecated, which is a term we use to let our customers know that the specific technology in question will be removed in subsequent release of the OS. It has recently been announced that AzMan will no longer be in future releases of the Windows Server OS (after 2012 R2).

    What does this mean to you? If you are on a newer OS and use Azman, not much (right now). If you use AzMan on say for example Server 2003, you need to either get AzMan prepped and ready on a newer OS or find a suitable replacement for role based authorization. Keep in mind each OS has its own life cycle so AzMan isn’t immediately going away. We have well into 2023 before we see the last of AzMan. AzMan will continue to work on whichever OS you are currently using it on just be aware of the OS life cycle to make sure that your OS is supported and as such your implementation of AzMan. The obvious question here is, where do we go?

    The best answer would be moving your application to be claims aware. Claims allow you to make decisions on authorization based on data sent within the claim token. Want access based on user group in AD? Sounds like you want claims. Want authorization to a specific site based on whom your manager is? Claims can do that. I don’t want to make it sound like this is an immediate “click here and it fixes everything for you”, you will have to do recoding on your application to be able to consume claims sent by a claim provider and that isn’t going to be flowers and unicorns. There will be some hard work to move it over, however the gains will be huge as there has been a large surge in claims based applications and services in the last few years (O365 included). Windows already has a claims provider you can use to build claims tokens and send to your application (this is ADFS if you haven’t heard, I’d be surprised if you haven’t) and it’s either already in the OS or a download away (depending on which OS you are running). If you’re are using AzMan and looking for the push to get you into the claims game, this is the nudge you’ve been looking for.

    A few things to keep in mind if you are intending to use ADFS for your claims provider:

    · ADFS is provided in 2003 R2, however this is 1.x and does not have some of the features that 2.x + has. Also, some of the terminology is different and could be confusing to start your claims experience with, not to mention 2003 is close to end of life

    · ADFS is a separate download for 2008 and 2008 R2. It is provided in the OS as a role, but this is 1.1. You definitely want the downloaded version. (Make sure to get rollup 3 KB2790338 , update KB2843638 and update KB2896713)

    · ADFS is provided in the OS on 2012 (ADFS 2.1) and 2012 R2 (ADFS 3.0)

    A few helpful links to get you started with using claims based authentication/authorization:

    Claims-Aware Applications

    http://msdn.microsoft.com/en-us/library/windows/desktop/bb736227(v=vs.85).aspx

    Building My First Claims-Aware ASP.NET Web Application

    http://msdn.microsoft.com/en-us/library/hh545401(v=vs.110).aspx

    Hopefully these can give you enough of a starter to build a proof of concept and get your team ready to dive into the claims game.

  • It turns out that weird things can happen when you mix Windows Server 2003 and Windows Server 2012 R2 domain controllers

     

    UPDATE:  The hotfix is now available for this issue!  Get it at http://support.microsoft.com/kb/2989971

    This hotfix applies to Windows Server 2012 R2 domain controllers and should prevent the specific problem discussed below from occurring.

    It’s important to note that the symptoms of users and computers not being able to log on can happen for a number of different reasons.  Many of the folks in the comments have posted that they have these sorts of issues but don’t have Windows Server 2003 domain controllers, for example.  If you’re still having problems after you have applied the hotfix, please call in a support case so that we can help you get those fixed!

    =====================================================

    We have been getting quite a few calls lately where Kerberos authentication fails intermittently and users are unable to log on.  By itself, that’s a type of call that we’re used to and we help our customers with all the time.  Most experienced AD admins know that this can happen because of broken AD replication, unreachable DCs on your network, or a variety of other environmental issues that all of you likely work hard to avoid as much as possible - because let’s face it, the last thing any admin wants is to have users unable to log in – especially intermittently.

    Anyway, we’ve been getting more calls than normal about this lately, and that led us to take a closer look at what was going on.  What we found is that there’s a problem that can manifest when you have Windows Server 2003 and Windows Server 2012 R2 domain controllers serving the same domain.  Since many of you are trying very hard to get rid of your last Windows Server 2003 domain controllers, you might be running into this.  In the case of the customers that called us, the login issues were actually preventing them from being able to complete their migration to Windows Server 2012 R2.

    We want all of our customers to be running their Active Directory on the latest supported OS version, which is frankly a lot more scalable, robust, and powerful than Windows Server 2003.  We realize that upgrading an enterprise environment is not easy, and much less so when your users start to have problem during your upgrade.  So we’re just going to come out and say it right up front:

    We are working on a hotfix for this issue, but it’s going to take us some time to get it out to you. In the meantime, here are some details about the problem and what you can do right now.

    Symptoms include:

    1. When any domain user tries to log on to their computer, the logon may fail with “unknown username or bad password”. Only local logons are successful.

    If you look in the system event log, you may notice Kerberos event IDs 4 that look like this:

    Event ID: 4
    Source: Kerberos
    Type: Error
    "The Kerberos client received a KRB_AP_ERR_MODIFIED error from the server host/myserver.domain.com.  This indicates that the password used to encrypt the Kerberos service ticket is different than that on the target server. Commonly, this is due to identically named machine accounts in the target realm (domain.com), and the client realm.   Please contact your system administrator."

    2. Operating Systems on which the issue has been seen: Windows 7, WS2008 R2, WS2012 R2

    3. This can affect Clients and Servers(including Domain Controllers)

    4. This problem specifically occurs after the affected machine has changed its password. It can vary from a few minutes to a few hours post the change before the symptoms manifest.

    So, if you suspect you have a machine with this issue, check when it last changed its password and whether this was around the time when the issue started.

    This can be done using repadmin /showobjmeta command.

    Example:

    Repadmin /showobjmeta * “CN=mem01,OU=Workstations,,DC=contoso,DC=com”

    This command will get the object metadata for mem01 server from all DC’s.

    In the output check the pwdlastSet attribute and see if the timestamp is around the time you started to see the problem on this machine.

    Example:

    Why this happens:

    The Kerberos client depends on a “salt” from the KDC in order to create the AES keys on the client side. These AES keys are used to hash the password that the user enters on the client, and protect it in transit over the wire so that it can’t be intercepted and decrypted. The “salt” refers to information that is fed into the algorithm used to generate the keys, so that the KDC is able to verify the password hash and issue tickets to the user.

    When a Windows 2012 R2 DC is promoted in an environment where Windows 2003 DCs are present, there is a mismatch in the encryption types that are supported on the KDCs and used for salting. Windows Server 2003 DCs do not support AES and Windows Server 2012 R2 DCs don’t support DES for salting.

    You might be wondering why these encryption types matter.  As computer hardware gets more powerful, older encryption methods become easier and easier to break.  Thus, we are constantly incorporating newer, more powerful encryption into Windows and Kerberos in order to help protect your user passwords (and your data and your network).

    Workaround:

    If users are having the problem:

    Restart the computer that is experiencing the issue. This recreates the AES key as the client machine or member server reaches out to the KDC for Salt. Usually, this will fix the issue temporarily. (at least until the next password change).

    To prevent this from happening, please apply the hotfix to all Windows Server 2012 R2 domain controllers in the environment.

    How to prevent this from happening:

    Option 1: Query against Active Directory the list of computers which are about to change their machine account password and proactively reset their password against a Windows Server 2012 R2 DC and follow that by a reboot.

    There’s an advantage to doing it this way: since you are not disabling any encryption type and keeping things set at the default, you shouldn’t run into any other authentication related issue as long as the machine account password is reset successfully.

    Unfortunately, doing this will mean a reboot of machines that are about to change their passwords, so plan on doing this during non-business hours when you can safely reboot workstations.

    We’ve created a quick PowerShell script that you can run to do this.

    Sample PS script:

    > Import-module ActiveDirectory

    > Get-adcomputer -filter * -properties PasswordLastSet | export-csv machines.csv

    This will get you the list of machines and the dates they last set their password.  By default machines will reset their password every 30 days.   Open the created csv file in excel and identify the machines that last set their password 28 or 29 days prior (If you see a lot of machines that have dates well beyond the 30 days, it is likely these machines are no longer active).

    Reset Password:

    Once you have identified the machines that are most likely to hit the issue in the next couple of days, proactively reset their password by running the below command on those machines.  You can use tools such as psexec, system center or other utilities that allow you to remotely execute the command instead of logging in interactively to each machine.

    nltest /SC_CHANGE_PWD:<DomainName> /SERVER:<Target Machine>

    Then reboot.

    Option 2: Disable machine password change or increase duration to 120 days.

    You should not run into this issue at all if password change is disabled. Normally we don’t recommend doing this since machine account passwords are a core part of your network security and should be changed regularly. However because it’s an easy workaround, the best mitigation right now is to set it to 120 days. That way you buy time while you wait for the hotfix.

    If you go with this approach, make sure you set your machine account password duration back to normal after you’ve applied the hotfix that we’re working on.

    Here’s the relevant Group Policy settings to use for this option:

    Computer Configuration\Windows Settings\Security Settings\Local Polices\Security Options

    Domain Member:  Maximum machine account password age:

    Domain Member: Disable machine account password changes:

    Option 3: Disable AES in the environment by modifying Supported Encryption Types for Kerberos using Group Policy. This tells your domain controllers to use RC4-HMAC as the encryption algorithm, which is supported in both Windows Server 2003 and Windows Server 2012 and Windows Server 2012 R2.

    You may have heard that we had a security advisory recently to disable RC4 in TLS. Such attacks don’t apply to Kerberos authentication, but there is ongoing research in RC4 which is why new features such as Protected Users do not support RC4. Deploying this option on a domain computer will make it impossible for Protected Users to sign on, so be sure to remove the Group Policy once the Windows Server 2003 DCs are retired.

    The advantage to doing this is that once the policy is applied consistently, you don’t need to chase individual workstations. However, you’ll still have to reset machine account passwords and reboot computers to make sure they have new RC4-HMAC keys stored in Active Directory.

    You should also make sure that the hotfix https://support.microsoft.com/kb/2768494  is in place on all of your Windows 7 clients and Windows Server 2008 R2 member servers, otherwise they may have other issues.

    Remember if you take this option, then after the hotfix for this particular issue is released and applied on Windows Server 2012 R2 KDCs, you will need to modify it again in order to re-enable AES in the domain. The policy needs to be changed again and all the machines will require reboot.

    Here are the relevant group policy settings for this option:

    Computer Configuration\Windows Settings\Security Settings\Local Polices\Security Options

    Network Security:  Configure encryption types allowed for Kerberos:

    Be sure to check:  RC4_HMAC_MD5

    If you have unix/linux clients that use keytab files that were configured with DES enable:  DES_CBC_CRC, DES_CBC_MD5

    Make sure that AES128_HMAC_SHA1, and AES256_HMAC_SH1 are NOT Checked

    Finally, if you are experiencing this issue please revisit this blog regularly for updates on the fix.

     

    - The Directory Services Team