Blog - Title

March, 2009

  • Understanding DFSR Debug Logging (Part 1: Logging Levels, Log Format, GUID’s)

    Ned here again. Today begins a 21-part series on using the DFSR debug logs to further your understanding of Distributed File System Replication. While there are specific troubleshooting scenarios that will be covered, the most important part of understanding any products logging is making sure you are comfortable with it before you have errors. That way you have some point of reference if things go wrong.

    As you can probably guess, these posts were a long time in development. They are based on an internal DFSR whitepaper I have worked on for six months, and which went through review by a number of excellent folks here in Support, Field Engineering, and the Product Group itself. Except for the removal of all private source code references, this series is otherwise unchanged.

    I'll start with a couple posts on the logs themselves, how they are formatted, how they can be controlled, etc. Then I'll dig into scenarios in detail, for both Windows Server 2003 R2 and Windows Server 2008. Don't feel like you have to read and memorize everything – this series is a reference guide as well.

    And yes, there will be a complete downloadable copy of this series in a few common file formats when the series is done.

    Logging levels

    DFSR writes circular log files in %systemroot%\debug that automatically compress with the GZ archive format. The debug logs can have varying levels of detail verbosity, to control how much or how little data you want written. It is also possible to control how many logs to maintain before overwriting the oldest ones, how many entries to store in each log, where the logs are stored, and whether or not logging should run. Under default log settings they should never occupy more than ~50MB of space on Windows Server 2003 R2 servers.

    The following controls the log settings and describes the defaults:

    SETTING: Debug Log Severity
    Default: 4
    Range: 1-5
    WMIC syntax:

    wmic /namespace:\\root\microsoftdfs path dfsrmachineconfig set debuglogseverity=5

    SETTING: Debug Log Messages
    Default: 200000
    Range: 1000 to 4294967295 (FFFFFFFF)
    WMIC syntax:

    wmic /namespace:\\root\microsoftdfs path dfsrmachineconfig set maxdebuglogmessages=500000

    SETTING: Debug Log Files
    Default: 100
    Range: 1 to 10000
    WMIC syntax:

    wmic /namespace:\\root\microsoftdfs path dfsrmachineconfig set maxdebuglogfiles=200

    SETTING: Debug Log File Path
    Default: %windir%\debug
    WMIC syntax:

    wmic /namespace:\\root\microsoftdfs path dfsrmachineconfig set debuglogfilepath="d:\dfsrlogs"

    NOTE: The path must be created manually; if not, at service restart, the default value %windir%\debug will be used.

    SETTING: Enable Debug Logging (NOTE: Debug logging is enabled by default)
    Default: TRUE
    Range: TRUE or FALSE
    WMIC syntax:

    wmic /namespace:\\root\microsoftdfs path dfsrmachineconfig set enabledebuglog=true

    The WMIC.EXE commands above are actually modifying the DfsrMachineConfig.XML file that is stored in the %systemdrive%\system volume information\dfsr\config. It will populate the DfsrDebug tags. If running with defaults, these would not be populated – in the example below debug log severity is now at 5:

    - <DfsrDebug>
    <EnableDebugLog>true</EnableDebugLog>
    <DebugLogFilePath>C:\WINDOWS\debug</DebugLogFilePath>
    <MaxDebugLogFiles>100</MaxDebugLogFiles>
    <DebugLogSeverity>5</DebugLogSeverity>
    <MaxDebugLogMessages>200000</MaxDebugLogMessages>
    </DfsrDebug>

    When setting Debug Log Severity you are influencing how verbose the logs are – i.e. what do we consider important enough to write. Here is a brief table:

    Level

    Setting in DFSR    

    Flag Logged

    Explanation

    0

    LogLevelNone

    N/A

    Write nothing

    1

    LogLevelAlways

    N/A

    Write log header information only

    2

    LogLevelError

    [ERROR]

    Write error events and all others above

    3

    LogLevelWarn

    [WARN]

    Write warning events and all others above

    4

    LogLevelInfo

    N/A

    Write informational events and all others above

    5

    LogLevelTrace

    N/A

    Write special tracing events and all others above

    Since the default is 4, DFSR will log everything that occurs except for tracing details. Tracing details are called out further in this guide, and are only necessary to activate under very specific troubleshooting scenarios.

    The debug log format

    The DFSR debug logs use a consistent, predictable format that consists of:

    Header – written at the top of each log file and contains (for example):

    * FRS Log Sequence:1 Index:1 Computer:2003MEM20 TimeZone:Eastern Standard Time (GMT-05:00) Build:[Feb 16 2007 20:14:20 built by: srv03_sp2_rtm] Enterprise=1
    * Configuration logLevel:4 maxEntryCount:200000 maxFileCount:100 logPath:\\.\C:\WINDOWS\debug\

    Field

    Description

    FRS Log Sequence & Index

    Describe which logs these are relative to the circular wrapping

    Computer

    Describes the server where this log was written

    TimeZone

    Describes the local time zone of the server and its relation to GMT

    Build

    Describes what OS is being used and if it is Enterprise edition or higher

    Configuration loglevel

    Describes the current log verbosity settings

    Maxentrycount

    Describes the number of lines that can be written to the debug log before it starts a new one

    Maxfilecount

    Describes the total number of circular logs maintained at any one time

    Logpath

    Describes the output folder of the logs

    Header lines always start with an asterisk (*). The header information is always written and cannot be turned off without disabling logging altogether.

    Single-line messages – written throughout the logs and always map back to one discrete operation in DFSR. So for example:

    20080111 15:12:30.996 3876 JOIN 1171 Join::SubmitUpdate Sent: uid:{AC759213-00AF-4578-9C6E-EA0764FDC9AC}-v33846 gvsn:{AC759213-00AF-4578-9C6E-EA0764FDC9AC}-v33846 name:USRSTAT.EXE connId:{CC694D38-7E97-467C-A963-B3D9B6E308B9} csId:{1697E5EB-BBD0-45B7-AC2F-11EBE7B3FD47} csName:dfsrprestaged

    Field

    Description

    Example from above

    Date-Time

    Stamps local time YYYYMMDD HH:MM:SS:MS

    20080111 15:12:30.996

    Thread

    The thread executing within DFSR.EXE

    3876

    Module ID

    The sub-component of DFSR

    JOIN

    Line

    The line in source code

    1171

    Class

    The class being executed

    Join

    Method

    The method (function) being executed by the class

    SubmitUpdate

    Data

    All the information being described by the logging

    Sent: uid:{AC759213-00AF-4578-9C6E-EA0764FDC9AC}-v33846 gvsn:{AC759213-00AF-4578-9C6E-EA0764FDC9AC}-v33846 name:USRSTAT.EXE connId:{CC694D38-7E97-467C-A963-B3D9B6E308B9} csId:{1697E5EB-BBD0-45B7-AC2F-11EBE7B3FD47} csName:dfsrprestaged

    Single-line messages always start with a date-time stamp entry. The above sample line is wrapped for readability.

    Nested messages – written throughout the logs and always map back to one discrete operation in DFSR that generates a multi-line response for better readability. So for example:

    20080111 11:44:28.873 1640 INCO 4378 InConnection::UpdateProcessed Received Update. updatesLeft:237 processed:1171 sessionId:1 open:1 updateType:0 processStatus:0 connId:{D0BF5598-9457-4C32-8C50-7558BCD76610} csId:{1697E5EB-BBD0-45B7-AC2F-11EBE7B3FD47} csName:dfsrprestaged update:
    +    present 1
    +    nameConflict 0
    +    attributes 0x10
    +    gvsn {AC759213-00AF-4578-9C6E-EA0764FDC9AC}-v29102
    +    uid {AC759213-00AF-4578-9C6E-EA0764FDC9AC}-v29102
    +    parent {AC759213-00AF-4578-9C6E-EA0764FDC9AC}-v28024
    +    fence 16010101 00:00:00.000
    +    clock 20080110 19:05:43.167
    +    createTime 20080110 19:05:43.157 GMT
    +    csId {1697E5EB-BBD0-45B7-AC2F-11EBE7B3FD47}
    +    hash 37123A73-30C1AFF0-B4EE5252-46212327
    +    similarity 00000000-00000000-00000000-00000000
    +    name acctsid

    Nested messages follow single line messages that are ended with a colon. The nested messages always start with a plus sign (+). The nested lines can change depending on the class and method/function being executed so they are described in their own section below for 'File and Folder Field Information'.

    The common GUID fields

    Globally Unique Identifiers (GUID's) are used throughout the DFSR system to map the friendly names of the topology to unique entries used by the DFSR service. This can make reading the DFSR debug logs very challenging, as not all GUID's in the environment are defined in the logs. When examining the DFSR debug logs it is important to understand how to map GUID's to real objects for troubleshooting purposes. Sample log entry:

    20080403 11:19:54.349 1660 SRTR 329 SERVER_EstablishConnection Succeeded on connId:{097BFFAA-99FB-4A4D-9590-C102985A55C6} replicaSetId:{D3558FFB-1E46-483F-AE89-E840E4A6A97B} partnerAddress:2003MEM21.contoso.com
    20080403 11:19:55.710 3360 JOIN 1171 Join::SubmitUpdate Sent: uid:{AC759213-00AF-4578-9C6E-EA0764FDC9AC}-v137449 gvsn:{AC759213-00AF-4578-9C6E-EA0764FDC9AC}-v137449 name:samplefile.txt connId:{097BFFAA-99FB-4A4D-9590-C102985A55C6} csId:{B269F903-539D-42F2-9D33-935590097578} csName:ihaterobocopy
    20080403 11:19:55.891 572 OUTC 588 OutConnection::OpenFile Received request for update:
    +    present 1
    +    nameConflict 0
    +    attributes 0x20
    +    gvsn {AC759213-00AF-4578-9C6E-EA0764FDC9AC}-v137449
    +    uid {AC759213-00AF-4578-9C6E-EA0764FDC9AC}-v137449
    +    parent {B269F903-539D-42F2-9D33-935590097578}-v1
    +    fence 16010101 00:00:00.000
    +    clock 20080403 15:17:17.233
    +    createTime 20080403 15:17:17.193 GMT
    +    csId {B269F903-539D-42F2-9D33-935590097578}
    +    hash 00000000-00000000-00000000-00000000
    +    similarity 00000000-00000000-00000000-00000000
    +    name samplefile.txt

    Field

    Description

    Example from above

    ReplicaSetId

    Replication Group GUID

    {D3558FFB-1E46-483F-AE89-E840E4A6A97B}

    CSID

    Replicated Folder GUID

    {B269F903-539D-42F2-9D33-935590097578}

    ConnID

    Connection GUID

    {097BFFAA-99FB-4A4D-9590-C102985A55C6}

    Parent

    Folder holding the file

    {B269F903-539D-42F2-9D33-935590097578}-v1

    UID

    Original file record

    {AC759213-00AF-4578-9C6E-EA0764FDC9AC}-v137449

    GVSN

    Modified file record

    {AC759213-00AF-4578-9C6E-EA0764FDC9AC}-v137449

    There are a few tools that can be used to map the GUID's:

    DFSRADMIN.EXE and DFSRDIAG.EXE - You can use the DFSRADMIN and DFSRDIAG tools included with DFSR to enumerate the topology and determine the GUID's. Below is a sample of doing this through a CMD prompt with the data provided by the above logs:

    Dfsradmin.exe rg list /attr:rgname,rgguid

    RgName RgGuid
    SteveLovesFRS d3558ffb-1e46-483f-ae89-e840e4a6a97b

    The above command is used to return the Replication Group name, which you will see below is necessary to complete a number of further lookups. This maps to REPLICASETID.

    Dfsradmin.exe rf list /rgname:SteveLovesFRS /attr:rfname,rfguid

    RfName RfGuid
    ihaterobocopy b269f903-539d-42f2-9d33-935590097578

    The above command is used to get the GUID of the Replicated Folder so that CSID is known.

    Dfsradmin.exe conn list /rgname:SteveLovesFRS /attr:sendmem,recvmem,connguid

    SendMem RecvMem ConnGuid
    2003MEM20 2003MEM21 097bffaa-99fb-4a4d-9590-c102985a55c6
    2003MEM21 2003MEM20 d2e396a5-837b-4103-b8a2-b8fc2c71d388

    The above command is used to return the Connection GUID's that can be mapped to CONNID.

    Dfsrdiag.exe guid2name /guid:AC759213-00AF-4578-9C6E-EA0764FDC9AC /rgname:stevelovesfrs

    Object Type : DfsrVolumeInfo
    Computer : 2003MEM20.contoso.com
    Volume Guid : B8B42506-BF98-11DC-B176-0003FF3813C5
    Volume Path : E:
    Volume SN : 108172604
    DB Guid : AC759213-00AF-4578-9C6E-EA0764FDC9AC

    Finally, the above command is used to retrieve the GUID of the actual DFSR database and therefore the server it is running on. When files and folders are created or modified, the originating server is used to form the GUID portion of the name, and then the current version vector from that server is appended to complete the unique file mapping in the database. These are used for UID and GVSN.

    So having retrieved all the GUID's, we can now see that our debug log entry actually means:

    20080403 11:19:54.349 1660 SRTR 329 SERVER_EstablishConnection Succeeded on connId: 2003MEM20 replicaSetId:SteveLovesFRS partnerAddress:2003MEM21.contoso.com
    20080403 11:19:55.710 3360 JOIN 1171 Join::SubmitUpdate Sent: uid:
    2003MEM20-v137449 gvsn: 2003MEM20-v137449 name:samplefile.txt connId: 2003MEM20 csId: ihaterobocopy csName:ihaterobocopy
    20080403 11:19:55.891 572 OUTC 588 OutConnection::OpenFile Received request for update:
    +    present 1
    +    nameConflict 0
    +    attributes 0x20
    +    gvsn 2003MEM20-v137449
    +    uid 2003MEM20-v137449
    +    parent 2003MEM20-v1
    +    fence 16010101 00:00:00.000
    +    clock 20080403 15:17:17.233
    +    createTime 20080403 15:17:17.193 GMT
    +    csId {B269F903-539D-42F2-9D33-935590097578}
    +    hash 00000000-00000000-00000000-00000000
    +    similarity 00000000-00000000-00000000-00000000
    +    name samplefile.txt

    Next up – nested debug log fields and module ID's.

    Understanding DFSR debug logging (Part 1: Logging Levels, Log Format, GUID’s)
    Understanding DFSR debug logging (Part 2: Nested Fields, Module ID's)
    Understanding DFSR debug logging (Part 3: The Log Scenario Format, File Added to Replicated Folder on Windows Server 2008)
    Understanding DFSR debug logging (Part 4: A Very Small File Added to Replicated Folder on Windows Server 2008)
    Understanding DFSR debug logging (Part 5: File Modified on Windows Server 2003 R2)
    Understanding DFSR debug logging (Part 6: Microsoft Office Word 97-2003 File Modified on Windows Server 2008)
    Understanding DFSR debug logging (Part 7: Microsoft Office Word 2007 File Modified on Windows Server 2008)
    Understanding DFSR debug logging (Part 8: File Deleted from Windows Server 2003 R2)
    Understanding DFSR debug logging (Part 9: File is Renamed on Windows Server 2003 R2)
    Understanding DFSR debug logging (Part 10: File Conflicted between two Windows Server 2008)
    Understanding DFSR debug logging (Part 11: Directory created on Windows Server 2003 R2)
    Understanding DFSR debug logging (Part 12: Domain Controller Bind and Config Polling on Windows Server 2008)
    Understanding DFSR debug logging (part 13: A New Replication Group and Replicated Folder between two Windows Server 2008 members)
    Understanding DFSR debug logging (Part 14: A sharing violation due to a file locked upstream between two Windows Server 2008)
    Understanding DFSR debug logging (Part 15: Pre-Seeded Data Usage during Initial Sync)
    Understanding DFSR debug logging (Part 16: File modification with RDC in very granular detail (uses debug severity 5))
    Understanding DFSR debug logging (Part 17: Replication failing because of blocked RPC ports (uses debug severity 5))
    Understanding DFSR debug logging (Part 18: LDAP queries failing due to network (uses debug severity 5))
    Understanding DFSR debug logging (Part 19: File Blocked Inbound by a File Screen Filter Driver (uses debug severity 5))
    Understanding DFSR debug logging (Part 20: Skipped temporary and filtered files (uses debug severity 5))
    Understanding DFSR debug logging (Part 21: File replication performance from throttling (uses debug severity 5))

    - Ned 'Cure for Insomnia' Pyle

  • Userenv 1054 events as a result of time-stamp counter drift on Windows Server 2003 guests running in Hyper-V

    Hello everyone, Brian here.

    I was working with a colleague on an issue in which the customer was receiving the following Userenv event in their application event logs:

    (Event ID 1054)
    Windows cannot obtain the domain controller name for your computer network. (An unexpected network error occurred.). Group Policy processing aborted.

    The customer explained to us that if he removes one of the Hyper-V virtual processors from his Windows Server 2003 Guest, the issue goes away. Based on this statement we asked the customer to gather a userenv log while forcing a group policy refresh with the additional virtual processor enabled, and this what we saw during the initial ping test before we process group policy:

    USERENV(15c.858) 15:55:09:080 PingComputer: First time: 2069
    USERENV(15c.858) 15:55:09:080 PingComputer: Second time: 2069
    USERENV(15c.858) 15:55:09:080 PingComputer: First and second times match.
    USERENV(15c.858) 15:55:09:080 PingComputer: First time: 2069
    USERENV(15c.858) 15:55:09:080 PingComputer: Second time: -2069
    USERENV(15c.858) 15:55:09:080 PingComputer: First time: -2069
    USERENV(15c.858) 15:55:09:080 PingComputer: Second time: 0

    We have a knowledgebase article that pertains to this issue on servers that uses dual-core or multiprocessor AMD Opteron processors:

    938448 A Windows Server 2003-based server may experience time-stamp counter drift if the server uses dual-core AMD Opteron processors or multiprocessor AMD Opteron processors

    http://support.microsoft.com/default.aspx?scid=kb;EN-US;938448

    Now in the case of our customer they were not running an AMD Processor server so they felt this resolution did not apply to them. Even though the article did not apply to the type of processor in their servers, the behavior was identical so we applied the resolution outlined in the knowledgebase article and this resolved the customer’s issue. I am in the process of having a knowledgebase article created to specifically address this issue with Windows Server 2003 virtual machines running in Hyper-V.

    So we did a little digging and found the following blog post from Tony Voellm, who is a Principal Software Test Engineer in the Windows Kernel development team:

    Negative ping times in Windows VM's - whats up?

    http://blogs.msdn.com/tvoellm/archive/2008/06/05/negative-ping-times-in-windows-vm-s-whats-up.aspx

    The following is from the above blog post:

    If you see negative ping times in multiprocessor W2k3 guest OSes you might consider setting the /usepmtimer in the boot.ini file.

    The root issue comes about from the Win32 QueryPerformanceCounter function.  By default it uses a time source called the TSC.  This is a CPU time source that essentially counts CPU cycles.  The TSC for each (virtual) processor can be different so there is no guarantee that reading TSC on one processor has anything to do with reading TSC on another processor.  This means back to back reads of TSC on different VP's can actually go backwards. Hyper-V guarantees that TSC will not go backwards on a single VP.

    So here the problem with negative ping times is the time source is using QueryPerformanceCounter which is using TSC.  By using the /usepmtimer boot.ini flag you change the time source for QueryPerformanceCounter from TSC to the PM timer which is a global time source.

    I wanted to bring these together as one may read Tony’s blog post and not understand how this would relate to what they see with regards to group policy application and the errors that may occur as a result of this behavior with Hyper-V.

    -Brian ‘Fast and Furious’ Singleton

  • Reminder: End of Life for Service Pack 1 in Windows Server 2003 Coming Soon

    Ned here again with a quick reminder:

    Support for computers running Windows Server 2003 Service Pack 1 ends on April 14th, 2009. Yes, that's just under a month from now. After that point there will be no hotfixes, security updates, or support for computers that do not have SP2 installed. If you don't have SP2 on your deployment radar, you are rapidly running out of time.

    http://support.microsoft.com/lifecycle/?p1=3198

    Support for Windows Server 2003 with SP2 will continue until July 14th, 2015, so no rush there. Although you're missing out on all the Win2008 goodness.

    - Ned 'Patch That Sucker!' Pyle

  • How do I find out what changes are going on in my Active Directory?

    Herbert here. Here are some common questions asked by AD Administrators:

    - Why has my AD database size increased by 500MB in the last three weeks?
    - I see lots of AD replication in Domain Controller monitoring. What are all these changes?

    Both symptoms can be severe enough to impair the operations of your AD forest. Here are examples of past occurrences that we tracked down:

    312403  Distributed Link Tracking on Windows-based domain controllers
    http://support.microsoft.com/default.aspx?scid=kb;EN-US;312403

    318774  Removing duplicate and unwanted proxy addresses in Exchange
    http://support.microsoft.com/default.aspx?scid=kb;EN-US;318774

    940262  The Active Directory database size increases unexpectedly because a Windows Server 2003-based DNS server inappropriately creates several SerialNo objects
    http://support.microsoft.com/default.aspx?scid=kb;EN-US;940262

    In order to find the cause for the problems, you should find what has changed in the AD database recently. Now Active Directory assigns an "Update Sequence Number" (USN) to each change. These USNs are 64 Bit Integers and are specific to a Domain Controller. The DC GUID and USN together uniquely identify a database change. A USN is both assigned to originating changes and replicated changes. So even for read-only GC content, you see local USNs getting written.

    You can use these USNs to identify recent changes in the database of each DC. Each AD Server (includes AD/AM and LDS) has an attribute named “highestCommittedUSN” on its RootDSE object. Here’s an example output from LDP:

    ...
    12> supportedLDAPPolicies: MaxPoolThreads; MaxDatagramRecv; MaxReceiveBuffer; InitRecvTimeout; MaxConnections; MaxConnIdleTime; MaxPageSize; MaxQueryDuration; MaxTempTableSize; MaxResultSetSize; MaxNotificationPerConn; MaxValRange;
    1> highestCommittedUSN: 175389104;
    4> supportedSASLMechanisms: GSSAPI; GSS-SPNEGO; EXTERNAL; DIGEST-MD5;
    ...

    Based on this number, you can query for the most recently changed Objects using an LDAP query. As an example, I’m using LDIFDE and I’m subtracting 10000 from the “highestCommittedUSN” value seen on RootDSE:

    Ldifde /d dc=contoso,dc=com /s contoso-DC1 /r "(usnchanged>=175379104)" /f domain-NC-last-10000-080919.txt

    This file now contains the names of the objects that were changed or created recently. The object names give you a hint as to what area of AD you need to look at, but it may not be enough of a clue yet. For users, you can use attributes that tell you about recent changes:

    Pwdlastset
    Lastlogontimestamp

    e.g. pwdLastSet: 129333360374989750

    The attributes have a 64 bit time format. You can convert them to readable timestamps using:

    C:\> w32tm /ntte 129333360374989750
    149691 09:20:37.4989750 - 04.11.2010 10:20:37

    You can then grep the LDIF export based on this knowledge about this time stamp, e.g.:

    C:\>w32tm /ntte 129333000000000000
    149690 23:20:00.0000000 - 04.11.2010 00:20:00

    Command would be something like:

    Ldifde /c:" pwdlastset: 129333" domain-NC-last-10000-080919.txt | wc

    This would print the list of users and computers who changed their password in the last ten hours. In most cases, these should all considered false positives.

    If they are not all new objects (very recent whenCreated attribute), you may want to look at what attributes have been changed. Also, you want to know from which DC the object change is originating.

    Maybe the DC that writes all the changes is the primary DC your provisioning system is working against, or it’s a DC you don’t expect to see. To get this information, retrieve the object meta-data using:

    repadmin /showobjmeta <DC name> <Object-DN>

    The output looks like this:

    Loc.USN         Originating DC   Org.USN  Org.Time/Date        Ver Attribute
    =======     =============== ========= =============       
    ...
    175389437     HQ\contoso-DC1   175389437 2008-09-16 18:12:46    2 name
    ...

    The leftmost column is the local USN; the more interesting fields are to the right, where you see the originating DC information and change time-stamp, attribute version and name. If the version is really high, it could mean excessive updates to this attribute which deserves more investigation.

    You should also look out for changes seen for linked attributes (Windows Server 2003 Forest Mode and higher):

    Type    Attribute     Last Mod Time                    Originating DC  Loc.USN   Org.USN        Ver   Distinguished Name
    ============================================================================
    ABSENT  member 2008-09-19 15:14:01       HQ\contoso-DC1 175384020 175384020   2    CN=test-user1,OU=Test-OU,DC=contoso,DC=com
    PRESENT member 2008-09-16 18:22:29       HQ\contoso-DC1 175379684 175379684   1   CN=test-user2,OU=Test-OU,DC=contoso,DC=com

    Note: High values for USNs will distort the table view.

    Many “ABSENT” and high version numbers indicate high activity with linked values. “ABSENT” indicates a deleted link, so you can think of it as a value tombstone. It’s treated just like an object tombstone in the database. During replication it means that the value is deleted from the attribute, in this case a group membership.

    Attributes that can contain lots of data deserve special attention. This often applies to attributes containing binary values, including the security descriptor for AD or Exchange, or attributes containing certificates. Note that by default, LDIFDE does not dump “ntSecurityDescriptor”. If any of these attributes show high version numbers or a recent update time stamp on many objects, you should investigate further. It will depend on the attribute on how you investigate the changes, for example for “ntSecurityDescriptor” you can dump it using DSACLS and check out any excess Access Control Entries.

    Excessive changes to “ntSecurityDescriptor” are not so much a problem regarding database size because there is single instance storage for these since Windows Server 2003. But they can take lots of replication bandwidth.

    The information on objects, attributes and originating DC you collected so far should give you good hints regarding the originator of the changes. If it’s not clear yet, you can enable auditing on successful changes to these attributes to find out the process that is making these changes. It may be necessary to make the attribute viewable in ACL Editor so you can define auditing for it. See the guide in:

    296490  How to modify the filtered properties of an object
    http://support.microsoft.com/default.aspx?scid=kb;EN-US;296490

    But what if there is no pattern evolving while you get the data?

    One approach is to repeat the LDIFDE export and reduce the window until you see a pattern. Maybe the problematic changes only happen at certain times of the day, so it would also play a role when you create the export. Or the changes happen on a branch office that only replicates at a certain time of day.

    But there are also more naming contexts that may have excessive changes, such as Configuration or the DNS partitions ForestDnsZones and DomainDnsZones, and on GCs. Hopefully, the admins of the other domains are already aware of the excessive changes. This is how you search the whole of the GC data:

    Ldifde /d "" /s contoso-DC1 /t 3268 /r "(usnchanged>=175379104)" /f GC-last-10000-080919.txt

    Hint: Keep in mind that this query only shows changes for attributes that are present in the GC.

    And finally, the problem may not be with existing objects that are changed, but with objects that are deleted and re-created all the time. Deleted objects still take database space for the tombstone, and the new objects cause replication traffic. LDIFDE can include deleted objects in the query when you pass the “/x” option:

    Ldifde /d dc=contoso,dc=com /s contoso-DC1 /x /r "(usnchanged>=175379104)" /f domain-NC-last-10000-deleted-080919.txt

    If the combined size of the tombstones is a problem, you have to wait until the garbage collection is done before you can reduce the size of the database file using an offline defragmentation. We advise against shortening Tombstone Lifetime for the sole purpose of kicking out these objects earlier. When you have strict replication enabled and replication quarantine is enforced, this shortening TSL to a few days can have a drastic impact on the availability of your Active Directory.

    I hope you’re having fun investigating all your ongoing AD changes. I think you’re up to a few interesting findings.

    - Herbert Mauerer

  • New Directory Services KB Articles 3/7-3/14

    New KB articles related to Directory Services for the week of 3/7-3/14.

    959517

    Windows Server 2008 Key Distribution Center (KDC) rejects a TGS request after the TGT is renewed

    968596

    Files and folders that are discarded during conflict resolution are intermittently not moved to the "Conflict and Deleted" folder on a server that is running Windows Server 2003 R2

    968867

    You cannot configure the Negotiate or NTLM protocols for Windows Integrated Authentication in the IIS Manager for Internet Information Services (IIS) 7.0

    969006

    The Home Folder may be mapping incorrectly when logging on to a Windows XP-based computer

    968920

    Windows Vista and Windows Server 2008 DNS clients do not honor DNS round robin by default

    968791

    Vista Sync Center may corrupt offline folder cache

    967729

    Only part of the network bandwidth is used when you transfer multiple large files at the same time through a high-bandwidth network connection on a Windows Vista-based computer or on a Windows Server 2008-based computer

  • How to Properly Disable Offline Files in Windows Vista

    Hi, it's Adam Conkle again from the Directory Services team. Today's posting covers how to correctly disable Offline Files in Windows Vista. I recently had a case where the customer was experiencing undesirable behavior with a file share only when the file server was accessed from their Windows Vista machines.

    The symptoms were:

    1. Files saved to the file share from the Vista machines will not update the file on the file server, yet the changes appear when accessed on the Vista machine.

    2. Files and folders that have been deleted from the file server continue to appear when the file share is accessed from the Vista machines. The files and folders cannot be deleted from the Vista machines because the files do not exist.

    The customer advised that they had chosen to disable Offline Files on all of their Vista machines. I looked at Control Panel and Offline Files was set to Disabled. I then looked at Services.msc and the Offline Files service was disabled. Looks good, right?

    image

    image

    Next, I decided it would be a good idea to get a Process Monitor capture while we reproduce the above symptoms. Analysis of the capture showed me that Explorer.exe was not accessing the UNC path as expected, but rather was accessing C:\Windows\CSC\<namespace_path_for_share>. That's odd. You're probably thinking: "Why is Explorer.exe trying to access the Client Side Caching database when Offline Files has been disabled?"

    That is a great question, and here is the answer:

    Client Side Caching has a kernel driver that is loaded at startup when Offline Files is enabled. You can check the status of the driver by opening System Information (Start > Run > msinfo32 > System Summary > Software Environment > System Drivers). In the name column you will find "csc" with a description of "Offline Files Driver". When Offline Files is truly disabled, the Start Mode column should be set to "Disabled"; otherwise it will be set to "System" which means it is loaded at system startup.

    image

    In my case, the customer's driver was set to "System" even though the Offline Files GUI and the Offline Files service were both disabled. Offline Files on her Vista machines was in a halfway enabled state. No synchronization with the file server was taking place, but Explorer.exe was still directed to look at the CSC database because the kernel driver was loaded.

    When the customer disabled Offline Files on the Vista machines, she manually disabled the Offline Files service on each machine, and she never made any change to the Offline Files GUI in Control Panel. I found that when you only disable the service, the Offline Files GUI does get updated and shows "Disabled", but it leaves the kernel driver startup mode set to "System". If you disable Offline Files via the Control Panel GUI, the service is disabled and the kernel driver is disabled as well. When Offline Files is disabled via the Control Panel GUI, you will be prompted to reboot the machine for this to take effect (so that the kernel driver can be unloaded).

    The supported way to disable Offline Files is to use the Offline Files GUI in Control Panel, Group Policy, or WMIC.

    To enable/disable Offline Files via Group Policy:

    This is a Computer Configuration setting under Administrative Templates\Network\Offline Files named "Allow of Disallow use of the Offline Files feature". Once the GPO applies, a reboot is required for the setting to take effect.

    image

    To enable/disable Offline Files via WMIC:

    We use the Enable method of Win32_OfflineFilesCache to enable or disable offline files. Simply pass the Enable method a Boolean value.

    Example command:

    image

    As noted in the output, a reboot is required for the new setting to take effect.

    If you have gotten your Vista machines into the state described above, enable Offline Files via one of the supported methods described above, reboot, disable Offline Files using a supported method, and then reboot once again.

    Thanks,

    Adam ‘Axe Body Wash’ Conkle

  • DS Restore Mode Password Maintenance

    Ned here again. There comes a day in nearly every administrator’s life where they will need to boot a domain controller into DS Restore Mode. Whether it’s to perform an authoritative restore or fix database issues, you will need the local administrator password. Too often, we work with customers that have not been maintaining this password and do not have a way to get in to their DC’s. Sometimes it’s because the password was set by another admin who is not available or no longer works at the company. Other times it’s because the password was set at DC promotion years ago and no one remembers it anymore. Since the password can only be set when the DC is running in normal mode, you get a nasty chicken and the egg situation if the DC cannot start Directory Services correctly. Today we’ll talk about some methods to get this password under control.

    The old fashioned way

    All domain controllers have a hard-coded local Administrator account stored in their SAM database. This account and local database are not used or generally available when the DC is running normally. You probably remember setting this password during DCPROMO:

    image

    In addition, the password can be reset later while the DC is running by using the NTDSUTIL tool:

    image

    If you know the password, you can restart the domain controller into DS Restore mode using F8 or via the boot options exposed in MSCONFIG and away you go.

    But what are the options to maintain this password, especially when your environment is in the hundreds or thousands of DC’s? Let’s dig in.

    NTDSUTIL Password Pull

    Beginning with hotfix KB961320 on Windows Server 2008, you now have the option to synchronize the DSRM password on a DC with a specific domain account. You must do every time the password is changed; it does not create an automatic sync partnership.

    1. Create a standard domain user account and set it with a complex password. It does not need to be a member of any special groups or the Domain Admins group.

    image

    2. Install the hotfix on your DC and restart.

    3. Logon to the DC normally.

    4. In an elevated CMD prompt where you have logged on as a Domain Admin, run:

    NTDSUTIL
    SET DSRM PASSWORD
    SYNC FROM DOMAIN ACCOUNT <your user here>
    Q
    Q

    So for example (using NTDSUTIL’s ability to pass in all parameters on a single command-line):

    image

    Note how there is no need to provide the actual password being used, or provide the old password. This feature will also be included in Service Pack 2 for Win2008.

    Group Policy Preference Automation of NTDSUTIL

    So what if we want to automate this NTDSUTIL command so that is run via a scheduled task? This is easily done using Group Policy Preferences.

    Note: Before you get too excited that I’ve missed something – no, GPP Local User password does not work with the DSRM passwords on domain controllers. You cannot use it to push a new password to the local administrator on DC’s; it only works only on member computers. Trust me, I’ve tried.

    The beauty of this solution is that there is no password stored anywhere except in Active Directory itself and the system effectively self maintains – the only administrator intervention needed is to periodically change the special user’s password, and to make sure the scheduled task is working on the DC’s. The same way you should be checking to make sure those backups will actually function for a restore if you ever need to use this password.

    So let’s set this up:

    1. Start GPMC on a Windows Server 2008 or Windows Vista computer running RSAT.

    2. Create and link a new policy on the Domain Controllers OU (you are doing all this in a test domain first, right?).

    image

    3. Create the GPP Scheduled task settings.

    image

    Note here that I have set:

    A) Action of ‘Update’ (this will create the task if it does not exist).

    B) Run command of the built-in GPP variable for %SystemDir% to specify the System32 directory, along with the ntdsutil.exe to be called.

    C) The command line exactly as it would be done by hand with ntdsutil, including the quotation marks:

    “SET DSRM PASSWORD” “SYNC FROM DOMAIN ACCOUNT DsrmUser” Q Q

    D) The task is Enabled with a checkbox so that it will run, not just be created.

    image

    E) Then I have set this to run as a daily task at 9AM (it’s fairly likely that the DC will be running at that point). I could also run this hourly, weekly, etc – whatever I want.

    4. After having created the policy and letting it apply to DC’s, I now see it is working by examining the scheduled tasks on one of my domain controllers. There it is (as well as another one I added to run every night too – can’t be too careful):

    image

    5. Once the right time has come and gone, I boot a DC into DS Repair mode and check – sure enough, my new password has taken affect automagically.

    And there you go.

    - Ned ‘That’s the same combination I have on my luggage!’ Pyle

    Reviewer’s note: We made Ned change his smart card PIN to something other than ‘12345’ after he let it slip in this blog post.

  • New Directory Services KB Articles 3/1-3/7

    New KB articles related to Directory Services for the week of 3/1-3/7.

    965493

    You may encounter the several smart card failure problems after you raise the domain functional level to Windows Server 2008 R2 domain function level in a network environment

    968703

    Microsoft Support Policy for NIC Teaming with Hyper-V

    967326

    Data loss occurs after you use the Dfsrmig.exe tool to migrate the SYSVOL share from the FRS to the DFSR service in a Windows Server 2008-based domain

    967592

    Windows Server 2008 Terminal Services licensing service leaks memory

    966329

    Windows Server 2008 Certificate Services (ADCS) does not start, and error code 0x80070057 is generated when ADCS is reinstalled by using the "use existing keys" option in Windows Server 2008

    967507

    You experience issues with DFS replication in a Windows Server 2003 R2-based AD environment

    968706

    Importing a signing certificate fails with an error

    963046

    When you use the Encrypting File System (EFS) to encrypt files, some files are not fully encrypted in Windows Vista or in Windows Server 2008

    967666

    After you sync a file in Windows Vista or in Windows Server 2008, the Date Modified time for the destination file may be updated to the current time

    961529

    The first offline file sync operation fails when you pin an entry and then work offline on a client computer that is running Windows Vista or Windows Server 2008

    963051

    A roaming profile user experiences a long delay during the logon or logoff process of a Windows Vista-based or a Windows Server 2008-based client computer in a WAN environment

    961120

    When you enable the "Encrypt the Offline Files cache" policy setting, multiple EFS certificates may be generated when you log on to the domain from multiple Windows Vista-based or Windows Server 2008-based client computers

    968264

    Error message when you try to map to a network drive of a DFS share by using a different user name than the one that you used to log on to Windows Vista: "A specified logon session does not exist. It may already have been terminated."