A Premier Field Engineer in Denmark

  • Getting started with Storage Replica in Windows Server Technical Preview

    Storage Replica (SR) is a new feature that enables storage-agnostic, block-level, synchronous replication between servers for disaster recovery, as well as stretching of a failover cluster for high availability. Synchronous replication enables mirroring of data in physical sites with crash-consistent volumes ensuring zero data loss at the file system level. Asynchronous replication allows site extension beyond metropolitan ranges with the possibility of data loss.

    Ned Pyle, the Product Manager for Storage Replica, has written a great “getting started” guide here:

    http://social.technet.microsoft.com/Forums/windowsserver/en-US/f843291f-6dd8-4a78-be17-ef92262c158d/getting-started-with-windows-volume-replication?forum=WinServerPreview

    I got mine going after adding the Windows Storage Replication feature in Server Manager:

    image

    It’s configured in Failover Clustering:

    image

    I’m working with a customer who is really excited that in-box volume replication has come to Windows Server. It’s going to be interesting to discover best practices and ideal use cases for Storage Replica as we get closer to the final release.
  • Migrating DFS Namespace from Windows 2000 Server mode to Windows Server 2008 mode

    Hi,
     
    I recently helped a customer with this tricky little exercise.
    The idea was to do the upgrade during office hours with as little downtime as possible and to run it remotely from one server.
     
    It’s following the basic guide here: http://technet.microsoft.com/en-us/library/cc753875.aspx
     
    But this formal guide wasn’t very “real world”. It forgets that there are clients out there with cached referrals who take a long time to realise that the namespace links have changed.
     
    In my example, I use these names which you’ll need to change:

    ·        The existing 2003 DFS server is FILE1. All the commands below are run on this server in C:\temp.

    ·         The first new 2012 R2 DFS server is FILE2

    ·         The domain is child.corp.contoso.com

    ·         The DFS Namespace in the domain is called “Testing”, so the path is \\child.corp.contoso.com\Testing

    ·         There are 3 target servers where all DFS links point to: TARGET1 , TARGET2 , TARGET3

     
    First, start by setting the existing 2003 DFS servers to issue FQDNs in their DFS referrals. This was a requirement of the customer as they wanted other forests to access this namespace, and wanted to be as efficient as possible.
    Note this requires restarting DFS-N:
     

    dfsutil server registry DfsDnsConfig set \\FILE1
    sc \\FILE1 stop DFS
    ping -n 5 127.0.0.1 > NUL
    sc \\FILE1 start DFS
    Run this for each of the 3 existing DFS servers.

     
    Next we copy the DFS root folder and share (including all security) to the new DFS servers:
     

    robocopy C:\DFSRoots\ \\FILE2\c$\DFSRoots\ Testing /COPYALL /E /XJ
    reg copy \\FILE1\HKLM\System\CurrentControlSet\Services\LanManServer\Shares \\FILE2\HKLM\System\CurrentControlSet\Services\LanManServer\Shares /s /f

     
    You will run this command for all new DFS-N servers
     
    We then set the new DFS servers to use FQDNs in their referrals:
     

    dfsutil server registry DfsDnsConfig set \\FILE2

     
    We need to restart the Server service so it can start sharing the shares we copied just before. This will also restart the DFS-N service for us, so the FQDN change will work:
     

    echo net stop LanManServer /yes > \\file2\c$\Temp\restartsvc.bat
    echo net start LanManServer >> \\file2\c$\Temp\restartsvc.bat
    echo net start DFS >> \\FILE2\c$\Temp\restartsvc.bat
    psexec \\FILE2 -d -accepteula c:\temp\restartsvc.bat
    Echo Waiting for 30 seconds for the Server service on FILE2 to restart
    ping -n 30 127.0.0.1 > NUL

     
    PSExec is needed to run the bat file because once the server service is stopped, it can’t be started remotely because it’s needed to accept the remote commands.
     
    Next, we need to export the existing DFS-N configuration and change all the short-name paths to FQDNs:
     

    dfsutil root export \\child.corp.contoso.com\Testing C:\temp\export.xml
    REM Get FNR.exe from here: http://findandreplace.codeplex.com/
    fnr.exe --cl --dir "C:\temp" --fileMask "export.xml" --find "\\\\TARGET1\\" --replace "\\\\TARGET1.child.corp.contoso.com\\" --silent
    fnr.exe --cl --dir "C:\temp" --fileMask "export.xml" --find "\\\\TARGET2\\" --replace "\\\\TARGET2.child.corp.contoso.com\\" --silent
    fnr.exe --cl --dir "C:\temp" --fileMask "export.xml" --find "\\\\TARGET3\\" --replace "\\\\TARGET3.child.corp.contoso.com\\" --silent

     
    We add the new DFS-N servers to the existing namespace:
     

    dfsutil target add \\FILE2.child.corp.contoso.com\Testing

     
    Repeat this for each of the 3 new namespace servers.
     
    Then we find the PDCe for the domain. We need to restart the DFS-N service on the PDCe as it doesn’t seem to accept that we add a new namespace servers so easily:
     

    nltest /dnsgetdc:child.corp.contoso.com /PDC | find ".">PDC.txt
    FOR /F "delims=. " %i IN (PDC.txt) DO sc \\%i stop DFS
    ping -n 5 127.0.0.1>NUL
    FOR /F "delims=. " %i IN (PDC.txt) DO sc \\%i start DFS
    ping -n 5 127.0.0.1>NUL

     
    You now need to wait for a DAY for the new namespace servers to actually take effect on the SMB clients out there on the network.
    If a session has an open write connection to any file on the DFS namespace path, then it will NOT update its target list.
    You verify this by logging on to the client computer and running:
     

    dfsutil.exe cache referral
    If it shows something like this…:
    Entry: \child.corp.contoso.com\Testing
    ShortEntry: \child\Testing
    Expires in 276 seconds
    UseCount: 0 Type:0x81 ( REFERRAL_SVC DFS )
      0:[\OLD-SERVER-1\Testing] AccessStatus: 0 ( ACTIVE TARGETSET )
       1:[\OLD-SERVER-3\Testing]
       2:[\OLD-Server-2\Testing]

     
    …and doesn’t list the new server names, then do NOT proceed to the next step. You can try flushing the DFSN cache, but if there are open files, this won’t have any effect. You will need to reboot this server before it learns of the new DFSN servers.
     

    dfsutil.exe cache referral flush
    You are looking for an output which looks like this:
    Entry: \child.corp.contoso.com\Testing
    ShortEntry: \child.corp.contoso.com\Testing
    Expires in 276 seconds
    UseCount: 0 Type:0x81 ( REFERRAL_SVC DFS )
      0:[\OLD-SERVER-1\Testing] AccessStatus: 0 ( ACTIVE TARGETSET )
       1:[\OLD-SERVER-3\Testing]
       2:[\NEW-SERVER-3.child.corp.contoso.com\Testing]
       3:[\NEW-SERVER-2.child.corp.contoso.com\Testing]
       4:[\OLD-SERVER-3\Testing]
       5:[\NEW-Server-1.child.corp.contoso.com\Testing]

     
    If you are sure that all the important SMB clients have updated their lists of possible namespace servers to include the 3 old servers and the 3 new servers (so 6 in all), then you can do the next step.
    The next step can only be run in the GUI because dfsutil cannot disable links. Open the DFS console, select the 3 old namespace servers and choose “Disable Namespace Server” for each of them:
     
    This will make sure that anyone who is still using the old Namespace servers can continue to do so, but that anyone who asks for a new referral will NOT receive the old servers.
     
    WAIT ANOTHER DAY.
    We need to make sure that all SMB clients end their sessions (by closing all their open write handles) and create new referral caches with only the new DFS-N servers in them.
    Logon to one of the big SMB clients on the network and verify that the OLD name space servers do NOT appear in referral cache:
     

    dfsutil.exe cache referral
    This should now look like this:
    Entry: \child.corp.contoso.com\Testing
    ShortEntry: \child.corp.contoso.com\Testing
    Expires in 276 seconds
    UseCount: 0 Type:0x81 ( REFERRAL_SVC DFS )
       0:[\NEW-SERVER-1.child.corp.contoso.com\Testing] AccessStatus: 0 ( ACTIVE TARGETSET )
       1:[\NEW-SERVER-3.child.corp.contoso.com\Testing]
       2:[\NEW-SERVER-3.child.corp.contoso.com\Testing]

     
    If old server names appear in the list, do NOT continue to the next step. Again, you can try to flush the DFS referral cache, but this is unlikely to work. The SMB client will likely need to be restarted (again).
     
    Remove the old namespace servers. Don’t use the GUI as this will attempt to also remove the share, which may cause extra difficulties if you need to rollback this step:
     

    dfsutil target remove \\FILE1\Testing

     
    Repeat this command for each of the 3 old namespace servers.
     
    We will now delete the existing DFS-N namespace and create a new one which is in Windows Server 2008 mode (aka v2). Existing sessions will be unaffected as they work from their referral cache. New sessions may fail if they are created in the next few seconds. Retrying will succeed without any intervention.
    Run these commands on one of the new DFS-N servers. Copy the export.xml file from the old DFS-N server where you have been running the previous commands from.
    This needs to be run locally due to the time it takes to re-create the links. It will be around 15 seconds when run locally and about 1-2 minutes if run remotely:
     

    dfsutil root import set C:\temp\export.xml \\child.corp.contoso.com\Testing NoBackup

     
    Where FILE3 and FILE4 in the example above are the additional new DFS-N servers.
     
    That will do it. You are now running the same exact namespace setup with the same permissions, but on new computers, using FQDN referrals and a v2 namespace.
  • Office 2013 Security Baselines for SCM are live

    Hi,

    Pat Fetty recently blogged about the new SCM baselines for Office 2013 going live.

    I opened up my local copy of SCM and imported the content:

    Prompt to import Offce 2013 baselines

    .cab and att files

    The .cab file contains the security settings. The “att” file contains the attachments which are Word documents describing the security baseline settings.

    You may get prompted at this point to accept the security details of the package. Inspect the certificates to make sure they are issued by Microsoft and are trusted by your computer.

    User and Computer product-specific settings

    There are user and computer settings, separated by individual Office programs or core Office settings.

    Done!

    Done!

     

    Browsing these new settings looks like this:

    SCM displaying Office 2013 settings

    Once you export these settings into a GPO Backup and import them onto an existing blank GPO in your domain, you’ll want the ADMX/ADML files which relate to the Office 2013 settings. And you’ll probably want to save them into your PolicyDefinitions folder in SYSVOL:

    \\your.domain.name\SYSVOL\your.domain.name\Policies\PolicyDefinitions

    Get theme here:

    http://www.microsoft.com/en-us/download/details.aspx?id=35554

    Office 2013 ADMX/ADML file download 

  • A backup server flooded by DPCs

    Hi,

    I’ve just finished working on a case with a customer that was so interesting that it deserved a blog post to round it off.

    These were the symptoms:

    Often while logged in to the server things would appear to freeze – no screen updates, little mouse responsiveness, if you could start a program (perfmon, Task Manager, Notepad etc.) then you wouldn’t be able to type into it and if you did it would crash.

    This Windows Server 2008 R2 server runs TSM backup software with thousands of servers on the network sending their backup jobs to it. At any one time there could be hundreds of backup jobs running. The load was lower during the day, but it was always working hard dealing with constant backups of database snapshots from servers. The backup clients are Windows, UNIX, Solaris, you name it…

    When the server froze, you’d see 4 of the 24 logical CPUs lock at 100% and the other 20 CPUs would saw-tooth from locking at 100% to using 20-30%. The freeze would happen for minutes at a time.

    CPUs 0,2,4,6 locked at 100%, others saw-tooth

    There are 2 Intel 10GB NICs in a team using Intel's teaming software. The team and the switches are setup with LACP to enable inbound load balancing and failover.

    By running perfmon remotely before the freeze happens we could see that the 4 CPUs that are locked at 100% are locked by DPCs. We used the counter “Processor Information\% DPC Time”.

    A DPC is best defined in Windows Internals 6th Ed. (Book 1, Chapter 3):

    A DPC is a function that performs a system task—a task that is less time-critical than the current one. The functions are called deferred because they might not execute immediately. DPCs provide the operating system with the capability to generate an interrupt and execute a system function in kernel mode. The kernel uses DPCs to process timer expiration (and release threads waiting for the timers) and to reschedule the processor after a thread’s quantum expires. Device drivers use DPCs to process interrupts.

    Because this is a backup server, we’re expecting that the bulk of our hardware DPCs will be generated by incoming network packets and raised by the NICs. Though they could have been coming from the tape library or the storage arrays.

    To look into what exactly is generating DPCs and how long the DPCs last for, we need to run Windows Performance Toolkit, specifically WPR.exe (Windows Performance Recorder). We have to do this carefully. We don’t want to increase the load of the server by capturing the Network and CPU activity of a server which already has high activity on the CPU and Network, and has shown a past history of crashing. But we want to run the capture while the server is in a frozen state. A tricky thing. So we ran this batch file:

    Start /HIGH /NODE 1 wpr.exe -start CPU –start Network -filemode –recordtempto S:\temp

    ping -n 20 127.0.0.1 > nul

    Start /HIGH /NODE 1 wpr.exe –stop S:\temp\profile_is_CPU_Network.etl

    If the server you are profile has a lot of RAM (24GB or more), you’ll want to protect your non-paged pool from increasing and harming your server. To do that you should review this blog and add this switch to the start command: –start "C:\Program Files (x86)\Windows Kits\8.0\Windows Performance Toolkit\SampleGeneralProfileForLargeServers.wprp"

    We’re starting on NUMA node 1 as the NICs were bound to NUMA node 0 and the “Processor Information” perfmon trace we took earlier showed that the CPUs on NUMA node 0 were locked. We’re starting the recorder with a “high” prioritization so that we can be sure it gets the CPU time it needs to work. We’re not writing to RAM, we’re recording to disk in the hopes that if the trace crashes we’ll at least have a partial trace to use. We made sure that S: in this example was a SAN disk to ensure it had the required speed to keep up with the huge data we’re expecting. We’re pinging 20 times to make sure our trace is 20 seconds long. And finally we’re starting a trace of CPU and Network profiles.

    Note that to gather stacks we first had to disable the ability for the Kernel (aka the Executive) to send its own pages of memory out from RAM to the pagefile, where we cannot analyze them. To do this run wpr –disablepagingexecutive on and then reboot.

    We retrieved 3 traces in all:

      1. The first trace to diagnose the problem
      2. The second trace after 2 changes were made which generated about 50% of our problem
      3. The final trace after the final change was made which created the other 50% of the problem

        Diagnosis

        So this blog now becomes a short tutorial on how you can use WPA (Windows Performance Analyzer) to locate the source of DPC issues. WPA is a VERY powerful tool and diagnosing problems is part science, part art. Meaning that no two diagnosis are ever done in the same way. This is just how I used WPA in this case. For this analysis, you’ll need the debugging tools installed and symbols configured and loaded.

        CPU Usage (Sampled)\Utilization By CPU

        First I want to see which CPUs are pegged. For that we use “CPU Usage (Sampled)\Utilization By CPU”, then select a time range by right-clicking:

        Choose a round number (10 seconds in my example) as it makes it easier to quickly calculate how many things happened per minute when comparing to the graphs for the later scenarios:

        Select Time Range

        I chose 20 seconds to 30 seconds as it is a 10 second window where there was heavy load and not blips due to tracing starting or stopping. Then “Zoom” by right clicking again.

        Now all your graphs will be focused on that time range.

        Then shift-select the CPUs which are pegged. In this case it is CPUs 0, 2, 4 and 6. This is because the cores are Hyperthreaded and the NICs cannot interrupt a logical CPU which is the result of Hyperthreading (CPUs 1, 3, 5, 7 etc.). And they are low-numbered CPUs because they are located on NUMA node 0.

        Once they are selected, right-click and choose “Filter to Selection”:

        Filter to Selection

        Next we want to add a column for DPCs so we can see how much of the CPUs time was spent locked processing DPCs. To add columns, just right click on the column title bar (in the screen above this has “Line # | CPU || Count | Weight (in view) | Timestamp”) on the centre of the right hand pane and select the columns you want to display. Once the DPC/ISR column has been added, drag it to the left side of the yellow bar, next to the CPU column:

        Choose columns

        Expanding out the CPU items, we see that DPCs count for almost all of the CPU activity on these CPUs (the count figures for the CPUs activity is 10 seconds of CPU time and the count of CPU time for DPCs under this is over 9 seconds).

        DPC duration by Module, Function

        The next WPA graph we need is the one which can show how long the DPCs last for. We drag in the first graph under “DPC/ISR” called “DPC duration by Module, Function”:

        DPC duration by Module, Function

        One the far right column (“Duration”), we can see how long each module spends waiting with a DPC. This says that 36.8 seconds were spent on DPCs for NDIS.SYS alone. How can it be 36.8 seconds if the sample window is 10 seconds? Well, it is CPU seconds, and we have 24 CPUs, so we could potentially have 240 CPU seconds in all.

        The next biggest waiter for DPCs is storport.sys. But at 1 second, it’s not even close.

        The column with the blue text is called “Duration (Fragmented) (ms) Avg” and is the average time a DPC lasts for during this sample window. The NDIS.SYS DPCs last around 0.22 milliseconds, or 220 microseconds. The count of DPCs for NDIS and storport are comparatively similar (163,000 and 123,000 respectively), but because NDIS took so long on each DPC on average, it ended up locking the CPU for longer than storport did.

        So let’s add the CPU column, move it to the left side of the yellow line with it as the first column to pivot on:

        Filter to busy CPUs

        We can see that our targeted CPUs, 0, 2, 4. 6 have very high durations of DPC waits (using the last column for “Duration”, again) with no other CPU spending very much time in a DPC wait state. So we select these CPUs and filter.

        Expanding out the CPUs, we see that there are many different sources of DPCs, but that NDIS is really the biggest source of DPC waits. So we will now move the “Module” column to be the left-most column and remove the CPU column from view. We then right click on NDIS.SYS and “Filter to Selection” again as we only want to focus on DPCs from NDIS on CPUs 0, 2, 4, 6:

        Filter to NDIS

        One function, ndisInterruptDPC is causing our DPC waits. This is the one we’ll focus on. If we expand this, it will list every single DPC and how long that wait is. Select every single one of these rows by scrolling to the very bottom of the table (in this example there are 163,230 individual DPCs):

        Copy Column Selection

        Right click on the column called “Duration” and choose “Copy Other” and then “Copy Column Selection”. This will copy only the values in the “Duration” column. We can paste this into Excel and create a graph which shows the duration of the DPCs as a function of the number of DPCs present:

        Taken from Excel

        I have added a red line on 0.1 milliseconds because according the hardware development kit for driver manufacturers, a DPC should not last longer than 100 microseconds. Meaning DPC above the red line are misbehaving. And that this is the bulk of our time spent waiting on DPCs.

        So, we have established that we have slow DPCs on NDIS, and lots of them, and that they are locking our 4 CPUs. Our NICs aren’t able to spread their DPCs to any other CPUs and Hyperthreading isn’t really helping our specific issue. But what is causing the networking stack to generate so many slow DPC locks?

        DPC/ISR Usage by Module, Stack

        The final graph in WPA will show us this. From the category “CPU Usage (Sampled)”, drag in a graph called “DPC/ISR Usage by Module, Stack”. Filter to DPC (which will exclude ISRs) and our top candidates are:

        DPC/ISR Usage by Module, Stack

        1. ntoskrnl.exe (the Windows Kernel)
        2. NETIO.SYS (Network IO operations)
        3. tcpip.sys (TCP/IP)
        4. NDIS.SYS (Network layer standard interface between OS and NIC drivers)
        5. IDSvia64.sys (Symantec Intrusion Detection System)
        6. ixn62x64.sys (Intel NIC driver for NDIS 6.2, x64)
        7. iansw60e.sys (Intel NIC teaming software driver for NDIS 6.0)

        To see what these are doing we simply expand the stack columns by clicking the triangle of the row with the highest count, looking for informative driver names and a large drop in the number of counts present, indicating that this particular function is causing a consumption of CPU time.

        NTOSKRNL is running high because we are capturing. The kernel is spending time gathering ETL data. This can be ignored.

        NETIO is redirecting network packets to/from tcpip.sys for a function called InetInspectRecieve:

        NETIO.sys stack expansion

        TCP/IP is dealing with the NETIO commands above to do this “Receive Inspection”:

        TCPIP.SYS stack expansion

        NDIS.SYS is dealing with 2 main functions in tcpip.sys: TcpTcbFastDatagram and InetInspectRecieve again:

        NDIS.SYS stack expansion

        Other than ntoskrnl, these 3 Windows networking drivers all have entries for the drivers listed as 5, 6 and 7 above in their stacks.

        Diagnosis Summary

        Lots of DPCs are caused by 3 probable sources:

        1. Incoming packet inspection by the Symantec IDS system.
        1. The IDS system has to take every packet, compare it to a signature definition, and, if clean, allow it to pass. This action is causing slow DPCs
        • The NIC driver could be stale/buggy and generating slow DPCs.
        1. There is no evidence for this, but it’s usually a good place to start. There could be TCP offloading or acceleration features in the NIC and/or driver which haven’t been enabled but may improve network performance.
        • And finally the NIC teaming software is getting in between the NICs and the CPUs.
        1. That is, after all, the job of the NIC teaming software: to trick Windows into thinking that the incoming packets from 2 distinct hardware devices are actually coming from 1 device. The problem here, however, is that this insertion into the networking stack is pure software, but is likely causing very slow DPCs

        Action Plan

        Our actions were to make changes over 2 separate outage windows:

        1. Update the NIC driver and enable Intel I/OAT in the BIOS of the server.
        1. I/OAT is described in the spec sheet for the NIC like this: “When enabled within multi-core environments, the Intel Ethernet Server Adapter X520-T2 offers advanced networking features. Intel I/O Acceleration Technology (Intel I/OAT), for efficient distribution of Ethernet workloads across CPU cores. Load balancing of interrupts using MSI-X enables more efficient response times and application performance. CPU utilization can be lowered further through stateless offloads such as TCP segmentation offload, header replications/splitting and Direct Cache Access (DCA).”
        • Uninstall the NIC teaming software
        1. 3rd party NIC teaming software inhibits many TCP offloading features, and in this case generates large numbers of slow DPCs
        • On the second outage we uninstalled the IDS system.
        1. IDS was not configured on this (and all other) servers. But as the software had the potential to become enabled, it was grabbing every incoming packet for inspection, despite the fact that it wasn’t configured to inspect the packet or act on violations in any way. Stopping the service is insufficient, the driver must be removed from the hidden, non-plug and play section of the device manager. Manually removing the driver isn’t sufficient. The software will reinstall it at next boot. Only a full uninstall will do.

        After dissolving the NIC Team

        Here is what the picture looked like after we dissolved the NIC team, updated the NIC driver and enabled Intel I/OAT in the BIOS.

        DPC duration - No teaming, I/OAT enabled

        In this 10 second sample we can see that the 4 CPU cores are still effectively locked as the CPU time due to NDIS DPCs is 37.7 seconds (out of a possible maximum of 40 seconds. The number of DPCs has decreased by more than half to 55,000, meaning that the average duration of DPCs has become very long at 682 microseconds – triple the average time from before we removed the NIC team and enabled I/OAT.

        Taken from Excel

        The blue area of the graph above is the picture we had from before changes were made. The pink/orange area is the picture of DPC durations after removing NIC teaming and enabling I/OAT.

        So why did the average duration of DPCs get longer?

        It could be that the IDS software now does not need to relinquish its DPCs to make room on the same CPU cores as the DPCs for the NIC teaming driver. These 2 drivers must be locked to the same CPUs. With no need to relinquish a DPC due to another DPC of equal priority, the IDS DPCs are free to use the CPU for longer periods of time before being forced off.

        At any rate, it certainly isn’t fixed yet.

        After uninstalling Symantec IDS

        And finally here’s what the picture looked like after we uninstalled the IDS portion of the Symantec package. Remember, this service was not configured to be enabled in any way.

        DPC duration - no IDS

        You can see that the average time has dropped from 220 microseconds to 90 microseconds – below the 100 microsecond threshold required by the Driver Development Kit.

        In this 10 second sample there were 127,000 DPCs from NDIS on the 4 heavily used CPUs, but the CPU time they consumed was 11 seconds, a reduction from 36.8 seconds.

        Taken from Excel

        The blue area of the graph above is the picture we had from before changes were made. The pink/orange area is the picture of DPC durations after removing NIC teaming and enabling I/OAT. And the green area is the picture after IDS is removed.

        This is a dramatic improvement. Nearly all DPCs are below the 100 microsecond limit. The system is able to process the incoming load without locking up for high priority, long lasting DPCs.

        What about RSS?

        We’re not quite done though. 4 of our CPUs are still working very hard, often pegged at 100%. But why only 4? This is a 2-socket system with 6 cores on each socket. That gives us 12 CPUs where we can run DPCs. DPCs from one NIC are bound to one NUMA node. We already dissolved our NIC team, so we only have 1 NIC in action, so we are limited to 6 cores. RSS can spread DPCs over CPUs in roots of 2, meaning 1, 2, 4, 8, 16, 32 cores. Meaning we can at most use 4 CPUs per NIC.

        To scale out we would need to add more NICs and limit RSS on each of those NICs to 2 cores. We’d need to bind 3 NICs to NUMA node 0 and 3 to NUMA node 1. We’d also need to set the starting CPUs for those NICs to be cores 0, 2, 4, 6, 8 and 10. In that we can saturate every possible core.

        But to do this, we’d need to ensure that we can have multiple NICs, without using the teaming software. Which means we’d need to assign each NIC a unique IP address. To do that we need to make sure that the TSM clients can deal with targeting a server name with multiple IP addresses in DNS for that name. And if connectivity to the first IP address is lost, that TSM can failover to one of the other IP addresses. We’ll test TSM and  get back with our results later.

        But we need one more fundamental check before doing that: We need to make sure that the incoming packet, hitting a specific NUMA node and core is going to end up hitting the right thread of the TSM server where that packet is going to be dealt with and backed up. If we can’t align a backup client to the incoming NIC and align that NIC to the backup software thread that should process it, then we’ll be causing intra-CPU interrupts, or worse yet, cross NUMA interrupts. This would make the entire system much less scalable.

        image

        So this is how this would all look. The registry key to set the NUMA node to bind a NIC to is “*NumaNodeId” (including the * at the start). To set the base CPU, use *RssBaseProcNumber”. To set the maximum number of processors to use set “*RssBaseProcNumber”.

        These keys are explained here: http://msdn.microsoft.com/en-us/library/windows/hardware/ff570864(v=vs.85).aspx

        and here: Performance Tuning Guidelines for Windows Server 2008 R2

        And more general information on how RSS works in Windows Server 2008 are here: Scalable Networking- Eliminating the Receive Processing Bottleneck—Introducing RSS

        Our problem in the above picture, however, is that our process doesn’t know to run its threads on the NUMA node and cores where the incoming packets are arriving. Had this been SQL server, we could have run separate instances configured to start using specific CPUs. Hopefully, one day, TSM will operate like this and become NUMA-node aware.

        I know this has been a long post, but for those who have read down to here, I do hope this has helped you with your troubleshooting using WPT.

      • Low throughput when copying files

        Hi,

        I have been helping a customer with a tricky issue recently regarding slow network performance for SMB file copies over their network.

        It came about after they took the settings defined in Security Compliance Manager for their member servers and deployed them as a Group Policy to their server OU. After doing this, they saw an 80% reduction in the performance in SMB file copies. But when we used Ntttcp.exe to test the network throughput via a test data stream, the throughput was not affected. Only SMB was affected.

        They had Windows Server 2008 R2 SP1 VMs on ESX with 1 virtual 10Gb NIC patched to a team of 2 physical 10Gb NICs. When 2 servers tried to copy a set of large test files without the SCM security settings applies, they could reach around 400Mbps. When we applied the settings, that dropped to around 80Mbps

        In the SCM security definitions, there are 234 settings defined. We had to find out which one of these settings caused their issue.

        image

        We could see that the CPUs of the VM were going nuts with a wild saw-tooth pattern of all CPUs. We tried adding more CPUs and the saw-tooth pattern simply spread without making any major change in achievable throughput.

        The process consuming the CPU time in Task Manager was ‘System’.

        So, to break into ‘System’ a little more, we ran Windows Performance Recorder (WPR) to get a trace of CPU activity, like this:

        image

        And in the trace, we expanded out “CPU Usage (Sampled)”, and added the graph for “DPC and ISR by Module, Stack”:

        DPC and ISR by Module, Stack

        This showed us that all our CPU time was spent processing DPCs generated by a driver called cng.sys

        image

        This is “Kernel Cryptography, Next Generation” which relates to the server or clients ability to calculate cryptographic equations in the Kernel when doing things like sending or receiving encrypted information, or information which has been signed. Signing in this case could be creating a signature hash for chunks of transmitted data to prove that is hasn’t been modified while on the wire.

        This, combined with the fact that only SMB was affected lead us to think it was SMB signing that was our issue.

        SMBv2 uses these 2 GPO settings to define SMB signing:

        image

        1. Microsoft Network Server: Digitally sign communications (always)
        2. Microsoft Network Client: Digitally sign communications (always)

        The settings relate to SMBv2. Note that they change the default, in-box setting from “Disabled” to the Microsoft recommended SCM setting of “Enabled”.

        For SMBv1 on Windows 2003 and older, the GPO settings are:

        1. Microsoft Network Server: Digitally sign communications (if client agrees)
        2. Microsoft Network Client: Digitally sign communications (if server agrees)

        Once we removed the “always” settings, the transfer speed returned back to the higher 400Mbps transfer speed we expected.

        We discussed the usefulness of this setting and in their network, it would be best to keep the “server” side setting enabled on DCs only to ensure that the GPO files which clients will download from the DCs during a Group Policy refresh have not been altered as these files are security sensitive files, but are usually very small and we don’t mind slightly slower transfer speeds for these files.

         

        Here’s some additional resources we used when investigating SMB signing:

        http://blogs.technet.com/b/josebda/archive/2010/12/01/the-basics-of-smb-signing-covering-both-smb1-and-smb2.aspx

        http://msdn.microsoft.com/en-us/library/a64e55aa-1152-48e4-8206-edd96444e7f7#id218

        http://blogs.msdn.com/b/openspecification/archive/2009/07/06/negtokeninit2.aspx?Redirected=true

        http://blogs.msdn.com/b/openspecification/archive/2009/04/10/smb-maximum-transmit-buffer-size-and-performance-tuning.aspx

        http://blogs.technet.com/b/filecab/archive/2012/05/03/smb-3-security-enhancements-in-windows-server-2012.aspx

        http://support.microsoft.com/kb/320829

        http://blogs.technet.com/b/neilcar/archive/2004/10/26/247903.aspx

        http://gallery.technet.microsoft.com/NTttcp-Version-528-Now-f8b12769

      • Removing permission for users to upload their image to AD

        Hi,

         

        I recently had the pleasure to help one of our Premier customers with a query they have regarding saving images in Active Directory.

        Default Permission in AD

        By default, users have permission to save a jpeg or bmp file to their own AD user account. This file can be up to 100KB in size. In a large AD with hundreds of thousands of users, this could quickly increase the size of the AD database. The increase in size can increase backup times, increase the backup size and slow down restores.

        This permission is granted via the constructed security principal, “SELF”.

        “SELF” is given permission to a set of attributes, not to the individual attributes themselves. By combining attributes into groups of common attributes, you reduce the size of the ACL entry. These groups are called Permission Sets. The attributes which relate to images are:

        • Picture (aka thumbnailPhoto)
        • jpegPhoto

        The attribute Picture is in a Permission Set called Personal-Information. You can see the permission is applied to all users like this:

        Personal-Information

        Control the Permission

        They wanted to take away the permission for SELF to be able to write to the Picture attribute, but this shouldn’t be a high-level deny for Everyone to write to this attribute. It could be that some users somewhere at sometime need to write to this attribute.

        What I suggested they do was de-couple the Picture attribute from the Property-Set called Personal-Information. Then apply an explicit permission to the root of the domain for a group which has write access to this attribute instead.

        Unlink from a Property-Set

        But how do you link (and therefore unlink) an attribute to a Property-Set?

        The property sets are not found in the schema, but instead are found in the Configuration partition, under Extended-Rights.

        Each of the Property Sets has an attribute identifying it, called rightsGuid. This GUID is used to pull in attributes as members of the property set by specifying the same GUID in the attribute of the attribute called attributeSecurityGUID. If these 2 GUIDs are the same, then the attribute will be a member of the Property Set. By removing the attributeSecurityGUID entry on the Picture attribute, it is no longer a member of the Personal-Information Property Set. And the SELF will lose permission to write to this attribute.

        While this sounds very complicated, here’s a simple picture to explain it all:

        Personal-Information

        The object on the left “CN=Personal-Information” is the property set. The object on the right “CN=Picture” is the attribute in the schema. It’s lDAPDisplayName is thumbnailPhoto. The attributes of these objects, rightsGuid and attributeSecurityGUID have the same value, a matching GUID.

        Remove the GUID

        When you remove the attributeSecurityGUID, open the attribute and click the button on the bottom left called “Clear”, as shown below:

        Personal-Information

        Notice also that the text in the attribute editor isn’t the same as the text you see in the window behind. The characters appear as pairs and the pairs in the blocks have been switched around.

        Undoing the Change

        In order to restore the GUID if you change your mind, you need to copy the same form of the GUID from another attribute. I chose Post-Code as this is also in the Personal Information Permission Set.

         

        I hope this helps someone else to delegate their Active Directory if needed.

         

        Craig

      • Installing DHCP on Windows Server 2012 did not create the local groups

        Hi again,

         

        Another interesting case with a nice, easy solution.

        While working with a Premier customer recently we found that the 2 local groups relating to DHCP, “DHCP Administrators” and “DHCP Users” didn’t get created on their new DHCP servers.

        Only the role installation steps can do this for us as that will make sure they were actually given the required rights to manipulate or view the service.

        What to do?

        We couldn’t just remove and reinstall the role – there was too much configuration already done.

        We couldn’t ignore it as we were installing IPAM and it needs to place the IPAM servers computer account into the group “DHCP Users” on the servers. It does this by nesting itself into new a universal group in the domain: IPAMUG. This group is the one which actually becomes a member of the “DHCP Users” group.

        The role was installed by a “next-next” manual installation using Server Manager. So it wasn’t as if some PowerShell or DISM.exe switch was accidentally left off. And if we repeated the manual installation, we would likely just end up where we started.

        What went wrong?

        At the end of the Server Manager wizard, you get this completion message (without the big red arrow, that’s my addition).

        Add Roles and Features Wizard

        Inside there is a link to launch a wizard which will configure the DHCP server, called “Complete DHCP configuration”. This wizard does 2 things:

        1. It creates the 2 groups we’re after: “DHCP Administrators” and “DHCP Users”
        2. It authorizes the DHCP server in Active Directory if the DHCP server is joined to a domain

        DHCP Post-Install configuration wizard

         

        The authorization part is pretty nifty. Usually you do this by right-clicking on the server in the DHCP MMC console and selecting “Authorize”. This will create a object in the Configuration partition of the Active Forest under Services / NetServices for the DHCP server. Only members of Administrators in the forest root domain or members of Enterprise Admins can create objects here. The new wizard lets you type alternate credentials to do this job:

        DHCP Post-Install configuration wizard

        My customer had authorized their DHCP servers, by doing it the old way in the DHCP MMC console using an account with permission to do so.

        They hadn’t noticed that small blue link from the image above. There is also a outstanding notification within any Server Manager console which connects to one of these DHCP servers (or on the local host itself). But that was also quiet subtle, and requires that you click on it to see the same blue link:

        Server Manager Notification

        In fact, we hadn’t even noticed any of this by the time I’d found an alternative way of creating these groups on their DHCVP servers using netsh.exe:

        netsh.exe dhcp add securitygroups

        netsh.exe dhcp add securitygroups

        Had we run the wizard through to it’s completion, we would have got a success message like this stating that the local groups were successfully created:

        DHCP Post-Install configuartion wizard

         

        I hope this helps someone avoid some troubleshooting time when deploying DHCP on Windows Server 2012.

      • MBAM 2.0 gets released along with Service Packs to most MDOP apps

        Hi,

         

        Just a quick note to publicise that MBAM 2.0 is now out, and each of AGPM 4.0, DaRT8.0, App-v 5.0, UE-V 1.0 each received their own updates to Service Pack 1. They are bundled in the new MDOP 2013.

        Read more about it here at the new home for the MDOP team: http://blogs.windows.com/windows/b/business/archive/2013/04/10/making-windows-8-even-more-manageable-with-mdop-2013.aspx

      • Using SONOS as a “Play To” destination from within Windows RT

        Hi,

         

        I recently became the proud owner of the fantastic Sonos PLAYBAR. And while the Sonos team is considering creating a Windows 8 App to control their devices, I found a neat little hack to get the DLNA portion of the Sonos to become a “Play To” device from within Windows 8 music apps.

        See the blog post here:

        http://digitalmediaphile.com/index.php/2013/03/30/using-uncertified-play-to-devices-on-surface-rt-w8-apps/

        Here are the registry keys I created for the PLAYBAR:

        Regedit 

      • Troubleshooting Windows Performance Issues: Lots of RAM but no Available Memory

        Hi,

        One of my recent posts was recently polished up enough to appear on the MSPFE blog:

        http://blogs.technet.com/b/mspfe/archive/2012/12/06/lots-of-ram-but-no-available-memory.aspx

        That blog roll is a new initiative within the Premier Field Engineer community to “put our best foot forward”.

        Posts appear from all the Microsoft technologies we support by PFEs like me who are working everyday with our customers to help them to resolve their technical issues. I hope it’s useful to you.

      • Messages cannot be sent when Exchange Hub Transport Service runs as NetworkService

        Hi again,

        I think the post title is pretty self explanatory.

        Just to clarify it a little, the customer who hit this problem found that

        1. A work-around was to run the service as LocalSystem.
        2. Mails between mailboxes on the same server would not be delivered while running the Hub Transport service as NetworkService.
        3. Those messages would move to the Sent Items folder in Outlook running in cached mode and would sit in Drafts and be shown in italics in OWA and Outlook in online mode.
        4. Messages from the internet coming in would always be delivered just fine.

        OK, so what could be causing the problem?

        Well, let’s first define what the 3 built-in security contexts for running services are and how they differ from each other:

         

        Account

        Local

        Permissions

        Can act

        on the network

        LocalService

        Limited

        No

        NetworkService

        Limited

        Yes

        LocalSystem

        Full

        Yes

         

        So any service running as LocalService cannot use the computers identity on the network and cannot authenticate to domain-joined resources on the network (and networked computers cannot authenticate to this service). A service running as NetworkService can do this authentication with remote resources. Both of these accounts have very limited permissions to access files and registry keys on the local system.

        LocalSystem has no restrictions on the local computer and also has the ability to authenticate on the network and have networked computers authenticate with it. This is a bad context to use for the Hub Transport service as it has too much access on the local server.

        So my first thought was that because LocalSystem works when sending messages and NetworkService does not work, then it wouldn’t be a problem regarding network authentication because both of these profiles support authenticating on the network. So it would be a local permission problem. Process Monitor from Sysinternals is a great tool for highlighting missing permissions to local resources.

        We shutdown all but 1 Hub Transport servers and on the remaining Hub Transport server we started the Hub Transport service as NetworkService. We then started up Process Monitor with a filter to show only events where RESULT = ACCESS DENIED like this:

        image

        But when sending an email from one mailbox to another, it didn’t record any actions which were very interesting at all.

        So, back to the drawing board. What about the scenario we excluded at the start? Network authentication. Well, what is happening with authentication is that the Mailbox server is trying to authenticate to the Hub Transport server to let it know that there are new messages that it needs to process.

        To get started we need to know how it’s authenticating and are there any problems during authentication. We looked at the Security event logs on both the Mailbox server and the Hub Transport server focusing on the time the test message was sent. What we saw were “Audit Failure” with an Event ID 4265. The interesting parts of the event were that Kerberos was attempted, the SID authenticating was NULL and the error was “invalid key”.

        We need to know which Kerberos tickets were in use for the LocalSystem logon session on the Mailbox server (we know that the information Store service starts as LocalSystem from here). We ran LogonSessions.exe from Sysinternals and got an output like this:

        C:\>logonsessions.exe

        Logonsesions v1.21
        Copyright (C) 2004-2010 Bryce Cogswell and Mark Russinovich
        Sysinternals - wwww.sysinternals.com


        [0] Logon session 00000000:000003e7:
            User name:    CONTOSO\SERVER-1$
            Auth package: Negotiate
            Logon type:   (none)
            Session:      0
            Sid:          S-1-5-18
            Logon time:   10/10/2012 12:04:25
            Logon server:
            DNS Domain:   contoso.com
            UPN:          SERVER-1$@contoso.com

        [1] Logon session 00000000:0000ae9f:
            User name:
            Auth package: NTLM
            Logon type:   (none)
            Session:      0
            Sid:          (none)
            Logon time:   10/10/2012 12:04:25
            Logon server:
            DNS Domain:
            UPN:

        [2] Logon session 00000000:000003e4:
            User name:    CONTOSO\SERVER-1$
            Auth package: Negotiate
            Logon type:   Service
            Session:      0
            Sid:          S-1-5-20
            Logon time:   10/10/2012 12:04:26
            Logon server:
            DNS Domain:   contoso.com
            UPN:          CONTOS-1$@contoso.com


        [3] Logon session 00000000:000003e5:
            User name:    NT AUTHORITY\LOCAL SERVICE
            Auth package: Negotiate
            Logon type:   Service
            Session:      0
            Sid:          S-1-5-19
            Logon time:   10/10/2012 12:04:26
            Logon server:
            DNS Domain:
            UPN:

        The first entry [0] is LocalSystem using Kerberos. Next [1] is NTLM authentication. Then [2] is NetworkService and lastly [3] is LocalService which has no ability to authenticate. So the logon session ID we want to target is 0x3e7 which I’ve highlighted above.

        We then ran klist tickets –li 0x3e7 on the Mailbox server to view the Kerberos service tickets held by the LocalSystem logon identity. This service will need a Kerberos ticket which is valid on the Hub Transport server. There was indeed a service ticket which was valid on the Hub Transport server (i.e. the encryption type AES256) was relevant as all Exchange servers are running on Windows Server 2008 SP2, the valid date range was correct and the clocks were in sync. So everything looks OK and the Mailbox server should be able to authenticate with the NetworkService logon session on the Hub Transport server. But it can’t. Why? Because the key which NetworkService on the Hub Transport servers should have been able to use to decrypt the incoming authentication message (the Kerberos service ticket) from the Mailbox server was broken for NetworkService, as explained here:

        http://support.microsoft.com/kb/2566059

        The domain functional level was at Windows Server 2003 and there was 1 2003 DC remaining in the domain, meaning that the pre-authentication key is encrypted using RC4 as newer AES128 and AES256  are not understood by 2003 DCs. When the first Windows Server 2008 member servers were added, they were these Exchange 2007 servers. The 2003 DCs started logging errors each time one of these 2008 clients requested a TGT or a Service Ticket because they would request it as AES256, which the 2003 OS didn’t understand. It would then negotiate down to RC4 and just work. In the mean time the 2003 DCs logged an error in the System event log about not understanding AES256.

        As a workaround to prevent the errors from filling the event logs on the 2003 DCs and from filling the monitoring application window, they implemented a reg key on the Exchange servers to force them to always request RC4 encrypted tickets. They found this hint on a 3rd party user forum site.

        We removed this key on the Mailbox servers:

        HKLM\System\CurrentControlSet\Control\Lsa\Kerberos\Parameters\DefaultEncryptionType

        We then removed all the Kerberos tickets which were cached on the Mailbox server using this command:

        klist purge –li 0x3e7

        And we then verified that this hotfix was installed on the remaining 2003 DCs so that they wouldn’t log the errors which flooded the event viewer causing them to implement the key we removed:

        http://support.microsoft.com/kb/948963

        We couldn’t install the hotfix mentioned in KB2566059 on the Mailbox servers as they were running on Windows Server 2008 SP2 and the hotfix was only built and released for Windows Server 2008 R2 as was not back-ported to Windows Server 2008 SP2. So the hotfix was not available to us.

         

         

        As a final note, why did internet messages work? Well, those messages are coming from unauthenticated senders – on the internet. Messages travelling from one mailbox to another are coming from one authenticated user to another. So authentication must be working for mailbox-to-mailbox messaging. But unauthenticated messages from the internet just worked.

         

        I hope this helps someone else in their troubleshooting the future.

      • SharePoint and SID History not playing well together

        Hi,

        I struck a problem at a custom and the impact, while it seemed minor on the surface, was actually a big deal for their migration project. In fact, the large team they had assembled to migrate users from one forest to a new forest had stopped while this issue was investigated.

        It relates to SID History and the way Windows queries for and caches Name-to-SID and SID-to-Name lookups from AD. This cache was causing SharePoint to think that a user who wanted to logon was actually a user from the wrong domain, and would create that person a new identity for that person within SharePoint for them.

        The scenario is actually very close to this one:

        http://blogs.technet.com/b/rgullick/archive/2010/05/15/sharepoint-people-picker.aspx

        But the workaround that we found would resolve the problem while they were migrating was pretty cool, so I thought I’d save it for all eternity here as a blog.

        It boils down to this:

        The LsaCache stores the previously looked-up domain user names and their SIDs. By asking a DC which has users that have both the new SID and the migrated SID on them at the same time, the DC always links the migrated SID to the new user name, not the old user name. If we can artificially fill the LsaCache with mappings for OLD USERNAME = OLD SID in our servers, then we can act as though no resources have migrated yet.

        Here’s the scenario where users were migrated with SID History from child1.domainA.com to domainB.com

        image 

        1. CHILD1\bob logs onto a workstation in CHILD1 and opens the SPS site in DOMAINB (intranet.domainB.com)
        2. SPS asks IIS, which asks Windows for a local DC to resolve a remote SID: S-1-5-21-[SID_for_CHILD1]-1010
        3. The local DC finds the SID assigned to the migrated user in the global catalog
        4. The local DC returns the account name of the migrated user, DOMAIN2\bob
        5. The SPS server adds the result to its LsaCache as a mapping for this SID to the DOMAIN2 account

        So we can see from the picture above that the LsaCache (the table in the bottom right of the drawing) has a mapping for NEW USERNAME = OLD SID but we want OLD USERNAME = OLD SID

        So, let’s warm up the LsaCache so it looks the way we’d like it to:

        image

        1. SPS constantly runs a script to query for the name CHILD1\bob
        2. The local DC queries its Global Catalog and does NOT have a record for this username
        3. The local DC must do its own LSA query to a DC in the domain CHILD1 for this name
        4. The remote DC in CHILD1 finds the user and replies with the SID: S-1-5-21-[SID_for_CHILD1]-1010
        5. The CHILD1 DC returns this to the DOMAINB DC (the DOMAINB DC caches this result in its own LsaCache)
        6. The local DC returns this result to the SPS server
        7. The SPS server adds this entry to its LsaCache

        Ah ha! Now our cache looks the way we’d like it, where OLD USERNAME = OLD SID. This way when a query for OLD SID is made, the result from cache will return OLD USERNAME.

        image 

        1. CHILD1\bob logs onto a workstation in CHILD1 and opens the SPS site in DOMIANB (intranet.domainB.com)
        2. SPS does NOT ask the local DC for the remote SID, it uses its LsaCache
        3. The LsaCache on SPS replies back with the username which relates to the SID: S-1-5-21-[SID_for_CHILD1]-1010 is CHILD1\bob

        The important step here is the red X where there IS NO STEP. What I mean is that the SharePoint server never talked to the DC to get the OLD SID lookup to return a result, meaning that we relied totally on the warmed up cache on the SPS alone.

        This relies on the LsaCache on the SPS server ALWAYS having the entry for the SID from the CHILD1 domain matching the CHILD1 username, and never matching the DOMAINB username. The only way to ensure this is:

        1. Constantly query from the SPS server for the name CHILD1\username for every user in DOMAINB which has been migrated from CHILD1 and has its SIDHistory migrated with it. Use a tool which invokes LookupAccountName() to locate the SID for the username: CHILD1\username. LookupAccountName is explained here: http://msdn.microsoft.com/en-us/library/aa379159(v=vs.85). I had access to a private tool which would do these queries for us. I suspect that PsGetSid from Sysinternals would be able to help out here too, but we never tried it.
        2. The LsaCache on SPS must be large enough to sure that the entries which are queried are never overwritten by entries from DOMAINB. Set the reg value HKLM\System\CurrentControlSet\Contol\Lsa\LsaLookupCacheMaxSize = (DWORD) = 0x2000 (8192 decimal). If this value does not exist the system uses a default cache size of 128 entries, which is overwritten too quickly on the busy SPS servers. 8192 entries on a pair of load balanced servers should be able to hold all SIDs for all users accessing the SPS site in the 2 forests (if your forest has more users, you’ll need to increase this.
        3. This is a workaround. The real fix is to have the users who are migrated from CHILD1.domain.com to domainB.com with SIDHistory should use their migrated accounts immediately. After the migration, their CHILD1 accounts should be disabled/deleted and SIDHistory should be removed from the DOMAINB accounts. This is an operationally very difficult action to do as it does not allow for an easy testing path or roll-back path.

        To view the actions as they are performed by LSA Lookups, add these 2 DWORDs to the registry under HKLM\System\CurrentControlSet\Control\Lsa\:

        • LspDbgTraceOptions = 0x1 (1 means “log to a file”, the file is C:\Windows\Debug\Lsp.log)
        • LspDbgInfoLevel = 0x88888888 (all 8‘s in hex means “log as verbose as possible”)

        These keys are explained here:

        http://technet.microsoft.com/en-us/library/ff428139(v=ws.10).aspx

        So, all in all a little complicated, but the workaround to increase the value for LsaLookupCacheMaxSize and constantly running a script on the SPS server to query for the SID for usernames in CHILD1 (with a filter to target only users which had been migrated to domainB) worked well for the customer.

      • Upgrading the ADMX Central Store files from Windows 7/2008R2 to Windows 8/2012

        ##############################

        ###   UPDATE (22 March 2013)   ###

        The ADMX and ADML files for Windows 8 and Windows Server 2012 are now available as a separate download. This includes 185 ADMX files, and is the complete set of all ADMX files for these OSes. Please use this download instead of the instructions in this post to create your super-set of updated ADMX/ADML files.

        http://www.microsoft.com/en-us/download/details.aspx?id=36991

        ##############################

         

        Hi,

         

        A while back I posted something similar regarding upgrading the PolicyDefinitions folder in SYSVOL from Windows Vista and Windows Server 2008 set of ADMX/ADML files to their newer versions in Windows 7 and Windows Server 2008 R2. That post is here.

         

        Well, it’s now time to move that on as Windows 8 and Windows Server 2012 are now out.

        First off, all ADMX/ADML files have had their dates updated. While I didn’t look to see if all the contents of the files have changed, it’s probably best to assume every file has changed and update all of them.

        One of them "(“InputPersonalization.admx”) has been removed since Windows 7. It controlled 1 setting, and this setting has been moved into the larger ControlPanel.admx. Meaning this admx/adml can be deleted once the newer ControlPanel.admx file is copied to the PolicyDefinitions folder.

        Windows 8 and Windows Server 2012 offer a range of new features (he says putting it mildly), and there are new admx/adml files for these. So make sure you include these in your update

        ADMX/ADML files new in Windows 8 and Windows Server 2012

         

        AppxPackageManager.admx
        AppXRuntime.admx
        DeviceCompat.admx
        DeviceSetup.admx
        EAIME.admx
        EdgeUI.admx
        EncryptFilesonMove.admx
        FileServerVSSAgent.admx
        FileServerVSSProvider.admx
        hotspotauth.admx
        LocationProviderAdm.admx
        msched.admx
        NCSI.admx
        NetworkIsolation.admx
        Printing2.admx
        Servicing.admx
        SettingSync.admx
        srm-fci.admx
        StartMenu.admx
        WCM.admx
        WinStoreUI.admx
        wlansvc.admx
        WPN.admx
        wwansvc.admx

        As with the previous operating systems, there are some admx/adml files which exist on the server SKU which do not also exist on the client SKU, and vice versa:

        ADMX/ADML files which exist on Windows Server 2012 but do NOT exist on Windows 8

        adfs.admx
        FileServerVSSAgent.admx
        GroupPolicy-Server.admx
        MMCSnapIns2.admx
        NAPXPQec.admx
        PswdSync.admx
        Snis.admx
        TerminalServer-Server.admx
        WindowsServer.admx

         

        ADMX/ADML files which exist on Windows 8 but do NOT exist on Windows Server 2012

        DeviceRedirection.admx
        sdiagschd.admx

        And the easy way to get all the possible ADMX/ADML files for a particular OS without having to install all the roles/features is to simply copy them out of the winsxs directory (replace en-US in the commands below if your OS is installed in a language other than English). Here is a sample set of commands which can do this for you. You’d need to run this on both a Windows 8 and Windows Server 2012 computers to capture all possible admx/adml files.

        cd /d %windir%\winsxs
        dir *.admx /s /b > %USERPROFILE%\Desktop\admx.txt
        dir *.adml /s /b | find /i "en-us" > %USERPROFILE%\Desktop\adml_en-us.txt


        mkdir %USERPROFILE%\Desktop\PolicyDefinitions
        mkdir %USERPROFILE%\Desktop\PolicyDefinitions\en-US
        FOR /F %i IN (%USERPROFILE%\Desktop\admx.txt) DO copy %i %USERPROFILE%\Desktop\PolicyDefinitions\
        FOR /F %i IN (%USERPROFILE%\Desktop\adml_en-us.txt) DO copy %i %USERPROFILE%\Desktop\PolicyDefinitions\en-US\

        I hope that helps you with your admx/adml upgrade.

         

        Craig

      • Using Delegation in Scheduled Tasks

        This blog is about the ability in Windows 7 and Windows Server 2008 R2 to apply a SID to every scheduled task and use that SID to apply permissions elsewhere in the Operating System.

        Services already have this feature from Vista and newer. The idea is the same; take the simple name for the service (or in that case of scheduled tasks in 7/R2 the path to the scheduled task) and compute a predictable SID based on that name. Have a look at the permissions applied to C:\Windows\System32\LogFiles\Firewall to see this in action. On the permissions of this folder, there is an ACE for a “group” called MpsSvc, which is the short name for the Windows Firewall service. In this way, even though the service is set to start as “Local Service”, not all other services which also run as this same account can see into the Firewall logs, only the firewall service itself has access.

        So every scheduled task can have a SID computed for it – this new feature is described here:

        http://msdn.microsoft.com/en-us/library/ee695875(v=vs.85).aspx

        And the way to locate what the predicable SID for a given service name is to run:

        schtasks /showsid /TN “TaskName”

        With this SID, you can now assign permissions to resources. For example, you could use icacls to apply permission to a folder, below we are granting an NT TASK (SID starts with S-1-5-87) modify permission to the folder C:\SomeFolder:

        icacls C:\SomeFolder /grant *S-1-5-87-xxxx-yyyy-zzzz:(M)

        Now, if you go and setup your task in the GUI, and run these commands, you will see icacls report back:

        No mapping between account names and security IDs was done.

        What went wrong?

        First you need to make sure that the scheduled task is configured for Windows 7 or Windows Server 2008 R2 and is using either “Network Service” or “Local Service”:

        image

        Then you need to make the task use the Unified Scheduling Engine so that it registers the SID with the list of “well known SIDs” for the system. But there is no check-box for this setting, and it is disabled by default. What to do?

        Export your task as an XML file, locate the line which reads:

        <UseUnifiedSchedulingEngine>False</UseUnifiedSchedulingEngine>

        And change that “False” to “True”:

        <UseUnifiedSchedulingEngine>True</UseUnifiedSchedulingEngine>

        With that changed, remove your task and import the XML file you modified above.

        Because the SID is a predicable calculation of the path to the task, so long as you recreate the task with the same name and in the same folder, the SID will remain the same and your icacls command will now work as expected, and only that scheduled task will have access to the file or folder to specify.

        The Unified Scheduling Engine leverages the Unified Background Process Manager (UBPM), which is described further here:

        http://blogs.technet.com/b/askperf/archive/2009/10/04/windows-7-windows-server-2008-r2-unified-background-process-manager-ubpm.aspx

      • Getting error 0xC004F074 when activating against KMS server

        Hi

        This error code is a very generic output to a KMS client having problems activating. To view your output, run slmgr.vbs –ato

        When troubleshooting this problem, we checked the following details – if any were a problem, they would generate this error code:

        • DNS record not published at _VLMCS._tcp.client.domain. This must be an SRV record, which the KMS server will automatically register. If you have disabled this, you will need to use a GPO on the clients to point them to the correct KMS server to use
        • No network connectivity from the KMS client to the MS server on the KMS port (tcp/1688 by default). Install telnet on the client and run telnet KMS.Server.Name 1688 and make sure the screen goes blank
        • Not more than 4 hours time difference between KMS server and client. Check that your time zones are correct. If using server core, run timedate.cpl
        • No major hardware changes to the KMS server. If the KMS server is a VM and you have added a number of new devices, CPUs, memory etc, or P2Ved you will need to reactivate your KMS license
        • The KMS service must have the keys to issue for the KMS client requesting a license. For example, if the KMS server is Windows Server 2008 and you are trying to activate Windows 7, you will need an update installed on the KMS server AND the correct KMS key for Windows 7.

        And the last one was the problem we hit: Our KMS server was Windows Server 2008 R2, just as the KMS clients. We’d crossed the threshold of 5 servers. But they still would not activate. The problem was that there are different license “channels” for Windows Server. They are described here: http://technet.microsoft.com/en-us/library/ff793411.aspx

        Our servers which were having problems activating were all Windows Server 2008 R2 Datacenter Edition, and we had a “B Channel” KMS license installed on the KMS server.

        We followed these steps on the KMS server to install the correct channel license:

        1. slmgr -upk
        2. slmgr -cpky
        3. slmgr -ipk <KMS Host Product Key - channel C>
        4. slmgr -ato

        After doing the above, we ran slmgr -ato on the Windows Server 2008 R2 Datacenter Edition servers. Note that “Channel C” is able to active all lower level channels.

      • Cannot bring Cluster Name resource online

        Hi,

        Another quick post with a non-very-obvious solution, this time on a new Windows Server 2008 R2 cluster.

        The case went like this:

        • The OSes of the nodes were built according to the security requirements of the customer
        • We added the Failover Clustering feature and attempted to create a new cluster while running the wizard as a member of Domain Admins who has Administrator permissions on all the nodes
        • The computer account in the domain was created for the Cluster Name Object (CNO), the account ‘SELF’ had full control
        • The wizard completed fine and the summary report showed no problems
        • The Cluster Name resource couldn’t come online
        • On the nodes the event ID 1206 was logged, which said:
          • Cluster network name resource 'Cluster Name' cannot be brought online. The computer object associated with the resource could not be updated in domain 'domain.name'. The error code was 'Unable to find computer account on DC where it was created'. The cluster identity 'CLUSTER01$' may lack permissions required to update the object. Please work with your domain administrator to ensure that the cluster identity can update computer objects in the domain
        • More confusing still, in the security log of the DC, there were “Kerberos pre-authentication failed” errors for the CNOs computer account, indicating that the wrong password was being used

        The problem turned out to be that the built-in group “Authenticated Users” had been removed from the built-in group “Users” on the OS of each of the nodes. The customer didn’t want to add “Authenticated Users” back into this group as that would have granted too many accounts too many rights. The work-around we put in was to create a domain group and nest the newly created CNO into this group. This group was placed into the “Users” built in group on all the cluster nodes. In this way, the CNO now has membership in the built-in group “Users” on each of the nodes.

        We needed to reboot all of the nodes before this change would take effect.

         

        I hope this helps someone out there.

      • Delegating access in AD to BitLocker recovery information

        Normally in AD, all attributes are readable by “Authenticated Users”. Some attributes should inherit permissions, but should not be readable by “just anyone” To protect attributes like this, they can be marked as “confidential”.

        There are 3 attributes relating BitLocker to which are marked in the schema as “confidential”.

        This is done by marking the searchFlags attribute as enabled for bit 7 (128 decimal) in the schema where the attribute is defined. See here for more information on searchFlags: http://support.microsoft.com/kb/922836

        These attributes are:

        Attribute

        Applies to Object

        Used for

        msTPM-OwnerInformation

        computer

        Contains the owner information of a computers TPM.

        msFVE-KeyPackage

        msFVE-RecoveryInformation

        Contains a volumes BitLocker encryption key secured by the corresponding recovery password.

        msFVE-RecoveryPassword

        msFVE-RecoveryInformation

        Contains a password that can recover a BitLocker-encrypted volume.

        An object of type “msFVE-RecoveryInformation” is created for every encrypted volume and is stored as a sub-object of the computers object where the volume was encrypted.

        Simply granting “read” access to these attributes will not allow a user to read the information in these attributes. A user who wants to read the attribute must also have an Access Mask for “Control_Access”. This is a special type of ACE (Access Control Entry). See here for more information on Access Masks: http://msdn.microsoft.com/en-us/library/aa374896(v=vs.85).aspx

        GUI Tool to Manage Permissions

        The only GUI tool which can set and view these special Control_Access ACEs is LDP.exe (using the version from Windows Server 2003 R2 ADAM or newer). This is shown below:

        The "Control_Access" flag is needed in ADDITION to the normal "Read Propery" right. The "Control_Access" flag gets you past the confidentiality bit. You still need to be able to read the contents of the attribute.

        Apply the permission once at the top of EACH DOMAIN where you need to delegate access to the recovery information of BitLocker volumes. Usually this does not include forest root domains or resource forests. Ensure the “inheritance” box is checked on each ACE so that it propagates to every msFVE-RecoveryInformation or Computer object and only to its relevant attributes.

        (Note from Ryans comment below: You can aply this permission anywhere in the OU structure if you'd like to split the delegation bewteen groups - e.g. Help Desk users can access the keys for Standard Workstations and the Server Admins can access the keys for servers etc. You could apply the "read propery" ACE at the top of the domain to a super-group for everyone who is allowed to access the keys, and then have different groups able to use the "Control_Access" flags for their particular OUs. This will help limit ACE bloat in lsass.exe working set while still locking down the keys in the way you'd expect.)

        Here are sample scripts to add the "Control_Access" flag to the top of the domain:

        DelegateBitLocker.vbs

        Taken from: http://technet.microsoft.com/en-us/library/cc771778(WS.10).aspx

        'To refer to other groups, change the group name (ex: change to "DOMAIN\Help Desk Staff")

        strGroupName = "BitLocker Recoverers"

        ' -----------------------------------------------------------

        ' Access Control Entry (ACE) constants

        ' -----------------------------------------------------------

        '- From the ADS_ACETYPE_ENUM enumeration

        Const ADS_ACETYPE_ACCESS_ALLOWED_OBJECT = &H5 'Allows an object to do something

        '- From the ADS_ACEFLAG_ENUM enumeration

        Const ADS_ACEFLAG_INHERIT_ACE = &H2 'ACE applies to target and inherited child objects

        Const ADS_ACEFLAG_INHERIT_ONLY_ACE = &H8 'ACE does NOT apply to target (parent) object

        '- From the ADS_RIGHTS_ENUM enumeration

        Const ADS_RIGHT_DS_CONTROL_ACCESS = &H100 'The right to view confidential attributes

        Const ADS_RIGHT_DS_READ_PROP = &H10 ' The right to read attribute values

        '- From the ADS_FLAGTYPE_ENUM enumeration

        Const ADS_FLAG_OBJECT_TYPE_PRESENT = &H1 'Target object type is present in the ACE

        Const ADS_FLAG_INHERITED_OBJECT_TYPE_PRESENT = &H2 'Target inherited object type is present in the ACE

        ' -----------------------------------------------------------

        ' BitLocker schema object GUID's

        ' -----------------------------------------------------------

        '- ms-FVE-RecoveryInformation object:

        ' includes the BitLocker recovery password and key package attributes

        SCHEMA_GUID_MS_FVE_RECOVERYINFORMATION = "{EA715D30-8F53-40D0-BD1E-6109186D782C}"

        '- ms-FVE-RecoveryPassword attribute: 48-digit numerical password

        SCHEMA_GUID_MS_FVE_RECOVERYPASSWORD = "{43061AC1-C8AD-4CCC-B785-2BFAC20FC60A}"

        '- ms-FVE-KeyPackage attribute: binary package for repairing damages

        SCHEMA_GUID_MS_FVE_KEYPACKAGE = "{1FD55EA8-88A7-47DC-8129-0DAA97186A54}"

        '- Computer object

        SCHEMA_GUID_COMPUTER = "{BF967A86-0DE6-11D0-A285-00AA003049E2}"

        'Reference: "Platform SDK: Active Directory Schema"

        ' -----------------------------------------------------------

        ' Set up the ACE to allow reading of all BitLocker recovery information properties

        ' -----------------------------------------------------------

        Set objAce1 = createObject("AccessControlEntry")

        objAce1.AceFlags = ADS_ACEFLAG_INHERIT_ACE + ADS_ACEFLAG_INHERIT_ONLY_ACE

        objAce1.AceType = ADS_ACETYPE_ACCESS_ALLOWED_OBJECT

        objAce1.Flags = ADS_FLAG_INHERITED_OBJECT_TYPE_PRESENT

        objAce1.Trustee = strGroupName

        objAce1.AccessMask = ADS_RIGHT_DS_CONTROL_ACCESS + ADS_RIGHT_DS_READ_PROP

        objAce1.InheritedObjectType = SCHEMA_GUID_MS_FVE_RECOVERYINFORMATION

        ' Note: ObjectType is left blank above to allow reading of all properties

        ' -----------------------------------------------------------

        ' Connect to Discretional ACL (DACL) for domain object

        ' -----------------------------------------------------------

        Set objRootLDAP = GetObject("LDAP://rootDSE")

        strPathToDomain = "LDAP://" & objRootLDAP.Get("defaultNamingContext") ' e.g. string dc=fabrikam,dc=com

        Set objDomain = GetObject(strPathToDomain)

        WScript.Echo "Accessing object: " + objDomain.Get("distinguishedName")

        Set objDescriptor = objDomain.Get("ntSecurityDescriptor")

        Set objDacl = objDescriptor.DiscretionaryAcl

        ' -----------------------------------------------------------

        ' Add the ACEs to the Discretionary ACL (DACL) and set the DACL

        ' -----------------------------------------------------------

        objDacl.AddAce objAce1

        objDescriptor.DiscretionaryAcl = objDacl

        objDomain.Put "ntSecurityDescriptor", Array(objDescriptor)

        objDomain.SetInfo

        WScript.Echo "SUCCESS!"

         

        DelegateTMPOwners.vbs

        Taken from: http://technet.microsoft.com/en-us/library/cc771778(WS.10).aspx

        'To refer to other groups, change the group name (ex: change to "DOMAIN\TPM Owners")

        strGroupName = "TPM Owners"

        ' ------------------------------------------------------------

        ' Access Control Entry (ACE) constants

        ' ------------------------------------------------------------

        '- From the ADS_ACETYPE_ENUM enumeration

        Const ADS_ACETYPE_ACCESS_ALLOWED_OBJECT = &H5 'Allows an object to do something

        '- From the ADS_ACEFLAG_ENUM enumeration

        Const ADS_ACEFLAG_INHERIT_ACE = &H2 'ACE applies to target and inherited child objects

        Const ADS_ACEFLAG_INHERIT_ONLY_ACE = &H8 'ACE does NOT apply to target (parent) object

        '- From the ADS_RIGHTS_ENUM enumeration

        Const ADS_RIGHT_DS_CONTROL_ACCESS = &H100 'The right to view confidential attributes

        Const ADS_RIGHT_DS_READ_PROP = &H10 ' The right to read attribute values

        '- From the ADS_FLAGTYPE_ENUM enumeration

        Const ADS_FLAG_OBJECT_TYPE_PRESENT = &H1 'Target object type is present in the ACE

        Const ADS_FLAG_INHERITED_OBJECT_TYPE_PRESENT = &H2 'Target inherited object type is present in the ACE

        ' ------------------------------------------------------------

        ' TPM and FVE schema object GUID's

        ' ------------------------------------------------------------

        '- ms-TPM-OwnerInformation attribute: SHA-1 hash of the TPM owner password

        SCHEMA_GUID_MS_TPM_OWNERINFORMATION = "{AA4E1A6D-550D-4E05-8C35-4AFCB917A9FE}"

        '- Computer object

        SCHEMA_GUID_COMPUTER = "{BF967A86-0DE6-11D0-A285-00AA003049E2}"

        'Reference: "Platform SDK: Active Directory Schema"

        ' ------------------------------------------------------------

        ' Set up the ACE to allow reading of TPM owner information

        ' ------------------------------------------------------------

        Set objAce1 = createObject("AccessControlEntry")

        objAce1.AceFlags = ADS_ACEFLAG_INHERIT_ACE + ADS_ACEFLAG_INHERIT_ONLY_ACE

        objAce1.AceType = ADS_ACETYPE_ACCESS_ALLOWED_OBJECT

        objAce1.Flags = ADS_FLAG_OBJECT_TYPE_PRESENT + ADS_FLAG_INHERITED_OBJECT_TYPE_PRESENT

        objAce1.Trustee = strGroupName

        objAce1.AccessMask = ADS_RIGHT_DS_CONTROL_ACCESS + ADS_RIGHT_DS_READ_PROP

        objAce1.ObjectType = SCHEMA_GUID_MS_TPM_OWNERINFORMATION

        objAce1.InheritedObjectType = SCHEMA_GUID_COMPUTER

        ' ------------------------------------------------------------

        ' Connect to Discretional ACL (DACL) for domain object

        ' ------------------------------------------------------------

        Set objRootLDAP = GetObject("LDAP://rootDSE")

        strPathToDomain = "LDAP://" & objRootLDAP.Get("defaultNamingContext") ' e.g. string dc=fabrikam,dc=com

        Set objDomain = GetObject(strPathToDomain)

        WScript.Echo "Accessing object: " + objDomain.Get("distinguishedName")

        Set objDescriptor = objDomain.Get("ntSecurityDescriptor")

        Set objDacl = objDescriptor.DiscretionaryAcl

        ' ------------------------------------------------------------

        ' Add the ACEs to the Discretionary ACL (DACL) and set the DACL

        ' ------------------------------------------------------------

        objDacl.AddAce objAce1

        objDescriptor.DiscretionaryAcl = objDacl

        objDomain.Put "ntSecurityDescriptor", Array(objDescriptor)

        objDomain.SetInfo

        WScript.Echo "SUCCESS!"

        And this script can help pull the assigned ACEs out to show you who has been delegated access: http://gallery.technet.microsoft.com/ScriptCenter/0bd4af9e-968a-4ae6-9950-2b2450afda37/

      • Getting prompted for credentials when accessing read-only Office files on SharePoint 2003 from Windows Vista or Windows 7 with Office 2007 or Office 2010

        I faced this problem recently at a customer.

        They had pure Windows XP with Office 2003 deployed to their clients. These clients were accessing a SharePoint 2003 site. When they started deploying new Windows 7 clients with Office 2007 they found that when the users clicked on links to Office files which they had read-only permissions to, they would get prompted to enter credentials. But entering credentials doesn’t work. If they hit cancel or escape, the prompt would disappear and the file would open as expected.

        Being a good PFE,  the first place I started was with Network Monitor traces. I was looking for any strange “access denied” messages, authentication attempts with mismatched methods, bad HTTP redirections, DNS problems, that sort of thing.

        Here’s what I found:

        • WebDAV is the agent which is trying to open the file, and is failing
        • By hitting cancel or escape you are telling the WebClient service to give up and fall back to HTTP
        • The WebDAV client which came with Office 2003 is not available on Office 2007 on Vista and newer – the OS has a WebDAV client called WebClient
          • This built-in WebDAV client doesn’t behave the same way as the Office 2003 extension does, and is too difficult to change
        • Stopping the WebClient service would avoid the prompt, as WebDAV is no longer being used. But now editing files on SharePoint 2003 is no longer possible
        • This is not a problem on versions of SharePoint which are newer than SPS 2003
        • The sites short name and FQDN were in the list of Trusted Sites in Internet Explorer
        • The security settings for IE allow automatic logon to sites in Trusted Sites
          • This means that Protected mode is avoided as IE can pass through authorisation to Office. Intranet Sites cannot do this apparently
        • They have a valid proxy server configured, but the short name for the SharePoint site and the FQDN are added to the proxy exception list
        • Using the registry key for AuthForwarServerList discussed here and here didn’t help

        So what is going on?

        WebClient is trying to take a write lock on the file. But the file is read-only to the user, so this fails. We see 4 requests to GET the file, each one has a reply which says “unauthorized”:

        WedDAVFailures

        Then I found this article:

        http://support.microsoft.com/kb/955375

        This says that by setting the registry value UseWinINETCache = 1 you will instruct Office to always open web-based files as read-only. If you need to edit the file on a SharePoint site, these will be opened as read-only also, so this will fail. To work-around this limitation you must do one of the following when editing a file:

        • Before opening the file, use the “Check Out” feature of SharePoint
        • Use the drop-down list for a file and choose “Edit in Microsoft Office Word/Excel”
        • Save the file as a different name

        Note this limit applies to ALL web-based files opened by Office, even those on SharePoint 2007 and 2010, which do not experience this problem. Therefore, this is only a work-around until you are able to upgrade your SharePoint 2003 sites to 2007 or 2010. Note that Internet Explorer 8 is NOT a supported browser when accessing SharePoint 2003, for this reason and others.

      • Preventing accidental removal from the domain

        Hi, another juicy customer question with a cool solution. The problem is this: on all workstations, the built-in Administrator account is disabled. Restriction groups are used to populate the group “Built-in\Administrators” with domain groups. No “back-door” local administrator accounts exist. So, when the desktop support team is trying to troubleshoot connectivity problems with machines, they may remove the computer from the domain. Once they do this, there is no way to log back on. Things were complicated even further as the workstations involved had their install partition encrypted with BitLocker, meaning any data on the workstation is also lost.

        So, without exposing a back-door account, we needed a way to prevent the workstation administrator from removing a workstation from the domain, but not in a “permanent” way – slow them down in such a way that undoing the prevention should remind them to add a temporary backdoor account before removing from the domain. And here’s what we came up with:

        We ran Sysinternals Process Monitor while removing the computer from the domain to see what changes are made to the system. If we can set a “deny” to the first action, then all the other actions will also be prevented from happening.

        The first change that happens to the system during a removal from the domain is to set the “Netlogon”service to start-up as “manual” instead of “automatic”. So by setting a deny in the registry for the user “SYSTEM”, we prevent the first action. But the remove from the domain process is clever, and detects if it cannot change the value it will attempt to take ownership of the registry key. So we also need to prevent modifying the owner. All of that looks like this:

        RegDeny-DomainDisjoin

        So when the workstation administrator attempts to remove the computer from the domain, they hit an immediate “access denied” and all further processing stops. Now the administrator must remove the explicit denies and that will hopefully be enough of a reminder to add a local administrator before rebooting the machine after removing it from the domain.

        Please note that this is not a supported or recommended method to perform this job, but it did fit well for the customer who was in that tricky situation.

      • Backing up and Restoring Domain-Based DFS Namespaces

        I had a question for a customer recently which needed some investigation, as the seemingly “easy steps” to export and import DFSN configurations didn’t do what either of us expected.

        KB969382 lists the actions to take in the event of your DFS Namespace going west. Option 2 was the one we were looking at as we wanted to create regular DFS-N backups to be used in any DFS-N related emergency.

        It seemed simple enough, run this command to backup your configuration:

        dfsutil root export \\domain.name\DFSN DFSN-root.txt

        And when disaster strikes, just run this command to put it all back again:

        dfsutil  root import set DFSN-Root.txt \\domain.name\DFSN

        However, no matter what the DFS-N emergency we created in the lab, the import would always fail citing “element not found”.

        The problem was that we were breaking the DFS-N root (on purpose), but the export/import scenario requires you to have a working DFS-N root. And to get that, you’d need good system-state backups of both a DC and a DFS Namespace server. Which isn’t going to provide for a fast, efficient restore scenario in a large organisation.

        So I started experimenting, and it seems that the objects in AD are easily copied and imported again using ldifde – there is no attachment to the object GUIDs (like there is say in a failover cluster). And once all the objects are back in AD, all the links and targets start working again as expected.

        The same applies to the share and DFSN root information in the registry – a simple ‘reg save’ followed by a ‘reg restore’ will get that information back with the registry ACLs in tact.

        So, I wrote 2 scripts (each fires of a second script to run directly on the DFS Namespace servers):

        1. The scheduled backup script deployed as a scheduled task to a central management server
        2. A restore script to be used in the case of any DFS-N emergency

        Now, while the restore could be more targeted to allow you to chose the scenario to recover from (e.g. restore ONLY the objects in AD or DFS-N registry information on just 1 DFS-N server, or only one DFS-N root), I’ll leave it to you good reader to add that intelligence. This restore script restore the entire DFS-N configuration for all roots and to all DFS-N servers.

        This will backup and restore both Windows 2000/2003 roots and Windows Server 2008 roots. It uses psexec from Sysinternals, available here. The reason it does this is to use reg save/reg restore, which capture ACLs on the registry keys and restores exactly the configuration which was backed up, rather than merging the configuration. While my testing shows that these reg keys do not have explicit permissions defined, you’re better safe than sorry.

        Make sure and change any instances of “dc=domain,DC=name” and “\\domain.name” to the domain name in your environment.

        Backup

        Main Job

        rem Setup input file

        if not exist .\backupfiles mkdir .\backup-files
        if exist root-servers.txt del root-servers.txt

        setlocal

        if exist servers.txt del servers.txt
        dsquery * "CN=DFS-Configuration,CN=System,DC=domain,dc=name" -filter "(|(objectClass=fTDfs)(objectClass=msDFS-Namespacev2))" -attr name > allRoots.txt
        for /F "tokens=1-3 skip=1 delims= " %%i IN (allRoots.txt) DO (
                dfsutil root \\domain.name\%%i | find /i "target" | find /i "%%i" >> %%i-serversRAW.txt
                for /F "tokens=2 delims=\" %%u IN (%%i-serversRAW.txt) do  echo %%i;%%u >> root-servers.txt
                del %%i-serversRAW.txt
                )

        for /F "tokens=1,2 delims=; " %%i IN (root-servers.txt) DO echo %%j>> serversRAW.txt
        sort serversRAW.txt /O serversSORTED.txt
        for /F "Tokens=*" %%s in ('type serversSORTED.txt') do set record=%%s&call :output

        del serversRAW.txt
        del serversSORTED.txt

        endlocal
         

        rem Backup

        for /F %%i IN (servers.txt) DO (
            if not exist \\%%i\c$\temp mkdir \\c$\temp
            copy .\NSserverBackup.bat \\%%i\c$\temp /Y
            psexec \\%%i C:\temp\NSserverBackup.bat
            copy \\%%i\c$\TEMP\%%i-dfsroots.hiv .\backup-files\%%i-dfsroots.hiv /Y
            copy \\%%i\c$\TEMP\%%i-CCS-shares.hiv .\backup-files\%%i-CCS-shares.hiv /Y
            copy \\%%i\c$\TEMP\%%i-CS1-shares.hiv .\backup-files\%%i-CS1-shares.hiv /Y
            )

        ldifde -f .\backup-files\dfs-export.ldf -v -d "CN=Dfs-Configuration,CN=System,DC=domain,dc=name" -l objectClass,remoteServerName,pKTGuid,pKT,msDFS-SchemaMajorVersion,msDFS-SchemaMinorVersion,msDFS-GenerationGUIDv2,msDFS-NamespaceIdentityGUIDv2,msDFS-LastModifiedv2,msDFS-Propertiesv2,msDFS-TargetListv2,msDFS-Ttlv2,msDFS-LinkPathv2,msDFS-LinkSecurityDescriptorv2,msDFS-Ttlv2,msDFS-Commentv2,msDFS-ShortNameLinkPathv2,msDFS-LinkIdentityGUIDv2 > .\backup-files\ldf-export.log

        goto :EOF

        :output
        if not defined previous_record goto write
        if "%record%" EQU "%previous_record%" goto :EOF

        :write
        @echo %record%>>servers.txt
        set previous_record=%record%

        NSserverBackup.bat

        C:
        cd \
        cd temp

        reg save HKLM\Software\Microsoft\Windows\DFS\Roots C:\TEMP\%COMPUTERNAME%-dfsroots.hiv /y
        reg save HKLM\System\CurrentControlSet\Services\lanmanserver\shares C:\temp\%COMPUTERNAME%-CCS-shares.hiv /y
        reg save HKLM\System\ControlSet001\Services\lanmanserver\shares C:\temp\%COMPUTERNAME%-CS1-shares.hiv /y

        The main backup job copies NSserverBackup.bat to each Namespace server and runs it from there.

         

        Restore

        Main Job

        rem Check input files

        if not exist allRoots.txt goto :EOF
        if not exist servers.txt goto :EOF


        rem clean up before restore

        dsquery * "CN=DFS-Configuration,CN=System,DC=DC=domain,dc=name" -filter "(|(objectClass=fTDfs)(objectClass=msDFS-NamespaceAnchor))" | dsrm -q -subtree -noprompt
        for /F %%i IN (servers.txt) DO (
                       reg delete \\%%i\HKLM\Software\Microsoft\Windows\DFS\Roots /f
                       reg delete \\%%i\HKLM\System\CurrentControlSet\Services\lanmanserver\shares /f
                       reg delete \\%%i\HKLM\System\ControlSet001\Services\lanmanserver\shares /f
                       reg add \\%%i\HKLM\Software\Microsoft\Windows\DFS\Roots /f
                       reg add \\%%i\HKLM\System\CurrentControlSet\Services\lanmanserver\shares /f
                       reg add \\%%i\HKLM\System\ControlSet001\Services\lanmanserver\shares /f
                       )

        rem restore

        ldifde -I -f .\backup-files\dfs-export.ldf -k -v > .\backup-files\dfs-import.log
        for /F %%i IN (servers.txt) DO (
                copy .\backup-files\%%i-dfsroots.hiv \\%%i\c$\temp\%%i-dfsroots.hiv /Y
                copy .\backup-files\%%i-CCS-shares.hiv \\%%i\c$\temp\%%i-CCS-shares.hiv /Y
                copy .\backup-files\%%i-CS1-shares.hiv \\%%i\c$\temp\%%i-CS1-shares.hiv /Y
            copy .\NSserverRestore.bat \\%%i\c$\temp\NSserverRestore.bat /Y
            copy .\allRoots.txt \\%%i\c$\temp\allRoots.txt /Y
            )
        psexec @servers.txt C:\temp\NSserverRestore.bat

        NSserverRestore.bat

        reg restore HKLM\Software\Microsoft\DFS\Roots C:\temp\%COMPUTERNAME%-dfsroots.hiv
        reg restore HLLM\System\CurrentControlSet\Services\lanmanserver\shares C:\temp\%COMPUTERNAME-CSS-shares.hiv
        reg restore HLLM\System\ControlSet001\Services\lanmanserver\shares C:\temp\%COMPUTERNAME-CS1-shares.hiv
        for /F "tokens=1-3 skip=1 delims= " %%i IN (allRoots.txt) DO dfsutil root forcesync \\domain.name\%%i
        net stop dfs && net start dfs

        The main backup job copies NSserverRestore.bat to each Namespace server and runs it from there.

      • How to set a static IP address and rename a NIC based on a known MAC address

        I had a question that I thought I would share the answer for.

        A customer was deploying multiple identical servers with multiple NIC into a testing lab as virtual machines. They needed a way to beat the plug and play detection of NIC cards so that they could set the correct static IP for the NIC which is “patched” to a virtual NIC port. The only static information they could use was giving all the identical, and isolated VMs the same MAC addresses from within Hyper-V

        In each identical VM (VM Guest 1, 2, 3 in the picture below), there are 4 NICs. 1 NIC is enabled for DHCP with a Hyper-V dynamic MAC address. The other 3 NICs have 1 of 3 known MAC addresses. The 3 NICs with known static MAC addresses all need static IP addresses. All the servers which share static MAC addresses must also share static IP addresses. And the name of NIC must be changed to make it clear in the VMs installation of RRAS (and to the administrators) which NIC is patched to which Hyper-V network. In this way the servers have identical, non-overlapping networks for administrators to test on – and one additional network where all the VMs can contact each other for sharing files.

        Lab Hyper-V NIC Setup

        The routine was this:

        1. Set the Static IP address based on the known MAC address
        2. Rename the NIC based on the known MAC address
        3. Rename the DHCP NIC based on the fact it is NOT one of the known MAC addresses

        Step 1

        wmic nicconfig where MACAddress=”00:12:34:56:78:9A” call EnableStatic ("1.2.3.4"), ("255.255.255.0")

        Step 2

        wmic /output:NICNameUNICODE.txt nic where MACAddress="00:12:34:56:78:9A" get NetConnectionID /FORMAT:LIST

        type NICNameUNICODE.txt > NICName.txt

        for /F "skip=2 tokens=1,2 delims==" %%i IN (NICName.txt) do netsh interface set interface name="%%j" newname="Some Name"

        We have to output WMIC to the text file instead of piping as it pipes with a <CR> at the end of each line, instead of a <CRLF>, which breaks the coming FOR /F command.

        But WMIC saves the resulting file as Unicode format, which the FOR /F cannot read, so we run this through TYPE to get the output formatted as UNICODE.

        The resulting NICName.txt looks like this:

        image

        There was also 1 additional NIC installed which did not have a static MAC address assigned by Hyper-V, and was enabled for DHCP. This NIC also needed renaming:

        Step 3

        wmic /output:DHCPNameUNICODE.txt nic where “MACAddress!=’00:12:34:56:78:9A AND MAC!=’00:12:34:56:78:9B’ AND MACAddress!=’00:12:34:56:78:9C’ AND AdapterType=’Ethernet 802.3’" get NetConnectionID /FORMAT:LIST

        type DHPNameUNICODE.txt > DHCPName.txt

        for /F "skip=2 tokens=1,2 delims==" %%i IN (DHCPName.txt) do netsh interface set interface name="%%j" newname="DHCP LAN"

         

        I hope this helps someone one day with their deployments.

      • Tuning Free System Page Table Entries when using /3GB and /USERVA=wxyz

         

        If you enable /3GB in the boot.ini of a Windows Server 2003 x86 server, you risk running out of address space for the kernel.

        You can tweak this by adding the switch /USERVA=wxyz where wxyz is the number of megabytes that should be allocated to the user mode processes. This will give more address space back to the kernel.

        But how should you choose the correct value for /USERVA?

        Here's the easiest way. This doesn't involve pool monitoring applications, debugging tools or enabling Free System Page Table Entries (FSPTEs) tracking registry keys (trackPTEs).

        1. Logon directly to the server. This will cache your credentials on the server should the next step cause the server to fail
        2. Add the switch /3GB to boot.ini
        3. Add an additional entry in the boot.ini which is the same as the default entry, but doesn't have /3GB in the boot.ini (this is useful if your system reboots in an unusable state after adding the /3GB switch). Your boot.ini should look like this:

           

        [Boot Loader]

        Timeout=5

        Default=multi(0)disk(0)rdisk(0)partition(1)\WINDOWS

        [Operating Systems]

        multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Microsoft Windows Server 2003, Enterprise" /fastdetect /NoExecute=OptOut /3GB

        multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Microsoft Windows Server 2003, Enterprise without 3GB" /fastdetect /NoExecute=OptOut

         

        1. Reboot the server
        2. Login, and open perfmon.msc. Add the counter Memory\Free System Page Table Entries and record the number. This should be a fairly static number.
          1. If the number is less that 10,000 and the server is behaving erratically, remove the line with /3GB to return the server to normal behaviour
        3. If the FSPTEs is greater than 10,000 then the server is ready to go back into production
        4. If the FSPTEs is less than 10,000 then we need to add the /USERVA=wxyz  switch to boot.ini
        5. Calculate the number using the method below, reboot and login again, and check that FSPTEs is indeed at the level you expected.
        6. If user load lowers the value to below the value you expected, recalculate your value for USERVA using the method below and reboot.

         

        How to calculate the right value for /USERVA

         

        Take your value for FSPTEs you got when you rebooted with only /3GB defined in the boot.ini. In our example we'll use 6,400 for the number of FSPTEs.

        6,400 is a counter of the number of 4KB memory "blocks" are available to the kernel. So the kernel has 6,400 * 4KB = 25,600 KB = 25MB of free address space to use.

        (We need to reboot the system, as we don't know how many PTEs are used by the system load all the required kernel resources when /3GB is specified as this changes the sizes of NPP and PP memory)

        FSPTEs must be greater than 10,000 at all times. So let's just make that 15,000 to ensure that future changes to required kernel resources (e.g. new video or network drivers) will not fail.

        15,000 * 4KB = 59MB

         

        In this example we have 25MB of free kernel memory and need 59MB, so we must add 59-25 = 34MB

         

        So we need to set USERVA to be 3072MB – 34MB = 3038

        So we add the switch /USERVA=3038 to boot.ini

         

        This will give every user-mode application 3038MB of address space, and give the kernel the 59MB of free address space it needs to be happy.

         

        This means that USERVA and FSPTEs are directly related to each other by a factor of 4KB/4*1024 = 256

         

        But will there be any other negative impact to any other critical kernel resource (like Pool of Non-Paged memory or Pool of Paged Memory)?

        In my tests, there will be no change in these resources, and this whitepaper confirms that adding or tuning /USERVA will only have an impact on the count of FSPTEs:

         

        http://www.microsoft.com/downloads/details.aspx?FamilyID=ed0e8084-abf7-4c00-ba6a-7d658cdb052a&DisplayLang=en

         

        With the "/USERVA" boot.ini switch, you can customize how the memory is allocated when you use the /3GB switch. The number following /Userva= is the amount of virtual memory address space in megabytes (MB) that will be allocated to each user process. If you set /3gb /Userva=3030 in Boot.ini, 3,030 MB of memory is reserved to the process space, as compared to 3,072 MB when you use the /3GB switch alone. The 42 MB that is saved when you set /Userva=3030 is used to increase the kernel memory space and free system page table entries (PTEs). The PTE memory pool is increased by the difference between 3 GB (specified by the /3GB switch) and the value that is assigned to the /Userva switch. There is no reduction in any other kernel resource as a result of this switch.

         

        Here's the results I found when changing the USERVA on a Windows Server 2003 SP2 x86 server with 3582MB RAM (4GB inserted, but the video card claimed 400MB at power-on):

        /3GB State

        /USERVA Value

        Free System Page Table Entries

        Free Kernel Memory

        OFF

        OFF

        175,000

        684MB

        ON

        OFF

        19,900 (1)

        78MB

        ON

        3030

        30,700

        120MB

        ON

        2900

        64,000

        250MB

        ON

        2800

        89,800 (2)

        351MB

        ON

        2650

        128,000

        500MB

        ON

        2500

        166,500

        650MB

        ON

        2466

        175,000

        684MB

        (1) = This is the lowest value for FSPTEs on this system, which is higher than 15,000, so no action is needed for this server.

        (2) = Exchange mailbox servers should never have a /USERVA lower than 2800 as this will cause problems for store.exe

         Below: Vertical Axis = Free System Page Table Entries, Horizontal Axis = Value for USERVA

         Vertical Axis = Free System Page Table Entries, Horizontal Axis = Value for USERVA

      • Upgrading the ADMX Central Store files from Vista to Windows 7

        I had a question from a customer and thought I’d share the answer with everyone. They asked “I want to upgrade our Central Store of ADMX/ADML files for Group Policy from Windows Vista SP2/Windows Server 2008 SP2 to Windows 7/Windows Server 2008 R2. What do we need to worry about?”. So I redirected them to this blog:

        http://blogs.technet.com/b/askds/archive/2009/12/09/windows-7-windows-server-2008-r2-and-the-group-policy-central-store.aspx

        But we found that there were differences between the ADMX files available in C:\Windows\PolicyDefinitions on Windows 7 and Windows Server 2008 R2. One such difference is highlighted here:

        http://blogs.technet.com/b/askds/archive/2008/07/18/enabling-group-policy-preferences-debug-logging-using-the-rsat.aspx

        I wondered if there were more differences, so I went through all of the ADMX files of:

        • a Windows Server 2008 R2 server with no roles or features installed
        • a Windows Server 2008 R2 server with EVERY role and feature installed
        • a Windows 7 RTM client
        • all of the above with Windows 7 / Windows Server 2008 R2 SP1 installed

        Here are the results:

        • The only ADMX/ADML files which were modified by SP1 related to TerminalServer.admx to include changes relating to Calista or RemoteFX. No other ADMX/ADML files were changed by SP1.
        • Applications like AGPM and Office can add their own ADMX files to the local PolicyDefinitions folder on the server or workstation they are installing to. Make note to add ALL the ADMX/ADML files you need to \\FQDN\SYSVOL\FQDN\policies
        • Installing Windows Search on the server will add the ADMX/ADML files for this on the server. Adding any other role/feature does NOT add ADMX/ADML files to Servers local PolicyDefinitions folder
        • Get your ADMX/ADML files from a server with all the roles and features installed, and a Windows 7 client with all the features installed, and create a “super-set” of all ADMX/ADML files in your Central Store

         

        ADMX/ADML files in Windows Server 2008 R2, which are missing in Windows 7

        • · adfs
        • · GroupPolicy-Server
        • · GroupPolicyPreferences
        • · kdc
        • · mmcsnapins2
        • · NAPXPQec
        • · PowerShellExecutionPolicy
        • · PswdSync
        • · SearchOCR (if Handwriting Recognition is installed)
        • · ServerManager
        • · Snis
        • · TerminalServer-Server
        • · WindowsServer

         

        ADMX/ADML files in Windows 7, which are missing in Windows Server 2008 R2

        • · DeviceRedirection
        • · sdiagschd
        • · Search (if not installed on the server)
        • · ShapeCollector

        Here is a list of all Files in PolicyDefinitions folder on collected from both Windows 7 and Windows Server 2008 R2 Server (with every role and feature installed) and their dates and sizes:

        10-06-2009  23:04             4,717 ActiveXInstallService.admx

        10-06-2009  22:53             4,714 AddRemovePrograms.admx

        10-06-2009  22:49             1,249 adfs.admx

        10-06-2009  22:30             5,393 AppCompat.admx

        10-06-2009  22:36             5,965 AttachmentManager.admx

        10-06-2009  22:53             3,391 AutoPlay.admx

        10-06-2009  22:52             2,968 Biometrics.admx

        10-06-2009  22:53            49,181 Bits.admx

        10-06-2009  23:01             1,749 CEIPEnable.admx

        10-06-2009  22:53             1,361 CipherSuiteOrder.admx

        10-06-2009  22:43             1,329 COM.admx

        10-06-2009  22:42            13,967 Conf.admx

        10-06-2009  22:53             2,600 ControlPanel.admx

        10-06-2009  22:53            10,099 ControlPanelDisplay.admx

        10-06-2009  22:53             1,293 Cpls.admx

        10-06-2009  22:53             1,933 CredentialProviders.admx

        10-06-2009  23:00            10,779 CredSsp.admx

        10-06-2009  22:53             1,746 CredUI.admx

        10-06-2009  23:04             2,141 CtrlAltDel.admx

        10-06-2009  22:43             2,437 DCOM.admx

        10-06-2009  22:53            13,576 Desktop.admx

        10-06-2009  23:07            18,551 DeviceInstallation.admx

        10/06/2009  22:50             2,391 DeviceRedirection.admx

        10-06-2009  22:59             1,093 DFS.admx

        10-06-2009  22:37             1,992 DigitalLocker.admx

        10-06-2009  22:52             3,034 DiskDiagnostic.admx

        10-06-2009  23:08             2,758 DiskNVCache.admx

        10-06-2009  22:38             6,123 DiskQuota.admx

        10-06-2009  22:54               989 DistributedLinkTracking.admx

        10-06-2009  22:30            10,290 DnsClient.admx

        10-06-2009  23:01             7,656 DWM.admx

        10-06-2009  22:53               962 EncryptFilesonMove.admx

        10-06-2009  22:40             5,097 EnhancedStorage.admx

        10-06-2009  23:01            21,737 ErrorReporting.admx

        10-06-2009  22:56             1,996 EventForwarding.admx

        10-06-2009  22:56            12,429 EventLog.admx

        10-06-2009  22:58             2,528 EventViewer.admx

        10-06-2009  22:53             3,836 Explorer.admx

        10-06-2009  22:51             2,141 FileRecovery.admx

        10-06-2009  22:38             6,172 FileSys.admx

        10-06-2009  22:45             2,342 FolderRedirection.admx

        10-06-2009  22:53             1,517 FramePanes.admx

        10-06-2009  22:52             2,229 fthsvc.admx

        10-06-2009  22:38             2,256 GameExplorer.admx

        10-06-2009  23:10            26,800 Globalization.admx

        10-06-2009  22:42             1,485 GroupPolicy-Server.admx

        10-06-2009  22:42            23,507 GroupPolicy.admx

        10-06-2009  22:42           100,025 GroupPolicyPreferences.admx

        10-06-2009  22:40             2,647 Help.admx

        10-06-2009  22:40             2,830 HelpAndSupport.admx

        10-06-2009  22:37             1,701 HotStart.admx

        10-06-2009  22:44            32,865 ICM.admx

        10-06-2009  22:43             1,243 IIS.admx

        10-06-2009  22:48         3,076,705 inetres.admx

        10-06-2009  23:08             1,787 InkWatson.admx

        10-06-2009  23:08             3,327 InputPersonalization.admx

        10-06-2009  22:41             6,868 iSCSI.admx

        10-06-2009  23:01             1,980 kdc.admx

        10-06-2009  23:01             3,709 Kerberos.admx

        10-06-2009  23:02             1,912 LanmanServer.admx

        10-06-2009  22:52             2,205 LeakDiagnostic.admx

        10-06-2009  22:39             3,681 LinkLayerTopologyDiscovery.admx

        10-06-2009  22:44             7,130 Logon.admx

        10-06-2009  23:01             1,786 MediaCenter.admx

        10-06-2009  22:31             3,580 MMC.admx

        10-06-2009  22:42            56,928 MMCSnapins.admx

        10-06-2009  22:42             6,994 MMCSnapIns2.admx

        10-06-2009  22:37             1,890 MobilePCMobilityCenter.admx

        10-06-2009  22:37             1,986 MobilePCPresentationSettings.admx

        10-06-2009  22:49             3,626 MSDT.admx

        10-06-2009  22:52             2,147 Msi-FileRecovery.admx

        10-06-2009  22:40            16,466 MSI.admx

        10-06-2009  22:58             1,298 NAPXPQec.admx

        10-06-2009  22:34             3,615 NCSI.admx

        10-06-2009  22:47            17,738 Netlogon.admx

        10-06-2009  22:31            17,024 NetworkConnections.admx

        10-06-2009  22:52             2,443 NetworkProjection.admx

        10-06-2009  23:01            25,505 OfflineFiles.admx

        10-06-2009  22:54             8,498 P2P-pnrp.admx

        10-06-2009  22:44             1,381 ParentalControls.admx

        10-06-2009  22:46             9,071 pca.admx

        10-06-2009  22:56             3,648 PeerToPeerCaching.admx

        10-06-2009  23:08             1,773 PenTraining.admx

        10-06-2009  22:33             2,292 PerfCenterCPL.admx

        10-06-2009  23:07             7,555 PerformanceDiagnostics.admx

        10-06-2009  23:07             1,939 PerformancePerftrack.admx

        10-06-2009  23:08            35,966 Power.admx

        10-06-2009  22:41             2,029 PowerShellExecutionPolicy.admx

        10-06-2009  22:44             6,901 PreviousVersions.admx

        10-06-2009  23:01            30,822 Printing.admx

        10-06-2009  22:53             3,239 Programs.admx

        10-06-2009  23:08             3,344 PswdSync.admx

        10-06-2009  22:50            13,257 QOS.admx

        10-06-2009  23:08             1,273 RacWmiProv.admx

        10-06-2009  22:52             1,972 Radar.admx

        10-06-2009  22:52             1,236 ReAgent.admx

        10-06-2009  22:57             3,722 Reliability.admx

        10-06-2009  22:51             7,150 RemoteAssistance.admx

        10-06-2009  23:07            23,268 RemovableStorage.admx

        10-06-2009  22:53             6,292 RPC.admx

        10-06-2009  22:42             6,991 Scripts.admx

        10-06-2009  22:48             2,519 sdiageng.admx

        10/06/2009  22:49             2,027 sdiagschd.admx

        10-06-2009  22:34            43,882 Search.admx

        10-06-2009  23:08            11,602 SearchOCR.admx

        10-06-2009  23:01             1,370 Securitycenter.admx

        10-06-2009  22:34             3,888 Sensors.admx

        10-06-2009  22:48             3,334 ServerManager.admx

        10-06-2009  23:04             1,588 Setup.admx

        10/06/2009  23:08             1,187 ShapeCollector.admx

        10-06-2009  22:54             1,634 SharedFolders.admx

        10-06-2009  22:53             1,985 Sharing.admx

        10-06-2009  22:53             3,466 Shell-CommandPrompt-RegEditTools.admx

        10-06-2009  22:53             1,157 ShellWelcomeCenter.admx

        10-06-2009  22:58             5,039 Sidebar.admx

        10-06-2009  22:31             7,397 Sideshow.admx

        10-06-2009  23:03             9,691 Smartcard.admx

        10-06-2009  23:08             2,057 Snis.admx

        10-06-2009  23:00             2,307 Snmp.admx

        10-06-2009  23:01             1,943 SoundRec.admx

        10-06-2009  22:53            25,663 StartMenu.admx

        10-06-2009  23:01             2,833 SystemResourceManager.admx

        10-06-2009  23:08             1,716 SystemRestore.admx

        10-06-2009  22:46            12,737 TabletPCInputPanel.admx

        10-06-2009  23:08            12,313 TabletShell.admx

        10-06-2009  22:53             9,365 Taskbar.admx

        10-06-2009  22:58             5,520 TaskScheduler.admx

        10-06-2009  22:49            10,059 tcpip.admx

        10-06-2009  22:39            17,774 TerminalServer-Server.admx

        04/11/2010  17:56            83,116 TerminalServer.admx

        10-06-2009  22:53             2,352 Thumbnails.admx

        10-06-2009  23:05             2,726 TouchInput.admx

        10-06-2009  23:04             3,409 TPM.admx

        10-06-2009  23:08             8,101 UserDataBackup.admx

        10-06-2009  22:56            15,021 UserProfiles.admx

        10-06-2009  23:04            40,554 VolumeEncryption.admx

        10-06-2009  23:04             6,277 W32Time.admx

        10-06-2009  22:49             2,512 WDI.admx

        10-06-2009  22:52             1,768 WinCal.admx

        10-06-2009  22:42            14,532 Windows.admx

        10-06-2009  22:53             1,265 WindowsAnytimeUpgrade.admx

        10-06-2009  23:08             3,702 WindowsBackup.admx

        10-06-2009  22:45             2,024 WindowsColorSystem.admx

        10-06-2009  22:39             4,085 WindowsConnectNow.admx

        10-06-2009  23:04             5,115 WindowsDefender.admx

        10-06-2009  22:53            35,942 WindowsExplorer.admx

        10-06-2009  23:08             3,000 WindowsFileProtection.admx

        10-06-2009  22:45            27,019 WindowsFirewall.admx

        10-06-2009  22:46             2,767 WindowsMail.admx

        10-06-2009  23:01             1,254 WindowsMediaDRM.admx

        10-06-2009  23:01            22,974 WindowsMediaPlayer.admx

        10-06-2009  22:44             2,903 WindowsMessenger.admx

        10-06-2009  22:42             7,203 WindowsProducts.admx

        10-06-2009  23:00             9,878 WindowsRemoteManagement.admx

        10-06-2009  23:00             4,338 WindowsRemoteShell.admx

        10-06-2009  22:42             1,314 WindowsServer.admx

        10-06-2009  22:59            19,272 WindowsUpdate.admx

        10-06-2009  23:04             1,955 WinInit.admx

        10-06-2009  23:04             5,237 WinLogon.admx

        10-06-2009  22:42             1,342 Winsrv.admx

        10-06-2009  22:53             1,406 WordWheel.admx

                     160 Files

      • Wildcard DNS Entires

        I was working on a case with a customer for something that was too weird to ignore.

        We wanted to use DNS Suffix Search Orders on the clients so that clients could query using short names for servers in DNS domains which weren’t their own.

        e.g. A PC in the domain child-dom-1.corp.contoso.com wanted to ping the short name “serverX”.

        ServerX had registered its name in the DNS zone matching its primary DNS Suffix: child-dom-2.corp.contoso.com

        So the answer is to set the DNS Suffix Search Order list. Prior to this the customer had configured the DNS zone child-dom2.corp.contoso.com to use WINS forwarders, pointing to a WINS server which serverX was also using. But WINS was on the way out (see the previous blog for those details on how to decommission WINS).

        DNS Suffix Search Order is configured on the properties of the NIC or in Group Policies (for all NICs):

        image

        We tried both methods, but it was not able to resolve names in any domain except the Primary DNS Suffix domain.

        Once there is a DNS Suffix Search Oder list defined, Windows must use that list over the single, Primary DNS Suffix. So what was going on?

        When we ran nslookup and set debug=2 we could see that queries for a non-existent host (e.g. mickeymouse) would reply back with a SUCCESS message for the A record, but no IP address in the answer.

        The solution:

        In the zone child-dom-1.corp.contoso.com there was a record called * with a type of MX. This record makes requests for ALL types of other records (A, AAAA, CNAME etc) succeed. And because the DNS client was getting back successes, it didn’t need to try alternate DNS Suffixes.

        The wildcard MX record was, of course deleted, and everything works as expected.

        But why have wildcard MX records?

        Wildcard MX records are good for when you have a large number of hosts which are not directly Internet-connected (for example, behind a firewall) and for administrative or political reasons it is too difficult to have individual MX records for every host, or to force all e-mail addresses to be "hidden" behind one or more domain names. In that case, you must divide your DNS into two parts, an internal DNS, and an external DNS.  The external DNS will have only a few hosts and explicit MX records, and one or more wildcard MXs for each internal domain.  Internally the DNS will be complete, with all explicit MX records and no wildcards.

      • Decommissioning WINS

        I’ve been working on helping remove WINS from a customers network. One of the big problems was identifying the remaining clients still using WINS, and just what they were using it for.

        We used Network Monitor to capture WINS name resolution queries on the WINS to see which clients were querying for which server names.

        What we found was quite interesting.

        When a client is configured with a WINS server (via DHCP or statically), it will always attempt to resolve queries for SHORT names (i.e. names without dots in them) via both WINS and DNS at the same time. When it formulates the first DNS query to send out, it uses this logic:

        • If DNS Suffix Search Order list is empty, then use the primary DNS Suffix (typically the DNS name of the domain the client is joined to).
        • If there is a DNS Suffix Search Order list, then use the first entry.

        It sends out BOTH a WINS query and a FQDN query to DNS at the same time because it doesn’t know which service can resolve the name, and rather than prefer one over the other and incur the delay, it just blasts both out at the exact same time.

        If both replies result in an answer (i.e. an IP address) then the client will use the result from the service which happens to reply back the fastest.

        If neither query comes back with a successful result, the DNS client takes over. It will either try DNS devolution on the primary DNS suffix (enabled by default), or will start walking down the DNS suffix search order, if that is configured. DNS devolution is the process of shortening the primary DNS suffix by dropping the left most parts of the DNS suffix until there is only 1 dot left in the DNS suffix.

        An example of DNS devolution:

        The primary DNS Suffix of the client is child.corp.contoso.com. The client is looking for the server called someserver.contoso.com by asking for server by the short name: someserver.

        1. someserver.child.corp.contoso.com [fails to resolve]
        2. someserver.corp.contoso.com [fails to resolve]
        3. someserver.contoso.com [success!]

        (Note that DNS wildcard records can mess this logic up – but that’s the topic of my next blog.)

        What does this matter for removing WINS?

        Well, in our case we started looking at all the WINS queries hitting the server before we started. And there were lots of them. This confused us a bit as all the clients should be Windows XP or newer, they should all be domain joined and should all use DNS. We were seeing the WINS queries because of the method described above where the client will send out BOTH WINS and DNS at the same time when querying for a short name.

        Step 1 in removing WINS from our clients was to export the static WINS entries and create static DNS records for them instead. This removed the reliance on WINS for the clients. There are still other devices (notably printers) which register in WINS and need WINS so the print operators can locate the new print devices appearing on the network. The DNS zones only allow for secure updates, so without some other method, WINS will still be needed for these devices. Altering the process for deploying print servers, by identifying them before they hit the field will solve that.

        Once that was done we installed Network Monitor 3.3 on the WINS server, and used this capture filter to show the successful answers the WINS server is giving back to the WINS clients:

        NbtNs.Flag.R == 0x1

        AND NbtNs.Flag.AA == 0x1

        AND NbtNs.AnswerCount > 0x0

        AND (IPv4.DestinationAddress < 10.1.0.0 OR IPv4.DestinationAddress > 10.1.255.255)

        AND (IPv4.DestinationAddress < 169.254.0.0 OR IPv4.DestinationAddress > 169.254.255.255)

        AND NbtNs.AnswerRecord.RRName.Name != "*<00><00><00><00><00><00><00><00><00><00><00><00><00><00><00>"

        Line-by-line this says: Show all responses, which are answers,where there is more than 0 answers, where I am not replying to a client who is in the server subnet (10.1.0.0/16), nor am I replying to APIPA assigned addresses in my subnet (169.254.0.0/16) and the answer is not a response to a master browser announcement. While WINS uses port 42, it uses this for WINS server replication. WINS queries happen on 137/UDP.

        We went through the results looking for names which weren’t in DNS. Which is like trying to find a straw in a great big stack of needles.

        Then we disabled the WINS entries in the DHCP scopes for the clients.

        Now we can see which clients are statically configured to use WINS. We’ll locate them first and correct them. Finding out exactly which host names they are relying on WINS for is still tricky, especially as the clients send out WINS and DNS queries simultaneously. But we’re on the right track.

        We can then focus the filter on the server subnets to locate servers which are configured to register records in WINS:

        (IPv4.SourceAddress > 10.1.0.0 AND IPv4.SourceAddress < 10.1.255.255)

        AND NbtNs.Flag.OPCode == 0x8

        AND NbtNs.NbtNsQuestionSectionData.QuestionName.Name != "CORP.CONTOSO.COM  "

        AND NbtNs.NbtNsQuestionSectionData.QuestionName.Name != "<01><02>__MSBROWSE__<02><01>"

        Which says: Limit the traffic to source IP addresses within the server range (10.1.0.0 – 10.1.255.255) which are WINS Name Registration requests but exclude domain browser election requests for the domain corp.contoso.com (the 2 spaces at the end are important"), and also exclude master browser announcements. What remains are

        I hope this helps you in your project to decommission WINS.