• How to set a static IP address and rename a NIC based on a known MAC address

    I had a question that I thought I would share the answer for.

    A customer was deploying multiple identical servers with multiple NIC into a testing lab as virtual machines. They needed a way to beat the plug and play detection of NIC cards so that they could set the correct static IP for the NIC which is “patched” to a virtual NIC port. The only static information they could use was giving all the identical, and isolated VMs the same MAC addresses from within Hyper-V

    In each identical VM (VM Guest 1, 2, 3 in the picture below), there are 4 NICs. 1 NIC is enabled for DHCP with a Hyper-V dynamic MAC address. The other 3 NICs have 1 of 3 known MAC addresses. The 3 NICs with known static MAC addresses all need static IP addresses. All the servers which share static MAC addresses must also share static IP addresses. And the name of NIC must be changed to make it clear in the VMs installation of RRAS (and to the administrators) which NIC is patched to which Hyper-V network. In this way the servers have identical, non-overlapping networks for administrators to test on – and one additional network where all the VMs can contact each other for sharing files.

    Lab Hyper-V NIC Setup

    The routine was this:

    1. Set the Static IP address based on the known MAC address
    2. Rename the NIC based on the known MAC address
    3. Rename the DHCP NIC based on the fact it is NOT one of the known MAC addresses

    Step 1

    wmic nicconfig where MACAddress=”00:12:34:56:78:9A” call EnableStatic ("1.2.3.4"), ("255.255.255.0")

    Step 2

    wmic /output:NICNameUNICODE.txt nic where MACAddress="00:12:34:56:78:9A" get NetConnectionID /FORMAT:LIST

    type NICNameUNICODE.txt > NICName.txt

    for /F "skip=2 tokens=1,2 delims==" %%i IN (NICName.txt) do netsh interface set interface name="%%j" newname="Some Name"

    We have to output WMIC to the text file instead of piping as it pipes with a <CR> at the end of each line, instead of a <CRLF>, which breaks the coming FOR /F command.

    But WMIC saves the resulting file as Unicode format, which the FOR /F cannot read, so we run this through TYPE to get the output formatted as UNICODE.

    The resulting NICName.txt looks like this:

    image

    There was also 1 additional NIC installed which did not have a static MAC address assigned by Hyper-V, and was enabled for DHCP. This NIC also needed renaming:

    Step 3

    wmic /output:DHCPNameUNICODE.txt nic where “MACAddress!=’00:12:34:56:78:9A AND MAC!=’00:12:34:56:78:9B’ AND MACAddress!=’00:12:34:56:78:9C’ AND AdapterType=’Ethernet 802.3’" get NetConnectionID /FORMAT:LIST

    type DHPNameUNICODE.txt > DHCPName.txt

    for /F "skip=2 tokens=1,2 delims==" %%i IN (DHCPName.txt) do netsh interface set interface name="%%j" newname="DHCP LAN"

     

    I hope this helps someone one day with their deployments.

  • Backing up and Restoring Domain-Based DFS Namespaces

    I had a question for a customer recently which needed some investigation, as the seemingly “easy steps” to export and import DFSN configurations didn’t do what either of us expected.

    KB969382 lists the actions to take in the event of your DFS Namespace going west. Option 2 was the one we were looking at as we wanted to create regular DFS-N backups to be used in any DFS-N related emergency.

    It seemed simple enough, run this command to backup your configuration:

    dfsutil root export \\domain.name\DFSN DFSN-root.txt

    And when disaster strikes, just run this command to put it all back again:

    dfsutil  root import set DFSN-Root.txt \\domain.name\DFSN

    However, no matter what the DFS-N emergency we created in the lab, the import would always fail citing “element not found”.

    The problem was that we were breaking the DFS-N root (on purpose), but the export/import scenario requires you to have a working DFS-N root. And to get that, you’d need good system-state backups of both a DC and a DFS Namespace server. Which isn’t going to provide for a fast, efficient restore scenario in a large organisation.

    So I started experimenting, and it seems that the objects in AD are easily copied and imported again using ldifde – there is no attachment to the object GUIDs (like there is say in a failover cluster). And once all the objects are back in AD, all the links and targets start working again as expected.

    The same applies to the share and DFSN root information in the registry – a simple ‘reg save’ followed by a ‘reg restore’ will get that information back with the registry ACLs in tact.

    So, I wrote 2 scripts (each fires of a second script to run directly on the DFS Namespace servers):

    1. The scheduled backup script deployed as a scheduled task to a central management server
    2. A restore script to be used in the case of any DFS-N emergency

    Now, while the restore could be more targeted to allow you to chose the scenario to recover from (e.g. restore ONLY the objects in AD or DFS-N registry information on just 1 DFS-N server, or only one DFS-N root), I’ll leave it to you good reader to add that intelligence. This restore script restore the entire DFS-N configuration for all roots and to all DFS-N servers.

    This will backup and restore both Windows 2000/2003 roots and Windows Server 2008 roots. It uses psexec from Sysinternals, available here. The reason it does this is to use reg save/reg restore, which capture ACLs on the registry keys and restores exactly the configuration which was backed up, rather than merging the configuration. While my testing shows that these reg keys do not have explicit permissions defined, you’re better safe than sorry.

    Make sure and change any instances of “dc=domain,DC=name” and “\\domain.name” to the domain name in your environment.

    Backup

    Main Job

    rem Setup input file

    if not exist .\backupfiles mkdir .\backup-files
    if exist root-servers.txt del root-servers.txt

    setlocal

    if exist servers.txt del servers.txt
    dsquery * "CN=DFS-Configuration,CN=System,DC=domain,dc=name" -filter "(|(objectClass=fTDfs)(objectClass=msDFS-Namespacev2))" -attr name > allRoots.txt
    for /F "tokens=1-3 skip=1 delims= " %%i IN (allRoots.txt) DO (
            dfsutil root \\domain.name\%%i | find /i "target" | find /i "%%i" >> %%i-serversRAW.txt
            for /F "tokens=2 delims=\" %%u IN (%%i-serversRAW.txt) do  echo %%i;%%u >> root-servers.txt
            del %%i-serversRAW.txt
            )

    for /F "tokens=1,2 delims=; " %%i IN (root-servers.txt) DO echo %%j>> serversRAW.txt
    sort serversRAW.txt /O serversSORTED.txt
    for /F "Tokens=*" %%s in ('type serversSORTED.txt') do set record=%%s&call :output

    del serversRAW.txt
    del serversSORTED.txt

    endlocal
     

    rem Backup

    for /F %%i IN (servers.txt) DO (
        if not exist \\%%i\c$\temp mkdir \\c$\temp
        copy .\NSserverBackup.bat \\%%i\c$\temp /Y
        psexec \\%%i C:\temp\NSserverBackup.bat
        copy \\%%i\c$\TEMP\%%i-dfsroots.hiv .\backup-files\%%i-dfsroots.hiv /Y
        copy \\%%i\c$\TEMP\%%i-CCS-shares.hiv .\backup-files\%%i-CCS-shares.hiv /Y
        copy \\%%i\c$\TEMP\%%i-CS1-shares.hiv .\backup-files\%%i-CS1-shares.hiv /Y
        )

    ldifde -f .\backup-files\dfs-export.ldf -v -d "CN=Dfs-Configuration,CN=System,DC=domain,dc=name" -l objectClass,remoteServerName,pKTGuid,pKT,msDFS-SchemaMajorVersion,msDFS-SchemaMinorVersion,msDFS-GenerationGUIDv2,msDFS-NamespaceIdentityGUIDv2,msDFS-LastModifiedv2,msDFS-Propertiesv2,msDFS-TargetListv2,msDFS-Ttlv2,msDFS-LinkPathv2,msDFS-LinkSecurityDescriptorv2,msDFS-Ttlv2,msDFS-Commentv2,msDFS-ShortNameLinkPathv2,msDFS-LinkIdentityGUIDv2 > .\backup-files\ldf-export.log

    goto :EOF

    :output
    if not defined previous_record goto write
    if "%record%" EQU "%previous_record%" goto :EOF

    :write
    @echo %record%>>servers.txt
    set previous_record=%record%

    NSserverBackup.bat

    C:
    cd \
    cd temp

    reg save HKLM\Software\Microsoft\Windows\DFS\Roots C:\TEMP\%COMPUTERNAME%-dfsroots.hiv /y
    reg save HKLM\System\CurrentControlSet\Services\lanmanserver\shares C:\temp\%COMPUTERNAME%-CCS-shares.hiv /y
    reg save HKLM\System\ControlSet001\Services\lanmanserver\shares C:\temp\%COMPUTERNAME%-CS1-shares.hiv /y

    The main backup job copies NSserverBackup.bat to each Namespace server and runs it from there.

     

    Restore

    Main Job

    rem Check input files

    if not exist allRoots.txt goto :EOF
    if not exist servers.txt goto :EOF


    rem clean up before restore

    dsquery * "CN=DFS-Configuration,CN=System,DC=DC=domain,dc=name" -filter "(|(objectClass=fTDfs)(objectClass=msDFS-NamespaceAnchor))" | dsrm -q -subtree -noprompt
    for /F %%i IN (servers.txt) DO (
                   reg delete \\%%i\HKLM\Software\Microsoft\Windows\DFS\Roots /f
                   reg delete \\%%i\HKLM\System\CurrentControlSet\Services\lanmanserver\shares /f
                   reg delete \\%%i\HKLM\System\ControlSet001\Services\lanmanserver\shares /f
                   reg add \\%%i\HKLM\Software\Microsoft\Windows\DFS\Roots /f
                   reg add \\%%i\HKLM\System\CurrentControlSet\Services\lanmanserver\shares /f
                   reg add \\%%i\HKLM\System\ControlSet001\Services\lanmanserver\shares /f
                   )

    rem restore

    ldifde -I -f .\backup-files\dfs-export.ldf -k -v > .\backup-files\dfs-import.log
    for /F %%i IN (servers.txt) DO (
            copy .\backup-files\%%i-dfsroots.hiv \\%%i\c$\temp\%%i-dfsroots.hiv /Y
            copy .\backup-files\%%i-CCS-shares.hiv \\%%i\c$\temp\%%i-CCS-shares.hiv /Y
            copy .\backup-files\%%i-CS1-shares.hiv \\%%i\c$\temp\%%i-CS1-shares.hiv /Y
        copy .\NSserverRestore.bat \\%%i\c$\temp\NSserverRestore.bat /Y
        copy .\allRoots.txt \\%%i\c$\temp\allRoots.txt /Y
        )
    psexec @servers.txt C:\temp\NSserverRestore.bat

    NSserverRestore.bat

    reg restore HKLM\Software\Microsoft\DFS\Roots C:\temp\%COMPUTERNAME%-dfsroots.hiv
    reg restore HLLM\System\CurrentControlSet\Services\lanmanserver\shares C:\temp\%COMPUTERNAME-CSS-shares.hiv
    reg restore HLLM\System\ControlSet001\Services\lanmanserver\shares C:\temp\%COMPUTERNAME-CS1-shares.hiv
    for /F "tokens=1-3 skip=1 delims= " %%i IN (allRoots.txt) DO dfsutil root forcesync \\domain.name\%%i
    net stop dfs && net start dfs

    The main backup job copies NSserverRestore.bat to each Namespace server and runs it from there.

  • Troubleshooting Windows Performance Issues: Lots of RAM but no Available Memory

    Hi,

    One of my recent posts was recently polished up enough to appear on the MSPFE blog:

    http://blogs.technet.com/b/mspfe/archive/2012/12/06/lots-of-ram-but-no-available-memory.aspx

    That blog roll is a new initiative within the Premier Field Engineer community to “put our best foot forward”.

    Posts appear from all the Microsoft technologies we support by PFEs like me who are working everyday with our customers to help them to resolve their technical issues. I hope it’s useful to you.

  • A backup server flooded by DPCs

    Hi,

    I’ve just finished working on a case with a customer that was so interesting that it deserved a blog post to round it off.

    These were the symptoms:

    Often while logged in to the server things would appear to freeze – no screen updates, little mouse responsiveness, if you could start a program (perfmon, Task Manager, Notepad etc.) then you wouldn’t be able to type into it and if you did it would crash.

    This Windows Server 2008 R2 server runs TSM backup software with thousands of servers on the network sending their backup jobs to it. At any one time there could be hundreds of backup jobs running. The load was lower during the day, but it was always working hard dealing with constant backups of database snapshots from servers. The backup clients are Windows, UNIX, Solaris, you name it…

    When the server froze, you’d see 4 of the 24 logical CPUs lock at 100% and the other 20 CPUs would saw-tooth from locking at 100% to using 20-30%. The freeze would happen for minutes at a time.

    CPUs 0,2,4,6 locked at 100%, others saw-tooth

    There are 2 Intel 10GB NICs in a team using Intel's teaming software. The team and the switches are setup with LACP to enable inbound load balancing and failover.

    By running perfmon remotely before the freeze happens we could see that the 4 CPUs that are locked at 100% are locked by DPCs. We used the counter “Processor Information\% DPC Time”.

    A DPC is best defined in Windows Internals 6th Ed. (Book 1, Chapter 3):

    A DPC is a function that performs a system task—a task that is less time-critical than the current one. The functions are called deferred because they might not execute immediately. DPCs provide the operating system with the capability to generate an interrupt and execute a system function in kernel mode. The kernel uses DPCs to process timer expiration (and release threads waiting for the timers) and to reschedule the processor after a thread’s quantum expires. Device drivers use DPCs to process interrupts.

    Because this is a backup server, we’re expecting that the bulk of our hardware DPCs will be generated by incoming network packets and raised by the NICs. Though they could have been coming from the tape library or the storage arrays.

    To look into what exactly is generating DPCs and how long the DPCs last for, we need to run Windows Performance Toolkit, specifically WPR.exe (Windows Performance Recorder). We have to do this carefully. We don’t want to increase the load of the server by capturing the Network and CPU activity of a server which already has high activity on the CPU and Network, and has shown a past history of crashing. But we want to run the capture while the server is in a frozen state. A tricky thing. So we ran this batch file:

    Start /HIGH /NODE 1 wpr.exe -start CPU –start Network -filemode –recordtempto S:\temp

    ping -n 20 127.0.0.1 > nul

    Start /HIGH /NODE 1 wpr.exe –stop S:\temp\profile_is_CPU_Network.etl

    If the server you are profile has a lot of RAM (24GB or more), you’ll want to protect your non-paged pool from increasing and harming your server. To do that you should review this blog and add this switch to the start command: –start "C:\Program Files (x86)\Windows Kits\8.0\Windows Performance Toolkit\SampleGeneralProfileForLargeServers.wprp"

    We’re starting on NUMA node 1 as the NICs were bound to NUMA node 0 and the “Processor Information” perfmon trace we took earlier showed that the CPUs on NUMA node 0 were locked. We’re starting the recorder with a “high” prioritization so that we can be sure it gets the CPU time it needs to work. We’re not writing to RAM, we’re recording to disk in the hopes that if the trace crashes we’ll at least have a partial trace to use. We made sure that S: in this example was a SAN disk to ensure it had the required speed to keep up with the huge data we’re expecting. We’re pinging 20 times to make sure our trace is 20 seconds long. And finally we’re starting a trace of CPU and Network profiles.

    Note that to gather stacks we first had to disable the ability for the Kernel (aka the Executive) to send its own pages of memory out from RAM to the pagefile, where we cannot analyze them. To do this run wpr –disablepagingexecutive on and then reboot.

    We retrieved 3 traces in all:

      1. The first trace to diagnose the problem
      2. The second trace after 2 changes were made which generated about 50% of our problem
      3. The final trace after the final change was made which created the other 50% of the problem

        Diagnosis

        So this blog now becomes a short tutorial on how you can use WPA (Windows Performance Analyzer) to locate the source of DPC issues. WPA is a VERY powerful tool and diagnosing problems is part science, part art. Meaning that no two diagnosis are ever done in the same way. This is just how I used WPA in this case. For this analysis, you’ll need the debugging tools installed and symbols configured and loaded.

        CPU Usage (Sampled)\Utilization By CPU

        First I want to see which CPUs are pegged. For that we use “CPU Usage (Sampled)\Utilization By CPU”, then select a time range by right-clicking:

        Choose a round number (10 seconds in my example) as it makes it easier to quickly calculate how many things happened per minute when comparing to the graphs for the later scenarios:

        Select Time Range

        I chose 20 seconds to 30 seconds as it is a 10 second window where there was heavy load and not blips due to tracing starting or stopping. Then “Zoom” by right clicking again.

        Now all your graphs will be focused on that time range.

        Then shift-select the CPUs which are pegged. In this case it is CPUs 0, 2, 4 and 6. This is because the cores are Hyperthreaded and the NICs cannot interrupt a logical CPU which is the result of Hyperthreading (CPUs 1, 3, 5, 7 etc.). And they are low-numbered CPUs because they are located on NUMA node 0.

        Once they are selected, right-click and choose “Filter to Selection”:

        Filter to Selection

        Next we want to add a column for DPCs so we can see how much of the CPUs time was spent locked processing DPCs. To add columns, just right click on the column title bar (in the screen above this has “Line # | CPU || Count | Weight (in view) | Timestamp”) on the centre of the right hand pane and select the columns you want to display. Once the DPC/ISR column has been added, drag it to the left side of the yellow bar, next to the CPU column:

        Choose columns

        Expanding out the CPU items, we see that DPCs count for almost all of the CPU activity on these CPUs (the count figures for the CPUs activity is 10 seconds of CPU time and the count of CPU time for DPCs under this is over 9 seconds).

        DPC duration by Module, Function

        The next WPA graph we need is the one which can show how long the DPCs last for. We drag in the first graph under “DPC/ISR” called “DPC duration by Module, Function”:

        DPC duration by Module, Function

        One the far right column (“Duration”), we can see how long each module spends waiting with a DPC. This says that 36.8 seconds were spent on DPCs for NDIS.SYS alone. How can it be 36.8 seconds if the sample window is 10 seconds? Well, it is CPU seconds, and we have 24 CPUs, so we could potentially have 240 CPU seconds in all.

        The next biggest waiter for DPCs is storport.sys. But at 1 second, it’s not even close.

        The column with the blue text is called “Duration (Fragmented) (ms) Avg” and is the average time a DPC lasts for during this sample window. The NDIS.SYS DPCs last around 0.22 milliseconds, or 220 microseconds. The count of DPCs for NDIS and storport are comparatively similar (163,000 and 123,000 respectively), but because NDIS took so long on each DPC on average, it ended up locking the CPU for longer than storport did.

        So let’s add the CPU column, move it to the left side of the yellow line with it as the first column to pivot on:

        Filter to busy CPUs

        We can see that our targeted CPUs, 0, 2, 4. 6 have very high durations of DPC waits (using the last column for “Duration”, again) with no other CPU spending very much time in a DPC wait state. So we select these CPUs and filter.

        Expanding out the CPUs, we see that there are many different sources of DPCs, but that NDIS is really the biggest source of DPC waits. So we will now move the “Module” column to be the left-most column and remove the CPU column from view. We then right click on NDIS.SYS and “Filter to Selection” again as we only want to focus on DPCs from NDIS on CPUs 0, 2, 4, 6:

        Filter to NDIS

        One function, ndisInterruptDPC is causing our DPC waits. This is the one we’ll focus on. If we expand this, it will list every single DPC and how long that wait is. Select every single one of these rows by scrolling to the very bottom of the table (in this example there are 163,230 individual DPCs):

        Copy Column Selection

        Right click on the column called “Duration” and choose “Copy Other” and then “Copy Column Selection”. This will copy only the values in the “Duration” column. We can paste this into Excel and create a graph which shows the duration of the DPCs as a function of the number of DPCs present:

        Taken from Excel

        I have added a red line on 0.1 milliseconds because according the hardware development kit for driver manufacturers, a DPC should not last longer than 100 microseconds. Meaning DPC above the red line are misbehaving. And that this is the bulk of our time spent waiting on DPCs.

        So, we have established that we have slow DPCs on NDIS, and lots of them, and that they are locking our 4 CPUs. Our NICs aren’t able to spread their DPCs to any other CPUs and Hyperthreading isn’t really helping our specific issue. But what is causing the networking stack to generate so many slow DPC locks?

        DPC/ISR Usage by Module, Stack

        The final graph in WPA will show us this. From the category “CPU Usage (Sampled)”, drag in a graph called “DPC/ISR Usage by Module, Stack”. Filter to DPC (which will exclude ISRs) and our top candidates are:

        DPC/ISR Usage by Module, Stack

        1. ntoskrnl.exe (the Windows Kernel)
        2. NETIO.SYS (Network IO operations)
        3. tcpip.sys (TCP/IP)
        4. NDIS.SYS (Network layer standard interface between OS and NIC drivers)
        5. IDSvia64.sys (Symantec Intrusion Detection System)
        6. ixn62x64.sys (Intel NIC driver for NDIS 6.2, x64)
        7. iansw60e.sys (Intel NIC teaming software driver for NDIS 6.0)

        To see what these are doing we simply expand the stack columns by clicking the triangle of the row with the highest count, looking for informative driver names and a large drop in the number of counts present, indicating that this particular function is causing a consumption of CPU time.

        NTOSKRNL is running high because we are capturing. The kernel is spending time gathering ETL data. This can be ignored.

        NETIO is redirecting network packets to/from tcpip.sys for a function called InetInspectRecieve:

        NETIO.sys stack expansion

        TCP/IP is dealing with the NETIO commands above to do this “Receive Inspection”:

        TCPIP.SYS stack expansion

        NDIS.SYS is dealing with 2 main functions in tcpip.sys: TcpTcbFastDatagram and InetInspectRecieve again:

        NDIS.SYS stack expansion

        Other than ntoskrnl, these 3 Windows networking drivers all have entries for the drivers listed as 5, 6 and 7 above in their stacks.

        Diagnosis Summary

        Lots of DPCs are caused by 3 probable sources:

        1. Incoming packet inspection by the Symantec IDS system.
        1. The IDS system has to take every packet, compare it to a signature definition, and, if clean, allow it to pass. This action is causing slow DPCs
        • The NIC driver could be stale/buggy and generating slow DPCs.
        1. There is no evidence for this, but it’s usually a good place to start. There could be TCP offloading or acceleration features in the NIC and/or driver which haven’t been enabled but may improve network performance.
        • And finally the NIC teaming software is getting in between the NICs and the CPUs.
        1. That is, after all, the job of the NIC teaming software: to trick Windows into thinking that the incoming packets from 2 distinct hardware devices are actually coming from 1 device. The problem here, however, is that this insertion into the networking stack is pure software, but is likely causing very slow DPCs

        Action Plan

        Our actions were to make changes over 2 separate outage windows:

        1. Update the NIC driver and enable Intel I/OAT in the BIOS of the server.
        1. I/OAT is described in the spec sheet for the NIC like this: “When enabled within multi-core environments, the Intel Ethernet Server Adapter X520-T2 offers advanced networking features. Intel I/O Acceleration Technology (Intel I/OAT), for efficient distribution of Ethernet workloads across CPU cores. Load balancing of interrupts using MSI-X enables more efficient response times and application performance. CPU utilization can be lowered further through stateless offloads such as TCP segmentation offload, header replications/splitting and Direct Cache Access (DCA).”
        • Uninstall the NIC teaming software
        1. 3rd party NIC teaming software inhibits many TCP offloading features, and in this case generates large numbers of slow DPCs
        • On the second outage we uninstalled the IDS system.
        1. IDS was not configured on this (and all other) servers. But as the software had the potential to become enabled, it was grabbing every incoming packet for inspection, despite the fact that it wasn’t configured to inspect the packet or act on violations in any way. Stopping the service is insufficient, the driver must be removed from the hidden, non-plug and play section of the device manager. Manually removing the driver isn’t sufficient. The software will reinstall it at next boot. Only a full uninstall will do.

        After dissolving the NIC Team

        Here is what the picture looked like after we dissolved the NIC team, updated the NIC driver and enabled Intel I/OAT in the BIOS.

        DPC duration - No teaming, I/OAT enabled

        In this 10 second sample we can see that the 4 CPU cores are still effectively locked as the CPU time due to NDIS DPCs is 37.7 seconds (out of a possible maximum of 40 seconds. The number of DPCs has decreased by more than half to 55,000, meaning that the average duration of DPCs has become very long at 682 microseconds – triple the average time from before we removed the NIC team and enabled I/OAT.

        Taken from Excel

        The blue area of the graph above is the picture we had from before changes were made. The pink/orange area is the picture of DPC durations after removing NIC teaming and enabling I/OAT.

        So why did the average duration of DPCs get longer?

        It could be that the IDS software now does not need to relinquish its DPCs to make room on the same CPU cores as the DPCs for the NIC teaming driver. These 2 drivers must be locked to the same CPUs. With no need to relinquish a DPC due to another DPC of equal priority, the IDS DPCs are free to use the CPU for longer periods of time before being forced off.

        At any rate, it certainly isn’t fixed yet.

        After uninstalling Symantec IDS

        And finally here’s what the picture looked like after we uninstalled the IDS portion of the Symantec package. Remember, this service was not configured to be enabled in any way.

        DPC duration - no IDS

        You can see that the average time has dropped from 220 microseconds to 90 microseconds – below the 100 microsecond threshold required by the Driver Development Kit.

        In this 10 second sample there were 127,000 DPCs from NDIS on the 4 heavily used CPUs, but the CPU time they consumed was 11 seconds, a reduction from 36.8 seconds.

        Taken from Excel

        The blue area of the graph above is the picture we had from before changes were made. The pink/orange area is the picture of DPC durations after removing NIC teaming and enabling I/OAT. And the green area is the picture after IDS is removed.

        This is a dramatic improvement. Nearly all DPCs are below the 100 microsecond limit. The system is able to process the incoming load without locking up for high priority, long lasting DPCs.

        What about RSS?

        We’re not quite done though. 4 of our CPUs are still working very hard, often pegged at 100%. But why only 4? This is a 2-socket system with 6 cores on each socket. That gives us 12 CPUs where we can run DPCs. DPCs from one NIC are bound to one NUMA node. We already dissolved our NIC team, so we only have 1 NIC in action, so we are limited to 6 cores. RSS can spread DPCs over CPUs in roots of 2, meaning 1, 2, 4, 8, 16, 32 cores. Meaning we can at most use 4 CPUs per NIC.

        To scale out we would need to add more NICs and limit RSS on each of those NICs to 2 cores. We’d need to bind 3 NICs to NUMA node 0 and 3 to NUMA node 1. We’d also need to set the starting CPUs for those NICs to be cores 0, 2, 4, 6, 8 and 10. In that we can saturate every possible core.

        But to do this, we’d need to ensure that we can have multiple NICs, without using the teaming software. Which means we’d need to assign each NIC a unique IP address. To do that we need to make sure that the TSM clients can deal with targeting a server name with multiple IP addresses in DNS for that name. And if connectivity to the first IP address is lost, that TSM can failover to one of the other IP addresses. We’ll test TSM and  get back with our results later.

        But we need one more fundamental check before doing that: We need to make sure that the incoming packet, hitting a specific NUMA node and core is going to end up hitting the right thread of the TSM server where that packet is going to be dealt with and backed up. If we can’t align a backup client to the incoming NIC and align that NIC to the backup software thread that should process it, then we’ll be causing intra-CPU interrupts, or worse yet, cross NUMA interrupts. This would make the entire system much less scalable.

        image

        So this is how this would all look. The registry key to set the NUMA node to bind a NIC to is “*NumaNodeId” (including the * at the start). To set the base CPU, use *RssBaseProcNumber”. To set the maximum number of processors to use set “*RssBaseProcNumber”.

        These keys are explained here: http://msdn.microsoft.com/en-us/library/windows/hardware/ff570864(v=vs.85).aspx

        and here: Performance Tuning Guidelines for Windows Server 2008 R2

        And more general information on how RSS works in Windows Server 2008 are here: Scalable Networking- Eliminating the Receive Processing Bottleneck—Introducing RSS

        Our problem in the above picture, however, is that our process doesn’t know to run its threads on the NUMA node and cores where the incoming packets are arriving. Had this been SQL server, we could have run separate instances configured to start using specific CPUs. Hopefully, one day, TSM will operate like this and become NUMA-node aware.

        I know this has been a long post, but for those who have read down to here, I do hope this has helped you with your troubleshooting using WPT.

      • Wildcard DNS Entires

        I was working on a case with a customer for something that was too weird to ignore.

        We wanted to use DNS Suffix Search Orders on the clients so that clients could query using short names for servers in DNS domains which weren’t their own.

        e.g. A PC in the domain child-dom-1.corp.contoso.com wanted to ping the short name “serverX”.

        ServerX had registered its name in the DNS zone matching its primary DNS Suffix: child-dom-2.corp.contoso.com

        So the answer is to set the DNS Suffix Search Order list. Prior to this the customer had configured the DNS zone child-dom2.corp.contoso.com to use WINS forwarders, pointing to a WINS server which serverX was also using. But WINS was on the way out (see the previous blog for those details on how to decommission WINS).

        DNS Suffix Search Order is configured on the properties of the NIC or in Group Policies (for all NICs):

        image

        We tried both methods, but it was not able to resolve names in any domain except the Primary DNS Suffix domain.

        Once there is a DNS Suffix Search Oder list defined, Windows must use that list over the single, Primary DNS Suffix. So what was going on?

        When we ran nslookup and set debug=2 we could see that queries for a non-existent host (e.g. mickeymouse) would reply back with a SUCCESS message for the A record, but no IP address in the answer.

        The solution:

        In the zone child-dom-1.corp.contoso.com there was a record called * with a type of MX. This record makes requests for ALL types of other records (A, AAAA, CNAME etc) succeed. And because the DNS client was getting back successes, it didn’t need to try alternate DNS Suffixes.

        The wildcard MX record was, of course deleted, and everything works as expected.

        But why have wildcard MX records?

        Wildcard MX records are good for when you have a large number of hosts which are not directly Internet-connected (for example, behind a firewall) and for administrative or political reasons it is too difficult to have individual MX records for every host, or to force all e-mail addresses to be "hidden" behind one or more domain names. In that case, you must divide your DNS into two parts, an internal DNS, and an external DNS.  The external DNS will have only a few hosts and explicit MX records, and one or more wildcard MXs for each internal domain.  Internally the DNS will be complete, with all explicit MX records and no wildcards.