December, 2012

  • Use Caution When Implementing IPC for Performance Counters

      Recently I was working with a developer who had created performance counters that work in Performance Monitor but are never collected in a user defined data collector set.  The customer explained that their counters update named shared memory ...read more
  • Where has all my Physical RAM gone?

    Hello AskPerf! Ranajoy here from the Windows Performance Team. One of our highest call generators here in support surrounds low Available Memory shown in Windows Task Manager. Today we are going to take a brief look at this value and where this “Missing Memory” may be hiding.

    Picture the following:

    • Windows 2008 R2 Server with 64GB RAM
    • Windows Task Manager shows Available Memory at 400MB
    • Cache shows 60GB used

    Is such a high value in System Cache detrimental to system performance?

    It’s typically not something to be concerned with because the pages come out of the standby list, to be discussed shortly, but we can use tools like Perfmon or Sysinternals RAMMap to determine how this memory is divvied up to be sure.

    Collecting PERFMON and RAMMAP output will show us something similar to the below values:

    PERFMON Capture

    Memory                                  Minimum      Maximum      Average

    =========================================================================

    Available Bytes                   :   56,495MB |   60,304MB |   58,349MB (Total Available Memory in the System)

     

    As you can see, Available Bytes (Available Physical Memory) is around 58GB.

    A RAMMap capture shows the following:

    clip_image002

    It would appear that most of our RAM is in Mapped File Usage. So, what do these columns mean?

    Mapped File: Also known as section objects, mapped “views” of files are when the contents of that file are mapped to virtual addresses in memory. This can be a process mapping views of files into its memory (for reading or writing) or for the system file cache.

    Active: Pages of physical ram in active use by the specified category (usually a process working set or the system working set).

    Standby: Pages of physical ram not actively being used. These are still left in physical ram but will be repurposed first by the memory manager (either returned to the active list or zeroed out and reused) if something needs physical ram for active pages. Standby pages are essentially cache – it’s better to have infrequently used data kept in RAM “just in case” than pushing it out to disk when the memory isn’t needed for anything else.

    As we see here the Cache is mainly comprised of memory pages in the standby list.

    In this scenario, the OS is keeping Memory pages in Standby so that they can be immediately given to a process or a service requesting RAM. As you can see, Mapped File Total equals the amount of Available bytes we saw in our Perfmon log.

    Another tool that we can use to review Memory Usage is Process Explorer.

    Note:

    The Process Explorer screenshots are taken from a different machine and under different circumstances.

    clip_image004

     

    Click View | System Information | Memory tab – the following screen appears:

    clip_image006

    The values listed under Physical Memory and Page list(K) will give you deeper Memory details than what Windows Task Manager shows and will suit with the values you see in RAMMAP.

    -Until next time!

  • Determining the source of Bug Check 0x133 (DPC_WATCHDOG_VIOLATION) errors on Windows Server 2012

    What is a bug check 0x133? Starting in Windows Server 2012, a DPC watchdog timer is enabled which will bug check a system if too much time is spent in DPC routines. This bug check was added to help identify drivers that are deadlocked or misbehaving.  ...read more
  • Cluster-Aware Updating (CAU) interaction with Proxy Servers

    Welcome back to the CORE Team blog. Cluster-Aware Updating (CAU) is an automated feature that allows you to update clustered servers with little or no loss in availability during the update process. Cluster updates are obtained using one of three methods:

    1. Connecting to the internet and downloading patches from Windows Update\Microsoft Update
    2. Connecting to an internal WSUS server and downloading approved updates
    3. Downloading hotfixes from the internet, placing those fixes on an internal file server share, and then using the CAU hotfix-plugin to patch a cluster

    During an Updating Run, CAU transparently performs the following tasks:

    • Place the node being updated into maintenance mode
    • Move all cluster roles off the node being updated (Virtual Machine roles are live migrated)
    • Install updates and all dependent updates
    • Restart the node if necessary during the patching process
    • Bring the updated node out of maintenance mode
    • Restore the cluster roles to the updated node
    • Continue updating the remaining nodes in the cluster using the same steps
    For more information on Cluster-Aware Updating (CAU), review the following TechNet information - http://technet.microsoft.com/en-us/library/hh831694.aspx

    If a Proxy server is required to gain access to the internet, then CAU must be configured to use it. CAU is a system-level process and cannot\will not use a user-mode proxy server configuration. A user-mode configuration is like that one configured in Internet Explorer. The setting is manually configured by the user, or implemented via a Group Policy Object (GPO) in Active Directory. To configure a system-level proxy server, use the netsh command line.

    netsh winhttp set proxy myproxy.fabrikam.com:80 "<local>"

    The above command configures a system-level proxy server using port 80 and sets the 'minimal' exceptions for local addresses. While this would appear to be sufficient, there is an unfortunate side effect of this configuration if the Failover Cluster is supporting highly available File Servers. With a configuration similar to the above, a user will not be able to add a file share to the HA File Server role. The process will appear to start normally, but it will terminate unexpectedly.

    clip_image002

    There will be no errors registered by the cluster service in the system event log. However, there will be several errors registered in the Windows Remote Management log. The first error is

    Event ID: 137
    Source: Windows Remote Management
    Network layer returned ERROR_WINHTTP_NAME_NOT_RESOLVED - The server name cannot be resolved. Aborting the operation.

    This is followed by another error -

    Event ID: 49
    Source: Windows Remote Management
    The WinRM protocol operation failed due to the following error: The WinRM client cannot process the request because the server name cannot be resolved.

    The final error recorded -

    Event ID: 142
    Source: Windows Remote Management
    WSMan operation Enumeration failed, error code 2150859193

    Decoding the error code -

    clip_image004

    The 'server name' in question is the name of the Client Access Point (CAP) (in my test the CAP NetBIOS name was Test-FS) in the File Server Group where the share is being created. Looking in the cluster log, we see -

    000010c8.00001028::2012/11/19-17:24:39.348 INFO [RES] Network Name <Test-FS>: Netbios: Slow Operation, FinishWithReply: 0

    000010c8.00001028::2012/11/19-17:24:39.348 INFO [RES] Network Name: [NN] got sync reply: 0

    000010c8.00001028::2012/11/19-17:24:39.348 INFO [RES] Network Name <Test-FS>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle

    000010c8.00001028::2012/11/19-17:24:39.348 INFO [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:d524be11-4b9a-4e1e-855d-9227ea61988d:Netbios

    000010c8.00000380::2012/11/19-17:24:39.348 INFO [RES] Network Name <Test-FS>: Netbios: Slow Operation, FinishWithReply: 0

    000010c8.00000380::2012/11/19-17:24:39.348 INFO [RES] Network Name: [NN] got sync reply: 0

    000010c8.00000380::2012/11/19-17:24:39.348 INFO [RES] Network Name <Test-FS>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle

    000010c8.00000380::2012/11/19-17:24:44.348 INFO [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:d524be11-4b9a-4e1e-855d-9227ea61988d:Netbios

    000010c8.00001028::2012/11/19-17:24:44.348 INFO [RES] Network Name <Test-FS>: Netbios: Slow Operation, FinishWithReply: 0

    000010c8.00001028::2012/11/19-17:24:44.348 INFO [RES] Network Name: [NN] got sync reply: 0

    000010c8.00001028::2012/11/19-17:24:44.348 INFO [RES] Network Name <Test-FS>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle

    The solution is to modify the local address exceptions in the proxy server configuration as shown in this example -

    netsh winhttp set proxy myproxy.fabrikam.com:80 "<local>;*.fabrikam.com"

    We added the wildcard exception for the local domain (*.fabrikam.com). With this updated configuration in place, the share (test2) creation process completes normally with no errors registering in the Windows Remote Management log. Looking at the cluster log -

    000010c8.00000200::2012/11/19-17:29:24.346 INFO [RES] Network Name <Test-FS>: Netbios: Slow Operation, FinishWithReply: 0

    000010c8.00000200::2012/11/19-17:29:24.346 INFO [RES] Network Name: [NN] got sync reply: 0

    000010c8.00000200::2012/11/19-17:29:24.346 INFO [RES] Network Name <Test-FS>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle

    00000920.0000072c::2012/11/19-17:29:24.479 INFO [NM] Received request from client address FABRIKAM-N21.

    000010e8.00000f10::2012/11/19-17:29:24.481 INFO [RES] Physical Disk <Cluster Disk 1>: Path Y:\Shares\test2 can be on the disk

    000010c8.00001028::2012/11/19-17:29:24.483 INFO [RES] Network Name <Test-FS>: Getting Read/Write private properties

    00000920.0000072c::2012/11/19-17:29:24.486 INFO [GEM] Sending 1 messages as a batched GEM message

    00000920.0000072c::2012/11/19-17:29:24.486 INFO [GUM] Node 2: Processing RequestLock 2:149

    00000920.000013a0::2012/11/19-17:29:24.487 INFO [GUM] Node 2: Processing GrantLock to 2 (sent by 3 gumid: 584)

    00000920.0000072c::2012/11/19-17:29:24.487 INFO [GEM] Sending 1 messages as a batched GEM message

    00000920.0000072c::2012/11/19-17:29:24.489 ERR [DM] Dm::DmBaseKey::SetValue: ERROR_ACCESS_DENIED(5)' because of 'status'(test2)

    00000920.0000072c::2012/11/19-17:29:24.489 INFO [GEM] Sending 1 messages as a batched GEM message

    000010c8.00000600::2012/11/19-17:29:24.490 INFO [RES] File Server <File Server (\\Test-FS)>: Created share test2

    00000920.00000f74::2012/11/19-17:29:24.491 INFO [GEM] Sending 1 messages as a batched GEM message

    000010c8.00000200::2012/11/19-17:29:24.527 INFO [RES] Network Name <Test-FS>: Getting Read/Write private properties

    00000920.00000b34::2012/11/19-17:29:24.552 INFO [GEM] Sending 1 messages as a batched GEM message

    000010c8.00000600::2012/11/19-17:29:24.553 INFO [RES] File Server <File Server (\\Test-FS)>: Updated share test2

    Hope you found this information helpful.

    Thanks, and come back again soon.

    Chuck Timon
    Senior Support Escalation Engineer
    Microsoft Enterprise Platforms Support
    High Availability\Virtualization Team