PRF: Perceived System Sluggishness

PRF: Perceived System Sluggishness

  • Comments 1
  • Likes

PERCEIVED SYSTEM SLUGGISHNESS


Description:  There are a number of factors which may cause a system to act sluggish, or appear to act sluggish.

 

Scoping the Issue:  Perfmon logs can be gathered on any Windows system to aid in troubleshooting perceived system sluggishness.  If you suspect a performance related issue on the system, capturing Performance Monitor logs during the time the problem is happening on the system may help determine the cause of the problem.  Poolmon Logs are also valuable to analyze potential high Paged Pool and NonPaged Pool usage issues.

 

Data Gathering:  In all instances, collecting either MPS Reports with the General, Internet and Networking, Business Networks and Server Components diagnostics, or a Performance-oriented MSDT manifest must be done.  Additional data required may include the following:

  • Performance Monitor logs that include the timeframe when the Working Set Trimming occurred.  The length of time it takes the server to go from a normal state, to a memory leak state will determine the Perfmon capture interval. Please use the table below to set the capture interval.  You can create the log parameters manually, or by using the Performance Monitor Wizard.  Required counters include:
    • Cache / All Counters / All Instances
    • Memory / All Counters / All Instances
    • Process / All Counters / All Instances
    • Processor / All Counters / All Instances
    • Physical Disk / All Counters / All Instances

If the average time to issue is: The capture interval should be:
Weekly 14 minutes
Daily 120 seconds
Hourly 5 seconds

  • Pool Monitor (PoolMon) logs that include the timeframe when the memory leak is occurring. As with Perfmon, the poolmon capture interval is set based on the frequency of the symptoms. The table below provides some guidelines for setting the interval. We strongly recommend capturing simultaneous Perfmon and Poolmon data simultaneously so we can correlate the events.

 

If the average time to issue is: The capture interval should be:
Weekly 1 hours
Daily 15 minutes
Hourly 60 seconds

  • It may also be necessary to capture a complete memory dump of the server while it is in the problem state.  In most cases we will capture what is known as a Ctrl-ScrollLock Memory Dump. However, if your system has a “Lights out Management” system (iLO), you will most likely want to capture what is known as an NMI Dump. In either case it is important to ensure that you have a pagefile on the root drive that is equal to the amount of RAM on the system plus about 100MB.  

 

Troubleshooting / Resolution:  After you have gathered this data, review the following:

  • MPS Reports
    • Check the System Event Log for any Event ID 9, 11 or 15 messages.  The presence of these messages indicate a possible disk issue – more specifically a hardware issue.  If you are seeing these events, you should engage your hardware vendor.
    • Check the System Event Log for Event ID 333 messages.  These indicate a potential resource issue.  We have additional information in a previous blog post on Event ID 333, and also the Event ID 333 Support Center topic.
  • Performance Monitor Logs
    • Check for Disk Related Performance issues.  Some common resolutions include adding more spindles, SAN configuration settings and addressing potential hardware issues
    • Check for high CPU issues for a specific process.  The information in our blog post on Processor Bottlenecks may be useful when troubleshooting High CPU issues, as well as the information in our Support Center topic on High CPU
    • Check for Low System PTE’s or memory pressure caused by a specific process.  Our blog post on Troubleshooting Memory Issues can help get you started.
    • Check for processes that are using an inordinate number of handles.  Our post on Troubleshooting Server Hangs goes into more detail
  • Pool Monitor Logs
    • Determine if there are any high NonPaged or PagedPool issues.  In the event that Pool Memory has been depleted, an Event ID 2019 (NonPaged) or 2020 (Paged) error will be logged
  • Process Explorer
    • In the event that you do have a high CPU issue, you can use Process Explorer to identify the process by sorting the CPU column.  Bring up the Properties of the process and click the Threads tab. You can sort by CPU and see which threads are taking up most of the CPU time.  If symbols are configured, you may be able to get more details on the thread, such as Stack information.

 

Additional Resources: 

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • I need help clearing out the system volume information folder. There is currently 277 Gb of data in that folder alone and I am running out of space on this drive. Any suggestions on how to clear that folder?