This guide was written based on the 6.0.6958.0 version of the Windows Server Operating System Management Pack.
Changes to built in Monitors and Rules:
Many rules and monitors were changed from a default setting, to provide a better out of the box experience. You might want to look at any overrides you have against these and give them a fresh look:
· “Avg Disk Seconds per Write/Read/Transfer” monitors changed from Average Threshold monitortype to Consecutive Samples Threshold monitortype.
o This is VERY good – this stops all the noise for the default enabled Sec/Transfer monitor, caused by momentary perf spikes.
o The default threshold is set to “0.04” which is 40ms latency. This is a good generic rule of thumb for the typical server.
o The default sample rate is once per minute, for 15 consecutive samples.
o Note – make sure you implement or at least evaluate hotfixes 2470949 or 2495300 for 2008R2 and 2008 Operating systems, which affect these disk counters.
o Make sure you look at any overrides you had previously set on these – as they likely should be reviewed to see if they are still needed.
· Disabled “Percentage Committed Memory in Use” monitor
o This monitor used to change state when more than 80% of memory was utilized. This created unnecessary noise due the fact that more and more server roles utilize all available memory (SQL, Exchange) and this monitor was not always actionable.
· Disabled “Total Percentage Interrupt Time” and “Total DPC Time Percentage”.
o These monitors would often generate alert and state noise in heavily virtualized environments, especially when the CPU’s are oversubscribed or heavily consumed temporarily. These were turned off by default, because there are better performance counters at the Hypervisor host level to track this condition than these OS level counters.
· Added “Free System Page Table Entries” and “Memory Pages per Second” monitors. These are both enabled out of the box to track excessive paging conditions. Also added MANY perf collection rules targeting memory counters, some disabled by default, some enabled.
· “Total CPU Utilization Percentage” monitor was increased from 3 to 5 samples. The timeout was shortened from 120 to 100 seconds (to be less than the interval of 120 seconds).
· Disabled the following perf counter collection rules by default:
o Avg Disk Sec/Write
o Avg Disk Sec/Read
o Disk Writes Per Second
o Disk Reads Per Second
o Disk Bytes Per Second
o Disk Read Bytes Per Second
o Disk Write Bytes Per Second
o Average Disk Read Queue Length
o Average Disk Write Queue Length
o Average Disk Queue length
o Logical Disk Split I/O per second
o Memory Commit Limit
o Memory Committed Bytes
o Memory % Committed Bytes in use
o Memory Page Reads per Second
o Memory Page writes per second
o Page File % use
o Pages Input per second
o Pages output per second
o System Cache Resident Bytes
o System Context Switches per second