Time for a Friday night post. Microsoft PFE raise the absence of system performance baselines pretty much in every Risk Assessment Program (RAP) or Health Check. Customers do not generally create performance baselines.

So, what are the typical uses for a performance baseline? Troubleshooting performance-related issues and facilitating capacity management comes to mind. “Hey, doesn’t Opsmgr have that information by default?” I hear you say. Opsmgr does indeed have some of it, but the performance information, for SQL Server at least, is not complete and only available for 7 days in the OperationsManager database and 10 days in the OperationsManagerDW database (with a default data retention configuration). We need to track performance baselines for systems over a much longer period.

Okay, so what is the plan (and make that step-by-step please)?

  1. Create a super-thorough WinWord template for a SQL Server (+ underlying Windows OS) Performance Baseline, so we know what performance counters, etc will be required – see attached docx.
  2. List all current Opsmgr performance counter collections. The following PowerShell script can be used to do this:
    Get-PerformanceCounter | Export-Csv $env:HOMEDRIVE\All_Hosts_All_PerfCounters.csv -NoTypeInformation
  3. Compare the template created in 1 above with the list created in 2 above. Identify performance collections not done by Opsmgr at present. Note: v6.1.314.35 (July 2010) of the SQL Management Pack and v6.0.6667.0 (September 2009) of the Base OS Management Pack are installed in my lab (yep, my OS MP upgrade release schedule for my personal lab is shocking, in terms of current priorities, it comes right after watching Outrageous Fortune, the V TV Series, analysing the skies for noticeable changes to the ozone layer and about a thousand other things). In my lab, this is the list of performance counters that are not collected:

    Object Name

    Counter Name

    Memory

    Pages Input/Sec

    Memory

    Free System Page Table Entries

    Paging file

    %Usage Peak

    Process (sqlservr)

    %Processor Time

    Process (msmdsrv)

    %Processor Time

    SQLServer:Access Methods

    Forwarded Records/sec

    SQLServer:Access Methods

    Full Scans / sec

    SQLServer:Access Methods

    Index Searches/sec

    SQLServer:Access Methods

    Page Splits/sec

    SQLServer:Access Methods

    Workfiles Created/sec

    SQLServer:Access Methods

    Worktables From Cache Ratio

    SQLServer:Access Methods

    Table Lock Escalations/sec

    SQLServer:Transactions

    Longest Running Transaction Time

    SQLServer:Memory Manager

    Granted Workspace Memory (KB)

    SQLServer:Memory Manager

    Maximum Workspace Memory (KB)

    SQLServer:Memory Manager

    Memory Grants Outstanding

    SQLServer:Memory Manager

    Memory Grants Pending

    SQLServer:Memory Manager

    Total Server Memory (KB)

    SQLServer:Memory Manager

    Target Server Memory (KB)

    SQLServer:Databases

    Data File(s) Size (KB)

    SQLServer:Databases

    Log Bytes Flushed/sec

    SQLServer:Databases

    Log File(s) Size (KB)

    SQLServer:Databases

    Log File(s) Used Size (KB)

    SQLServer:Databases

    Log Flush Wait Time

    SQLServer:Databases

    Log Flush Waits/sec

    SQLServer:Databases

    Log Flushes/sec

    SQLServer:Databases

    Log Growths

    SQLServer:Databases

    Log Shrinks

    SQLServer:Databases

    Log Truncations

    SQLServer:Databases

    Percent Log Used

    SQLServer:Buffer Manager

    Free List Stalls/sec

    SQLServer:Buffer Manager

    Lazy Writes/Sec

    SQLServer:Buffer Manager

    Checkpoint Pages/sec

    SQLServer:Buffer Manager

    Page Life Expectancy

    SQLServer:Buffer Manager

    Page Lookups/sec

    SQLServer:Buffer Manager

    Page Reads/sec

    SQLServer:Buffer Manager

    Page Writes/sec

    SQLServer:Buffer Manager

    Readahead/sec

    SQLServer:Buffer Manager

    Database Pages

    SQLServer:Buffer Manager

    Procedure Cache Pages

    SQL Server:Buffer Manager

    Target Pages

    SQL Server:Buffer Manager

    Free Pages

    SQL Server:Buffer Manager

    Stolen Pages/sec

    SQL Server:SQL Statistics

    Batch requests/sec

    SQL Server:SQL Statistics

    SQL Attention Rate/sec

    SQL Server:Cursor Manager by Type

    Active Cursors

    SQL Server:SQL Errors

    Errors/sec

    SQL Server:Deprecated Features

    Usage

    SQL Server:General Statistics

    Logouts/sec

    SQL Server:Latches

    Latch Waits/sec

    SQL Server:Latches

    Avg Latch Wait Time (ms)

    SQL Server:Latches

    Total Latch Wait Time (ms)

    SQL Server:Locks

    Lock Wait Time (ms)

    SQL Server:Locks

    Avg Wait Time (ms)

  4. Create custom windows performance collection rules in Opsmgr for the missing performance counters (i.e. those above). Note: There are obvious best practices that should be followed to make sure that you don’t blow-up, destroy or cripple your Opsmgr or SQL Server systems (or at least create a semblance of following good practice)…e.g. deploy in your test environment first (yep, soooo many of us have them for Opsmgr, right?), choose a sensible performance collection sample frequency (tip, every 1 sec is not it), calculate the additional storage requirements in the OperationsManager and OperationsManagerDW databases, enable the new rules for a limited number of target instances to start with, etc.
  5. Okay, now use PowerShell to export raw data for all of the performance counters in the attached file. The script below will export ‘SQL Re-Compilations/sec’ counter data for all ‘Microsoft.SQLServer.2008.DBEngine’ discovered instances.
    Get-MonitoringClass -Name "Microsoft.SQLServer.2008.DBEngine" | Get-MonitoringObject | % `
    {$aaa=$_.Id;$bbb=$_.PathName;Get-PerformanceCounter | ? {$_.MonitoringObjectId -eq $aaa -and $_.ObjectName -eq "SQLSERVER:SQL Statistics" -and $_.CounterName -eq "SQL Re-Compilations/sec"} | Get-PerformanceCounterValue -StartTime ((get-date).adddays(-7)) -EndTime (get-date) | Select TimeSampled,SampleValue | Export-Csv -NoTypeInformation $env:HOMEDRIVE\$bbb.csv};
  6. Manually edit the script above for all desired counters…Smile…I know, I know…it is real easy to automate that <sigh>…The PowerShell script below will get the raw performance data only for relevant Objects and Counters specified in a CSV file:
    Import-Csv c:\JVTools\SQL.csv | % `
    {$Counter=$_.Counter;$Object=$_.Object;Get-MonitoringClass -Name "Microsoft.SQLServer.2008.DBEngine" | Get-MonitoringObject | % `
    {$aaa=$_.Id;$bbb=$_.PathName;$Object;$Counter;$count++;$CounterStr=$Counter-replace("/"," per ");$ObjectStr=$Object-replace(":","-");Get-PerformanceCounter | ? {$_.MonitoringObjectId -eq $aaa -and $_.ObjectName -eq $Object -and $_.CounterName -eq $Counter} | Get-PerformanceCounterValue -StartTime ((get-date).adddays(-7)) -EndTime (get-date) | Select TimeSampled,SampleValue | Export-Csv -NoTypeInformation $env:HOMEDRIVE\$Count$ObjectStr$CounterStr$bbb.csv}};
    Note: The CSV format is simply: Object,Counter. Use at least two different CSV's for the OS and SQL Server and replace "Microsoft.SQLServer.2008.DBEngine" with "Microsoft.Windows.OperatingSystem" (or a class more relevant to your scenario like "Microsoft.Windows.LogicalDisk") and the pointer to the CSV in the above script for the OS performance data collection.
  7. Finally, manually crunch the data in Excel and complete the attached WinWord template…<sigh> or use the Measure-Object (not Measure-Command) in PowerShell to create the 7 day averages where required. Now at least manually fill out the WinWord template and create the baseline graphs-over-time…<sigh> or use PowerShell to do that as well Smile. A post for another day perhaps?
  8. Repeat this baseline creation every 3 or so months (depending on how dynamic and/or important your system is), before and after major changes like service packs, etc. Use it for troubleshooting, load balancing, fighting off business and application owners that want to overload your system and to help in getting that next salary increase when your boss asks you for a list of proactive things you have done lately.

I know that this data can also be retrieved from the OperationsManagerDW database in SQL (the OperationsManager database is not a good idea from an Opsmgr performance perspective...as the SQL guy, you may have the pesky Opsmgr Admin on your case with his own pesky Opsmgr performance baseline indicating that system stats degraded since you started fiddling) and I also know that a custom Opsmgr report to replace the WinWord template is very possible. That is a post for another time, this time the focus was on using PowerShell. I am also aware that this view is largely server-centric and that a more holistic end-to-end service performance baseline would be a useful thing...hmm Opsmgr does include features for creating Distributed Applications...wonder if I can target that with the MonitoringObject?

Have fun!

Johan