Kevin Holman's System Center Blog

Posts in this blog are provided "AS IS" with no warranties, and confers no rights. Use of included script samples are subject to the terms specified in the Terms of UseAre you interested in having a dedicated engineer that will be your Mic

OpsMgr – Authoring performance collection rules for cluster disks – the *right* way.

OpsMgr – Authoring performance collection rules for cluster disks – the *right* way.

  • Comments 10
  • Likes

 

I recently wrote about the Windows Server Operating System Management pack version 6.0.6989.0 HERE.  In the “known issues” section of that article – I discussed that the MP is solid – except there are no performance collection rules for cluster disks, such as latency, or free space, to leverage in your reports.  We now have three different types of common disks in Windows servers (Logical Disks, Cluster Disks, and Cluster Shared Volumes).  This means we have to have three different sets of rules targeting each class, for collecting performance data.  This article will be a walk through on how to correctly create performance collection rules, to collect performance data from Cluster disks.  It can be used as an example on how to collect performance data from any multi-instance object.

I previous wrote on a similar topic here:  http://blogs.technet.com/b/kevinholman/archive/2009/11/24/writing-monitors-to-target-logical-or-physical-disks.aspx   That article covered correctly using MONITORS to examine disks (or any multi-instance object) and the caveats around monitors.  This article is different, in that there are specific rules that must be obeyed for collection rules in order to have a positive reporting experience.

 

The wrong way:

 

Essentially, to collect performance data on these cluster disks, I could simply write a rule, target the “Windows Server Operating System” class, and then collect something like this from perfmon:

image

That would work fine – in the OpsDB…. and you would see the data in your views in the console.  HOWEVER – this breaks in the warehouse.  In the warehouse – all the raw data is aggregated NOT based on the counter, but on the RULE that collected it.  This means for an hourly aggregation, if we used a rule like above – we would aggregate ALL the collected data together, and then the data would be completely meaningless, as it would average the data from all instances of disks on a Windows Server.  You can NOT use “include all instances” when collecting data…. this is a no-no based on how our warehouse is designed.

 

So what can I do?

 

Well I am so glad you asked!  Read on.

For this exercise – we will be using the SCOM 2007 Authoring Console – which you can download from the resource kit.  http://www.microsoft.com/en-us/download/details.aspx?id=18222

You can use this to create MP’s for OpsMgr 2007 or 2012.

Open the authoring console > File > New > Empty Management Pack.  Give the MP an ID/Filename.

 

image

Give the MP a displayname and an Optional description:

 

image

 

File > Management Pack Properties > References.  Add a reference, to the downloaded Microsoft.Windows.Server.ClusterSharedVolumeMonitoring.mp file.  This will allow us to use the Cluster Disks as a target class when we author a rule.

 

 

image

 

image

 

At this point – save your MP locally.  Then – close and reopen the Authoring console, and open your XML management pack.  You may be prompted to find a few more MP’s – these will generally be located in the installation directory of one of your RMS/management servers on SCOM 2007, or on your installation media for SCOM 2012 in the \ManagementPacks directory.

 

Then, create a new rule:

Rules > New > Collection > Performance Based > Windows Performance Collection

 

 image

 

Give your new rule a proper Rule ID by removing the “NewElement” text and putting in a word/words that describe a little about what the rule is for.

Then give the collection rule a display name that makes good sense, and choose the correct target class (if the auth console crashes here – you are missing dependency MP’s that we need to resolve, which is why we closed and reopened the auth console above to resolve all these.)

 

image

 

Next – browse and choose the Logical Disk object in perfmon, %Free Space, and then we have to do something special for the instance.

Since we are targeting the rule at EACH instance of the Cluster Disk class, we need to ensure that we ONLY collect that instances perf data per targeted disk.

Whatever we use for the “Instance” for perf collection in SCOM – it MUST match the instance in perfmon – perfmon is shown below:

image

 

So we need to select some *discovered property* of the Cluster Disks class, that equals “K:” or “L:” etc….

Looking at discovered inventory – we see:

image

Partition name will be perfect!

So we use the “flyout” on the right of instance and choose Partition Name.  The flyout shows all properties of the targeted class:

 

image

 

 

 

image

 

Done!  Finish to create the rule!

Now – lets repeat these for Free Space in Megabytes, and Avg Disk Sec per Transfer like so:

 

image

 

image

 

Checkpoint:

 

This way – we can collect the performance data for a multi-instance object like Logical Disks, or Cluster Disks.  We just need a discovered property on the class to match up with Perfmon – something that resolves to the exact instance data shown in perfmon.  If a class has multiple instances, and does not have a discovered property we can use for perfmon, you are pretty much screwed.  That is why you should always plan for this when writing custom classes that also have performance statistics on a individual level.  Most of the time you will only encounter this for Microsoft systems, when you need to collect additional perf counters for multi-instance items, like disks, processors, SQL instances, SQL databases, etc.

 

For the community – I have attached my Addendum management pack below, so you can use it as a guide or optionally modify it/seal it and use it directly.

Attachment: Microsoft.Windows.Server.ClusterSharedVolumeMonitoring.Addendum.zip
Comments
  • Good One.

  • Can you try and get this into the next release of the official OS MP so we don't have to do our own custom workarounds for a known issue?

  • I have raised it as a recommendation.  :-)

    When I say "known issue" I mean it is known by me.

  • But look at all the cool stuff you just learned?  :-)

  • Kevin, brilliant stuff. However, I have a question:

    We urgently need to get some IO monitoring in place for our Hyper-v 2012 hosts. When doing a "discovered inventory" of the class "Cluster Disk" the only thing I get is the quorum disk for my Hyper-V cluster. The data disks show up as "Cluster Shared Volume". In Perfmon, I can't find any instance name to line up with any of the fields in Perfmon (the perfmon ones are typically just numbers from 1 up to the number of CSVs the OS can see). However, I do have some "Cluster CSV*" counters, that have instance names like "Volume2", Volume3" and so on. The CSV counters don't have the exact counters I want though, so i'd really like to use the PhysicalDisk ones.

    The sealed management pack seems to include counters for free space /MB, free space / %, and total size /MB for CSV disks. How can I extend this to include other performance metrics?

  • @Trond -

    GREAT question - I don't know - but I will look into this.

  • Hi Kevin

    Again a greate post from you! – thank you!

    I followed you article and start collecting a lot of performance data from the cluster disk, but I have one problem – How to I collect performance data from mount points we are using in a cluster?

    We have a lot of big SQL clusters where each cluster resource/SQL instance is buildup of a “anchor disk” with a driver letter and then a lot of mount points on this disk – In this situation I only get performance data from the “anchor disk” that have a drive letter and not from all the mount points – do you have an ide how to solve this?

  • @RHC - Have you enabled our mount point discovery?

    We will discover and monitor mount points via that method.  However - I do know that our typical perfmon targets don't work - and this shows up in events on the SCOM agent because the workflow is trying to map the discover ID to something in perfmon - and these don't match.  Perfmon shows these as "HardDiskVolume1", etc.... so I'd argue that we need to change the mount point discovery to allow logical disk counters in perfmon to map correctly to these, or find some other way to monitor free space and collect perf data for mount points.  This sounds like a good bug to raise in support and request a hotfix for the MP.  Raising a bug and escalating for a fix is the BEST way to see changes made to any MP.  Without that request being made by the customer, it wont be seen as an impacting issue.

  • Hi Kevin

    Yes we have enabled moint point discovery.

    I will raise it for MS support - thank you a lot for you answer.

  • Hello,

    Does a list of all counters already available exist?

    Should I add all counters in the custom MP attached?

    - Avg. Disk sec/Read

    - Avg. Disk sec/Write

    - I/O Database Reads/Sec

    - I/O Database Reads Average Latency

    - I/O Database Writes/Sec

    - I/O Database Writes Latency

    for SQL databases (SQL MP), I saw these counters in the Exchange MP...

    Thanks,

    Dom

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
Search Blogs