~ Claudio Broglia
Hello, my name is Claudio Broglia, I’m a Dataplatform Consultant in Microsoft Services Italy.Many customers ask me about having a monitor for a group of servers that would provide some sort of overall health indicator for the collection as a whole. Usually that means something that would work like this:
This request typically is made when designing some kind of monitoring dashboard because they want a sort of health state index of the farm.
As a first answer, I try to explain them that it’s not the number of servers unhealthy, but it’s the type of problem that affects them that they need to be worried about. In fact, it could be a single alert on a single server (e.g. a certificate expired) that causes a downtime or service disruption. That said, having a health state indicator like this is often times a request coming not from the technical folks, but from management as they want to know at a glance the overall state of their infrastructure.
Unfortunately, a monitor of this kind in Operations Manager is not possible with the standard tools. In fact, the only monitor that aggregates health states and rolls up based on a percentage of objects is the dependency rollup monitor and it simply rolls up the worst state of the percentage of servers indicated in a good health state. Reading it can be more difficult to understand than what it is. This is best explained using an image, and the best is the one given by the Health Rollup tab when creating a new monitor as shown below.
To meet the goals of our original request above we will need something more flexible for our health indicator. To do that, we will use groups to have our server collection and the monitors linked to them rollup the state of their members. The idea here is to have many groups, each with the same members, where each group has a monitor that rolls-up the health state with a different percentage. In our “service dashboard” we can then link a set of corresponding shapes resembling a health indicator. Additionally, to these shapes we will apply a custom Data Graphic so that we can have a different color based on the number of objects that go unhealthy.
To start, let’s say that we want to graduate our “health index” with three states: All objects healthy, > 33% unhealthy, > 66% of objects unhealthy.
To do that, we first create two groups, one for the 33% indicator and one for the 66% indicator. Below is an example based on the default class Windows Server.
1. Create a new group, and host it in your custom management pack, and name it with some mnemonic. For example, for the 33% index, add “ – 33%” as the name suffix.
2. Next, add your objects, either by Explicit or Dynamic Membership and complete the wizard.
3. Now create than another group for the 66% index state, naming it accordingly.
4. Go to the Authoring -> Monitors section, find each group, right-click on its Entity Health and create a new Dependency Rollup Monitor.
Follow the wizard, giving a name to the monitor and hosting it in your custom management pack. Then, in the Monitor Dependency section, choose “Entity Health” in the “Object (Contains Entities)” section. This allows you to make the monitor state depend on the Entity Health of the Group members.
In the Health Rollup Policy section, select “Worst state of the specified percentage of members in good health state” and set it accordingly for the Group. For the 33% index, you should specify 67%, as the monitor works taking the percentage of objects in good (or better) state. So, to have 33% or more of objects unhealthy, you need to consider the 67% healthy before one of the objects unhealthy states is included in the rollup. Accordingly, for the other group for the 66% index, you would specify 34%.
Now you have two groups, with the same members, which becomes unhealthy in two different steps – one when at least the 33% of objects is unhealthy, and one when the percentage is at least 66%.
Next we need to switch to Visio. You need to have the Visio 2010 Extensions for System Center 2012 installed to perform the next steps.
Create a new Visio document and choose your favorite shape. Add three of them to the workspace. The first one you can color green, as it represent the (hopefully) default state of “everything ok”. Something like this:
Then, select the middle shape, go to the Operations Manager tab, and select Link Shape.
Specify All Management Packs, remove the flag from Show only commonly-used classes and find your 33% group. Link it to the shape.
We need to now apply a custom Data Graphic to this shape. We want to leave it “off” when the state of the 33% Group is healthy and switch it on otherwise. To do that, go to Data Graphics and Create a New Data Graphic.
Choose New Item.
In the Data field specify “Health State” and in the Displayed as field choose “Color by Value”. Next, compile the possible Health State and give a neutral color to all except the Warning and Error state. To these two, choose to color both of them as yellow.
To make it easier to recognize, rename the Data Graphic with something meaningful.
Repeat the process for the other shape, but for this one choose red as the color in the Data Graphics section.
All done. Now we have a graphical indicator of the number of objects – in our example, servers – unhealthy, that gets colored dynamically as the number increases. You can repeat the process to have as many indicators as needed.
Hope this helps!
Claudio Broglia | Dataplatform Consultant | Microsoft
Get the latest System Center news on Facebook and Twitter:
System Center All Up: http://blogs.technet.com/b/systemcenter/ System Center – Configuration Manager Support Team blog: http://blogs.technet.com/configurationmgr/ System Center – Data Protection Manager Team blog: http://blogs.technet.com/dpm/ System Center – Orchestrator Support Team blog: http://blogs.technet.com/b/orchestrator/ System Center – Operations Manager Team blog: http://blogs.technet.com/momteam/ System Center – Service Manager Team blog: http://blogs.technet.com/b/servicemanager System Center – Virtual Machine Manager Team blog: http://blogs.technet.com/scvmm
Windows Intune: http://blogs.technet.com/b/windowsintune/ WSUS Support Team blog: http://blogs.technet.com/sus/ The AD RMS blog: http://blogs.technet.com/b/rmssupp/
The Forefront Endpoint Protection blog : http://blogs.technet.com/b/clientsecurity/ The Forefront Identity Manager blog : http://blogs.msdn.com/b/ms-identity-support/ The Forefront TMG blog: http://blogs.technet.com/b/isablog/ The Forefront UAG blog: http://blogs.technet.com/b/edgeaccessblog/
Interesting, so you would then hand off the Visio to Management to get them out of your hair? I like the idea behind this, I would like to have a Visio of the datacenter floor, basically showing the racks from above. When X number of servers become unhealthy
the top of the rack changes colors. Then it would be cool to click on that rack to get a view of all the servers inside, and the individual health of each to find the one or more problem children. Great article, it game me the basic how I needed to set this
Any chance Monitoring a non-domain / workgroup server in Operations Manager is going to get any easier? I see it's not impossible, http://blogs.technet.com/b/stefan_stranger/archive/2012/04/17/monitoring-non-domain-members-with-om-2012.aspx, but, appears to be some form of punishment for wanting to do so. Some of the other products, Virtual Machine Manager (For Windows Server-based hosts in perimeter networks..., http://technet.microsoft.com/en-us/library/cc764275.aspx) and Data Protection Manager (http://technet.microsoft.com/en-us/library/hh758063.aspx) both allow you to with relative ease.