Written by Kip Ng, Principal Microsoft Premier Field Engineer.
One of the key factors for successful operations is monitoring. The most common questions I get when helping customers plan their monitoring needs are: "How do I know what to monitor?" and "How do I know my system is healthy?"
Yes, two very simple questions but not always easy to answer, and if you can’t answer these, you really have no idea what or how to monitor your environment. Yes, some monitoring solutions like Microsoft System Center Operations Manager (OpsMgr) come with management packs, but I am sure most of you already know that it takes more than just installing the software and slapping in the management pack to make the monitoring solution suit your need and to be able to fully realize the usefulness.
There are obviously many ways to approach this, and I find one of the easiest ways is to create dependency tree views of your service and the components that you would like to monitor. We can call this a technical service map. Yes, it will take some time but I assure you that the time taken will be well worth it. In addition to that, it can make a useful reference document for troubleshooting in the future.
For example, I spent a few minutes creating the following tree to document the components required for Outlook Anywhere to work,
From the above, you can easily tell that in order for Outlook Anywhere (assuming this is from Internet) to work:
A very simple tree like the one above takes you no more than 10 minutes to create, and it helps you understand how Outlook Anywhere works and what kind of monitoring is required if you want to do end-to-end client monitoring. Of course, there might be more dependencies than shown in the tree above that I spent a few minutes to create, but I am sure you get the point. For example, you can further expand this by noting that for Outlook Anywhere to work, it also depends on the health of the Exchange Client Access Server (CAS). So, you should create another tree to monitor the health of the Exchange CAS such as the following:
From the image above, you can easily see what kind of monitoring is required to ensure that this service functions well. With this, if someone asks you the 2 questions I started this blog post with, you can easily tell them that in order for me to ensure that this Exchange CAS is running, all these components that the system depends must first work and hence, we need to monitor each of these components. You can further expand the above to more granular units. For example, for Operating System, we can create the following:
So, spend some time, dig into Visio (I used the Visio Brainstorming Diagram to create the service maps above) and start creating your own IT technical service maps.