Lync Server 2013 Monitoring Packs for System Center Operations Manager lets you see at a glance how your Lync infrastructure is doing, so you can focus on maintaining and improving the solution, rather than manually monitoring it. 

Series Note

In the Notes from the Field series, we bring you a variety of knowledgeable insider views into the special considerations inherent in planning, deploying, managing, and troubleshooting Lync Server as an integral part of a unified communications strategy. Be sure to look for other articles in this series.

Author: Ståle Hansen, Lync MVP

Published: May 17, 2013

Product version: Lync Server 2013

When deploying Lync Server 2013 as your real-time communications platform, active monitoring becomes important. Being aware of issues with your deployment before users become aware of them is key to designing a quality experience. Lync Server 2013 Monitoring Packs for System Center Operations Manager lets you see at a glance how your Lync infrastructure is doing, so you can focus on maintaining and improving the solution, rather than manually monitoring it. This is about the administrator’s Quality of Life (yes, I just invented that), and I think this integration is what sets Lync apart from other real-time communications solutions.

If you downloaded the Lync Server 2013 Management Pack, you know that the deployment documentation is excellent. Be sure to read it. My goal in this article is to give you an overview of the possibilities and what to expect when deploying the management pack.

Why is Active Monitoring so important?

When Lync Server is deployed as your primary phone system, your primary video multipoint control unit (MCU) and collaboration platform uptime is important. This is real-time communications and users notice instantly if the solution becomes unavailable. If for some reason, Lync Server should become unavailable, it affects:

  • Meetings, which may need to be cancelled and rescheduled.
  • Onsite workers, who may become less efficient.
  • Remote workers, who may need to travel to the office to participate in meetings.

I have experienced Lync uptime being more important than email uptime, because if affects so many different work scenarios.

These are some of the reasons you need active monitoring, so that you know before users do that something is not right in the solution, which gives you time to troubleshoot and fix the issue. Because Lync is heavily dependent on your infrastructure, it may be a complex task to troubleshoot the environment. The management pack helps by:

  • Narrowing the scope from a global issue to a specific area of your infrastructure.
  • Presenting rich error messages from Quality of Experience (QoE) and Call Detail Records (CDR) reports, as well as synthetic transactions (ST) (much like the diagnostics header, which is a great starting point in a troubleshooting process).
  • Spending less time finding the problem and more time solving the problem.

Monitoring Scenarios

Here are some key scenarios that are monitored through Operation Manager:

  • Call reliability alerts
    • From the monitoring server database
  • Media quality alerts
    • From the monitoring database
  • Component health alerts
    • Through Event logs and performance counters
  • Dependency health monitoring
    • Outside factors that can make Lync fail, such as IIS, CPU, and disk metrics
  • Synthetic Transactions
    • Actual tests that run periodically and simulate activity in your deployment with actual users

Without the management packs, you must manually monitor these scenarios or write your own routines to collect the information. Instead of developing those routines and monitoring the servers manually, the information is preconfigured and collected in one place. Reducing the daily routine from manually going through logs and running tests, to checking the dashboard in Operations Manager is a huge cost and time saver. It may even justify the cost of deploying Operations Manager for the sole purpose of monitoring Lync Server.

The Lync Server Management Pack supports only deployment with the Operations Manager agent files installed locally on the Lync servers you want to monitor. After the agent files are installed, you must configure the computer to act as a System Center proxy. That enables the computer to self-report its roles and set up role-based monitors and rules. This is a faster process than it was in Lync Server 2010, where the management pack would discover the entire topology by itself, which sometimes required significant time.

Monitors define the health states of objects. An object can have one of three health states: green (successful), yellow (warning), or red (critical). For example, a monitor for disk drive capacity might define green as less than 85 percent full, yellow as over 85 percent full, and red as over 90 percent full. A monitor can be configured to generate an alert when a state change occurs.

What Does it Look Like?

The alarms, triggered by the metrics defined in rules or monitors, are put in three prioritization categories:

  • High priority alerts
    • Indicate conditions that cause service outages
  • Medium priority alerts
    • Indicate conditions that affect a subset of users or indicate issues in call quality
  • Other alerts
    • All other non-critical alerts that affect single users or warnings

Figure 1 and Figure 2 below show the view you get from Operation Manager. If you want to see that all is okay nd get a quick overview of the state of just a subset of components, these views provide the basis for active monitoring.

     

Figure 1. Operations Manager Monitoring view

As you see in the screens, there is a lot you can investigate, but the important thing is that all alerts are included in the three prioritization categories, so in essence you just need to follow those.

Figure 2. Alert priorities

When you receive an alert, you can see in the alert details whether the alert was generated by a rule or a monitor. If the alert was generated by a monitor, as a best practice, you should allow the monitor to auto-resolve the alert when the health state returns to healthy. If you close the alert while the object is in a warning or unhealthy state, the problem remains unresolved but no further alerts will be generated.

Watcher Nodes

Watcher nodes are computers that periodically run Lync Server synthetic transactions against your environment. It is a server that has a basic Windows operating system deployed, Lync Core components and the WatcherNode.msi installed. Each watcher node is configured through the new cmdlet called New-CsWatcherNodeConfigurationfigure. This is a new feature in Lync Server 2013 where you can define:

  • What test cmdlets the watcher node can run
  • What users to use with the tests, and the required elements:
    • Need to be real users in Active Directory
    • Enabled for Lync
    • Enabled for Enterprise Voice with real phone numbers if you are testing PSTN connectivity
  • If the node is active or not

Watcher Node Placement

Watcher nodes can be places everywhere in your network, as shown in Figure 3 below. If you have one central site with users connected through a WAN link, it may be a good idea to set up nodes in sites where there are users. You can even have non-domain joined watcher nodes that test your solution from the Internet, perimeter network (screened subnet), and other often difficult to monitor network locations. This way, you can quickly discover problems over WAN links as well.

Figure 3. Watcher Node placement in a network

Watcher Node Administration

When using the management packs for Lync Server 2010, I felt that I did not get full control over the Synthetic Transaction node. But now with watcher nodes, you can tune the cmdlets being run and see when they are run.

Here is what I experienced:

  • The tests are run every 15 minutes.
  • If you restart the health service on the watcher node, it will start running a test pass immediately.
  • Watch for three different event viewer ID’s, to see where the watcher node is in the process:
    • 334 – States what individual test is being run.
    • 335 – States if the test pass is finished.
    • 331 – States that the results are sent to the management server.
  • After the reports are sent, you can go back to your management server and view the result.
  • The results contain a full error report, and not just that the transaction has failed (which was the behavior in previous versions). This helps you:
    • Troubleshoot why the transactions are not working as expected.
    • Quickly narrow the scope of why the test failed, such as user sign in error or service not working.

To see a full list of what test cmdlets the watcher node can run, check out this article in the Lync Technical Library: Installing and Configuring Watcher Nodes.

Deploy for Success

When deploying Lync Server 2013 with System Center Operations Manager, here are some key factors to help you succeed with the implementation:

  • Deploy only one management pack at a time.
  • Get an Operations Manager consultant and a Lync consultant to collaborate on resolving all the alarms that appear.
    • Some alarms may not apply to your deployment, while others must be resolved.
    • Make sure all alarms are green over time—indicating good operational status—before going into production.
  • Create a management pack for overrides and customization.
  • Tune your synthetic transactions with the correct users against correct pools in the correct sites.

A correctly tuned and maintained Operation Manager solution with Lync Management Pack is a great tool to maintain the quality of a Lync Server deployment for administrators ensuring quality for end-users.

Live Demo

I presented on this topic at the Nordic Infrastructure Conference in January 2013 and shared the following demo:

About the Author

 

Ståle Hansen is a Lync Technical Evangelist at Atea, Norway. He has worked with Lync since Live Communications Server 2005. He spends most of his time with Proof of Concept deployments, talking with customers, and helping them adopt and implement the latest technology. Ståle is an avid speaker at both internal and public seminars. He drives the msunified.net blog, contributes to The UC Architects Podcast, and co-authors a Lync Master Class.

Additional Resources

· Download Center: Lync Server 2013 Management Pack

Lync Server Resources

We Want to Hear from You

Keywords: Lync Server 2013, Active Monitoring, System Center Operations Manager 2012, Management Pack, Watcher Node