This post is authored by Henrik Jørgensen from Microsoft Services in Denmark.
The following is based upon real life experience with the CQM framework from a Microsoft Consulting Services project.
One of our customers reported bad experiences with Lync. Specifically the customer reported, that their end-users complained about problems with Audio and Video. The problems reported could be divided into 2 scenarios:
The customer is a major global player in their specific area. They host several Lync pools globally and are represented in countries around the globe.
An approach to analyze the above problems is to use the Call Quality Methodology (CQM) framework as introduced by the Microsoft Lync product group.
The approach was to establish a baseline, in order to understand the level of the problems but also have a benchmark to compare with after implementation of changes to the Lync environment and related IT infrastructure components.
We divided the work into the following areas:
The work involved several IT teams at the customer. A key learning was, that the operation of a complex IT-infrastructure as Lync calls for co-operation and communication between the IT teams involved in operations and maintenance of the Lync infrastructure.
The analysis work revealed several findings:
The KHI, CQM and Network Analysis revealed other findings. These are presented in more detail in the following.
Server Key Health Indicators
We used the KHI collection PowerShell script from the networking guide. We collected data for 5 working days. Afterwards, the data was imported to Microsoft Excel.Among other critical findings, we observed packet loss on some of the front-end servers. This called for further analysis of the server problems. A firmware upgrade was part of the solution.
CQM Analysis
The CQM SQL queries are divided in 3 areas
We used the queries in the Networking guide. A summary of the findings are provided below:
Endpoints
The CQM queries revealed that a majority of end-users at given locations did not use Lync certified devices. The customer initiated a process to
Server traffic
The CQM queries documented packet loss between the AVMCU and the Mediation Server and from the Mediation Server to the gateway at some sites. Further analysis looked at
Network traffic
We identified several issues
All above findings called for additional analysis and work in order to solve the problems.
Network Analysis
Together with the customer, we defined three persona profiles. These where defined in the bandwidth calculator. The customers HR department provided us with a number for each of the persona profiles at the specific locations where the customer is represented.
The calculations in the bandwidth calculator revealed:
Furthermore, the customer initiated a network assessment of the WAN. The assessment documented the predictions from the bandwidth calculator.
Customer initiated actions to improve the Lync experience
The customer initiated several actions to improve the Lync experience of their end-users. In summary these are
Key learnings
With the CQM approach, we helped our customer to not only troubleshoot and fix problems with their Lync infrastructure. We also established a methodology that is used pro-actively in their environment to prevent problems in Lync communications internally as well as with external parties.
A key learning is that CQM is a very good framework, but the value from it can be very limited, if the processes at a customer are not aligned to CQM and a proper Lync service mapping is not in place. Furthermore, the different IT teams at the customer needs to communicate very close about operation and maintenance of the IT infrastructure.