Incident SLA Management in Service Manager

Incident SLA Management in Service Manager

Rate This
  • Comments 49

This blog post describes how to build a custom SLA management solution in SCSM.  If you are looking for more of a plug and play solution check out a solution provided by our partner Cased Dimensions that provides Service Level Management.  Check out the Cased Dimensions demo video.

A question that has been discussed eagerly on forums regarding Service Manager 2010 is how to be able to take action upon incidents breaching their Service Level Agreement (SLA). In this post Patrik Sundqvist and I will show you one way to do it. There are three goals of this blog post:

  1. Explain how to configure incident SLAs in Service Manager 2010.
  2. Explain how to use the plug and play solution that we built for managing SLAs.
  3. Explain how we built the solution and in particular how to create custom Windows Workflow Foundation activities that use the Service Manager SDK.

How to Configure Incident SLAs in Service Manager

When an incident is registered within Service Manager, it will get a priority based on a priority calculation drive by the urgency and impact of the incident. The priority and target resolution time are also recalculated each time the impact and/or urgency changes. The calculation is based on a matrix which can be configured directly in the console at "Administration" – "Settings" – "Incident Settings".

image

 

In the same place where you configure the priority matrix you're able to define target resolution times per priority level.

image

 

As mentioned above, when an incident is registered in Service Manager it receives a priority based on the matrix. At the same time as it receives the priority it also get's a "Target Resolution Time", which is based on the priority and the resolution time configuration.

image

 

Notice here how the Priority is set to 1 because the Impact and Urgency are both High. The priority is determined by the Urgency/Impact matrix shown above.

image

 

Here, notice how the Resolve By (also called 'Target Resolution Time') is set to the time the incident was created plus 30 minutes per the configuration shown above.

Out of the box you can manage incidents which are still active past their target resolution times by using the 'Overdue Incidents' view.

image

 

This is a pretty passive approach though and requires someone to be continually hitting refresh on the view instead of managing things by exception. You can also run an Incident KPI Trend report to see the number of incidents that didn't meet their SLA:

image

 

Wow! The Contoso service desk team is really doing a bad job of meeting their SLAs! :)

You can also run the Incident Resolution report

image

 

Either of these reports you can slice by queue, source, time range, etc. Our upcoming dashboard release for Service Manager will also have some interesting views on this data.

But again, these are also pretty passive approaches to managing incidents.

What we hear from customers a lot is that they want to take a more proactive approach to managing incident SLAs. After all some people's jobs depend on having good incident SLA numbers!

Here are a couple of things that people want to do which we don't provide for out of the box but with a little customization can be configured:

  • Have a view of incidents which are within X minutes of breaching the SLA – see this blog post but instead of doing it for Last Modified do it for Target Resolution Time is Less Than [Now] + 30m (or whatever your desired warning threshold is).
  • Send a notification to the assigned to analyst when the incident is X minutes away from breaching SLA.
  • Send a notification to a manager when the incident is X minutes away from breaching SLA. Send another one when it has breached SLA.
  • Escalate/route an incident automatically when the incident is X minutes away from breaching SLA or when it has breached SLA.

To detect and act upon incidents about to or breaching their SLA (their Target Resolution Time) you can use the built in workflow engine of Service Manager 2010. Here is how you can use this solution we provide in this blog post.

Deploying the Solution

  1. First, download the solution here
  2. Copy the following DLLs to the C:\Program Files\Microsoft System Center\Service Manager 2010 directory:


    image

The Microsoft.ServiceManager.WorkfowAuthoring.* dlls come from the Service Manager Authoring Tool Beta 2. Be careful replacing what you have already there or replacing these in the future with new ones. Always create backups of these before you replace them!

2.  Import the management pack Microsoft.Demo.IncidentSLAManagement.xml into Service Manager. Note – you can optionally configure how frequently the workflow that checks service levels runs. By default is every 15 minutes. Make sure you decide how often you want it to run before you import and don't run it too frequently! Just search for 'Minutes' in the XML and you'll see where it is set to 15. Just change it to some other number if you want before you import.

3.  Go to the Administratoin/Settings view in the console. Double click on Incident SLA Management Settings and configure the warning threshold. This is the threshold at which you will change the incidents' SLA status to Warning. By default it is zero meaning there is no warning interval.

image

     

Note: this solution will start running immediately after import. If you don't want it to run immediately on import you can change the Rule Enabled attribute to "false" in the XML prior to importing and then enable it in the Administration/Workflows/Configuration view.

Now, what you will see is that any incidents which are still active past their target resolution time will be marked as Incident SLA Status = "Breached" and any incidents which are within X minutes (as defined by the Warning Threshold) of Target Resolution Time will be marked as SLA Status = "Warning". You can see this on the incident form in the Extensions tab.

image

 

To make it easy to see the incidents that are in a Warning or Breached state we have provided a couple of new views in the management pack:

 

 image

Now you can use this property as part of notification subscriptions or incident event workflows to escalate or do other classification/routing things.

  1. First go to the Library/Templates view and create a new incident template that will route/classify your incidents according to what you want – for example, if when incidents change to SLA Status = Breached you want to chnage the support group to 'Escalation Team' then in the new incident template set the Support Group = 'Escalation Team'.
  2. Navigate to the Administration/Workflows/Configuration view.
  3. Select the Incident Event Workflow Configuration row and click Properties.
  4. In the workflow dialog that comes up click Add.
  5. Click Next on the welcome page of the wizard (if it comes up)
  6. Provide a name for the workflow like 'Escalate SLA Breaching incidents to the Escalation Team Support Group'.
  7. Select 'When an incident is updated'.
  8. Select the Incident SLA Management MP.

image

     

9.  Click Next.

10.  On the criteria page set it up so that "when the SLA Status change to Breached" the workflow will be triggered like this:

image

     

11.  Click Next.

12.  On the template screen, select the incident template you created in step #1. Click Next.

13.  Optionally choose to notify people related to the incident. Click Next. Note: We have provided a couple of "out of the box" notification templates – one for 'Incident SLA Status – Warning' and one for 'Incident SLA Status – Breached'.

14.  Click Create.

15.  Click Close.

You can also set up notifications to other people like team leads, managers, etc. by following the same subscription logic by creating new notification subscriptions in the Administration/Notifications/Subscriptions view.

Now that you know how to use the solution now, let's take a look at how we built it.

How We Built the Solution

Note: This part is intended more for developers!

The solution is comprised of the following parts:

  • Incident class extension to add a new enum property for SLA Status
  • Enum values for 'Breached' and 'Warning'
  • New class for capturing the Warning Threshold administration setting
  • Custom form for displaying the Warning Threshold
  • Custom task to display the Warning Threshold settings form when the user clicks 'Properties' in the Administration/Settings view
  • 2 notification templates – one for breached and one for warning
  • 2 views – one for breached and one for warning and a new folder to put them in
  • New custom Windows Workflow Foundation activity that queries the database looking for objects which are in a warning state or breached state and marks them accordingly
  • Rule that runs on a schedule that runs the custom Windows Workflow Foundation activity

Let's take these one at a time. Most of these concepts have already been described previously so I'll just link to them here:

Extending classes is described here.

Creating enumerations is described here.

Creating a new administration setting with form and custom task is described here.

Creating notification templates is described here.

Creating views is described here.

Creating custom Windows Workflow Foundation activities hasn't been described before so we'll do that in this blog post…

First start by creating a new Solution using the Workflow Activity Library project template

image

 

Next change your class name to something meaningful by selecting the activity in the designer and changing the name in the Properties panel:

image

 

 

And rename the files:

image

 

Next add some references and using statements in the .cs file (not the designer.cs file):

C:\Program Files\Microsoft System Center\Service Manager 2010\SDK Binaries\Microsoft.EnterpriseManagement.Core.dll (on management server)

C:\Program Files (x86)\Microsoft System Center\Service Manager 2010 Authoring\PackagesToLoad\Microsoft.ServiceManager.WorkflowAuthoring.ActivityLibrary.dll (on computer where Authoring Tool is installed)

using Microsoft.EnterpriseManagement;

using Microsoft.EnterpriseManagement.Configuration;

using Microsoft.EnterpriseManagement.Common;

using Microsoft.EnterpriseManagement.Workflow.Common;

using System.Collections.Generic;

using System.Threading;

Now you need to make your custom Windows Workflow Foundation activity derive from a special base class we provide. This will allow your Windows Workflow Foundation activity to use the special property binding dialog in the Service Manager Authoring Tool that allows you to bind to trigger class properties.

Change your class declaration like this:

image

 

Now you can declare some input/output parameters. Here is an example:

public static DependencyProperty WarningThresholdProperty = DependencyProperty.Register("WarningThreshold", typeof(TimeSpan), typeof(GetSLABreachingIncidents));

[DescriptionAttribute("Number of minutes prior to breach when incidents should be marked as Warning. If not speicified (00:00:00), value from database will be used.")]

[CategoryAttribute("Search Configuration")]

[BrowsableAttribute(true)]

[DesignerSerializationVisibilityAttribute(DesignerSerializationVisibility.Visible)]

public TimeSpan WarningThreshold

{

get

{

return ((TimeSpan)(base.GetValue(GetSLABreachingIncidents.WarningThresholdProperty)));

}

set

{

base.SetValue(GetSLABreachingIncidents.WarningThresholdProperty, value);

}

}

Then implement an Execute() method:

protected override ActivityExecutionStatus Execute(ActivityExecutionContext executionContext)

{

return base.Execute(executionContext);

}

This is where the code goes that you want to execute. For example, one of the first things you'll want to do is create a connection to the management server:

EnterpriseManagementGroup emg = new EnterpriseManagementGroup("localhost");

In this particular solution we are basically making three queries each time this activity runs.

  • The first one gets incidents which are currently breaching SLA and which have not already been marked as breaching.
  • The second one gets incidents which are within the Warning Threshold of breaching SLA and have not already been marked as warning.
  • The last one gets incidents which have been marked as Warning, but because the target resolution time has since been adjusted (due to the incident urgency/impact changing) are no longer in a warning state.

Then for those incidents which match the first query it marks them as SLA Status = breached, those meeting the second query as SLA Status = Warning, and those meeting the last query as SLA Status = <blank>.

Reminder on how to debug workflows: http://blogs.technet.com/servicemanager/archive/2010/01/19/debugging-custom-forms-console-task-handlers-and-workflows.aspx Use the Thread.Sleep trick!

Now you can build your workflow activity.

Using Custom Workflow Activities in the Authoring Console

To use this new custom workflow activity:

  1. Copy the .dll from the bin\debug or \bin\release folder of your project and copy it to C:\Program Files (x86)\Microsoft System Center\Service Manager 2010 Authoring\Workflow Activity Library
  2. Start the Authoring Tool
  3. Create a new Management Pack (or open an existing one)
  4. Create a new workflow by right clicking on the workflows node and choosing new and going through wizard
  5. When the activity toolbox comes up, create a new "group" in the tree to organize your custom activities by right clicking the top level 'Activity Groups' node in the tree and choosing 'Create Group'.
  6. Right click on that new group and click on 'Choose Activities...'
  7. In the dialog that comes up, click 'Add Custom Activities...'.

     

     image

  8. Then select your assembly .dll and click Open

     image

  9. Then select your activity in the list and click OK

     

     image

  10. Now your activity will show up in the Activity Toolbox and you can drag it into the workflow designer.

     

     image

Conclusion

This solution is available for testing now in an alpha version. A new CodePlex project has been started for any developers that would like to contribute.

You can get the installable download, source code, or start contributing by going to the project site on CodePlex here:

http://scsmincidentsla.codeplex.com/

 Lastly, I want to give a HUGE "Thank you!!" to Patrik for his contribution to this project!

Leave a Comment
  • Please add 3 and 4 and type the answer here:
  • Post
  • Thanks for letting me know.

  • Hi

    I have copied the DLLs to the right place and imported the MP whit no problem, but it dosent work?

    I can see the SLA Status field in the Extensions tab (it's blank)in the incident form.

    The folder and SLA views are created but empty, i have incidents that has gone over the treshold for both warning and breached.

    If i look in the workflow/status and open incident SLA management, under worlflow instaces that need attention is empty.

    What can be the problem? Thanx in advance.

    Regards

    Petter

  • @Petter

    Do you have the RTM version of Service Manager installed?

    Can you please check the event log to see if there is anything there?

  • Hi Travis

    Yes it is the RTM version of SCSM installd.

    I have check the event log but i can not see any problem there.

    Can this have anything to do whit the date and time format, i use the following format: yyyy-MM-dd and HH:mm ?

  • @Petter - yes, likely this is the same date/time formmating issue that others have described in the comments here.  I'll try to get this fixed in the next week or so and post a new release of the Incident SLA solution on CodePlex.

  • Hi Travis,

    I would love to use your extension too, but i have the same problem with the date/time bug so that the workflow won't run. Are you still working on this extension? Any chance to fix this issuse?

    Thanks

  • @Steffen -

    I looked into that bug about a month ago but couldnt figure out why it was happening.  My hunch is that it happens when your SQL server is installed with a different locale than your management server or something like that.  Could that be the case in your environment there?

  • @ Travis

    Hmm my first reply to your response is missing - strange. Well thanks for your answer, I will have to check that and will report back :)

  • @ Travis

    local seems to be the same on both server - English. But I noticed, that SP1 for the SQL Server is not installed. Is that one needed?

  • Cased Dimensions have also written an SLA Management Pack for Service Manager. The product works inside Service Manager as a Management Pack. A self-install wizard unpacks and installs the functionality Functions include additional features such as calendaring, Change and Problem SLA's, alerting should SLA's be exceeded and much more. For information, please visit www.caseddimensions.com/service_manager_SLA_management

    This Management Pack is easy to deploy and wizard driven for configuration.

  • Hi Travis,

    I know this is an old post but if you're still watching it, I've followed your instructions to try and make a new activity (to perform a different function) to use in the Authoring Tool. However, after I add my custom activity assembly, my activity does not show up in the activities list.

    Any quick ideas of why?

    Thanks,

    Rob

  • Did you derive your workflow activity from our base activity class (WorkflowActivityBase)?

  • @Travis

    Yes, Base Class is set to "Microsoft.EnterpriseManagement.Workflow.Common.WorkflowActivityBase"

    and the declaration is:

    public partial class IncidentProcessingActivity : WorkflowActivityBase

  • Hi Travis,

    We have followed your instructions and were able to import the management pack.

    When we view the breached incidents view it is showing up as all our incidents being breached. We have checked the resolve by date on majority of these incidetns anf they are not breached.

    We would highly appreciate your assistance here please.

  • I am using SCSM 2010 SP1. I copied the dll files and import the MP without error. When I go to Settings/Incident SLA management Setttings, I get the following error on the pop-up window:

    This task will run the following unverifiable code:

    Assembly:

    Microsoft.Demo.IncidentSLAManagement.SettingsForm, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null

    Handler: SettingsConsoleCommand

    Would you like to comtinue?

Page 2 of 4 (49 items) 1234