Managing Incident Resolution SLAs in Service Manager

Managing Incident Resolution SLAs in Service Manager

  • Comments 14
  • Likes

This blog post will cover how you can set up incident resolution service level agreements (SLAs) in Service Manager 2010.  There are some things that we still need to add support for in this area, but we’ll at least explain what you can do with SCSM today in this blog post.  I’ll also point the areas that are gaps we need to fill and later we’ll announce when and how we are going to fill those gaps.

To begin with, let’s talk about what you can do in SCSM 2010.  Service Manager out of the box has support for SLAs based on the target resolution time.  Another common SLA metric is target response (“acknowledgement”) time.  We don’t have support for that out of the box right now.

Target resolution time is determined by the incident priority.  If you go to Administration\Settings\Incident Settings you will find this dialog:

image

For each priority level (1-9) you can define a different Target Resolution Time.  The Target Resolution Time is defined as the Time Created + Target Resolution Time for the incident’s current priority.  For example, if I created an incident with priority = 3 at 8:00 in the morning, I would have until 12:00 noon to resolve that incident.  If the incident status changes to Resolved prior to 12:00 then I have met the SLA.

The Target Resolution Time is always displayed at the top of the incident form as the “Resolve By:” field.

image

image

This incident is Priority = 4 and per the matrix above has a target resolution time of Time Created (9/21/2010 12:59 PM) + 4 hours = 9/21/2010 4:59 PM.

The priority value is not directly settable in the UI because it is a function of the Impact and Urgency values.  In the example above when Impact = Low and Urgency = Medium that is configured to have a priority of 4.

image

You can also add additional Impact and Urgency items in the Library\Lists view and then you can work with a larger matrix:

clip_image002

Note: changing either these settings or the target resolution time settings above will not take affect until after you close and restart the console and they will not retroactively be applied to incidents.  If the incident urgency or impact value changes when the workflow runs to evaluate it’s target resolution time again it will use the updated settings.

Assuming we go with the default configuration of Urgency: High, Medium, Low and Impact: High, Medium, Low at this point we have established the following pattern:

Urgency Impact Priority Target Resolution Time
High + High = 1 --> 30 minutes
Medium + High = 2 --> 2 hours
High + Medium = 3 --> 4 hours
Low + High = 4 --> 1 day
Medium + Medium = 5 --> 2 days
High + Low = 6 --> 7 days
Low + Medium = 7 --> 2 weeks
Medium + Low = 8 --> 4 weeks
Low + Low = 9 --> 52 weeks

That alone might be good enough for some customers, but a lot of people want to map different SLAs for different customers, different classifications of incidents, different services, different affected configuration items, etc.  First lets work this out on “paper” like this for a situation where we want to have different SLAs depending on how important a user is in an organization (from an IT guy’s perspective :)  ).

Scenario Urgency Impact Priority Target Resolution Time
Affected User’s title is ‘CEO’ High + High = 1 --> 30 minutes
Affected User’s contains ‘IT’ Medium + High = 2 --> 2 hours
Affected User’s title contains ‘Manager’ High + Medium = 3 --> 4 hours
Affected User’s title contains ‘HR’ Low + High = 4 --> 1 day
Affected User’s title contains ‘Engineer’ Medium + Medium = 5 --> 2 days
Affected User’s title contains ‘Senior’ High + Low = 6 --> 7 days
Affected User’s title is ‘Janitor’ Low + Medium = 7 --> 2 weeks
Affected User’s title contains ‘Marketing’ Medium + Low = 8 --> 4 weeks
Affected User’s title is ‘the guy with the stapler in the basement’ :) Low + Low = 9 --> 52 weeks

 

 

 

 

You can do the same kind of thing for other types of schemes including mixing and matching criteria using OR/AND statements:

Scenario Urgency Impact Priority Target Resolution Time
Incident Classification = ‘Network Outage’ High + High = 1 --> 30 minutes
Incident Classification = ‘HR App Down’ AND Affected User Title contains ‘Manager’ Medium + High = 2 --> 2 hours
Incident Classification = ‘Finance App Down’ High + Medium = 3 --> 4 hours
Incident Classification = ‘Printer Down’ OR ‘Printer Out of Paper’ OR ‘Network Slow’ Low + High = 4 --> 1 day
Incident Classification = ‘Disk Space Low’ Medium + Medium = 5 --> 2 days
Incident Classification = ‘Disk Space Low’ and Support Group = ‘Test Environment Support Team’ High + Low = 6 --> 7 days
Incident Classification = ‘Other’ Low + Medium = 7 --> 2 weeks
Incident Classification = ‘Maintenance’ Medium + Low = 8 --> 4 weeks
Incident Classification = ‘Games’ Low + Low = 9 --> 52 weeks

It’s really up to you how you want to classify these things, but in the end (at least for SCSM 2010) you have to map these all down to a certain pair of Urgency and Impact values which in turn drives Priority which in turn drives the Target Resolution Time.

Now the question is “How do you implement this map that you have created on paper?”  There are a few different ways:

  1. When an incident is created by one of the connectors, apply a template which is appropriate for the type of issue.  The template should be configured to set the Impact and Urgency appropriately along with any other additional properties and relationships that are appropriate to route and classify the incident.  Examples where you can do this are:
    • SCOM Alert –> Incident scenario
    • SCCM Desired Configuration Management –> Incident scenario
    • Upcoming Exchange connector
  2. Use the Incident Event Workflow to apply a template to update the Urgency and Impact appropriately as incidents are created or updated.  For example, if an incident changes from Classification = ‘Network Slow’ to ‘Network Down’ you would want to apply a template which sets the Urgency = High and the Impact = High.  Another example could be that whenever a new incident is created, regardless or source and regardless of what the initial Urgency and Impact values are, if the Affected User is the CEO then apply a template which changes the Impact and Urgency to High.
  3. When analysts create a new incident in the console, have them apply a template to populate the incident which sets multiple properties and relationships such the affected service, classification and Impact/Urgency values appropriately at the same time.

A couple of important notes:

  • There is a workflow in SCSM out of the box that changes the Priority and Target Resolution Time property value each time there is a change in either Urgency or Impact.  So – even if the incident is updated via a template being applied in a workflow the Priority and Target Resolution Time will be updated within a few seconds to match the new Urgency and Impact values.
  • The data is sent to the data warehouse on a schedule.  Changes to Urgency, Impact, Priority, and Target Resolution time will not be reflected in reports until the data has had a chance to go through the Extract, Transform, and Load process which takes about an hour or so on average.

Now, let’s point out some of the limitations currently in SCSM 2010:

  • Target resolution times assume a service desk that is operating 24x7.  There is no way to override this out of the box, but Patrik and I may provide a customized solution to this at some point.
  • There is no way to add additional priorities beyond 1-9.
  • There is currently no way to subscribe for incidents in the Incident Event Workflow with criteria that traverses relationships where the max cardinality is > 1.  This includes relationship types such as affected Configuration Items and affected Services.
  • There is no way out of box way to send a notification or apply a template when an incident has exceeded its target resolution time or is about to exceed its target resolution time.  You can use the Incident SLA Management CodePlex solution to do this though.
  • There isn’t a Response Time SLA capability out of the box although it is possible to add support for that as described in this blog post.
  • There is no way to capture the SLA document itself in SCSM.  It would be really easy to create a new class called ‘SLA Agreement’ though.  You could define some simple properties and relationships on this class and attach documents to it through the Related Items tab.

We are working on addressing these issues as soon as possible, but in the meantime you can start to map your SLAs to SCSM using the approach described above for target resolution times.

Another thing you can look into is a solution provided by our partner Cased Dimensions that provides Service Level Management.  Check out the Cased Dimensions demo video.

Hope that helps clear things up!

Please leave any helpful comments or suggestions in the comments below so we can factor those in for future improvements.

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • "There is a workflow in SCSM out of the box that changes the Priority and Target Resolution Time property value each time there is a change in either Urgency or Impact.  So – even if the incident is updated via a template being applied in a workflow the Priority and Target Resolution Time will be updated within a few seconds to match the new Urgency and Impact values."

    I am creating incidents via the SDK and I see the Priority get set by a system workflow, but I don't ever see the Target Resolution set until someone opens an incident and saves it.  

  • @Brian  

    Are you certain of this?  I just did some testing and it works fine for me.  I even checked in the MP to make sure that there is a workflow in there that is triggered on create.   It has the same configuration as the update workflow.

  • what is the Target Resolution Time variable? I cannot find it in the incident class... and i want to insert the resolution time in a notification template. It is possible to retrieve this variable in the incident class?

    tnx

  • ok :) i've found it

    $Context/Property[Type='CoreIncident!System.WorkItem.Incident']/TargetResolutionTime$ aka "Resolved by"

  • Hi Travis

    In  this post you show that you have changed the impacy to show 5 categories.  Adding is easy but how have you edited the names of the default categories, low medium and high, I can't do this in mu console the defaults are greyed out.

  • Hello Travis,

    Is it possible to allow the users to be able to choose the Impact when creating a ticket via the web portal? I realize that they can determine the urgency of a given ticket, but having the provide the Impact would be extremely helpful in our situation.

  • @Aly

    Sorry - It's not possible to allow users to choose the impact level at this time.  

  • Travis, any updates to this?

    Mostly hoping for the customized solution for operating hours. We only operate 8 hours a day...

    Also, is there any way to implement a "on hold" functionality. For example, using the Exchange connector and the request information from affected user scenario, I would like the target resolution clock to stop ticking until the affected user replies to the email.

    Another scenario might be such basic things as laptop users leaving mid day taking their laptops with them, I'd like the Analyst to be able to manually put the incident on hold in those kind of scenarios.

    Keep up the good work!

  • Trana010 - Some of those features, especially business hours, will probably be coming in SCSM 2012 due out around the end of this year.  Given that we are so close to that release and that there are a number of other areas that I could invest my time in developing that wont be covered by SCSM 2012 and that there is already a partner solution from cased dimensions for this, I don't think I'll be building a workaround for business hours.

  • Thanks Travis, makes perfect sense, cant wait for 2012! :)

  • Hi Travis,

    my customar want to change incidet status when they are check the set first response check box. Is it possible? and what is "set first response" attribute class?

  • @Firat Yasar -

    What you need to do is subscribe to the first response date changing from Null to Is Not Null and then apply a template which changes the status value.  Unfortunately, the console doesn't allow you to do that.  You could do it in MP XML.

    The checkbox that you see on the 'Set First Response or Comment' dialog is not bound to a property itself.  If the checkbox is checked and the incident is saved then the first response date property is set to Now.

  • Hi Travis,

    Just want to check I'm not missing something here! :)

    I am using SCSM 2010 and we require some way of measuring first response times.  I can see how using created date compared to assigned to can work, but was wondering about this "set first reponse checkbox" is that only in scsm 2012?  

    Also when I click link to blog post near bottom of article:-

    "•There isn’t a Response Time SLA capability out of the box although it is possible to add support for that as described in this blog post." I get redirected to the same page, is this correct?

    Thanks

    Phil

  • @Phil -

    Yes, first response date (including the checkbox to set it) is a feature of SCSM 2012+.