We get asked this one a lot: What happens when two people edit the same object (like an incident record for example) at the same time?
It’s inevitable in a distributed application with many people that each have their own interface that you will have two people try to update the same thing at roughly the same time. So – how do we deal with this in SCSM?
Let’s take the most common scenario and the one that people seem the most concerned with – incident management…
Here’s a scenario for us to work with in this timeline:
What happens to Bob here? The answer is that it depends on what update Bob is trying to make. We try to be “optimistic” (geeky developer term) in allowing updates to happen. For example, let’s say that Tom changes the incident urgency from High to Medium. Bob is trying to change it from High to Low. We don’t want to let Bob change it from High to Low without knowing that Tom has changed it to Medium. So in that scenario we reject Bob’s attempt to update an incident and show him an error message that looks like this:
He has to close the incident form and reopen it to get the latest data. That way he is acting in an informed way with the most up to date information.
So – that scenario was obvious since they were both trying to change the same property value. Ideally, we would “optimistically” allow different property values to be submitted. For example, let’s say that the only thing that Bob changed was the title of the incident and the only thing that Tom changed was the urgency. Since those don’t really conflict with each other we could allow them both to be able to update the incident. We’ll that’s an ideal world! :) The reality is that if anything has changed on the object itself the later update is rejected. So – in this case Bob would be rejected again. Poor Bob! :)
Here’s another case – Bob changes the Urgency from High to Medium and the Title from ‘Foo’ to ‘Bar’. Tom changes the Urgency from High to Medium. Even though Bob and Tom agreed on the Urgency, because Bob updates last and some property on the incident object itself has been updated, Bob’s update will be rejected. Again. Ouch!
There is one case where we do allow “optimistic” updates – relationships! Keep in mind that an “incident” is really comprised of an incident object and it’s relationships to other objects. The incident object has properties like Title, Urgency, Impact, Priority, Created Time, etc. It also has relationships to other objects like the Affected User relationship which relates an incident to a user. The affected user is not a property of the incident object even though it kinds of seems that way on the form because it looks like any other control on the form.
Let’s say Bob adds a computer named ‘bobscomputer’ to the affected configuration item list of the incident. Let’s say Tom adds a computer name ‘tomscomputer’ to the affected CI list. In this case, Bob’s update will be allowed! This is true of relationships with cardinality greater than 1 (affected CIs for example) or for cardinality equals 1 (affected user, assigned to user, etc.) relationships.
So – how does this all work? Let’s say that the last modified time of the incident in this scenario prior to Bob or Tom doing anything is 8:00. When Bob opens the incident record he gets a timestamp (8:01). Tom also gets a timestamp (8:03) when he opens the form. When Tom updates the incident at 8:08 his timestamp (8:03) is greater than the last modified time (8:00) and therefore his update is allowed. The last modified time on the incident is updated to 8:08. When Bob tries to submit his incident his timestamp (8:01) is less than the current last modified time (8:08) and therefore his update is rejected.
This is also true for system updates from a connector or a workflow. In the scenario below Bob’s update will be rejected.
Hope that helps explain how we handle two users editing the same object at the same time!
This all seems rather crazy and unhelpful to me since there are so many circumstances where one user is likely to receive an error and maybe lose a large edit. Not only can this be frustrating but it may be a negative experience for a customer. Rather than code what must be a complex optimistic mechanism why not implement the tried and trusted Word method where the first one wins and second can open in read only mode? This seems a far more sensible approach to me.
@GaryJ - There are a few problems with the "Word" approach:
1) It assumes that everyone that opens an object to view it also wants to edit it. In many cases a user may just want to look at the current state of the object and is fine with other users editing the object in the background. You would actually have many more "collisions" this way and it likely would result in a different (and possibly more severe) type of user frustration because Bob might open an incident and go out to lunch. Meanwhile Tom can't update the incident! SLAs could be missed because the object is locked like this. You could develop a two-state system where the first user can choose to open the incident in view mode or edit mode (like Office 2010) but this puts additional cognitive load on the user, takes longer to develop/test, and it can still often result in objects being completely locked.
2) It would lock out automatic updates from connectors and workflows which could break the automated portions of the process or prevent the most up to date information from being available to the user to make decisions from.
3) It's actually more difficult to implement the "Word" approach, especially if you wanted to do the edit/view approach, notify users when the object is available for editing, etc. FWIW - the current implementation is actually very simple in terms of coding which means that it is very efficient and not as prone to bugs and such. Basically all that is happening behind the scenes is that the database returns a datetime property of the current time with the object when the object is requested by the console. When the object comes back to the database on an update the database does a check to see if the datetime property is older or newer than the last modified time.
I do agree with you that there are optimizations that we could make to the current design though and hopefully we'll get to those soon enough. Here are some examples:
1) Allow non-conflicting property level updates (as described in the blog post).
2) Notify a user on open if another user has recently opened an object.
3) Notify a user while editing that another user has updated. This could prevent the user entering further information that will be lost.
Each of these improvements has tradeoffs though. We could spend time developing/testing those instead of developing other features. There could also be performance implications to them and the user experience could become more complicated.
It will be interesting to see how many conflicts there are in practice with the current design. So far I haven't heard any complaints about the current design, but if there are feel free to let us know!
These kind of errors do appear in practise. They do occur very rarly in Problem and Incident, but quite frequently in Change. The reason for that is the 'Change Request Status Changed' workflow that runs in the background for the CRs. One example of this would be when our analysts enters some of the information in the CR form, then hit apply. Then, without refreshing or re-opening the form, they will enter more information and then tries to save it again. Which wont work ofcourse, since the workflow has changed the status from 'New' to 'In Progress'.
There is an easy answer to the example above: "Don't hit apply"! Unfortunately, it aint easy to get our 50 analysts to understand that...
However, I do like all of your suggestions to fix or warn for this Travis!
@AndersAsp - That's a great scenario. We'll do some thinking about how we could improve that. Thanks!
good engineering , not magic
Travis, in follow up to AndersAsp's example, we are currently running into that scenario very freqently. I wanted to see if you had any information of anything that might be in the pipeline to address this major problem? Perhaps in R2? Even if the apply button locked the CR with a green in-progress bar while the workflows ran, that would be an improvement as it would prevent users from making changes until the workflows were done. Or, other option, if the CR Status field didn't lock the entire CR. Thx Travis!
Apply should just invoke an automatic refresh with a progress bar? As it currently works, CRs are a pain. No other "service Manager" software I have used (and I have used a few) has wasted so much entered data
We run into this some 10-20 times per day for our 70 console users, which are more than mildly frustrated. Mostly it seems workflow related but there really must be a fix under way for this, isn't it?
Something that allows the workitems to be updated on a property level, or flagged for changed properties when trying to apply (in a sharepoint-like manner).