My colleague Cedric Naudy recently analyzed an interesting problem: Hierarchy creation for all site collections in a specific content database failed to create the hierarchy while hierarchy creation in other web applications worked fine.
The Hierarchy creation jobs were marked as succeeded and but no new hierarchies were created.
While analyzing the issue Cedric noticed, that the ScheduledWorkItems table contained work items for the CreateVariationHierarchiesJobDefinition timer job, which belonged to a site collection, which does not exist in the specific content database.
Further analysis revealed that the site collection has been deleted on the same day as it was created. Before the deletion the site collection administrator started to create the variation hierarchy before he later decided to discontinue the site collection.
As the CreateVariationHierarchiesJobDefinition timer job only runs once a day per default the scheduled work item for the CreateVariationHierarchiesJobDefinition timer job was still in the ScheduledWorkItems table after the site collection was deleted.
The timerjob now tried to create the variation hierarchy but failed to locate the site collection in the content database which led to an unexpected System.ArgumentNullException when trying to remove the problematic work item from the ScheduledWorkItems table. This exception causes the timerjob to stop without further processing other scheduled work items for other variation hierarchies in other site collections in the same content database.
As the problematic work item still remained in the ScheduledWorkItems table the next run of the CreateVariationHierarchiesJobDefinition timerjob again tried to process it and again ran into the same exception which again caused the timerjob to stop without further processing work items.
Each time the "Create Hierarchies" button is clicked in any site collection which resides in the same content database, a new work item is added to the ScheduledWorkItems table. But these work items are never processed because the timer job fails already on the older problematic work item which references the deleted site collection.
For those of you who are interested to reproduce this problem you can use the following steps:
The hierarchies should be created for Pub2 site collection.
The "Variations Create Hierarchies Job Definition" is visible as succeeded in job history (it took 0 seconds) but no hierarchy was created for the new label.
On the site, we could see that the hierarchies were not created.
If you suspect that you have ran into this issue you should enable verbose logging for the "SharePoint Foundation" - "General" category. Afterwards you will find the following entry in the ULS log, whenever such a problematic work items is being processed:
OWSTIMER.EXE (0x1EF4) 0x1C8C SharePoint Foundation General 8nc8 Verbose TimerJob WorkItem Processing exception: System.ArgumentNullException: Value cannot be null. Parameter name: workItemId at Microsoft.SharePoint.SPWorkItemCollection.DeleteWorkItem(Guid workItemId) at Microsoft.Office.Server.Utilities.TimerJobUtility.<>c__DisplayClass1.<ProcessWorkItem>b__0() at Microsoft.Office.Server.Utilities.MonitoredScopeWrapper.RunWithMonitoredScope(Action code) at Microsoft.Office.Server.Utilities.TimerJobUtility.ProcessWorkItem(SPWorkItemCollection workItems, SPWorkItem wi, WorkItemTimerJobState timerJobState, ProcessWorkItemWithState processor) at Microsoft.SharePoint.Publishing.Internal.VariationsSpawnJobDefinitionBase.ProcessWorkItem(SPContentDatabase contentDatabase, SPWorkItemCollection workItems, SPWorkItem workItem, SPJobState jobState) at Microsoft.SharePoint.Administration.SPWorkItemJobDefinition.ProcessWorkItems(SPContentDatabase contentDatabase, SPWorkItemCollection workItems, SPJobState jobState) at Microsoft.SharePoint.Administration.SPWorkItemJobDefinition.HandleOneContentDatabase(SPContentDatabase db, SPJobState jobState)
This error message is a bit misleading. It is complaining about a null value while workitemID is a GUID. In fact, this fails because the ParentID field in the "ScheduledWorkItems" table references the GUID of the site which was deleted.
The DeleteWorkItem method (of the SPWorkItemCollection object) needs to have an existing site to remove the WorkItem from the table (see the community content in this link):
As you can see in the callstack visible in the ULS logs, the public API to delete a work item has a limitation that prevents us from removing the problematic Work Item as it requires to get the SPSite object the work item belongs to first. As this site collection no longer exists, it is not possible to retrieve the SPSite object, which means that we cannot remove the work item using this API. On the other hand until we have to remove the work item in order to get the hierarchy creation working again for site collections in this content database.
To solve the issue, it is required to call the internal Delete method of the affected work item itself.
As this is not a public API it is required to use Reflection to get access to this method as outlined in the following code sample:
To pass the SPWorkitem object as a parameter to this function, standard SharePoint Object Model calls can be used to list the WorkItems and identify the one we want to delete.
To identify the Hierarchy Creation workitems, we can, for example, filter by its well-known GUID.
This issue is not very likely to occur. It is quite rare to have someone asking for the creation of a hierarchy and, in the same day, delete the site collection hosting it.
If you run into such an issue, we recommended to open a support case with Microsoft to get it fixed.
A short comment: a user sent me the info how to perform the same using direct SQL commands.
Never ever perform direct modifications to the SQL database!
That is unsupported and you would have to delete the database and start from scratch to get back to a supported state!