In the last article, I showed how you can run a few SQL queries to find some detailed information about IPs, runbooks and activities and their relationships in the database. In this article, I’ll focus more on running runbooks (Jobs). To start with, here’s a diagram of the relationships of the tables related to to running a runbook:

image.

Knowing these relationships, you can now use this info to to build queries to get interesting info about jobs.

List all actively-running runbooks across all Runbook Servers

SELECT *
  FROM [Orchestrator].[dbo].[POLICY_PUBLISH_QUEUE] ppq
  LEFT JOIN Orchestrator.dbo.POLICYINSTANCES pin on pin.PolicyID = ppq.PolicyID
  Where AssignedActionServer IS NOT NULL
  and TimeEnded IS NULL

List all running Jobs with activities that started more than 5 mins ago and have not finished

SELECT pin.PolicyID
    , pin.State
    , pin.Status
    , oi.ObjectID
    , oi.ObjectStatus
    , oi.StartTime
    , oi.EndTime
  FROM [Orchestrator].[dbo].[POLICYINSTANCES] pin
  LEFT JOIN Orchestrator.dbo.[OBJECTS] obj on obj.ParentID = pin.PolicyID
  LEFT JOIN Orchestrator.dbo.OBJECTINSTANCES oi on oi.ObjectID = obj.UniqueID
Where TimeEnded IS NULL
And Status IS NULL
and oi.EndTime IS NULL
and OI.StartTime IS NOT NULL
and DATEDIFF(MINUTE,oi.StartTime,getdate()) > 5

This could help you spot jobs that have stalled at at activity and might need manual intervention. Similar to this would be those jobs who have not sent a heartbeat in a while.

List of all currently-running Jobs that have not had their heartbeat update in the last 5 minutes

SELECT *
  FROM [Orchestrator].[dbo].[POLICY_PUBLISH_QUEUE] ppq
  LEFT JOIN Orchestrator.dbo.POLICYINSTANCES pin on pin.PolicyID = ppq.PolicyID
  Where AssignedActionServer IS NOT NULL
  and TimeEnded IS NULL
  and DATEDIFF(MINUTE, Heartbeat, GetDate()) > 5

Maybe you want to get some statistical info about how many jobs you’re running to make sure you’re utilizing your infrastructure appropriately. Here’s a good one:

Find out the highest number of runbooks that were run per hour at any given time in the last 30 days:

SELECT MAX(jobs.MaxInstances) FROM
(SELECT [ActionServer]
      , CONVERT(VARCHAR(19), dateadd(hour,datediff(hour,0,TimeStarted),0), 120) as Hourly
      , COUNT(*) as MaxInstances
  FROM [Opalis].[dbo].[POLICYINSTANCES]
  WHERE (DateDiff(M, TimeStarted,GETDATE()) < 1 )
  GROUP BY ActionServer, CONVERT(VARCHAR(19), dateadd(hour,datediff(hour,0,TimeStarted),0), 120)) as jobs

You can take these queries and build on them to find out even more details about what’s going on in your Orchestrator environment. I’ll provide some more examples involving cleaning up the database in the next article.