Command Shell Reference
SQL Queries
Welcome to TechNet Blogs Sign in | Join | Help
Return discovered inventory

Just playing around today and thought I’d make this little post on how to return discovered inventory in Command Shell.  This one in particular will return all disk sizes discovered, convert to GB and round to the nearest 1.

get-MonitoringClass | where {$_.Name -like "Microsoft.Windows.Server.200?.LogicalDisk"} | get-MonitoringObject | get-MonitoringObjectProperty | Where {$_.Name -eq "SizeNumeric"} | Select @{Name="GB";Expression={[Math]::Round($_.Value / 1024, 0)}}

 

This is what I call fun!

 

Command Shell Reference

How and when is data written (or synchronized) to the Data Warehouse?

I’ve been asked this question a couple times and have seen these types of questions in the forums.  It’s one of those questions that eludes SCOM administrators, because back in the MOM days the data was synchronized to the Data Warehouse by means of DTS jobs.  Although the MOM DTS jobs may be complex to someone who is not a SQL DBA, the concept was quite simple; data would always be synchronized from the operational database to the Data Warehouse database on a schedule.

Using the scheduled DTS method of synchronizing data means the Data Warehouse is never really in sync with the operational database.  Synchronization was re-engineered and greatly improved in SCOM, and now synchronization is controlled via workflows from with the product rather than relying on scheduled jobs in SQL Server.

If you think the DTS jobs in MOM were confusing or difficult to manage and troubleshoot, you’re in for an even bigger learning curve in SCOM.  There are several synchronization workflows scattered throughout the core management packs, and they can be difficult to decipher.  If you’re sharp with XML and SQL, you can reverse engineer these workflows to gain a better understanding.

However, not all data is synchronized between the operational and Data Warehouse databases.  In fact, only Alert and Discovery data are synchronized.  All other data types are written in parallel to both the operational and Data Warehouse databases.  This is usually what confuses so many SCOM engineers.  This is what confused me for the longest time, and I just couldn’t wrap my head around it.  So I had a short conversation with Vitaly from the product team, and he was able to demystify synchronization with one sentence.

To paraphrase that conversation…

tip General rule: If data cannot change once it is created, it is written in parallel to both operational and Data Warehouse databases. Otherwise it is synchronized.

So now we know that because Discovery and Alert data can change after it is written to the database, we also know that these types of data need to be synchronized from the operations database to the Data Warehouse.

Conversely, Event (including state change event) and Performance data will never change (or be updated or modified) once it is written to the database, so these types of data will be written to both databases in parallel.

Note As a side note, not all event and performance data collection workflows need to write in parallel to both databases.  There are specific write actions while creating a rule that will be automatically configured for the workflow, which defines that the data be written to the operations database and Data Warehouse in parallel.  If we really did not want this data to be written to the Data Warehouse, because maybe this was intended for an operational view only, we can configure the workflow to write to the operational database only.

Now that we know which types of data are synchronized and which are written in parallel, let’s go back to the synchronization part.  How and when does synchronization of Discovery and Alert data actually happen?

As stated earlier, synchronization is controlled from within SCOM by some internal workflows, and breaking apart the XML and some SQL programmability will reveal most of these secrets.  We’ll see that Alert synchronization is scheduled to run every 3 minutes.  This workflow is configurable, and has exposed an override argument for interval.  I’ve never tinkered around with changing these workflows, and cannot think of any reason why anyone would ever want to, so I just assume leaving it as is.  With that in mind, we know there could be a delta of up to 3 minutes between the operational database and Data Warehouse for Alert data.

I’ve spent some time looking at some of the Discovery synchronization workflows, but this seems to be a little more complex, and I haven't completely figured that one out yet.  Someday I will dive deeper into Discovery data synchronization and I’ll be sure to share that knowledge with everyone when I do.

If anyone else has some knowledge to share about the synchronization workflows, feel free to comment here for others to read.

Health Service Heartbeat Failure, Diagnostics and Recoveries

I’ve seen plenty of questions come up in the forums and from customers regarding the Health Service Heartbeat Failure monitor, and its associated diagnostics and recoveries.  I spent a little time digging further into these workflows and thought I’d share what I found here.  Hope this helps those curious about what’s happening under the hood.

Communication Channel Basics

After an Operations Manager Agent is installed on a Windows computer, and after it is approved to establish a communication channel with an Operations Manager 2007 management group, the communication channel is maintained by the Health Service.  If this communication channel is interrupted or dropped between the Agent and its primary Management Server (MS) for any reason, the Agent will make three attempts to re-establish communication with its primary MS, by default.

If the Agent is not able to re-establish the channel to its primary MS, it fails over to the next available MS.  Failover configuration and the order of failover is another topic, and will not be covered here.

While the Agent is failed over to a secondary MS, it will attempt to re-establish communication with its primary MS every 60 seconds, by default.  As soon as the Agent can establish communication with its primary MS again, it will disconnect from the secondary MS and fail back to its primary MS.

Health Service Heartbeat Failure Monitor

To briefly summarize the Heartbeat process, there are two configurable mechanisms that control Heartbeat behavior.  Heartbeat interval and number of missed Heartbeats.  If the MS fails to receive a Heartbeat from an Agent computer greater than the number of intervals specified, the Health Service Heartbeat Failure monitor will change to a critical state and generate an alert.

Read more about Heartbeat and configuration here.

Diagnostic and Recovery Tasks

There are a couple of diagnostic tasks that run when the Health Service Heartbeat Failure monitor changes to a critical state.  Ping Computer on Heartbeat Failure and Check If Health Service Is Running.

Ping Computer on Heartbeat Failure

This diagnostic is defined in the Operations Manager 2007 Agent Management Library and is enabled by default. This workflow uses the Automatic Agent Management Account, which will run under the context of the Management Server Action Account by default, to execute a probe action which is defined in the Microsoft System Center Library named WmiProbe.

This probe is initiated on the Health Service Watcher. Since the Health Service Watcher is a perspective class hosted by the Root Management Server, this is where the WMI query is executed when the Health Service Heartbeat Failure monitor changes to a critical state. Even though the agent may be reporting to another MS, it is the RMS that sends the ICMP packet to the agent.

Unlike the traditional Ping.exe program we are all accustomed to, which sends four ICMP packets to the target host by default, the WMI query is executed only once and sends a single ICMP packet, so there is no calculation of percentage of lost packets one would expect to see with Ping.exe.

Following is the WMI query executed on the RMS.

SELECT * FROM Win32_PingStatus WHERE Address = '$Config/NetworkTargetToPing$'

To verify the number of ICMP packets sent, I ran a traditional Ping.exe test and the WMI query used in this workflow and traced these using Netmon.  The first two entries in the image below were captured from the WMI query, and the last eight entries captured were from a Ping.exe test using default parameters (four packets).

WMI query vs. Ping.exe
image

The WMI query results are passed to a condition detection module, which filter StatusCode and execute the appropriate write action. If StatusCode <> 0, the write action ComputerDown will set state to reflect the computer is down. If StatusCode = 0, the write action ComputerUp will set state to reflect computer is up.

The condition detection modules that filter StatusCode are actually the recovery tasks shown in the Health Service Heartbeat Failure monitor. These are the reserved recoveries, Reserved (Computer Not Reachable - Critical) and Reserved (Computer Not Reachable - Success), respectively.

Under the covers, these reserved recoveries are actually setting state of the Computer Not Reachable monitor, which is defined in the System Center Core Monitoring MP. Ultimately, if StatusCode <> 0, the Computer Not Reachable monitor will change to a critical state and generate the Failed to Connect to Computer alert.

Since this is a diagnostic task which runs during a degraded state change event, the Agent will only be pinged once when the Health Service Heartbeat Failure monitor changes to a critical state. If there are any network related problems after this monitor has changed to critical and the diagnostic task has ran, there will be no further monitoring regarding the ping status of this Agent and no “Failed to Connect to Computer” alert will be generated.

We can understand the root cause better based on whether the Health Service Heartbeat Failure alert was generated along with the Failed to Connect to Computer alert. If the Health Service Heartbeat Failure alert generated without the Failed to Connect to Computer alert, logic would tell us that the issue is not related to loss of network connectivity or that the server has shutdown or become unresponsive. Both alerts together generally indicate the server is completely unreachable due to network outage, or the server is down or unresponsive.

Check if Health Service is Running

This diagnostic is defined in the Operations Manager 2007 Agent Management Library and is enabled by default.  This workflow uses the Automatic Agent Management Account, which will run under the context of the Management Server Action Account by default, to initiate a probe action which is defined in the Operations Manager 2007 Agent Management Library named QueryRemoteHS.

Specifically, this probe is initiated on the Health Service Watcher, which is the MS, and queries Health Service state and configuration on the Agent, when the Health Service Heartbeat Failure monitor changes to a critical state.  This probe module type is further defined in the Windows Core Library.  It takes computer name and service name as configuration, and passes the query results through an expression filter and returns the startup type and current state of the Health Service.

If the service doesn't exist or the computer cannot be contacted, state will reflect this.  Depending on output of the diagnostic task, optional recovery workflows may be initialized (i.e., reinstall agent, enable and start Health Service, and continue Health Service if paused), but these recoveries are not enabled by default.

Why are there no alerts coming in?

Every now an then someone will ping me and ask me why they are not seeing any new alerts.  My first question is, do you expect an alert for some reason?  Sometimes there may be an issue with SCOM.  But every now and then we find that SCOM is working just fine, and that their entire environment is seemingly healthy for a period of time.

One thing I ask customers to do is create a rule to capture a synthetic transaction, which can be initiated from any agent to test communications and monitoring workflow.  With this simple rule, if they ever suspect a problem with an agent not working correctly for some reason, or perhaps that SCOM is not generating the volume of alerts it most commonly does, we can generate a very simple synthetic transaction and validate that monitoring data is making its way to the console.

Here’s how.

Create the rule

Create a new rule as shown, saving to your extended monitoring MP.
image

Complete the general screen as shown.
image

Paste the following text into the rule description, as it will come in handy later.

EventCreate /T ERROR /ID 101 /L APPLICATION /SO TEST /D "This is a synthetic transaction test only.  Disregard this event."

Event log type is Application.
image

Build the expression as shown.
image

Configure alerts as shown, and click create.
image 

Now when you want to create a synthetic transaction to test whether alerting is working, you can copy what you had pasted into the rule description earlier and paste that into a command prompt on any Windows Server (2003 or later version) with an agent installed.

image

If alerting data is flowing as it should, you’ll see a new alert in the console.

image

Group members (DW)

/*
Return Health Service instances hosting one or more instances contained in system.group
*/
USE OperationsManagerDW
SELECT vManagedEntity.DisplayName AS Computer
FROM  vManagedEntity INNER JOIN
               vManagedEntityType ON vManagedEntity.ManagedEntityTypeRowId = vManagedEntityType.ManagedEntityTypeRowId
WHERE (vManagedEntity.TopLevelHostManagedEntityRowId IN
                   (SELECT DISTINCT ME1.TopLevelHostManagedEntityRowId
                    FROM   vManagedEntity AS ME2 INNER JOIN
                                   vRelationship ON ME2.ManagedEntityRowId = vRelationship.SourceManagedEntityRowId INNER JOIN
                                   vManagedEntity AS ME1 ON vRelationship.TargetManagedEntityRowId = ME1.ManagedEntityRowId
                    WHERE (ME2.DisplayName = 'group'))) AND (vManagedEntityType.ManagedEntityTypeSystemName = 'Microsoft.SystemCenter.HealthService')
ORDER BY Computer

 

Back to SQL queries main menu

All groups (DW)

Thanks to Daniel Savage for helping with this one.

/*
Get all groups from Data Warehouse
*/
USE OperationsManagerDW
SELECT DISTINCT DisplayName
FROM  vManagedEntity
WHERE (ManagedEntityTypeRowId IN
                   (SELECT ManagedEntityTypeRowId
                    FROM   dbo.ManagedEntityDerivedTypeHierarchy
                                       ((SELECT ManagedEntityTypeRowId
                                         FROM   vManagedEntityType
                                         WHERE (ManagedEntityTypeSystemName = 'system.group')), 0)))

 

Back to SQL queries main menu

Agents: remove configured management group or uninstall agent using command line

I had a question today about uninstalling an agent using the command line.  The options referenced here are great for installing, configuring and modifying an agent and all possible options.  Here are some additional notes that may come in handy, which supplements Rob’s excellent command line post.

 

Remove a specific MG configuration from the agent

MsiExec.exe /i \\path\MOMAgent.msi  /norestart /qn /l*v %temp%\RemoveMG.log MANAGEMENT_GROUP=<OldManagementGroup> MANAGEMENT_GROUP_OPERATION=RemoveConfigGroup REINSTALL=ALL

Remove all configured MG's

You'll need to run this as many times as there are MG's.  Only one MG at a time can be removed using this method.  I’m not 100% certain, but I believe the program simply selects the first MG found in the registry.

MsiExec.exe /i \\path\MOMAgent.msi /norestart /qn /l*v %temp%\RemoveMG.log MANAGEMENT_GROUP_OPERATION=RemoveConfigGroup

If AD Integration is enabled on the agent, the Health Service will remain running.  If AD Integration configuration is discovered in the future, this agent will pick up that configuration and start reporting to specified MG.

If AD Integration is not enabled on the agent, the Health Service will stop but the agent will remain installed.  The Health Service will continue to start automatically during server startup, but will again be stopped once it discovers AD Integration is disabled and there are no MG’s configured.

Uninstall R2 Agent

Removes Agent binaries and all configured MG’s

msiexec /x {25097770-2B1F-49F6-AB9D-1C708B96262A} /qn /norestart /l*vx %temp%\RemoveAgent.log

Uninstall SP1 Agent (slipstream only, not RTM upgrade)

Removes Agent binaries and all configured MG’s

msiexec /x {E7600A9C-6782-4221-984E-AB89C780DC2D} /qn /norestart /l*vx %temp%\RemoveAgent.log

Monitor the operations database grooming procedure (v2)

Okay, so it’s only been two days since I posted the original monitor.  But, I’ve been thinking about this one after the original post and now have what I think is a better solution to monitoring the operations groom procedure.  It’s still worthwhile to read the first couple paragraphs in the original post, as I talk about why we should monitor the groom procedure and common causes of failures.

What I wanted to avoid with the initial post was a dependency on another MP and an assumption that customers are running the SQL MP, so I did not specifically target the SQL DB Engine, which might have been a better choice.  But, still, it’s not exactly a precise target class, because we still need to create the monitor in a disabled state and override-enable for the instance.  Not to mention, things get a little more complicated if the database server is clustered.

With all this in mind, I decided to write up this new post with instructions to create this workflow to run on the Root Management Server.  This also gives us an opportunity to change the monitor implementation to include three states.  Reason being is, sometimes grooming will fail for other reasons relating to database performance or availability, and then succeed the next day.

Rather than creating a critical state and generating an alert in this single failure scenario, I thought it would be better to only change state to warning at the first failure.  Then if the groom process fails two or more times, change state to critical and generate a critical alert.  I think this makes more sense, as we certainly don’t need to see any premature alerts for a condition that may not be directly related to grooming.

So here we go!

 

Create the Monitor

Time Script Three State Monitor
image

Configure general properties as shown.
image

Configure schedule as shown.
image 

Configure script information as shown, then click Parameters. (script is at end of article)
image

Configure script parameters as shown, with a space between the two parameters, entering your database server or cluster virtual name.  If you gave the database a custom name, enter that database name instead of OperationsManager.
image

Configure unhealthy expression as shown.
image

Configure degraded expression as shown.
image

Configure healthy expression as shown.
image

Configure health mapping as shown.
image

Configure alert settings as shown.
image

 

State change and alert flow

Warning state change event for one failed groom interval
Capture

Critical state change event for 2 or more failed groom intervals
Capture

Critical alert for two or more failed groom intervals
Capture

Upgrade state when groom succeeds once after critical
Capture

Returns to healthy for two or more successful groom intervals
Capture

 

The script

'OperationalDatabaseGroomingProcedureMonitor3State.vbs

Option Explicit
'Declarations
Dim objCN,objRS,strQuery,strStatusSum
Dim oArgs,oAPI,oBag
Dim strDBServer,strDatabase

'Define local event constants
Const EVENT_TYPE_ERROR = 1
Const EVENT_TYPE_WARNING = 2
Const EVENT_TYPE_INFORMATION = 4

'Create objects
Set oAPI = CreateObject("MOM.ScriptAPI")
Set oArgs = WScript.Arguments
Set oBag = oAPI.CreatePropertyBag()

'Define parameters
strDBServer = oArgs(0)
strDatabase = oArgs(1)

'Set DB connection
Set objCN = CreateObject("ADODB.Connection")
objCN.Open "Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=" & _
strDatabase & ";Data Source=" & strDBServer & ""

strQuery = "SELECT SUM(StatusCode) AS StatusSum " & _
"FROM  InternalJobHistory " & _
"WHERE CONVERT(varchar, TimeStarted, 101) IN " & _
"(CONVERT(varchar, DATEADD(day, - 1, GETUTCDATE()), 101), " & _
"CONVERT(varchar, GETUTCDATE(), 101)) " & _
"AND Command = 'Exec dbo.p_GroomPartitionedObjects and dbo.p_Grooming'"

'Query DB
Set objRS = objCN.Execute(strQuery)

'Set variables
strStatusSum = objRS ("StatusSum")

'Submit Property Bag
Call oBag.AddValue("DBServer",strDBServer)
Call oBag.AddValue("Database",strDatabase)
Call oBag.AddValue("StatusSum",strStatusSum)

'Healthy state
If strStatusSum => 2 Then
    Call oBag.AddValue("State","Healthy")
    'Log event to Operations Manager log.  For testing only.
    Call oAPI.LogScriptEvent("OperationalDatabaseGroomingProcedureMonitor3State.vbs",100,EVENT_TYPE_INFORMATION,"Script executed " & _
    "with StatusSum " & strStatusSum)

'Warning state
ElseIf strStatusSum = 1 Then
    Call oBag.AddValue("State","Warning")
    Call oBag.AddValue("Details","Operational database grooming has failed one time in the last two days.")
    'Log event to Operations Manager log.  For testing only.
    Call oAPI.LogScriptEvent("OperationalDatabaseGroomingProcedureMonitor3State.vbs",100,EVENT_TYPE_WARNING,"Script executed " & _
    "with StatusSum " & strStatusSum)

'Critical state
ElseIf strStatusSum = 0 Then
    Call oBag.AddValue("State","Critical")
    Call oBag.AddValue("Details","Operational database grooming has failed two or more times. " & _
    "Check grooming by running the following SQL query against the operations database: " & VBCRLF & VBCRLF & _
    "SELECT * FROM InternalJobHistory ORDER BY InternalJobHistoryId DESC")
    'Log event to Operations Manager log.  For testing only.
    Call oAPI.LogScriptEvent("OperationalDatabaseGroomingProcedureMonitor3State.vbs",100,EVENT_TYPE_ERROR,"Script executed " & _
    "with StatusSum " & strStatusSum)

End If

'Return property values
Call oAPI.Return(oBag)

Monitor the operations database grooming procedure

Often times we’re not aware of, or concerned about grooming in Operations Manager.  After all, it’s an internal process that is managed by the RMS and we might expect an alert if anything goes awry with the operations database.  Right?

The System Center Core Monitoring management pack does have the Operational Database Space Free (%) monitor, which will generate an alert if the database is running low on space; 40%=Warning and 20%=Critical, by default.  This is our first indicator that something is not right, because we should size the operations database accordingly to accommodate a steady operational state.  This is the main reason we recommend not enabling auto growth on the operations database.

Is it good enough to know when the operations database is running out of space?  Not for me.  Because running out of disk space is usually a belated indication and symptom of another issue.

 

Common causes of database growth

The only time the operations database will grow beyond what is expected during steady operational state, is when we introduce some noise.  The issue is usually one or more of the following:

* Poorly written monitor was introduced
* Too much data collected by new rule
* New management pack(s) imported
* Adding agents to management group 
* Transaction Log is not sized accordingly

At the end of the day it’s the grooming procedure that isn’t running successfully.  Unless there was some issue not directly related to the grooming process (performance or availability problem), if the grooming procedure fails one time it is likely that it will continue failing until an administrator intervenes.

The last bullet above, Transaction Log is not sized accordingly, is the reason grooming will continue to fail after the first failure.  Because when grooming fails, it’s almost always because the transaction log ran out of space.

If the transaction log runs out of free space during the first failed groom interval, it certainly will not have enough space during subsequent intervals to successfully groom the data that created the large drop in the first place.

 

So why not monitor the grooming procedure?

Out of the box, there is no monitoring of the grooming procedures that keeps the operational database clean and performing well.  I hope this is included in core monitoring at some point, but for now we’ll need to create a simple workflow to monitor grooming.

I could create some sort of extended management pack, which would properly discover the operations database server and include the associated monitor, but I’d rather show you how to do it so you can create this in your own custom management pack.  Most customers already have some sort of extended core monitoring management pack, and who needs another MP with a single workflow?

Before going further, please see the improved monitor instructions here.

Create a unit monitor

Timed Script Two State Monitor
image

Configure general properties as shown, and uncheck Monitor is enabled.
image

Configure schedule as shown.
image

Configure as shown, then click Parameters. (see script at end of post)
image

Configure parameters as shown.  If you have a custom name for your database, replace the parameter with your custom database name.
image

Create Unhealthy Expression filter as shown.
image

Create Healthy Expression filter as shown.
image

Configure Health Mapping as shown.
image

Configure Alert Settings as shown.
image

 

Enable the new monitor

Scope the Authoring space to Windows Server Operating System and find your new monitor.

Override for a specific object of class: Windows Server Operating System
image

Find the server hosting the operations database and select it.
image

Set the Enabled parameter to True.
image

 

The Results

Grooming failed

alert
image

state
image

state change event
image 

Grooming succeeded

alert auto-resolved
image

state
image

state change event
image

 

The Script

'OperationalDatabaseGroomingProcedureMonitor.vbs

Option Explicit
'Declarations
Dim objCN,objRS,strQuery,strInternalJobHistoryId,strTimeStarted,strStatusCode,strCommand
Dim oArgs,oAPI,oBag
Dim strDatabase,strDBServer

'Define local event constants
Const EVENT_TYPE_ERROR = 1
Const EVENT_TYPE_WARNING = 2
Const EVENT_TYPE_INFORMATION = 4

'Create objects
Set oAPI = CreateObject("MOM.ScriptAPI")
Set oArgs = WScript.Arguments
Set oBag = oAPI.CreatePropertyBag()

'Define parameters
strDBServer = "."
strDatabase = oArgs(0)

'Set DB connection
Set objCN = CreateObject("ADODB.Connection")
objCN.Open "Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=" & strDatabase & ";Data Source=" & strDBServer & ""

strQuery = "SELECT InternalJobHistoryId, TimeStarted, StatusCode, Command FROM  InternalJobHistory " & _
"WHERE (CONVERT(varchar, TimeStarted, 101) = CONVERT(varchar, GETDATE(), 101)) and Command = 'Exec dbo.p_GroomPartitionedObjects and dbo.p_Grooming' " & _
"ORDER BY InternalJobHistoryId DESC"

'Query DB
Set objRS = objCN.Execute(strQuery)

'Set variables
strInternalJobHistoryId = objRS ("InternalJobHistoryId")
strTimeStarted = objRS ("TimeStarted")
strStatusCode = objRS ("StatusCode")
strCommand = objRS ("Command")

'Submit Property Bag
Call oBag.AddValue("DBServer",strDBServer)
Call oBag.AddValue("Database",strDatabase)
Call oBag.AddValue("InternalJobHistoryId",strInternalJobHistoryId)
Call oBag.AddValue("TimeStarted",strTimeStarted)
Call oBag.AddValue("StatusCode",strStatusCode)
Call oBag.AddValue("Command",strCommand)

'Healthy state
If strStatusCode = 1 Then
    Call oBag.AddValue("State","Healthy")
    'Call oAPI.LogScriptEvent("OperationalDatabaseGroomingProcedureMonitor.vbs",100,EVENT_TYPE_INFORMATION,"Healthy")

'Critical state
Else
    Call oBag.AddValue("State","Critical")
    Call oBag.AddValue("Details","Operational database grooming return status code " & strStatusCode & " for procedure " & strCommand & ", " & _
    "which was started at " & strTimeStarted & ". Check Internal Job History Id " & strInternalJobHistoryId & ".")
    'Call oAPI.LogScriptEvent("OperationalDatabaseGroomingProcedureMonitor.vbs",100,EVENT_TYPE_INFORMATION,"Critical")
End If

'Return property values
Call oAPI.Return(oBag)

'Log event to Operations Manager log.  For testing only.
'Call oAPI.LogScriptEvent("OperationalDatabaseGroomingProcedureMonitor.vbs",100,EVENT_TYPE_INFORMATION,"Operational database grooming return status code " & strStatusCode & " for procedure " & strCommand & ", " & _
'    "which was started at " & strTimeStarted & ". Check Internal Job History Id " & strInternalJobHistoryId & ".")

 

Table size and record count

This was initially published by Kevin, and Steve added row count.  I just changed the column names and posted it here for easy access to customers I work with.  Anytime I engage in a tuning effort or a case where there are database issues, I first ask customers to run all queries on the main page that have a red asterisk (*).

/*Table size and record count*/
USE OperationsManager
SELECT so.name AS 'Table Name', si.rowcnt as 'Row Count',
8 * Sum(CASE WHEN si.indid IN (0, 1) THEN si.reserved END) AS 'Data(kb)',
Coalesce(8 * Sum(CASE WHEN si.indid NOT IN (0, 1, 255) THEN si.reserved END), 0) AS 'Index(kb)'
FROM dbo.sysobjects AS so JOIN dbo.sysindexes AS si ON (si.id = so.id)
WHERE 'U' = so.type
GROUP BY so.name, si.rowcnt
ORDER BY 'Data(kb)' DESC

 

Back to SQL queries main menu

Command Shell Tab Expansion Delay

A few months ago, after re-installing my workstation operating system and getting all my applications and tools loaded, I fire up Operations Manager Command Shell.  Since I use the tab expansion function frequently, I noticed immediately that for some odd reason the Command Shell would paused for ~30 seconds every time I used the tab completion function.  I had to remind myself never to use it or I’d be twiddling my thumbs for a while.  Sometimes it’s quicker to just close Command Shell and launch it again, but usually there is history saved which I don’t want to lose.  So I wait…and wait.  Very frustrating.

Lincoln Atkinson created a workaround for this, which I stumbled across on the Technet forum.  This was such a wonderful find, I feel like I should spread the news.  Run the following script, or add it to your $profile, and tab away!

$tabExpand = (get-item function:\tabexpansion).Definition
if($tabExpand -match 'try {Resolve-Path.{49}(?=;)')
{
   $tabExpand = $tabExpand.Replace($matches[0], "if((get-location).Provider.Name -ne 'OperationsManagerMonitoring'){ $($matches[0]) }" )
   invoke-expression "function TabExpansion{$tabExpand}"
}

Thanks Lincoln!


Note

Some say this tab expansion delay issue is only observed with Powershell v2, but I was running Powershell v1 at the time and adding this script to my profile resolved the tab expansion delay.

Event data, views and grooming

Every day I have a number of fleeting thoughts that I hope every Operations Manager administrator knows.  With all the emerging blogs about Operations Manager over the past year offering up a wealth of knowledge, sometimes I assume the majority of these fleeting thoughts are common knowledge.  I’m going to stop assuming, and start writing.

So here’s a little bit of “common knowledge”.

When is the last time you were in the console looking at an event view?  If you recall, was it a good experience and did it help you resolve some issue?  Would you have considered it a good use of your time?  If so, did you need to see event data from 6 days ago…or even yesterday?  Or were you interested in current event data flowing into Operations Manager?

Personally, I have yet to use an event view for anything more than viewing current events flowing into Operations Manager.  Even still, it is a rare case I will use any event view, and I wouldn’t miss them if they disappeared.  If I need a history of events logged on a particular computer, or a history of a particular event across the Management Group, I’ll run a report.  Reports offer great value for event analysis.  And that’s the real value of event collection rules.

The point I’m trying to get across here is, consider adjusting the operational event grooming cycle down.  By default, Operations Manager retains event data in the operations database for 7 days.  Depending on your environment and how many event collection rules you’ve got running, this can amount to unnecessary expense.  Not just disk space.  There’s a penalty for performance as well.

It’s not uncommon for a single event collection rule to collect up to 1 million events in a week in some environments.  Imagine an operator opening that event view and the impact this would cause just serving up that view.  What could we possibly ascertain by looking at such a view, other than there are a load of events being collected?  This is what I call system overload.

Hypothetically speaking, if I were an Operations Manager administrator at some company, I would adjust the event grooming interval down to 1 day.  This is just one of many things I could do to help Operations Manager perform better without really losing anything.  I’d like to hear if there is a downside to this.

Groups and related information

Return state of a group

foreach($a in get-monitoringobjectgroup){if($a.DisplayName -eq "group name"){$a.HealthState}}

Return contained instances of a group and state of each instance

foreach ($group in get-monitoringobjectGroup) {if($group.DisplayName -eq "group name") {$group.GetRelatedMonitoringObjects() | ft DisplayName,HealthState}}

Return management pack a group is stored in

ForEach ($Group in get-MonitoringObjectGroup) {If ($Group.DisplayName -eq "group name") {get-MonitoringClass | where {$_.Id -eq $Group.Id.ToString()} | Foreach-Object {$_.getManagementPack()} | Select @{Name="Group Name";Expression={$Group.DisplayName}},@{Name="MP Name";Expression={$_.Name}},@{Name="MP DisplayName";Expression={$_.DisplayName}} | fl}}

 

Command Shell Reference

Number of alerts raised per day for last 28 days

/*Number of alerts raised per day for last 28 days.*/
USE OperationsManagerDW
SELECT CONVERT(VARCHAR(10), DBCreatedDateTime, 101) AS Date, COUNT(*) AS Alerts
FROM  Alert.vAlert
WHERE (DBCreatedDateTime BETWEEN DATEADD(day, -27, GETDATE()) AND GETDATE())
GROUP BY CONVERT(VARCHAR(10), DBCreatedDateTime, 101)
ORDER BY Date DESC

Back to SQL queries main menu

Number of events collected per day for last 28 days

/*Number of events collected per day for last 28 days.*/
USE OperationsManagerDW
SELECT CONVERT(VARCHAR(10), DateTime, 101) AS Date, COUNT(*) AS Events
FROM  Event.vEvent
WHERE (DateTime BETWEEN DATEADD(day, - 27, GETDATE()) AND GETDATE())
GROUP BY CONVERT(VARCHAR(10), DateTime, 101)
ORDER BY Date DESC

Back to SQL queries main menu

More Posts Next page »
Page view tracker