Silect MP Author is the simple tool for authoring SCOM Management Packs. They shipped Service Pack 2 today:
http://www.silect.com/mp-author
Download here: http://www.silect.com/content/mp-author-free-download-form
Questions? info@silect.com
Silect MP Author is the simple tool for authoring SCOM Management Packs. They shipped Service Pack 2 today:
http://www.silect.com/mp-author
Download here: http://www.silect.com/content/mp-author-free-download-form
Questions? info@silect.com
KB Article for OpsMgr: http://support.microsoft.com/kb/2965445
KB Article for all System Center components: http://support.microsoft.com/kb/2965090
Download catalog site: http://catalog.update.microsoft.com/v7/site/Search.aspx?q=2965445
Key fixes:
[Error] :DataProviderCommandMethod.Invoke{dataprovidercommandmethod_cs370}( 000000000371AA78 )
An unknown exception was caught during invocation and will be re-wrapped in a DataAccessException. System.TimeoutException: The operation has timed out. at Microsoft.EnterpriseManagement.Monitoring.DataProviders.RetryCommandExecutionStrategy.Invoke(IDataProviderCommandMethodInvoker invoker) at Microsoft.EnterpriseManagement.Presentation.DataAccess.DataProviderCommandMethod.Invoke(CoreDataGateway gateWay, DataCommand command)
Microsoft.SystemCenter.CrossPlatform.ClientLibrary.Common.SDKAbstraction.ManagedObjectNotFoundException
Lets get started.
From reading the KB article – the order of operations is:
Now, we need to add another step – if we are using Xplat monitoring – need to update the Linux/Unix MP’s and agents.
5. Update Unix/Linux MP’s and Agents.
1. Management Servers
Since there is no RMS anymore, it doesn’t matter which management server I start with. There is no need to begin with whomever holds the RMSe role. I simply make sure I only patch one management server at a time to allow for agent failover without overloading any single management server.
I can apply this update manually via the MSP files, or I can use Windows Update. I have 3 management servers, so I will demonstrate both. I will do the first management server manually. This management server holds 3 roles, and each must be patched: Management Server, Web Console, and Console.
The first thing I do when I download the updates from the catalog, is copy the cab files for my language to a single location:
Then extract the contents:
Once I have the MSP files, I am ready to start applying the update to each server by role.
***Note: You MUST log on to each server role as a Local Administrator, SCOM Admin, AND your account must also have System Administrator (SA) role to the database instances that host your OpsMgr databases.
My first server is a management server, and the web console, and has the OpsMgr console installed, so I copy those update files locally, and execute them per the KB, from an elevated command prompt:
This launches a quick UI which applies the update. It will bounce the SCOM services as well. The update does not provide any feedback that it had success or failure. You can check the application log for the MsiInstaller events for that:
Log Name: Application
Source: MsiInstaller
Date: 8/6/2014 3:00:46 PM
Event ID: 1022
Task Category: None
Level: Information
Keywords: Classic
User: OPSMGR\kevinhol
Computer: SCOM01.opsmgr.net
Description:
Product: System Center Operations Manager 2012 Server - Update 'System Center 2012 R2 Operations Manager UR3 Update Patch' installed successfully.
You can also spot check a couple DLL files for the file version attribute.
Next up – run the Web Console update:
This runs much faster. A quick file spot check:
Lastly – install the console update (make sure your console is closed):
A quick file spot check:
Secondary Management Servers:
I now move on to my secondary management servers, applying the server update, then the console update.
On this next management server, I will use the example of Windows Update as opposed to manually installing the MSP files. I check online, and make sure that I have configured Windows Update to give me updates for additional products:
This shows me two applicable updates for this server:
I apply these updates (along with some additional Windows Server Updates I was missing, and reboot each management server, until all management servers are updated.
Updating Gateways:
I can use Windows Update or manual installation.
The update launches a UI and quickly finishes.
Then I will spot check the DLL’s:
I can also spot-check the \AgentManagement folder, and make sure my agent update files are dropped here correctly:
2. Apply the SQL Scripts
In the path on your management servers, where you installed/extracted the update, there are two SQL script files:
%SystemDrive%\Program Files\System Center 2012\Operations Manager\Server\SQL Script for Update Rollups
First – let’s run the script to update the OperationsManager database. Open a SQL management studio query window, connect it to your Operations Manager database, and then open the script file. Make sure it is pointing to your OperationsManager database, then execute the script.
Click the “Execute” button in SQL mgmt. studio. The execution could take a considerable amount of time and you might see a spike in processor utilization on your SQL database server during this operation.
You will see the following (or similar) output:
or
IF YOU GET AN ERROR – STOP! Do not continue. Try re-running the script several times until it completes without errors. In a large environment, you might have to run this several times, or even potentially shut down the services on your management servers, to break their connection to the databases, to get a successful run.
Technical tidbit: This script has been updated in UR3. Even if you previously ran this script in UR1 or UR2, you must run this again.
Next, we have a script in UR3 to run against the warehouse DB. Do not skip this step under any circumstances. From:
%SystemDrive%\Program Files\System Center 2012\Operations Manager\Server\SQL Script for Update Rollups
Open a SQL management studio query window, connect it to your OperationsManagerDW database, and then open the script file UR_Datawarehouse.sql. Make sure it is pointing to your OperationsManagerDW database, then execute the script.
If you see a warning about line endings, choose Yes to continue.
Click the “Execute” button in SQL mgmt. studio. The execution could take a considerable amount of time and you might see a spike in processor utilization on your SQL database server during this operation.
You will see the following (or similar) output:
3. Manually import the management packs?
We have 6 updated MP’s to import (MAYBE!).
The TFS MP bundles are only used for specific scenarios, such as DevOps scenarios where you have integrated APM with TFS, etc. If you are not currently using these MP’s, there is no need to import or update them. I’d skip this MP import unless you already have these MP’s present in your environment.
The Advisor MP’s are only needed if you are using System Center Advisor services.
However, the Image and Visualization libraries deal with Dashboard updates, and these need to be updated.
I import all of these without issue.
4. Update Agents
Agents should be placed into pending actions by this update (mine worked great) for any agent that was not manually installed (remotely manageable = yes):
If your agents are not placed into pending management – this is generally caused by not running the update from an elevated command prompt, or having manually installed agents which will not be placed into pending
You can approve these – which will result in a success message once complete:
Soon you should start to see PatchList getting filled in from the Agents By Version view under Operations Manager monitoring folder in the console:
5. Update Unix/Linux MPs and Agents
Next up – I download and extract the updated Linux MP’s for SCOM 2012 SP1 UR3
http://www.microsoft.com/en-us/download/details.aspx?id=29696
7.5.1025.0 is current at this time for SCOM 2012 R2 UR2.
****Note – take GREAT care when downloading – that you select the correct download for R2. You must scroll down in the list and select the MSI for 2012 R2:
Download the MSI and run it. It will extract the MP’s to C:\Program Files (x86)\System Center Management Packs\System Center 2012 R2 Management Packs for Unix and Linux\
Update any MP’s you are already using. These are mine for RHEL, SUSE, and the universal Linux libraries:
You will likely observe VERY high CPU utilization of your management servers and database server during and immediately following these MP imports. Give it plenty of time to complete the process of the import and MPB deployments.
Next up – you would upgrade your agents on the Unix/Linux monitored agents. You can now do this straight from the console:
You can input credentials or use existing RunAs accounts if those have enough rights to perform this action.
6. Update the remaining deployed consoles
This is an important step. I have consoles deployed around my infrastructure – on my Orchestrator server, SCVMM server, on my personal workstation, on all the other SCOM admins on my team, on a Terminal Server we use as a tools machine, etc. These should all get the UR3 update.
Review:
Now at this point, we would check the OpsMgr event logs on our management servers, check for any new or strange alerts coming in, and ensure that there are no issues after the update.
Known issues:
See the existing list of known issues documented in the KB article.
1. Many people are reporting that the SQL script is failing to complete when executed. You should attempt to run this multiple times until it completes without error. You might need to stop the Exchange correlation engine, stop the services on the management servers, or bounce the SQL server services in order to get a successful completion in a busy management group. The errors reported appear as below:
------------------------------------------------------
(1 row(s) affected)
(1 row(s) affected)
Msg 1205, Level 13, State 56, Line 1
Transaction (Process ID 152) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
Msg 3727, Level 16, State 0, Line 1
Could not drop constraint. See previous errors.
--------------------------------------------------------
I didn’t see any announcements on this – but several customers have been asking.
From the SQL Requirements for System Center 2012 R2, which looks like it was updated on July 9th:
http://technet.microsoft.com/library/dn281933.aspx
| System Center 2012 R2 component | SQL Server 2008 R2 SP1 Standard, Datacenter | SQL Server 2008 R2 SP2 Standard, Datacenter | SQL Server 2012 Enterprise, Standard (64-bit) | SQL Server 2012 SP1 Enterprise, Standard (64-bit) | SQL Server 2012 SP2 |
| App Controller Server | ● | ● | ● | ||
| Data Protection Manager (DPM) Database Server | ● | ● | ● | ● | |
| Operations Manager Data Warehouse | ● | ● | ● | ● | ● |
| Operations Manager Operational Database | ● | ● | ● | ● | ● |
| Operations Manager Reporting Server | ● | ● | ● | ● | ● |
| Orchestrator Management Server | ● | ● | ● | ● | |
| Service Manager Database or Data Warehouse Database | ● | ● | ● | ● | |
| Service Provider Foundation | ● | ||||
| Virtual Machine Manager Database Server | ● | ● | ● |
SCOM has many different ways to monitor for a file size. Here are some simple examples using script and WMI monitor types.
In this specific example – this will be a monitor to look for Windows Server Registry Bloat. The monitor will inspect the registry hives for the registry file size, and alarm when the size is over a set threshold.
In the console, under Authoring, create a new Unit Monitor. Choose a Timed Script Two State Monitor and choose an appropriate management pack.

Provide a displayname for the monitor, and choose “Windows Server Operating System” as that is the BEST generic targeting class. I will place the monitor under “Availability” as that is most applicable for what I am trying to impact: If the registry file grows to large, the availability of the server might become impacted.

Set a schedule that makes sense for your monitor. Remember script based monitors consume the most resources, especially depending on the complexity of the script, so don’t try and run it too frequently.

Next, give your script a name that it will be compiled in XML as, and paste in the body of your script. Here is my script below. It accepts two parameters: the full path to the file we wish to monitor, and the size threshold.
Option Explicit Dim oAPI, oBag, objFSO, objFile, varSize, oArgs, filepath, threshold Set oArgs = Wscript.Arguments filepath = oArgs(0) threshold = int(oArgs(1)) Set oAPI = CreateObject("MOM.ScriptAPI") Set objFSO = CreateObject("Scripting.FileSystemObject") Set objFile = objFSO.GetFile(filepath) varSize = objFile.Size If varSize > threshold Then Set oBag = oAPI.CreatePropertyBag() Call oBag.AddValue("Status","Bad") Call oBag.AddValue("Size", varSize) Call oBag.AddValue("Threshold", threshold) Call oAPI.Return(oBag) Call oAPI.LogScriptEvent("regfilesize.vbs", 160, 0, "The registry file size of HKLM\SOFTWARE is greater than the threshold of " & threshold & " bytes. The current size is: " & varSize & " bytes") Else Set oBag = oAPI.CreatePropertyBag() Call oBag.AddValue("Status","Ok") Call oBag.AddValue("Size", varSize) Call oBag.AddValue("Threshold", threshold) Call oAPI.Return(oBag) Call oAPI.LogScriptEvent("regfilesize.vbs", 160, 0, "The registry file size of HKLM\SOFTWARE is less than the threshold of " & threshold & " bytes. The current size is: " & varSize & " bytes") End If
Then select the “parameters” button, and provide the params:

Next – we must provide the “Unhealthy” expression. We are returning a PropertyBag from the script as “Status” which will either be “Bad” or “Ok”. The parameter name here is in the format: Property[@Name='Status']

Repeat for Healthy expression:

Configure the health status you are looking to drive:

And alerting. Note: to make the value of the alert higher, you can include data from the propertybags returned in the script, into the alert context. See the examples below for Size and Threshold, along with the computer name:

Here is the finished result of the alert:

And Health Explorer output is also very useful:

If you need to tune the monitor for specific systems – the script arguments are automatically exposed in Overrides:

Additional reading and examples on using script based monitors:
http://technet.microsoft.com/en-us/library/ff629453.aspx
http://contoso.se/blog/?p=1367
You can make this even more sexy, by creating a composite datasource for the script. Then create a Monitortype to call the datasource, and then create Monitors to pass the necessary data. Then you can also create a script based performance collection rule to use the same datasource.
Ok, that’s pretty cool. But – what about another way?
SCOM also has a built in WMI based monitor, which will accept WMI queries to which you can map as performance type data with thresholds. I previously wrote examples of this:
Lets create another new Unit Monitor, WMI Performance Counters, Static Threshold, Simple Threshold:

Give it a name, choose Windows Server Operating System as that is the preferred generic target of choice, and choose Availability.

We will connect to root\cimv2. The query we will use is:
select filesize from cim_datafile where name='c:\\windows\\system32\\config\\software'

The Performance Mapper screen might be the most confusing. We simply just need to make up the data as to how we’d like to see it inserted in SCOM.

I used “FileSize” for the counter, since that is what I am querying from WMI. Then I need to make sure that Value matches the counter name I used, and in the format of: $Data/Property[@Name='QueryObject']$
Next I set my threshold value:

Configure health according to what you desire:

Configure alerting:

The subsequent alert:

And Health Explorer:

Now, we can also create a rule – to collect this value, and have a report for which servers have the biggest registry:
Create a new rule, collection, performance based, WMI:

Provide a name and target:

Provide the same query, and set a frequency that you need for reporting on changes.

Fill out the performance mapper just as we did above:

Now – create a performance view to examine the data:



And even a cool dashboard to show off all of it:

For additional reading on using WMI counters in SCOM:
This article is not just a warning about the Dell (Detailed) MP, but the danger of importing ANY management pack into your environment without fully understanding the intended scope, scalability, and any known/common issues.
I recently worked with a customer who had an interesting issue. They had a very large agent based monitoring environment (greater than 10,000 agents). While performing a supportability review, we noticed that Config generation was failing. This was evidenced by the Config monitors showing red on the console, alerts generated, events logged in the Management Server SCOM event logs, and most notably by the fact that agents were not getting updated config in a timely fashion.
Events were similar to:
Log Name: Operations Manager
Source: OpsMgr Management Configuration
Event ID: 29181
Computer: managementserver.domain.com
Description:
OpsMgr Management Configuration Service failed to execute 'SnapshotSynchronization' engine work item due to the following exceptionMicrosoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.DataAccessException: Data access operation failed
at Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.DataAccessOperation.ExecuteSynchronously(Int32 timeoutSeconds, WaitHandle stopWaitHandle)
at Microsoft.EnterpriseManagement.ManagementConfiguration.SqlConfigurationStore.ConfigurationStore.ExecuteOperationSynchronously(IDataAccessConnectedOperation operation, String operationName)
at Microsoft.EnterpriseManagement.ManagementConfiguration.SqlConfigurationStore.ConfigurationStore.EndSnapshot(String deltaWatermark)
at Microsoft.EnterpriseManagement.ManagementConfiguration.Engine.SnapshotSynchronizationWorkItem.EndSnapshot(String deltaWatermark)
at Microsoft.EnterpriseManagement.ManagementConfiguration.Engine.SnapshotSynchronizationWorkItem.ExecuteSharedWorkItem()
at Microsoft.EnterpriseManagement.ManagementConfiguration.Interop.SharedWorkItem.ExecuteWorkItem()
at Microsoft.EnterpriseManagement.ManagementConfiguration.Interop.ConfigServiceEngineWorkItem.Execute()
-----------------------------------
System.Data.SqlClient.SqlException (0x80131904): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at System.Data.SqlClient.SqlCommand.InternalEndExecuteReader(IAsyncResult asyncResult, String endMethod)
at System.Data.SqlClient.SqlCommand.EndExecuteReaderInternal(IAsyncResult asyncResult)
at System.Data.SqlClient.SqlCommand.EndExecuteReader(IAsyncResult asyncResult)
at Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.ReaderSqlCommandOperation.SqlCommandCompleted(IAsyncResult asyncResult)
ClientConnectionId:724196c1-d9ec-4f29-8807-b16cab05fcc6
Our initial issue was due to the fact that the management servers were running Windows 2012 RTM, with .NET 4.5. There is an issue here and we needed to install .NET 4.5.1 to resolve these timeouts. This got us past the initial failing for Snapshot Config failing.
Next – we saw that Delta Config started failing:
Log Name: Operations Manager
Source: OpsMgr Management Configuration
Event ID: 29181
Computer: managementserver.domain.com
Description:
OpsMgr Management Configuration Service failed to execute 'DeltaSynchronization' engine work item due to the following exceptionMicrosoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.DataAccessException: Data access operation failed
at Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.DataAccessOperation.ExecuteSynchronously(Int32 timeoutSeconds, WaitHandle stopWaitHandle)
at Microsoft.EnterpriseManagement.ManagementConfiguration.CmdbOperations.CmdbDataProvider.GetConfigurationDelta(String watermark)
at Microsoft.EnterpriseManagement.ManagementConfiguration.Engine.TracingConfigurationDataProvider.GetConfigurationDelta(String watermark)
at Microsoft.EnterpriseManagement.ManagementConfiguration.Engine.DeltaSynchronizationWorkItem.TransferData(String watermark)
at Microsoft.EnterpriseManagement.ManagementConfiguration.Engine.DeltaSynchronizationWorkItem.ExecuteSharedWorkItem()
at Microsoft.EnterpriseManagement.ManagementConfiguration.Interop.SharedWorkItem.ExecuteWorkItem()
at Microsoft.EnterpriseManagement.ManagementConfiguration.Interop.ConfigServiceEngineWorkItem.Execute()
-----------------------------------
System.Data.SqlClient.SqlException (0x80131904): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
at System.Data.SqlClient.SqlDataReader.TryReadInternal(Boolean setTimeout, Boolean& more)
at System.Data.SqlClient.SqlDataReader.Read()
at Microsoft.EnterpriseManagement.ManagementConfiguration.CmdbOperations.EntityChangeDeltaReadOperation.ReadManagedEntitiesProperties(SqlDataReader reader)
at Microsoft.EnterpriseManagement.ManagementConfiguration.CmdbOperations.EntityChangeDeltaReadOperation.ReadData(SqlDataReader reader)
at Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.ReaderSqlCommandOperation.SqlCommandCompleted(IAsyncResult asyncResult)
ClientConnectionId:9d9ec759-e9bf-4c1e-a958-581377c630b3
We run a snapshot config every 24 hours by default. We run a delta config every 30 seconds by default. These are controlled via the ConfigService.config file located in the \Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\ directory. Delta config timing out was odd. There can be many reasons for this, so the next step was to take a SQL trace and see what expensive queries were running.
If you want to see these in more clarity – the Config service logs these jobs to the CS.WorkItem table:
SELECT * FROM cs.workitem
ORDER BY WorkItemRowId DESC
You can filter these by Delta Sync or the daily Snapshot sync as well:
SELECT * FROM cs.workitem
WHERE WorkItemName like '%delta%'
ORDER BY WorkItemRowId DESCSELECT * FROM cs.workitem
WHERE WorkItemName like '%snap%'
ORDER BY WorkItemRowId DESC
WorkItemStateId is the value of success or fail for the job. It is normal to see some failures, for instance when multiple management servers try and execute the same job, some of those will fail, by design.
1 Running
10 Failed
12 Abandoned
15 Timed out
20 Succeeded
What we found – was one of the MP’s – the Dell Hardware MP – was consuming a large amount of SQL server CPU time, just to queries some standard Managed Type views in the database, many of these lasting over 10 minutes.
When we researched further, we found that the “Dell Windows Server (Detailed Edition)” management pack had been imported, and in the documentation there was no mention of scalability limitations. However, we found in a much older (4.x) version of the documentation, Dell specifically states that they recommend the Detailed MP only for small environments, when the monitored server count is less than 300 agents!!!! We had already discovered and were monitoring over 5000 Dell servers.
This massive discovery data influx was also causing Config Churn – and binding showing up as 2115 errors for discovery data:
Log Name: Operations Manager
Source: HealthService
Event ID: 2115
Computer: managementserver.domain.com
Description:
A Bind Data Source in Management Group Production has posted items to the workflow, but has not received a response in 1510 seconds. This indicates a performance or functional problem with the workflow.
Workflow Id : Microsoft.SystemCenter.CollectDiscoveryData
Instance : managementserver.domain.com
Instance Id : {B3FA7F2F-3D4A-236D-D3FD-119B3E01C3E3}
So, just delete the MP, right?
Well, lets talk about what must happen when we delete an MP. When you right click an MP in the console to delete it, we must first delete any discovered instances of any classes defined in that MP. (Such as an instance of “Dell Server BIOS”.) In order to delete an instance of a class, we must first also delete ALL monitoring data associated with that instance. And I don’t mean just simply mark it as “deleted” in the database. It must actually be deleted transactionally from the tables. This means all alerts, all monitor based state changes, all events, all performance data, etc. This can be MASSIVE overhead.
What we actually experienced, is the console locking up, we could track the SQL statements trying to delete the management pack and all the instance data, however this would time out eventually and never return anything to the console. It would just go away, all the while our MP still existed.
So what can we do?
Well, we do have a possible solution…. in the Remove-SCOMDisabledClassInstance PowerShell commandlet. This cmdlet allows us to delete the discovered instance data methodically, and slowly. What this cmdlet does, is to delete any discovered instances in the management group, where that instance’s discovery is explicitly disabled via override.
So – we find all the discoveries in the Dell Detailed MP, and we create a new Override MP, to store a disable override for each discovery in. Then, we run Remove-SCOMDisabledClassInstance. This will run and run and run…. seemingly forever, until it returns with no errors. In many cases, even this cmdlet will time out or crash with an exception, which can be normal when deleting a massive amount of data.
One trick to help with this process – is to set your state, performance, and event retention in the OpsDB to ONE day, then run grooming. This will greatly reduce the amount of data we must delete transactionally.
Then – just keep running Remove-SCOMDisabledClassInstance. In this specific case, because the amount of data was so large, it actually took over a day and probably over 100 executions, before the instances were all removed. You can track the instances being removed, by creating a query that counts the records in the Managed Type tables you are deleting from. Here is part of the one I crafted for this MP:
select sum(TCount) As TotalCount
from
(
select count (*) as Tcount
from MT_Dell$WindowsServer$Server
union all
select count (*) as Tcount
from MT_Dell$WindowsServer$BIOS
union all
select count (*) as Tcount
from MT_Dell$WindowsServer$Detailed$MemoryUnit
union all
select count (*) as Tcount
from MT_Dell$WindowsServer$Detailed$ProcUnit
union all
select count (*) as Tcount
from MT_Dell$WindowsServer$Detailed$PSUnit
union all
select count (*) as Tcount
from MT_Dell$WindowsServer$EnclosurePhysicalDisk
union all
select count (*) as Tcount
from MT_Dell$WindowsServer$ControllerConnector
) as T
As you run the Remove-SCOMDisabledClassInstance command, you will see these instance counts slowly eroding. You just have to keep running it until it completes without a timeout or an exception.
Once the instance count gets to zero…. you can delete the MP. We found this time the MP deleted in seconds!
Now that this MP was gone, the expensive query was over… and we saw the binding on Discovery Data go back to a more reasonable occurrence count and time value.
The lesson to learn here is – be careful when importing MP’s. A badly written MP, or an MP designed for small environments, might wreak havoc in larger ones. Sometimes the recovery from this can be long and quite painful. An MP that tests out fine in your Dev SCOM environment might have issues that wont be seen until it moves into production. You should always monitor for changes to a production SCOM deployment after a new MP is brought in, to ensure that you don’t see a negative impact. Check the management server event logs, MS CPU performance, database size, and disk/CPU performance to see if there is a big change from your established baselines.
If you are designing a large agent deployment that nears our maximum scalability (currently 15,000 agents) great consideration must go into the management packs in scope. If you require management packs that discover a large instance space per agent, and/or have a large number of workflows, you might find that you cannot achieve the maximum scale.