This blog post is based on questions that people who attended our MMS 2010 session BB23 – Operations Manager 2007 SQL Server Configuration for Operations Manager 2007 Administrators. You know as I was typing the name of the session, I realize it is way too wordy.  In retrospect I could have simply named it, “Optimizing SQL Server for Operations Manager 2007”.  Here Chris Cubley and I delivered this at our internal TechReady conference in June with some spit and polish applied, and I did not think of it then. I digress…okay moving on

In our session we cover optimizations that are specific to Operations Manager 2007 R2 and these optimizations are applicable to a management group supporting an enterprise scenario (1,000 – 6,000+ agents). There is no performance benefit to be gained if you apply these settings to a management group that is managing a medium or small scenario.

The following are specific settings with recommended values from the product group based on their performance and scalability tests that can reduce resource utilization on the SQL Servers hosting the Operations Manager databases, and the management servers/Root Management Server:

1. To reduce resource utilization impact on the Root Management Server and management servers caused by the OpsMgr queues, perform these changes on the RMS/MS’s in the management group:

Registry Path

Registry Value(DWORD)

HKLM\SYSTEM\CurrentControlSet\Services\HealthService\Parameters\Persistence Cache Maximum

102400

HKLM\SYSTEM\CurrentControlSet\Services\HealthService\Parameters\Persistence Version Store Maximum

10240

HKLM\SYSTEM\CurrentControlSet\Services\HealthService\Parameters\State Queue Items (See note below)

20480

HKLM\SYSTEM\CurrentControlSet\Services\HealthService\Parameters\Persistence Checkpoint Depth Maximum

104857600

Note: This key does not exist by default and must be created manually.

2. For reduced resource utilization on the RMS and SQL Server(s) hosting the OperationsManager and OperationsManagerDW databases caused by group calculation, Configuration Service polling, and data warehouse state change insertion, perform these changes on the Root Management Server in the management group:

Registry Path

Registry Value (DWORD)

HKLM\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Config Service\Polling Interval Seconds

120

HKLM\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\GroupCalcPollingIntervalMilliseconds

900000

Note: These Registry keys do not exist by default and must be created manually.

Before changing the Group Calculation interval I should point out a few things to help you make a well informed decision. By default group calculation is performed on the RMS every 30 seconds. In a management group supporting the enterprise scenario, you will typically see many custom groups defined for targeting overrides, scoping of user roles, and for controlling the behavior of notification subscriptions (at a minimum). Group calculation discovery rules can impact the performance of the OperationsManager database, as the behavior characteristics are queries run against the database instance space in the form of multiple read operations. If you have lot of groups and their group calculation criteria are complex, it will have a big hit on database performance. Other operations in the management group could be affected as well, such as slower discovery insertion, degraded console performance, and replication of configuration changes to agents is slower. Precisely how much degradation you’ll see in these other areas is predicated upon how much group calculation is overloaded.

Changing the calculation interval to a greater value could affect any overrides that target a group, since an object that would fall under the criteria of a group would not relate to that group and receive the override until the group calculation is performed. If you can tolerate the latency of group membership discovery, then you can increase the interval/frequency to a less frequent schedule, say every four or eight hours as an example.

3. For reduced resource utilization impact on the OperationsManager databases caused by DW synchronization rules running on the RMS, create overrides in the Operations Manager console for the following rules to increase the interval and batch size of those operations:

Class

Rule/Monitor Name

Override Parameter

Override Value

Data Warehouse Synchronization Server

Data Warehouse monitor initial state synchronization rule

Batch Generation Frequency Seconds

300

Data Warehouse monitor initial state synchronization rule

Batch Size

1000

Data Warehouse object synchronization rule

Batch Generation Frequency Seconds

300

Data Warehouse object synchronization rule

Batch Size

1000

Data Warehouse report deployment rule

* Management Pack List Frequency Seconds

600

Data Warehouse report deployment rule

*Management Pack List Frequency Seconds

550

Data Warehouse report deployment rule

*Management Pack List Frequency Seconds

500

Data Warehouse managed object type synchronization rule

Batch Generation Frequency Seconds

300

Data Warehouse managed object type synchronization rule

Batch Size

1000

Data Warehouse relationship synchronization rule

Batch Generation Frequency Seconds

300

Data Warehouse relationship synchronization rule

Batch Size

1000

*Note: This override parameter actually affects three data sources referenced in this rule.

4. The Operations Manager Console refresh interval is every 15 seconds by default. With multiple consoles in an enterprise scenario, this can negatively impact performance. For best performance, turning off Polling or increasing the interval can help. Perform this change on any Windows computer that has the console installed:

Registry Path

Registry Value (DWORD)

HKCU\Software\Microsoft\Microsoft Operations Manager\3.0\console\CacheParameters\ PollingInterval

0 – 10 (0 turns off automatic refresh and requires manual refresh via F5. The value 1 through 10 increments the refresh interval every 15 seconds. The maximum value of 10 is a refresh interval of 2 min 30 seconds).

Before making any changes, always test first and evaluate the results before implementing them in production.  If you make them in production due to constraints in being able to appropriately test/validate in your test lab, first establish a performance baseline before making any of the proposed changes stated here.  After each change, perform another performance measurement and compare it to the initial baseline statistics to determine if the results are above or below the baseline.