Neil Harrison's System Center Blog

This blog is dedicated to the System Center 2012 suite of products and how to use those products more effectively

Neil Harrison's System Center Blog

  • Operations Manager 2012 Integration Pack Overview for Orchestrator 2012

    For those that may not be aware, System Center 2012 Orchestrator is a workflow automation tool that allows integration between heterogeneous environments.  As an example, many organizations have multiple monitoring tools and need to transfer data (such as alerts) between them.  Orchestrator, with the use of Integration Packs, provides the ability to facilitate this integration.

    In this post I’m going to provide a general overview of the System Center 2012 Operations Manager Integration Pack for System Center 2012 Orchestrator.

    The first thing we need to do is download the Integration Pack from the Microsoft Download center.  All of the Integration Packs for System Center 2012 products as well as earlier products come in the form of one download which is currently found here.

    Once we have the Integration Packs downloaded and extracted to the file system we can then use Deployment Manager to register and deploy the IP.  I won’t review this procedure in this post as there is a fairly straight forward procedure for this on Technet which can be found here.

    Before we can dive in to the activities that this IP provides for us we’ll need to perform a few steps.

    1.  We need to register at least one Operations Manager 2012 connection.  We can do this in the Runbook       Designer by going under the ‘Options’ menu and selecting ‘SC 2012 Operations Manager’.  Fill in the connection information as appropriate for your environment similar to what is shown below.

     

    image

    2. We need to make sure we install the OpsMgr console on every server that hosts the Runbook Server or the Runbook designer.  If you don’t do this you will get an error similar to the following when you try to run certain activities such as ‘Create Alert’:

    Could not load file or assembly 'Microsoft.EnterpriseManagement.OperationsManager, Version=7.0.5000.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies. The system cannot find the file specified

    Now let’s have a look to see what this Integration Pack provides for us.  I can see that there is now an ‘SC 2012 Operations Manager’ category under my Activities pane and within there are 8 pre-defined activities.

    image

    Here is a brief overview of what each one does

    image

    Create Alert

    This activity does exactly as its name describes.  The properties of this activity allows us to set several alert properties such as Name, Description, Priority, Severity, and Owner as well as any of the custom fields.

    Here is a screenshot of the alert that was generated by using this activity

    image

    Notice that the Alert generating rule is

    Microsoft.SystemCenter.Orchestrator.Integration.Library.AlertOnEventForComputer.

    This is a rule that exists in the ‘Microsoft System Center Orchestrator Integration Library’ MP that you likely didn’t import yourself.  The first time an activity runs that needs this MP then it will import it for you in to your OpsMgr environment.

    Also note my artistic blacking out of sensitive information such as my server hostname.  I have yet to master the art of the straight line within the Snipping Tool.

    image

    Get Alert

    This activity does exactly as its name describes as well.  It retrieves alert data that matches the criteria you select.  The criteria you can select for retrieving alerts is actually quite extensive and you can select multiple criterion as well as shown in the screen shot below

    image

    You would probably not use this activity for proactively looking for new alerts that have occurred in a time sensitive manner.  The ‘Monitor Activity’ activity would be a better choice for that.  A better use of this activity would be to use this in a regularly scheduled Runbook that retrieves alert data and takes some sort of maintenance action upon them.

    For instance, you could use this activity in a Runbook that retrieves all warning alerts that are older then ‘x’ amount of days and increase the severity to critical using the ‘Update Alert’ activity.

    image

    Get Monitor

    This activity is used to retrieve information from a specific monitor or set of monitors that match specified filter criteria.  This information can be used in many ways such as input to a corrective action or as input to a problem incident in another tool.

    image

    image

    Monitor Alert

    This activity has nothing really to do with monitors in Operations Manager terminology.  It is actually used to detect new or updated alerts so that they can be used as input for some other action when they occur.   The detected alert can be generated by a rule or a monitor.

    It is similar to ‘Get Alert’ but it is a better candidate for proactively taking action upon the occurrence of an alert being generated.  For instance, you could use this to trigger some sort of corrective action that should occur when that alert occurs.  The criteria for which alerts are detected is quite extensive and is similar to those for the ‘Get Alert’ activity.

    image

    Monitor State

    This activity is used to retrieve information about an Operations Manager object when the Health State of that object enters a specific state.  For example, a Logical Disk object in Operations Manager that changes to a Critical state.

    This information can then be passed to additional automation such as a procedure for grooming log files.

    This activity accepts the name of the Object and health state to be monitored as parameters.

    image

    Start Maintenance Mode

    This activity does exactly as its name describes and shouldn’t need further explanation about when this would be used.  I will point out however one interesting thing about the interface and documentation that I don’t necessarily agree with.  As shown below the interface asks you to select a ‘Monitor’ to place in to Maintenance Mode.  This is not accurate as we don’t actually place monitors in to Maintenance Mode within Operations Manager.  Instead, we place monitored objects in to Maintenance Mode.

    When you click the Ellipsis it actually gives you a list of monitored objects and not monitors

    image

    image

    Stop Maintenance Mode

    Much like the ‘Start Maintenance Mode’ activity this one is self-explanatory.  Again though, ‘Monitor’ here should mean ‘Monitored Object’.

    image

    Update Alert

    This activity does exactly as its name describes as well.  You use the Alert ID to determine which Alert to update and you can choose to update the Owner, ResolutionState, TicketId or any of the Custom Fields in the Alert properties.   This would be most useful if proceeded by another Activity such as ‘Get Alert’ to first retrieve an alert.  This way there is an Alert ID in the published data for you to use as criteria for this activity.

     

    This has been a very high-level overview of the System Center 2012 Operations Manager Integration Pack for System Center 2012 Orchestrator.   For more detailed information about these activities you can refer to the official TechNet documentation found here.

    In an upcoming post, I will use some of these activities in some real-world scenarios to help illustrate their potential use further.

  • Rules and Monitors in OpsMgr 2007

    One of the topics that is difficult for many new OpsMgr admins or MP authors to wrap their heads around is the difference between rules and monitors in Operations Manager 2007.   In addition, there often seems to be confusion about which to use in certain situations.

    First let’s start with the fundamental difference between a rule and a monitor.   A monitor affects the health state of a managed entity in OpsMgr where as a rule does not.  When we look at a state view in the console or we look at Health Explorer we are ONLY seeing the results of monitors.   A rule does not have the ability to make something go from Green to Red in OpsMgr.

    That is the BIGGEST difference between a rule and a monitor although there are certainly other differences in the technical implementation of each.  Once we understand this it becomes a lot easier to understand when to use a monitor and a rule.

    When to use a monitor

    Use a monitor in almost any situation where you are checking for the health of an object.  

    When to use a rule

    There are 2 common scenarios where you might want to use a rule

    1.  When you are collecting data for the purposes of reporting (storing in the Data Warehouse) or for displaying data in views within the OpsMgr Console (storing in the Operational Database).   You may have noticed that there are “Write Actions” associated with rules.   These can be thought of as instructions for where to store the data once it’s collected.   You will often see rules with two Write Actions.  One for storing data in the OpsMgr operational database and one for storing the data in the Data Warehouse database.

    2.  If for some reason you want to generate an alert for a condition but DO NOT want the health of an object affected.

    Dealing with Alerts from Monitors and Rules

    When talking about the differences between monitors and rules it is also very important to understand the differences in how you should deal with the alerts that are generated from each.  

    Since there is no underlying health state for a rule you can simply go in and close any alerts that are generated by the rule.

    DO NOT JUST CLOSE ALERTS FROM MONITORS!!   I can not stress this enough and I often see this happening in various OpsMgr environments.   Always check to see if the health of the underlying managed object needs to be reset back to a healthy state as well.   If you close an alert that is generated from a monitor you DO NOT reset the health state of the underlying object automatically.  Think about the impact of this, since alerts are generated based upon state changes you will not receive additional notifications alerting you of future problems until that health state is reset.

    If you’ve ever opened up Health Explorer and saw a bunch of Critical and Warning states but couldn’t figure out why there was no corresponding alerts then it is almost guaranteed that this has happened in your OpsMgr environment.

    If you are unsure as to whether the alert was generated by a rule or monitor then one easy way to tell is to look at the alert properties.  There will be a field called either Alert Rule or Alert Monitor which you can use to tell.

     

    image

    Although there are other smaller technical differences between the two, if you can remember that a monitor is the only one that can alter the health state of a managed object then you should be able to make the right choice when choosing between a rule and a monitor.

  • ACS Forwarders and High Availability - Part 3

    In Part 1 of this series, I discussed the scenario where we have only deployed one ACS Collector along with one ACS Database.  No additional steps were taken to insure that the ACS Forwarder was able to continue to forward Security events in the event that the ACS Collector becomes unavailable.  I would highly recommend that you read at least the first part of Part 1 for some background information if you have found yourself directly on this post.

    In Part 2 of this series, I discussed the scenario where we have deployed multiple ACS Collector and ACS Database pairs and we have relied on one of these to take over in the case of an ACS Collector failure.

    So now the question becomes what if you need High Availability but you don’t want or need to have multiple ACS Databases.  That’s where this scenario comes in.

    Scenario #3: One Database / Two Collectors in an Active/Passive Mode

    Since Operations Manager SP1 was released we now have the ability to have two ACS Collectors point to one ACS Database as long as only one of them is active at one time.  What that means is that we can have a warm standby ACS Collector ready to go whenever a primary fails.

    Setup

    1. Install the primary ACS Collector as you normally would but make sure to use SQL Authentication and not Windows authentication.

    If you use Windows Authentication you will be denied access when you attempt to bring up your standby ACS Collector.  You will notice that when you select Windows Authentication it doesn’t ask you what account you want to use.  That’s because it assumes you will use the computer account of the ACS Collector to connect.  Obviously this would break once the standby ACS Collector comes online.

    See below for a screenshot of the AdtServer user that was created when I chose Windows Authentication. OMMS02$ is the computer account for my ACS Collector.

    clip_image001

    2. Once the primary Collector server has been successfully installed you will need to stop the “Operations Manager Audit Collection Service” so we can install the secondary Collector.

    3. Install the secondary collector while specifying an existing database (the one created in Step 1) and again choosing SQL Authentication.

    4. Stop the “Operations Manager Audit Collection Service” on the secondary Collector and start it again on the Primary Collector.  You may even want to set it to Manual on the secondary Collector just in case it tries to start again on a reboot.

    5. This particular step is optional but highly recommended.  In order to minimize the amount of duplicate events that occur once the ACS Forwarders fail over to the secondary Collector we need to find a way to automate the process of transferring the ACSConfig.xml from the Primary server to the Secondary server.  Remember from Part 1 that ACSConfig.xml contains the recent sequence number which tells the Collector / Forwarder where things left off when inserting data in to the ACS Database.

    Couple things to note about transferring this file:

         a. Try to transfer this file every 5 minutes as this is how often the file will be updated on the Primary Collector.

         b. Choose a method that does not lock the file as you do not want the file locked when the AdtServer service tries to overwrite the file.

    6. Enable the ACS Forwarders by using the “Enable Audit Collection” task in the OpsMgr console.  This task requires an override configured that specifies the Collector for the Forwarder to communicate with.  Here you will enter a comma-separated list of Collector servers.  Notice that the secondary server appears first in the list and the primary server appears last.

    clip_image003

    Failing Over

    If your Primary ACS Collector were to become unavailable you would now just need to start the “Operations Manager Audit Collection Service” on the secondary Collector server.  The AdtServer service will start up and read the ACSConfig.xml file we have been transferring over and take over the role of collecting Security Events from the ACS Forwarders.

    Note: There will likely be some small amount of duplicate data that occurred due to data that was inserted in to the database between the time that the primary Collector crashed and the last time the ACSConfig.xml file was updated.

    Failing Back

    Once the primary Collector has been brought back online the process of failing back is quite easy.  First you’ll want to copy the ACSConfig.xml from the secondary Collector back to the primary Collector to minimize the duplication that will occur once we fail back.  Once that’s done you just need to stop the “Operations Manager Audit Collection Service” on the secondary Collector and the Forwarders will automatically attempt reconnection back to the Primary.

    Pros

    · Only one ACS Database is necessary

    · Duplication of data can be minimized by synchronizing the ACSConfig.xml file between the Primary Collector and the Secondary Collector

    Cons

    · Some duplication of data may still occur

    · Failover is not automatic and will require intervention (but could be scripted)

    Additional Information

    Check out Part 2 for information on how to check the configuration of an ACS Forwarder as well as what events to look for in the Event Log.

  • ACS Forwarders and High Availability - Part 2

    In Part 1 of this series, I discussed the scenario where we have only deployed one ACS Collector along with one ACS Database.  No additional steps were taken to insure that the ACS Forwarder was able to continue to forward Security events in the event that the ACS Collector becomes unavailable.  I would highly recommend that you read at least the first part of Part 1 for some background information if you have found yourself directly on this post.

    Scenario #2: Multiple Collector / ACS Database pairs

    In this scenario, we have deployed multiple ACS Collector and ACS Database pairs most likely due to performance or scalability concerns.  As there are multiple “active” Collectors at one time then our ACS Forwarder has the ability to fail over automatically with no intervention required to a secondary Collector.  The term “active” is important here as remember from Part 1 that an ACS Collector as a “one-to-one” relationship with an ACS Database.

    In order for this to work we need to specify a comma-separated list of Collector servers when we enable our ACS Forwarders.  This is done through the use of an override when we run the “Enable Audit Collection” task in the OpsMgr console.

    clip_image002

    Notice that our Primary Collector appears last in the last!

    That’s all we need to do in this scenario to configure automatic failover but is it the perfect solution?  As with everything it seems, the answer is “It Depends.”  Read on for a list of reasons why this scenario may or may not be the best solution for your environment.

     

    PROS

    · Automatic failover! This is the only scenario where automatic failover occurs as part of the built-in functionality.

    · No vulnerability to malicious intruders who manipulate the Security event log.

     

    CONS

    · Data for Forwarders ends up being spread across ACS databases and may become hard to query.  The impact of this could be negligible if you are using some other tool for long-term ACS data archival.

    · Data will be duplicated when the new Collector takes over as the ACSConfig.xml will not have a record of the Forwarders sequence number and will therefore retrieve the last 72 hours’ worth of events. Remember from Part 1 that the ACSConfig.xml helps to keep track of the last events recorded from the ACS Forwarder.

    Additional Information:

    If you want to see whether or not a Forwarder has more than one server defined then you can view this in the registry on the ACS Forwarder itself within the following key:

    HKLM\Software\Policies\Microsoft\AdtAgent\Parameters\AdtServers

    If you have specified multiple ACS Collectors then you will find that your ACS Forwarder will fail over automatically when the original Collector is unavailable.  You can see this by looking for events like the following in the OperationsManager Event log on the Forwarder:

    Log Name: Operations Manager
    Source: AdtAgent
    Date: 3/22/2011 2:13:54 PM
    Event ID: 4368
    Task Category: None
    Level: Information
    Keywords: Classic
    User: NETWORK SERVICE
    Computer: MYSERVER.MYDOMAIN
    Description:
    Forwarder successfully connected to the following collector:
    COLLECTOR:51909, status: 0x0 (success), source: registry
    addresses tried:
    xxx.xxx.xxx.xxx:51909

    In Part 3 of this series, I will discuss how to create a warm standby ACS Collector so that multiple Database installations are not necessary. This is a new feature for Operations Manager SP1.

  • ACS Forwarders and High Availability - Part 1

    If your organization has implemented Audit Collection Services (ACS) then you will already know that there is a “one to one” relationship between an ACS Collector and an ACS Database.  If your organization has requirements for High Availability or you are forced to maintain strict auditing of Security events then you may be asking the question about what happens to your ACS Forwarders when their respective ACS Collector becomes unavailable.

    There are 3 possible scenarios that could occur when your ACS Collector becomes unavailable so this will be a 3 part series detailing what happens within each scenario.

    Before I dive in to the details of Scenario 1 a little bit of background information is needed:

    There is a file called ACSConfig.xml which is located at %systemroot%\system32\security\AdtServer. This file contains some useful information that is needed by the Collector to perform its functions. Some of that information is entries for each Forwarder that has communicated with the Collector along with a Sequence Number. This sequence number corresponds to the EventRecordID of the event on the Forwarder and this is how ACS keeps track of which events have been collected.

    This file is updated every 5 minutes by the AdtServer service with information that it has stored in memory.

    Scenario #1: One ACS Collector / ACS Database pair

    In this scenario, your organization has deployed one ACS Collector along with one ACS Database as that is all that is needed to satisfy the performance and scalability requirements for ACS.  There have also been no additional steps taken to maintain ACS Forwarder availability.

    If the ACS Collector were to become unavailable then the ACS Forwarder no longer has anywhere to forward events and the Security Event Log on the Forwarder is basically acting as the queue for the backlog of events.

    If you bring the same ACS Collector back online (the previous ACSConfig.xml file is present) AND you are still within your Event Retention Period (default 72 hours) then you will not miss any events.

    If you bring a new ACS Collector online and reconfigure the ACS Forwarder or you bring up the old ACS Collector but you are no longer within the Event Retention Period then the ACS Collector will collect the last 72 hours’ worth of events only and carry on from there.

    Eric Fitzgerald has written a blog that describes the Event Retention Period with ACS here.

    PROS

    · No failover configuration is needed as no failover actually occurs

    · As long as the original ACS Collector is brought online (or the ACSConfig.xml file is restored) then there is the potential for no loss of data within the Event Retention Period.

    · There will be no or minimal duplication of data as long as the original ACSConfig.xml is present on the ACS Collector

    CONS

    · If the original ACSConfig.xml file is not present then you will get duplication of data that is equivalent to the Event Retention Period.

    · Data will be lost if ACS Collector is unavailable for longer than the Event Retention Period

    · Your organization becomes vulnerable to a malicious intruder that could potentially manipulate the Security Event Log during the time in which the ACS Collector is unavailable.

    In Part 2 of this series, I will discuss the behavior that occurs when you have deployed multiple collectors and database pairings and then rely on the ACS Forwarder to fail over to one of the other available Collector/Database pairings.

    In Part 3 of this series, I will discuss how to create a warm standby ACS Collector so that multiple Database installations are not necessary. This is a new feature for Operations Manager SP1.