Kevin Holman's System Center Blog

Posts in this blog are provided "AS IS" with no warranties, and confers no rights. Use of included script samples are subject to the terms specified in the Terms of UseAre you interested in having a dedicated engineer that will be your Mic

Deploying Unix/Linux Agents using OpsMgr 2012

Deploying Unix/Linux Agents using OpsMgr 2012

  • Comments 14
  • Likes

Microsoft started including Unix and Linux monitoring in OpsMgr directly in OpsMgr 2007 R2, which shipped in 2009.  Some significant updates have been made to this for OpsMgr 2012.  Primarily these updates are around:

  • Highly available Monitoring via Resource Pools
  • Sudo elevation support for using a low priv account with elevation rights for specific workflows.
  • ssh key authentication
  • New wizards for discovery, agent upgrade, and agent uninstallation
  • Additional Powershell cmdlets
  • Performance and scalability improvements
  • New monitoring templates for common monitoring tasks

 

This article will cover the discovery, agent deployment, and monitoring configuration of a Linux server in OpsMgr 2012.  I am going to run through this as a typical user would – and show some of the pitfalls if you don’t follow the exact order of configuration required.

 

So what would anyone do first?  They’d naturally run a discovery, just like they do for Windows agents.  However – this will likely end up in frustration.  There are several steps that you need to configure FIRST, before deploying Unix/Linux agents.

 

High Level Overview:

 

The high level process is as follows:

  • Import Management Packs
  • Create a resource pool for monitoring Unix/Linux servers
  • Configure the Xplat certificates (export/import) for each management server in the pool.
  • Create and Configure Run As accounts for Unix/Linux.
  • Discover and deploy the agents

 

 

Import Management Packs:

 

The core Unix/Linux libraries are already imported when you install OpsMgr 2012, but not the detailed MP’s for each OS version.  These are on the installation media, in the \ManagementPacks directory.  Import the specific ones for the Unix or Linux Operating systems that you plan to monitor.

 

 

Create a resource pool for monitoring Unix/Linux servers

The FIRST step is to create a Unix/Linux Monitoring Resource pool.  This pool will be used and associated with management servers that are dedicated for monitoring Unix/Linux systems in larger environments, or may include existing management servers that also manage Windows agents or Gateways in smaller environments.  Regardless, it is a best practice to create a new resource pool for this purpose, and will ease administration, and scalability expansion in the future.

Under Administration, find Resource Pools in the console:

image

 

OpsMgr ships 3 resource pools by default:

image

 

Let’s create a new one by selecting “Create Resource Pool” from the task pane on the right, and call it “Unix Linux Monitoring Resource Pool”

 

image

 

Click Add and then click Search to display all management servers.  Select the Management servers that you want to perform Unix and Linux Monitoring.  If you only have 1 MS, this will be easy.  For high availability – you need at least two management servers in the pool.

 

Add your management servers and create the pool.  In the actions pane – select “View Resource Pool Members” to verify membership.

 

image

 

 

Configure the Xplat certificates (export/import) for each management server in the pool

This process is documented here:  http://technet.microsoft.com/en-us/library/hh287152.aspx

Operations Manager uses certificates to authenticate access to the computers it is managing. When the Discovery Wizard deploys an agent, it retrieves the certificate from the agent, signs the certificate, deploys the certificate back to the agent, and then restarts the agent.

To configure high availability, each management server in the resource pool must have all the root certificates that are used to sign the certificates that are deployed to the agents on the UNIX and Linux computers. Otherwise, if a management server becomes unavailable, the other management servers would not be able to trust the certificates that were signed by the server that failed.

We provide a tool to handle the certificates, named scxcertconfig.exe.  Essentially what you must do, is to log on to EACH management server that will be part of a Unix/Linux monitoring resource pool, and export their SCX (cross plat) certificate to a file share.  Then import each others certificates so they are trusted.

If you only have a SINGLE management server, or a single management server in your pool, you can skip this step, then perform it later if you ever add Management Servers to the Unix/Linux Monitoring resource pool.

 

In this example – I have two management servers in my Unix/Linux resource pool, MS1 and MS2.  Open a command prompt on each MS, and export the cert:

On MS1:

C:\Program Files\System Center 2012\Operations Manager\Server>scxcertconfig.exe -export \\servername\sharename\MS1.cer

On MS2:

C:\Program Files\System Center 2012\Operations Manager\Server>scxcertconfig.exe -export \\servername\sharename\MS2.cer

Once all certs are exported, you must IMPORT the other management server’s certificate:

On MS1:

C:\Program Files\System Center 2012\Operations Manager\Server>scxcertconfig.exe –import \\servername\sharename\MS2.cer

On MS2:

C:\Program Files\System Center 2012\Operations Manager\Server>scxcertconfig.exe –import \\servername\sharename\MS1.cer

If you fail to perform the above steps – you will get errors when running the Linux agent deployment wizard later.

 

 

Create and Configure Run As accounts for Unix/Linux

 

Next up we need to create our run-as accounts for Linux monitoring.   This is documented here:  http://technet.microsoft.com/en-us/library/hh212926.aspx

 

We need to select “UNIX/Linux Accounts” under administration, then “Create Run As Account” from the task pane.  This kicks off a special wizard for creating these accounts.

image

 

image

 

Lets create the Monitoring account first.  Give the monitoring account a display name, and click Next.

 

image

 

On the next screen, type in the credentials that you want to use for monitoring the Linux system(s).

 

image

 

On the above screen – you have two choices.  You can provide a privileged account for handling monitoring, or you can use an existing account on the Linux system(s) that is not privileged.  Then – you can specify whether or not you want this account to be able to leverage sudo elevation.  Since I am providing a privileged account in this case – I will tell it to not use elevation.

On the next screen, always choose more secure:

image

 

Now – since we chose More Secure – we must choose the distribution of the Run As account.  Find your “Linux Monitoring Account” under the UNIX/Linux Accounts screen, and open the properties.  On the Distribution Security screen, click Add, then select "Search by resource pool name” and click search.  Find your Unix/Linux monitoring resource pool, highlight it, and click Add, then OK.  This will distribute this account credential to all Management servers in our pool:

 

image

 

We would repeat the above process, as many times as necessary for the number of different accounts we need.  If all our Linux systems use the same credentials, then we need at a minimum, ONE monitoring account that is privileged, and it can be associated to the three Run As Profiles (covered in next section).

However, what would be more typical, if all our systems had the same credentials and passwords, is to use THREE Run As accounts:

  • One for for Unprivileged (do not use elevation) monitoring
  • One for Privileged monitoring using EITHER a priv account (do not use elevation), OR a unpriv account using sudo (use elevation)
  • One for Agent Maintenance using EITHER a priv account (do not use elevation), OR a unpriv account using sudo (use elevation)

For the purposes of this demo, I am just going to create a SINGLE priv Run As account (root) that I will use for all three scenarios.

 

Next up – we must configure the Run As profiles.  This is covered here:  http://technet.microsoft.com/en-us/library/hh212926.aspx

 

There are three profiles for Unix/Linux accounts:

 

image

 

The agent maintenance account is strictly for agent updates, uninstalls, anything that requires SSH.  This will always be associated with a privileged account that has access via SSH, and was created using the Run As account wizard above, but selecting “Agent Maintenance Account” as the account type.  We wont go into details on that here.

The other two Profiles are used for Monitoring workflows.  These are:

Unix/Linux Privileged account

Unix/Linux Action Account

The Privileged Account Profile will always be associated with a Run As account like we created above, that is Privileged (root or similar) OR a unprivileged account that has been configured with elevation via sudo.  This is what any workflows that typically require elevated rights will execute as.

The Action account is what all your basic monitoring workflows will run as.  This will generally be associated with a Run As account, like we created above, but would be used with a non-privileged user account on the Linux systems.

***A note on sudo elevated accounts:

  • sudo elevation must be passwordless.
  • requiredtty must be disabled for the user.

 

For my example – I am keeping it very simple.  I created a single Run As account, of the Monitoring type, which is the privileged root account and password credential.  I will associate this Run As account to BOTH the Privileged and Action account.  This will make all my workflows (both normal monitoring and elevated monitoring) run under this credential.  This is not recommended as the “lowest priv” design, but being leveraged in this example just to keep things simple.  Once we validate it is working, we can go back and change this configuration and experiment using low priv and sudo enabled elevation accounts, and associate them independently.

For more information on configuring sudo elevation for OpsMgr monitoring accounts, including some sample configurations for your sudoers files for each OS version:  http://social.technet.microsoft.com/wiki/contents/articles/7375.configuring-sudo-elevation-for-unix-and-linux-monitoring-with-system-center-2012-operations-manager.aspx

 

I will start with the Unix/Linux Action Account profile.  Right click it – choose properties, and on the Run As Accounts screen, click Add, then select our “Linux Monitoring Account”.  Leave the default of “All Targeted Objects” and click OK, then save.

Repeat this same process for the Unix/Linux Privileged Account profile.

Repeat this same process for the Unix/Linux Agent Maintenance Account profile.

 

 

Discover and deploy the agents

 

Run the discovery wizard.

 

image

 

Click “Add”:

image

 

Here you will type in the FQDN of the Linux/Unix agent, its SSH port, and then choose All Computers in the discovery type.  ((We have another option for discovery type – if you were manually installing the Unix/Linux agent (which is really just a simple provider) and then using a signed certificate to authenticate))

 

Now – hit “Set Credentials”.  If we do not want to provide a root account here, and wanted to use SSH key authentication, we support that on this screen now.  For this example – I will simply type in my root account in order to use SSH to discover and deploy the Linux agent.

 

image

 

Notice above that you can tell the wizard if the account is privileged or not.  Here is an explanation:

  • A privileged account is a user account that has root-level access, including access to security logs and read, write, and execute permissions for the directories in which the Operations Manager agent is installed.
  • An unprivileged account is a normal user account that does not have root-level access or special permissions. However, an unprivileged account allows monitoring of system processes and of performance data.

If you have to discover only UNIX and Linux computers that already have an agent installed, rather than installing an agent, you can use an unprivileged user account on the UNIX or Linux computer. If you have to install an agent, you must use a privileged account. If you do not have a privileged account, you can elevate an unprivileged account to a privileged account provided that the su or sudo elevation program has been configured on the UNIX or Linux computer for the user account.

 

So – if we had pre-installed the agent already – we could simply use an unprivileged account to authenticate and discover the system, bringing it into OpsMgr.

Or – we could provide an unprivileged account that was allowed elevation via a pre-existing sudo configuration on the Linux server.

 

image

 

Click save.  On the next screen – select a resource pool.  We will choose the resource pool that we already created.

 

image

 

Click Discover, and the results will be displayed:

 

image

 

Check the box next to your discovered system – and deploy the agent.

 

image

 

This will take some time to complete, as the agent is checked for the correct FQDN and SSL certificate, the management servers are inspected to ensure they all have trusted SCX certificates (that we exported/imported above) and the connection is made over SSH, the package is copied down, installed, and the final certificate signing occurs.  If all of these checks pass, we get a success!

 

There are several things that can fail at this point.  See the troubleshooting section at the end of this article.

 

 

Monitoring Linux servers:

 

Assuming we got all the way to this point with a successful discovery and agent installation, we need to verify that monitoring is working.  After an agent is deployed, the Run As accounts will start being used to run discoveries, and start monitoring.  Once enough time has passed for these, check in the Administration pane, under Unix/Linux Computers, and verify that the systems are not listed as “Unknown” but discovered as a specific version of the OS:

 

image

 

Next – go to the Monitoring pane – and select the “Unix/Linux Computers” view at the top.  Look that your systems are present and there is a green healthy check mark next to them:

 

image

 

Next – expand the Unix/Linux Computers folder in the left tree (near the bottom) and make sure we have discovered the individual objects, like Linux Server State, Linux Disk State, and Network Adapter state:

 

image

 

Run Health explorer on one of the discovered disks.  Remove the filter at the top to see all the monitors for the disk:

 

image

 

Close health explorer. 

Select the Operating System Performance view.   Review the performance counters we collect out of the box for each monitored OS.

 

image

 

Out of the box – we discover and apply a default monitoring template to the following objects:

  • Operating System
  • Logical disk
  • Network Adapters

Optionally, you can enable discoveries for:

  • Individual Logical Processors
  • Physical Disks

I don’t recommend enabling additional discoveries unless you are sure that your monitoring requirements cannot be met without discovering these additional objects, as they will reduce the scalability of your environment.

 

Out of the box – for an OS like RedHat Enterprise Linux 5 – here is a list of the monitors in place, and the object they target:

 

image

 

There are also 50 rules enabled out of the box.  46 are performance collection rules for reporting, and 4 rules are event based, dealing with security.  Two are informational letting you know whenever a direct login is made using root credentials via SSH, and when su elevation occurs by a user session.  The other two deal with failed attempts for SSH or SU.

 

To get more out of your monitoring – you might have other services, processes, or log files that you need to monitor.  For that, we provide Authoring Templates with wizards to help you add additional monitoring, in the Authoring pane of the console under Management Pack templates:

 

image

 

In the reporting pane – we also offer a large number of reports you can leverage, or you can always create your own using our generic report templates, or custom ones designed in Visual Studio for SQL reporting services.

 

image

 

As you can see, it is a fairly well rounded solution to include Unix and Linux monitoring into a single pane of glass for your other systems, from the Hardware, to the Operating System, to the network layer, to the applications.

Partners and 3rd party vendors also supply additional management packs which extend our Unix and Linux monitoring, to discover and provide detailed monitoring on non-Microsoft applications that run on these Unix and Linux systems.

 

 

Troubleshooting:

 

The majority of troubleshooting comes in the form of failed discovery/agent deployments.

 

Microsoft has written a wiki on this topic, which covers the majority of these, and how to resolve:

http://social.technet.microsoft.com/wiki/contents/articles/4966.aspx

 

  • For instance – if your DNS name that you provided does not match the DNS hostname on the Linux server, or match it’s SSL certificate, or if you failed to export/import the SCX certificates for multiple management servers in the pool, you might see:

 

image

 

Agent verification failed. Error detail: The server certificate on the destination computer (rh5501.opsmgr.net:1270) has the following errors:
The SSL certificate could not be checked for revocation. The server used to check for revocation might be unreachable.

The SSL certificate is signed by an unknown certificate authority.
It is possible that:
1. The destination certificate is signed by another certificate authority not trusted by the management server.
2. The destination has an invalid certificate, e.g., its common name (CN) does not match the fully qualified domain name (FQDN) used for the connection. The FQDN used for the connection is: rh5501.opsmgr.net.
3. The servers in the resource pool have not been configured to trust certificates signed by other servers in the pool.

The server certificate on the destination computer (rh5501.opsmgr.net:1270) has the following errors:
The SSL certificate could not be checked for revocation. The server used to check for revocation might be unreachable.
The SSL certificate is signed by an unknown certificate authority.
It is possible that:
1. The destination certificate is signed by another certificate authority not trusted by the management server.
2. The destination has an invalid certificate, e.g., its common name (CN) does not match the fully qualified domain name (FQDN) used for the connection. The FQDN used for the connection is: rh5501.opsmgr.net.
3. The servers in the resource pool have not been configured to trust certificates signed by other servers in the pool.

 

The solution to these common issues is covered in the Wiki with links to the product documentation.

 

  • Perhaps – you failed to properly configure your Run As accounts and profiles.  You might see the following show as “Unknown” under administration:

 

image

 

Or you might see alerts in the console:

 

Alert:  UNIX/Linux Run As profile association error event detected

The account for the UNIX/Linux Action Run As profile associated with the workflow "Microsoft.Unix.AgentVersion.Discovery", running for instance "rh5501.opsmgr.net" with ID {9ADCED3D-B44B-3A82-769D-B0653BFE54F9} is not defined. The workflow has been unloaded. Please associate an account with the profile.

This condition may have occurred because no UNIX/Linux Accounts have been configured for the Run As profile. The UNIX/Linux Run As profile used by this workflow must be configured to associate a Run As account with the target.

Either you failed to configure the Run As accounts, or failed to distribute them, or you chose a low priv account that is not properly configured for sudo on the Linux system.  Go back and double-check your work there.

 

If you want to check if the agent was deployed to a RedHat system, you can provide the following command in a shell session:

image

Comments
  • Thanks for the great post.

    It is valuable and amazing as always.

  • Agreed this document helped out so much. I appreciate the share.

  • Excellent document again....

    I do not see any Resouce Pools in my environment even I have about 100 Cross Platform agent running monitored and alerting properly, is it normal?

    Is it only SCOM 2012?

    As I have VMs managed in SCOM through nWorks from Veeam/VMware using the vCenter what is the main difference in the informtion provided by the two ways to managed the VM/Linux machine?

    Thanks,

    DOm

  • we are getting below error when using a privileged account

    Failed during SSH discovery. Exit code: 1

    Standard Output: Sudo path: /usr/bin/

    Standard Error: sudo: no tty present and no askpass program specified

    Exception Message:

  • Hello

    Thanks for the info. I succesfully configured a number of linux using the same credentials for monitoring. But I'm trying to add now another group of linux boxes with a different set of credentials (as I dont want to share a privileged account between all my servers) and when trying to add the credentials to the profile, the "All targeted objects" option is not available anymore.

    Is it possible to monitor several Linux machines using different SUDOer accounts for each one?

    Thanks a lot

    Fran

  • Oh, nevermind my last post. The solution is already out there :)

    social.technet.microsoft.com/.../27d1983a-96d2-4900-8730-0a9522d870b4

    Again, thanks for the great post :)

  • HI Kevin,

    Could you please help me in including the SCOM R2 agent in our Server Template.

    Is it there any power shell script to install manually.

    We are planning to automate the SCOM agent installation by adding the agent to Template. But we have different SCOM gateway server for different domain.

    P Lease help if there any solution for this.

    Thanks,

    Raksha

  • hi kevin

    i followed your instuctions as detailed above but i seem to be getting the following error i dont know if i missed something out what do you think

    Failed to sign kit. Exit code: 1

    Standard Output: Failed to start child process '/sbin/init.d/scx-cimd' errno=13

    RETURN CODE: 1

    Standard Error: cp: cannot create /etc/opt/microsoft/scx/ssl/scx.pem: Permission denied

    Exception Message:

  • Hi Kevin,

    Do you have the Solution to fix the below error:

    Failed to sign kit. Exit code: 1

    Standard Output: Failed to start child process '/sbin/init.d/scx-cimd' errno=13

    RETURN CODE: 1

  • I know how to get the .cert to the unix system (aix) but dont they need an agent on their side and where do you get it from?

  • i am trying to discover and install agent into Linux machines throught SCOM 2012 . Below is the error i am facing ...please someone help em Failed to sign kit. Exit code: 1 Standard Output: Failed to start child process '/etc/init.d/scx-cimd' errno=13 RETURN CODE: 1 Standard Error: cp: cannot create regular file `/etc/opt/microsoft/scx/ssl/scx.pem': Permission denied Exception Message:

  • Is there a way to configure the run as accounts to install/monitor some Linux servers with a privileged and other servers with an unprivileged account? It seems the configuration only one way or the other for all the servers.

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
Search Blogs