Stuff from stuf

Bringing sexy back to management...

May, 2009

  • Rethinking the guest?

    I love working in the IT industry.  Apart from being involved in an industry that is constantly changing and evolving, there are also lots of smart people out there who are doing & saying stuff that is interesting & challenging.  I read a lot of blogs, and I read a lot of blogs of people who are working with competitors technology, or doing things that aren’t in my area of focus.  I read an interesting post recently on the vinternals.com site called Rethinking the guest.  I was going to comment on the post there, but thought I would blog my response instead.

    Stu who posts there is a VMware guy and has a lot of interesting things to say.  This post touches my area of work (Systems Management) and challenged me to think about how we do stuff, and how our world might evolve – thanks Stu!

    Stu’s theory is that agent based management of guests needs to change, and posted four key areas where things could be done better.  I find myself half agreeing and half disagreeing with him.

    His first point is that managing patching with an agent is probably not efficient, and a sub point was that an enterprise software management system will likely do other stuff like hardware & software inventory, but we should disable hardware inventory because Vcenter captures that.  I think that misses the important point that a lot of these systems that capture inventory then pass that information up to other systems (like your CMDB) and it’s nice to have a consistent place to capture that.  I know that with System Center Configuration Manager I can grab all the information about my Windows inventory (software & hardware, physical or virtual) from a single place, and with partners like Quest I can grab information about my non-Windows environment as well from the same location.  Do we need to complicate our environment by splitting our virtualised hardware inventory from our physical hardware inventory, or from our desktop inventory?  And what if we want to do other things with our inventory system like baselining our desired configurations?
    And agentless patch management is not without it’s problems.  What about when the machine is turned off or unreachable (maybe the Windows Firewall is switched on)?  What about when we want more control over when things happen – agentless patching can be good, but I think we give up control of a lot.

    His second point is that agentless monitoring is also possible with the new Windows eventing subsystem.  Again, that only gives us a subset of the information that we might care about.  If all I care about is what events are being logged in the event log then sure, maybe that’s a potential solution.  But what if I want deeper information?  What if I want to be alerted when my disk space is low?  What if I want deeper information about what an application is doing?  If I really want to understand what my application is doing, merely looking at Windows events simply isn’t enough.  I need to look at more metrics than events expose.  That’s where System Center Operations Manager excels – and then using the PRO functionality inside Virtual Machine Manager we can use that contextual information to make smart decisions about remediating problems.  Which might be live migrating/Vmotioning a machine to another node, or it might be provisioning a new VM to take the extra load because we’ve hit an OS limit that providing extra resource can’t solve.
    And don’t get me started on SNMP.  SNMP is an overcomplicated, insecure (at least until SNMP v3) mess that is great for monitoring simple network devices, but it really shows it’s age.  And you simply don’t get the depth of information about Windows devices with SNMP.

    His third point – backup.  All I can say is, good to see VMware catching up on the great backup tools we have available with System Center Data Protection Manager which provides the same functionality but across physical and virtual environments. :)

    I’ll skip VMsafe for now, I don’t know enough about it to comment – but I guess virtualisation will require security models to evolve, and VMsafe looks like one step in the process.

    But his overall point – this will hinder the move to the cloud.  This is where I start to agree.  I think it probably will, and it’s one of the things people will have to consider when they look at their cloud strategy.  So he’s right in that things have to change, but I’m not sure that the alternatives he’s proposed are good ones yet.  It’s going to be interesting to see how the management tools industry does evolve to take into account the cloud.  I’m just glad I’m here to see what happens!

  • Monitoring service levels in ops Mgr r2

    I’ve just been playing with the new Service Level Dashboard for System Center Operations Manager 2007 R2.  It’s a great improvement on the first version, and it’s pretty straightforward to set up.  I thought I’d document the steps I took to configure it, although if you just follow through the documentation you’ll get there pretty easily.

    First the prerequisites:

    1. Install Operations Manager 2007 R2

    2. Install Windows SharePoint Services 3.0 (You need to install against SQL, not against the internal Windows database)

    3. You’ll need to import the Service Level Dashboard management pack into Ops Mgr

    4. Install the Service Level Dashboard following the wizard – it’s pretty straightforward.

    5. Make the following change to the web.config file for the Service Level Dashboard application that gets created (by default it’s stored in C:\inetpub\wwwroot\wss\VirtualDirectories\51918) and change the line that reads:
    <identity impersonate=”false”> to:
    <identity impersonate=”true”>

    You’ll need to run iisreset after you’ve done this.  If you don’t do this, you’ll get a “Cannot complete this action” when you navigate to the home page.

    6. Make sure to create a firewall exception for the port that you’ve configured for the Service Level Dashboard.

    Now you should be good to go.  First thing you need to do is define your service level in the Operations Manager console.  Open the console and go to the Authoring pane.  Choose the “Service Level Tracking” option. 

    1. SL definition

    In the right hand pane, click Create. This brings up the Service Level Tracking wizard.  Give it a name.

    2. SL definition

    Now we need to define what we’re interested in tracking for service levels.

    3. SL definition

    The first thing we need to do is select the class that we’re monitoring.  Click the top Select button and choose the class that matches what you’re interested in.  By default it’s scoped to Distributed Application, but you can change this to Group or All.  In this case I’m interested in tracking one of my pre-created distributed applications, so I choose “Distributed Application” from the list.

    4. SL definition

    Now we’ve selected our class, we need to target to a specific object – in this case I’m looking for my distributed application called “OpsMgr Website”.  This is a distributed application I created based on the “Line of business web application template” – it monitors the OpsMgr web console & backend database of the OpsMgr server itself.

    5. SL definition

    Also make sure you choose a management pack to store this new service level in.

    6. SL definition

    Now we need to define what our service level objective actually is

    7. SL definition

    Click the Add button and we can define what we want our service level objective to be.  I want my app to available at least 95% of the time and I can specify which states count as downtime.

    8. SL definition

    Once I’ve done this, I can now create the service level tracking.

    9. SL definition

    Now that I’ve defined what I need to track, I can then go to the Service Level Dashboard and setup the web page that lets me view this.  Navigate to the website you set up when installing the service level dashboard (by default it’s http://localhost:51918).  It’s also blank by default.  Make sure you log on as an Administrator of the site so that you can make changes.

    1. SLD Config

    Go to the Site Actions button, and choose Edit Page.  This will bring up the screen below and, all things going well, you should have your service level that you’ve defined available to select.  Choose the one you want to monitor, and choose the refresh rate and over what period you want to report.  Once you’re done, exit edit mode.

    2. SLD Config

    You should now have a nice view of how you’re tracking against your service levels – you can see that I’m not meeting my 95% objective (that’s what happens when you stop the web console web site for a few hours…)

    3. SLD Config

    As you can see, the new version of the Service Level Dashboard is easy to set up and provides a great view of how you’re tracking.

  • Save money managing your virtual hosts

    As I said in my post below, I read a lot of blogs.  I’ve been meaning to write a response to a post I read on Vcritical.com for a while, and haven’t got around to it.  Vcritical is written by Eric Gray who is a VMware employee, so clearly he has a biased view, much as I do given my employer :).  However it’s always fun to debate the issues, and we shouldn’t shy away from respectful disagreement.

    Eric wrote a post a while back “Save $14970 on VMware ESX Management” where he said to save money managing your VMware hosts, don’t use Virtual Machine Manager, and then pointed out that the cost of the Server Management Suite Enterprise license is $1,497 per physical host.

    I think Eric is understating the value that the SMSE license brings.  SMSE doesn’t just give you the Operations Manager management license for the host, but also for all the guests running on that host, along with the management agents for Configuration Manager, Data Protection Manager & Virtual Machine Manager.  So now we get detailed information about what’s going on at the host level, but detailed application level information about what’s going in all the guests, plus backup, plus inventory, plus patching, plus software distribution, plus desired configuration management, plus self service VM provisioning, plus a whole bunch of other capability. 

    For me the deep application information is really important.  Sure you could take a black box view of the VM and just treat it as a CPU & memory & disk IO consuming thing, but just because it’s virtualised doesn’t mean you don’t want to know what’s going on in that VM.  If that VM is running BizTalk you’d want to know that the BizTalk services are behaving the way they should, if the VM is running Exchange you want to know that mail is flowing, if it’s running SQL you want to know that the SQL databases aren’t running out of space.  The black box view is great if all you want to do is play musical chairs with your VM’s, but if you really want to know that they’re doing what they’re supposed to, you need deep application knowledge.  And that’s where Operations Manager excels.

    And forget that Eric tells you that PRO is too hard to configure – it’s not.

     

    Just to be explicit on my comment policy: comments are moderated on this blog, but I’ll publish every comment that isn’t offensive, spam or defamatory.  It just takes me a while sometimes.