• To hotfix or not to hotfix, that is the question

    Let me start this blog post by making it clear that this is my opinion, not an official stance.  Everyone is entitled to their opinion and this is just mine.  Feel free to ignore it if you like.

    I often time have discussions with customers about applying hotfixes, and more recently cumulative updates.  The discussion revolves around if these things should be applied pro-actively or not.  There are many folks in the support organizations here at Microsoft that want to make sure they are looking after our customers and thus recommend to them to apply all the latest and greatest fixes. The thought is that doing so will avoid fighting already known and fixed issues.  This can be a big time and frustration saver rather than battling for days trying to get something to work, only to find out there was already a fix for the issue.

    Change is risk.  This is why most medium and large companies with mature IT processes have some form of change control in place.  Any change means a deviation from the status quo and while that may improve things, it has the potential to cause problems as well.  Standard change mitigation is to test the changes before committing to them in production.  This is a good practice for IT groups as well as internally at Microsoft.  Everything that Microsoft releases is tested before let out “into the wild”.  As some of you may have noticed, occasionally that testing has missed a scenario or two and had unforeseen side effects.  Fortunately I think such scenarios are getting to be less and less as process and diligence improves.  Unfortunately, we aren't to a 0 occurrence rate just yet.

    I was a test lead/manager for several years in the SCCM development group and was part of many hard discussions on how much testing is the right amount of testing for a given problem.  With only a few quick tests the possibility of having a regression or unforeseen problem caused by the fix was high.  On the other end of the spectrum I could sit down and dream up an infinite amount of potential tests, meaning that the product could never be released.

    To give you an off-the wall example of how far these tests can go.  Think of a screw driver.  It is a fairly simple and straight forward thing that most of us use with out really thinking about it. Various tests are: can it turn to the right, turn to the left, fit screw size A, fit screw size B, comfortable in my hand, comfortable in my kids hand, not break under normal use, not break in a deadly way under more extreme use, not melt in my hand, not melt out in sunlight, not melt in my garage with the heater left on during a 100 degree day, not melt in the sun, go to space, not wear down too quickly, look good on a store shelf to sell better,  etc.

    So, how much testing is enough testing?  Well, the balance point changes.  The more critical and time sensitive something is, the more risk we take by doing less testing.  Hotfixes are generally at the higher end of the risk scale.  We test them in lab, maybe with some internal folks, and usually with at-least a few customers before they become available for everyone to download.  The testing is limited because we want to be able to get them out fairly quickly.  Cumulative updates get a little more testing, especially the interaction of the multiple fixes.  The full development, then testing, then release happens in only a few months (and is in addition to the testing that some included hotfixes may have already been through) with a limited number of people so while this is more coverage than a typical hotfix, it isn’t really what I personally would consider as “low risk” just yet (although it is, arguably, getting close).  SCCM CU hold less risk than a typical standalone hotfix would, and typically only include items that have a clearly understood risk to them which was covered by internal testing. Service packs get much more rigorous testing across many different in house and external scenarios.  The chances of problem arising from a service pack are very low (or at-least well understood and documented) and thus on a personal level I consider it a “low risk” type of deployment.

    So…, why do I write all this up?  The advice I have always heard is “if it isn’t broke, don’t fix it” and in general I think that applies to software patching.  Don’t apply a hotfix or a CU unless you are experiencing the symptoms that it means to address.  Yes, you might waste a weekend battling a problem only to find out that a fix already existed.  Compared to wasting a weekend applying the fix then battling an issue caused by it I think it is a good trade off to have not applied the fix if it wasn’t truly needed.  There is one caveat I make to this statement, however, and that is for “invisible” problems.  These are problems that you may actually be having, but not know about.  A good example is a memory leak.  Sure, you might have a leak in your admin console (as an example) but if you close it at-least once per day then you never realize it.  The fact that every Monday after you left it open all weekend it is sluggish until you restart it has just become a habit that you have never bothered to investigate.  A fix that solves admin UI memory leaks might help, or might be completely unrelated and do nothing for you but it is worth considering applying proactively.

    So now I shall get down off my soap box.  There are may smart people who I have respect for who disagree with me on this stance and in the end what works best for one company may not work best for all companies. Make the choice you deem appropriate for your company and your role.  I will hope that it works out well for you in any case.

     

    8/15 - Minor updates to clarify CU

  • How to manually clean your SCCM server roles off the box

    Occasionally it becomes necessary to manually clean off a server of SCCM components.  When that becomes necessary I usually tell customers to just flatten and rebuild the box, but that is not always an option.  In those cases I have this mental list of things I go through to remove all the traces that SCCM/SMS could leave on the machine.  Not every machine will have all of these locations populated. Every scenario has its nuances so don’t blindly follow this if you want a clean box but consider if each item is relevant to what you are trying to accomplish. If you think I missed anything, comment below and I’ll update as appropriate.

    • File System
    • \Program Files\Microsoft Configuration Manager
    • \Program Files\SMS_CCM
    • \sms
    • \windows\ccm
    • \windows\ccmsetup
    • \windows\ccmcache
    • Registry
    • HKLM\software\Microsoft\sms
    • HKLM\software\Microsoft\ccm
    • HKLM\software\Wo6432Node\Microsoft\sms
    • HKLM\software\Wo6432Node\Microsoft\CCM
    • Services
    • SMS_executive
    • SMS_Site_Component_Manager
    • SMS_Site_Backup
    • SMS_Site_SQL_Backup
    • SMS_SITE_VSS_Writer
    • SMS Agent Host
    • WMI
    • root\sms
    • root\ccm
  • How to get clients to avoid one of your management points

    The other week I had a customer asking me how they could keep clients from using a Management Point, yet still have it installed and functional to interact with some 3rd party software they wanted to use.  That question didn’t have a simple answer.  By default an SCCM 2012 client will randomly choose from any available MP in a site.  The key things that control the choice over one MP versus another are if an HTTPS MP specifically is required.  Clients also have a preference to use the MP in the same forest they are in, if several MP area available.  For my customer, all the MP were HTTP and all in the same forest, so of their 3 MP all would have the same possibility of being chosen.

    I tried an idea that turned out to work, which is to "hide" one of the MP.  By “hide” I mean that it is still in AD and seen in an MPList call but will not be returned to clients which call their current MP and request other MP to communicate with. This means that normal client processes would randomize between 2 of the MP, but the third MP would be used only when specified or hard coded, such as during client installation.  That third MP is still there and running as normal but it takes something like 3rd party software, boot media, or a client command line parameter for it to be used. Un-publishing the MP means that it won’t be listed in AD and normal location requests will not return it as an option.  Screen shots on where this un-publishing can be done are below.  The change can be seen by watching the clientlocation.log on the clients and looking for a line similar to the following, never changing to the "hidden" MP:


    Assigned MP changed from <MP1> to <MP2>.

     

    There is a desire in the SCCM community to allow clients to have an affinity to a specific MP, similar to the use of boundaries and Distribution Points (DP).  To be clear this will not provide that affinity.  It simply removes one or more MP from normal client use processes.  It cannot be used to selectively make one MP serve a subset of clients.  If you were to  set a client to use this “hidden” MP it would, for a time.  Various processes in the SCCM client would eventually ask for a list of available MP, and the results returned would be the other MP, and thus the clients would switch away from the use of this “hidden” MP.  The MP would serve a limited purpose, until such a switch occurred.

    clip_image002

    Thanks go to Jason Sandys and Adam Meltzer for helping me provide clarity on this post.

    6/27/2014 UPDATE - To provide better clarity I changed the post to reflect that MPList will still show the "hidden" MP and its object will still be in AD

  • Getting started with RBA in System Center Configuration Manager 2012

    RBAC, or Role Based Administration, is new with the SCCM 2012 product.  Many customers I encounter are excited with the separation and flexibility it can provide, but daunted by the activity of getting it setup initially.  There are a few handy tricks I share time and again with customers at this stage and I thought it might be nice to share with everyone.

    1 - Create a security role template

    If you get into RBA you will find that you can copy an existing role, then modify it for your needs.  However, this means that you have to go and remove all those perms that came over from your copy…, and you have to do this for each role you have to make.  Save yourself some time and the very first time make a copy of the Remote Tools Operator. This has the least amount of existing perm lines that need cleared.  I like to change all the existing lines to ‘no’ then add 1 single perm at the very top, under Alert Subscription.  Save this as “_Template” and from this time forward you can just copy it as your starting point, making things a little easier.

    image

     

    2 - Think ahead about your potential use of scopes

    If today you manage desktops only, and that is all you will ever have in SCCM, then stick to the default scope.  If, however, you think that some day you may want to add servers and have management of them separated out, then the time to create the “desktop” and “server” scopes is right after SCCM finishes installation.  Sure, you can do it any time, but the problem is that any object you create going forward is tagged to the scopes you are in.., which means they can all be tagged as part of the “desktop” scope, or they can be tagged as part of the “default” scope and you can go change them all to “desktop” later (not simple, but possible).

    3 – Use RBAViewer

    This tool is free as part of the SCCM 2012 R2 toolkit. It will make your visualization and exploration of the different perms necessary to reach your ends goals easier and more clear.  There is also a helpful spreadsheet made by Brent Dunsire that many people find useful.

     

    As a little side note, I was asked to figure out what the minimum permission necessary to allow an admin to block/deny devices from communicating with SCCM.  That requires “Read Resource” perm under collections.  Of course, if you can’t see the collections or connect to the admin console you will probably need an additional perm of some kind.  “Read” under collections was the easy one I used.

  • Basic content deployment troubleshooting in SCCM 2012

    I was recently asked to pull together a general starting point guideline on how to troubleshoot content deployment issues with SCCM 2012.  There is a lot that could go wrong, and the specifics for each scenario are different, but here is what I pulled together as a general starting point.

     

    · Check Status in the UI

    · Head to the logs

        • If you missed the time or no problems are obvious look "downstream" at pkgxfermgr.log
        • If you should have info but do not, look "upstream" at the parent site communication components (sender, scheduler)
      • Watch for lines with “STATMSG” in them where you see Sev=W or Sev=E
      • This are the error and warning status messages and help focus on problem points
      • Look just above them to see the real issue
    • Use CMTRace.exe and look for lines colored red

    Additional info: