Keith Combs' Blahg

Ramblings from another nerd on the grid

Windows Genuine Advantage (WGA) - weekend issue explained

Windows Genuine Advantage (WGA) - weekend issue explained

  • Comments 5
  • Likes

This weekend a number of users got caught up in an issue with our Windows Genuine Advantage (WGA) program.  I'm reposting the explanation from http://blogs.msdn.com/wga/archive/2007/08/27/update-on-validation-issues.aspx.

"We're continuing to investigate what happened but here's a quick update to yesterday's post. The issue with processing validations began Friday afternoon at about 3:30 PM Pacific time and through a combination of posts to our forum and customer support the issue was discovered by evening. By about 11:15 AM Pacific on Saturday morning the issue affecting the validation service had been analyzed and resolved such that validations were again being processed properly. Our data shows that fewer than 12,000 systems were affected worldwide and that many of those have already revalidated and are fixed. This is encouraging news but we want to emphasize that one bad customer experience is one too many and that we're committed to learning from this experience and working to prevent this type of event from occurring again.

We're also looking into the reports of comments made about the expected length of the issue and how support inquiries were handled overall during this time.  I heard a report that one of our support folks indicated that the issues would not be fixed until Tuesday, that was incorrect. We'll be looking closely at how/why that statement was made.

Let me call out here that we take issues like this very seriously and when anything like this happens it receives our full attention. I heard directly from a couple of users yesterday who experienced this issue. They confirmed to me that they were able to re-validate their systems successfully and had no other issues. As I mentioned yesterday the fix for anyone affected by this is to revalidate their system at our site. This can be done by visiting our site (www.microsoft.com/genuine) and clicking the ‘Validate Windows' button in the upper right area of the homepage. For customers who need additional support Microsoft already offers free support for WGA issues starting with diagnostics and other tools and information. In North America support for WGA issues is available at 1-866-530-6599. Internationally it varies somewhat so to find out more about our support options go to http://www.microsoft.com/genuinesupport (there's a link to technical support at the bottom of the page).

This validation failure did not result in the 30-day grace period starting and no one went into reduced functionality mode as a result.  The experience of a system that failed validation in this instance was that some features intended for use only on genuine systems were temporarily unavailable.  Those features were Windows Aero, ReadyBoost, Windows Defender (which still scanned and identified all threats, but cleaned only the severe ones), and Windows Update (only optional updates were unavailable; security and other critical updates remained available).  Also, the desktop message about failed validation appeared.  And as I indicated, these features return to normal and the desktop message disappears when an affected system is revalidated at our site.

As always, we welcome feedback about the program so please feel free to contact me here through the blog or post a comment.

Published Monday, August 27, 2007 2:04 PM by alexkoc"

As Alex indicates, we are far from being done looking at this incident.  I too will be looking closely at what happened, perceptions of our customers, and what we can do moving forward to earn your trust.  Since I am moving into a different role starting October 1st, I'll be looking at all of this from a different view.  I'll let you know more about that as we get closer to that date.  In the meantime, I sincerely hope everyone is back up and running, and having fun.

[UPDATE] See http://blogs.msdn.com/wga/archive/2007/08/28/so-what-happened.aspx for the gory details.

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • Different role?  Hmmm... Interesting

  • Yea, you think I have a great job now?  It just got better...

  • Not much of an explanation as to what happened.

  • Brad, it is what it is at this point.  Rest assured there are many tough questions being asked.  If you've ever been responsible for a data center service that has an incident, then you know what type of inquiries are taking place.  Data center operators know what I mean.

    This brought back vivid memories from early in my IT career.  I was working in a large data center.  We had a batch job that takes all weekend to complete that was running.  One aspect of running the job included loading data via removable disk packs.  Think data center mainframe DASD in the removable form.

    The operator doing the manual loading of the disk packs was apparently bored and not paying attention.  He was eating some peanut M&M’s and one fell down into the open disk machine.  He dropped the new disk pack in placed, locked it up with the handled and it spun the candy up into the disk pack trashing the disk and processing time of the batch job.  This meant the job did not complete on time and the CICS region was not up on time Monday morning.  A new policy was implemented following that… no more food and drink on the data center floor.  Harsh.

    Regarding the software side of this, again, lots of tough questions are being asked in and outside Microsoft.  I must admit however I am a believer in the object of WGA and like technologies.  Regardless of whether I’m running Linux, UNIX, OS X, or Windows, I want to know the binaries are from the OS vendor and are safe.  It needs to be as painless as possible and resilient.  

    I have no idea where this will lead but don’t think for a second there isn’t laser focus on this right now.  There is.

  • Shoot, I've been there many times.  You have a problem... You pretty much know what caused it but on the off chance there's something else involved you don't want to be left looking like you don't own the solution.  In fact, at one place I worked at (which I won't name) we had a name for that process...  We all got together after the fact in a formal meeting informally called a Blame-Storming Session.  The purpose of which was to come up with an explination that we could present to an executive.  They wern't too fun usually.. :-)