Microsoft Enterprise Platforms Support: Windows Server Core Team
EPS Team Blogs
Product Team Blogs
I know that one of the workarounds everyone is using is editing the pending.xml file and rebooting the machine. This is seemingly working for everyone that the SetupExecute registry value does not. I wanted to give a brief rundown as to why you only want to do this as a last resort.
During service pack installation, we populate the pending.xml file with all of the files and registry values needed to install a particular update. Service packs are special in that they are broken into critical and non-critical transactions to allow us to recover more quickly and reduce the no-boot window that could occur during installation.
During system shutdown, we process all of the critical transactions first and then the non-critical transactions. If we fail processing the critical transactions, the service pack will just fail and rollback. If the critical operations succeed but the non-critical operations fail, we attempt to process them on reboot using Session Manager (smss) and the SetupExecute registry value. When the system reboots and reads the SetupExecute key, it retries installation first and if that fails it will roll back the Service Pack installation. Deleting the registry value tells smss to not try and run the poqexec. It should be reattempted again during startup processing or fail outright. So effectively deleting the registry value breaks you out of the install fail reboot loop that the machine ends up being in.
Additionally, the pending.xml file has a checkpoint value that tells Windows where the critical transactions end and the non-critical transactions begin. When you delete the checkpoint value in the pending.xml, its effectively marking everything in the pending operation queue as critical. Because your machine has already rebooted, Windows thinks it has nothing to do and just boots normally. The problem with this is that because there are still operations that need to be processed that will not get processed and this could potentially leave the machine in an even worse state. Doing this should be an absolute last resort. The best thing to do here is let the failure occur later on so a rollback can take place.
Joseph Conway Senior Support Escalation Engineer Microsoft Enterprise Platforms Support
As one of the IT Support professionals who's been affected. I can tell you that "letting the failure occur later" is not an option. As the machine is stuck on the fatal error screen, and does not get past it to execute a rollback. Editing the pending.xml or replacing it, was a workaround to get the machine to a point where the service back would rollback.
While it may be true that the problem is not totally fixed by editing the pending.xml file in this manner it does at least allow access to an otherwise 'dead' machine! It is incredible that this service pack has been allowed out to wreak such havoc without being adequately tested. What is the fix which will now be offered to users who have had to fight their own way to the position of having implemented this less-than-total fix?
This is ridiculous...Microsoft releases something that literally breaks your machine.