Blogs

The Case of the Slow Keynote Demo

  • Comments 32
  • Likes

A couple of weeks ago I participated for the first time in the keynote at Microsoft’s Teched US conference to a room of over 5,000 attendees. Bill Veghte, the Senior Vice President of Windows marketing, led the keynote and gave a tour of the user-focused features of Windows 7, Iain McDonald, General Manager for Windows Server, demonstrated new functionality in Hyper-V and Windows Server 2008 R2, and I demonstrated IT Pro-oriented enhancements in Windows 7 and Microsoft Desktop Optimization Pack (MDOP). 

I showed features like BitLocker To Go group policy settings, PowerShell v2’s remoting capabilities, PowerShell’s ability to script group policy objects, Microsoft Enterprise Desktop Virtualization (MEDV) and how the combination of App-V, roaming user profiles and folder redirection enable a replaceable PC scenario with minimal downtime. One point I reinforced was the fact that we made every effort to ensure that application-compatibility fixes (called shims) that IT Pros have developed for Vista applications work on Windows 7. I also demonstrated Windows 7’s new AppLocker feature, which allows IT Pros to restrict the software that users can run on enterprise desktops with flexible rules for identifying software.

In the weeks leading up to the keynote I worked with Jason Leznek, the owner of the IT Pro portion of the keynote, to identify the features I’d showcase and to design the demos. We used dry runs to walk through the script, tweaking the demos and creating transitions, trimming content to fit the time allotted to my segment, and tightening my narration to focus on the benefits of the new technologies. For the application-compatibility demo, we decided to use a sample program used internally at Microsoft, called Stock Viewer, that’s intentionality incompatible with Vista and Windows 7 in ways representative of actual line-of-business software that doesn’t run without assistance on these newer operating systems. In my demo, I would launch Stock Viewer on Windows 7 and show how its trends report function fails with an obscure error message caused by incompatibility:

image

Then I’d show how I could deploy an application compatibility shim that enables the application to work correctly on Vista and then rerun the application successfully.

We also wanted to show how AppLocker’s rule creation wizard makes it easy to allow software to run based on the publisher or version if the software is digitally signed. Originally, we planned on showing AppLocker after the application compatibility demo and enabling Adobe Acrobat Reader, an application commonly used in enterprises. We rehearsed this flow a couple of times, but found the transitions a little awkward, so I suggested that we sign the Stock Viewer executable and move the AppLocker demo before the shim demo. I’d be able to enable Stock Viewer to run with an AppLocker rule and then show how the shim helps it run correctly, using it for both demos.

I went back to my office, signed Stock Viewer with the Sysinternals signing certificate and sent it to Jason. A few hours later he emailed me saying that something was wrong with the demo system because Stock Viewer, which had previously launched instantly, now took over a minute to start. We were counting down to TechEd and he was panicked because we needed to nail down the demos. I had heard at some point in the past that .NET does authenticode signature checks when it loads digitally signed assemblies, so my first suspicion was that it was related to that. I asked Jason to capture a Process Monitor trace and he emailed it back a few minutes later.

After opening the log, the first thing I did was filter events for StockViewer.exe by finding its first operation and right-clicking to set a quick filter:

image

Then I looked at the timestamps on the first item, 2:27:20, and the last item, 2:28:32, which correlated with the minute delay Jason had observed. As I scrolled through the trace I saw many references to cryptography (crypto) Registry keys and file system directories, as well as references to TCP/IP settings, but I knew that there had to be at least one major gap in the timestamps to account for the long delay. I scanned the log from the beginning and found a gap of roughly 10 seconds at 2:27:22:

image

The operations immediately before were references to the Rasadhlp.dll, a networking-related DLL, and a little earlier there were lots of references to Winsock registry keys, with accesses to crypto Registry keys immediately after the 10 second delay. It appeared that the system was not connected to the Internet and that the application was held up by some networking timeout of roughly 10 seconds. I looked forward in order to find the next gap and came across a 12-second interval:

image

Again, networking-related activity before, and crypto related activity after. The subsequent gap, also of 12-seconds, was identical:

image

In fact, the next few gaps looked virtually identical. In each case there was a reference to HKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings\Connections immediately before the pause, so I set a filter for that path and RegOpenKey and sure enough, could easily see six gaps of exactly 12-seconds each:

image

The sum of the gaps – 12 times 6 – equaled the delay Jason was seeing. Next, I wanted to verify that the repeated attempts to access the network were caused by signing verification so I started looking at the stacks of various events by selecting them and typing Ctrl+K to open the stack properties dialog. The stack for events related to the Internet connection settings revealed that crypto was the reason:

image

One final piece of evidence I wanted to check for was that .NET was ultimately responsible for these checks. I rescanned the log and I saw events in the trace that confirmed that Stock Viewer is a .NET application:

image

I also looked at the stacks of some of the early events referencing crypto Registry keys and saw that it was the .NET runtime performing the call to WinVerifyTrust, the Windows function for checking the digital signature on a file, that started the cascade of attempted Internet accesses:

image

Confident now that the cause of the startup delay was due to .NET seeing that Stockviewer.exe was signed and then checking to see if the signing certificate had been revoked, I entered Web searches looking for a way to make .NET to skip the check, since I knew that the keynote machines probably wouldn’t be connected to the Internet during the actual keynote. After a couple of minutes of reading through articles by others with similar experiences, I found this KB article:

image

The article describes exactly the symptoms we were seeing and notes that .NET 2.0, which is the version of .NET I could see Stock Viewer was using based on the paths of the .NET DLLs it accessed during the trace, supports a way to turn off its obligatory checking of assembly digital signatures: create a configuration file in the executable’s directory with the same name as the executable except with “.config” appended (e.g. StockViewer.exe.config) containing the following XML:

<?xml version="1.0" encoding="utf-8"?>
<configuration>
      <runtime>
              <generatePublisherEvidence enabled="false"/>
      </runtime>
</configuration>

A total of about 15 minutes since I had received Jason’s email, I sent him a reply explaining my conclusion with the configuration file attached. Shortly after, he wrote back confirming the delays were gone and expressing amazement that I had figured out the problem and solution so quickly. It might have seemed like magic to him, but I had simply used basic Process Monitor troubleshooting techniques and the Web to solve the case. Needless to say, the revised demo flow and transition between AppLocker and application compatibility came off great.

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • Mark, your posts are always amazing and I look forward to the days where one hits my RSS feed. Thanks for taking the time to share your troubleshooting techniques with us. I've used many of your methods to perform similar "magic" in much less time than my older, shotgun-style trial & error approaches. Your surgical approach to getting to the root of issues is surely making a positive impact at Microsoft. In a time where they need you and your drive for perfection more than ever, keep up the good work and try to "Right the Ship" that is MS. THANK YOU!!!

  • Excellent post. This is a common problem with E12 servers that aren't connected to the Internet (either on purpose or because of a network outage/problem) when they start up.

    I've also had it happen with .NET 2.0 ASP.NET applications that have a signed assembly loaded, in which case you have to edit machine.config.

  • Hi Mark,

    Interestingly, you could have solved the situation even faster by not using Process Monitor.  A google or live search for the symptoms would have brought you to the same knowledgebase article.

    The #1 return for keywords “signed windows application starts slowly” on Google and Live Search is the relevant KB Article.

    And it’s the #7 result if you drop the word “signed”.

  • While it's easy to use Google to search for the answer, Mark was showing us the technical procedure involved in finding the answer ourselves.  It's the thought process and deductive reasoning behind the search for the answer that makes this blog worth reading.  The reader also has an opportunity to learn something.  That's the value of doing your own research and fact finding, rather than always turning to someone else to find the answers.

  • Great stuff, thanks for taking the time to post your method.  It is GOOD to TEST 'LITTLE' CHANGES to the demo before going live!  :)

  • Funny thing.  We ran into this issue when testing our final RTM bits with the "master" CD produced at our manufacturing facility.

    It really, really sucked for a variety of reasons.

    Ultimately, because there was no other workaround at the time (a few years ago), we had to remove the Authenticode certificates from our binaries.

    This invalidated our Vista logo compliance status - which wasn't a huge deal. But it was really annoying trying to convince Microsoft that this was a severe bug in the .NET framework.

    Eventually, someone at Microsoft independently got a clue and realized this was important enough to fix.  Unfortunately, the workaround (modifying your app.config) is known only to the people who have the pleasure of having to chase this issue down.  Since generatePublisherEvidence is not defaulted to false, you can be sure that everyone signing their apps will fall victim to this error.

    So Dear Mark, I have to admit my perverse satisfaction in seeing you, a Microsoft employee, trip over this bug. Maybe the pain will induce your colleagues to consider this issue a little more important.

    But, hey, thanks for blogging :)

  • Great article Mark. However, this demonstrates how easy it is to create a security vulnerability. Just sign some "bad" code with a bogus cert, then include the .config file to turn off cert checking for the app. I certainly hope that you received some form of forced UAC message indicating that the application's cert validity wasn't being checked.

  • @WhatAboutSecurity:

    Turning off the check does not expose any security issue or create a 'vulnerability'. A security vulnerability allows for malicious software to gain execution without any authorization.

    Signature checks serve as proof-of-origin for trust decisions (e.g. installing an activeX control) and integrity check, not as any indication that the software is non-malicious, free from exploitable defects, or carrying a malicous data payload.

    The only code in general checked for signature validity during loading are ActiveX controls, .NET assemblies and device drivers. OS components are not verified except on demand.  UAC elevations are not 'security' and the signature verification performed by the consent prompt is intended primarily to encourage ISVs to sign their code.

  • I honestly wish generatePublisherEvidence enabled="false" was the default for most .NET apps. I understand the reasoning behind the verifications, but there seems that there should be a better way of going about this. As more applications become .NET based, I suspect that KB article is going to get hit more often. Anyone running Exchange 2007/2010 in a lab environment without internet connectivity already knows that KB by heart... most likely. :)

  • The Stock Viewer demo app is available for the world to use here: http://blogs.msdn.com/cjacks/archive/2008/01/03/stock-viewer-shim-demo-application.aspx. You can steal my demos too, just like Mark does! :-) Unless you digitally sign it yourself, though, you shouldn't need to apply this fix.

  • There are two thing here that I do not fully understand:

    - why does it take 12 seconds to realize there's no internet connection?

    - why does it try this six times?

  • Surprising that people in Microsoft search Web for problems related to Microsoft products, rather than searching the internal/external "Knowledge Base" articles.

  • SharePoint is suffering from this problem as well.  It really did my head in until I went through a similar exercise to troubleshoot the slowdown.

    In addition to the web.config change there are other ways around it as well, including registry changes to turn this check off for all applications in one go.

    This does not reduce the security of the system as, the way I understand it, .net is checking crl.microsoft.com to check if the certificate of the signed assembly has been revoked. As there is no internet connectivity on affected system a failed (and delayed) check is as good or as bad as no check at all.

    I have blogged about it here:

    http://www.muhimbi.com/blog/2009/04/new-approach-to-solve-sharepoints.html

    This was a difficult one to troubleshoot, if an event log entry would have been written it would have literally saved me weeks.

  • This is a particular problem for services written using the .NET Framework since if they do not start within 30 seconds then the SCM will kill them off.  Other than applying the hotfix and workaround for all our customers (or trying to force our customers to download the revocation list weekly) the only way round this is to make sure our service executables are not signed.  Of course when the SCM kills off an application in this manner the error message in the event log is abysmal.

  • To Tim Heron,

    Putting generatePublisherEvidence=>false in your app.config is the solution for you.