Lately I have been dealing with the big challenge of teaching something that I have become to know as more of an art than a science. That is, the troubleshooting of virtual applications not functioning correctly in App-V. I have seen many articles that walk through how someone was able to figure out why one particular application was failing and how they were able to provide a resolution. We always appreciate these brave pioneers who prevent us from having to reinvent the wheel when we happen to have the same exact application problem. Through a quick search of Bing, we were able to find that hero who would rescue us from our current plight. The App-V community is full of these heroes – especially in the MVP space. What we do not know in many cases is how they went down that particular path of troubleshooting. What we will often only see (and I myself am quite guilty of this) is the abbreviated cleaned-up version of how the problem was resolved. We do not see the many paths traversed or the rationale behind the particular troubleshooting approach used.
How do you learn to do that?
I got asked this question bluntly by a customer. I was helping them remediate an application and they came right out and asked me. This got me thinking. What about the packager who is new to App-V? How does one learn how to fix a broken virtualized application, or at the very, be able to determine the actual root cause as to why the application was failing? Where does one begin when troubleshooting a broken application? What are the best tools to use?
Before we can answer these questions, we need to explore why a lot of people struggle with this. What makes troubleshooting a virtual application challenging is the lack of source codes, symbols, and more importantly, an application SME. Every time you troubleshoot a virtual application for the first time, you are basically having to learn the very depths of the application. I used to work escalation-level support and many of the rules of troubleshooting and reverse engineering that I employed applied to virtual applications. But it is one thing to support a broken application when you have every developmental resource at your disposal because you happen to work for the company that makes the very software you are troubleshooting. It is another monumental task to have to isolate and troubleshoot one that is from another vendor. Fortunately, the scope of App-V applications are limited to applications that run on windows-based operating systems.
To start, let us talk about how to avoid rabbit holes. We need to focus first on what not to do. The reasons people struggle and often never resolve application issues is usually tied to one or more of the following rabbit holes:
Speaking of Process Monitor . . .
So let’s look at troubleshooting virtual applications using Process Monitor. How do you learn how to effectively do it? In my opinion, it boils down to combining several areas of knowledge –
Foundation: This is knowledge of the platform. In order to understand how applications break, you need to understand how applications work. In this case, we are dealing with Windows-based applications. A strong foundation of knowledge can be obtained by reading up on the design and internals of the windows operating system, the Win32 API, the native NT API, etc. all help to build this foundation. You do not necessarily have to know everything there is to know about the windows operating system, but having a good understanding and the ability to reference development references will suffice. Even if you are not a programmer by trade, reading the Windows Internals book (sometimes several times) will open your eyes to how critical elements such as processes, threads, and objects work. The Windows Internals book (by Mark Russinovich, David Solomon, and Alex Ionescu) is now in its 6th edition and has been split into two volumes due to its mammoth size. In addition, you have several references online that you can research and reference. The Windows API is online at MSDN (http://msdn.microsoft.com/en-us/library/cc433218%28VS.85%29.aspx) and the Windows Drivers Kit (not needed as much for virtual applications) is also online at the WDK site (http://msdn.microsoft.com/en-us/windows/hardware/gg487428 for free. You can even have an offline reference for the Win32 and .NET API’s through the downloadable SDK’s (Windows 7 and .NET 4 here at http://www.microsoft.com/en-us/download/details.aspx?id=8279)
Operational: This is knowledge of the tool. Assume the example is Process Monitor, you have to be able to efficiently leverage the right feature when needed. Since Process Monitor is the most commonly used tool used in troubleshooting virtual applications, it is essential that you understand all of the features. For any Sysinternals Tool, I would recommend the Sysinternals Administrator's Reference by Mark Russinovich and Aaron Margosis. In the case of Process Monitor, everything useful regarding the tool is discussed. Once you know how the tool is used, you can then proceed to apply its uses to the right task.
Application Context: This is the context of the failure within the application. You cannot teach a tool if you do not know the right context of the tool especially when the use of the tool will vary from issue to issue. The application that yields an “UNKNOWN ERROR” must be troubleshot differently from an application that performs poorly, crashes, freezes, disappears without a trace, etc. I find this to be the most important skillset – the “X” factor, or so-called “Art” of troubleshooting. Those who are great at this tend to know context and thus, are able to isolate quicker.
You may find this overwhelming at first . . .
. . . and it will take practice and repetition. But like any “art” the improvement will come and you will begin to develop your own style.
Sequencing is not Virtualization and Virtualization is not Sequencing
A final note: There are big differences between “sequencing problems” and problems that are actually virtualization issues where an application is not functioning as expected. Often virtualization problems are labeled as “sequencing problems” when in fact, no problems occurred during sequencing. The problem occurred *AFTER* sequencing. Now it is true that the resolution to the virtualization problem may result in a modification to the sequencing process (i.e. a custom recipe.) True “sequencing” issues rear their ugly heads during the actual sequencing of an application.
Great article Steve! I have my own strategy too. Like you said, when there's no SME for the application itself it can making it challenging, that's when you need to do some reverse engineering and use the 'oul noggin.
Very useful information!
We work a lot packaging applications and we've developed our own tool to troubleshoot: SpyStudio. Process Monitor lacks of user-mode information and the errors that it shows are not meaningful. With SpyStudio, you can see lot of user-mode information such as COM object and Windows creation, environment variables queries, application exceptions and other information. In addition to this, you can compare 2 traces which is very useful when you can trace a working application running on the base and the faulty one in the virtual environment.
Thank you very much for your information and insight!