***Updated on Sep 20th 2012…see below***
Unlike many software products, where most errors are presented by the application’s user interface, many of the issues you might encounter have little or no visual indication of what’s going on. In those cases, you might have no choice but to go under the hood, and ask UAG to perform a “trace”. A Trace sets the software to write a very detailed log of what it is doing. It logs functions, values and errors, which often helps you see what’s really going on. For example, if there’s a problem with a server’s certificate, UAG may not be able to create a secure communication channel and show a very generic error like “An unknown error occurred while processing the certificate”. In this type of situation, a trace would be essential for understanding what the problem is.
Keep in mind, though, that a trace is far from easy to read. In fact, the sheer volume of data in a trace can be overwhelming. A busy server may generate 600,000 of lines of trace data in a single minute of activity! Add to that the fact that it contains application-developer level data, and you can see how this can be scary. Often, Microsoft CSS support will ask you to generate such a trace, as the only means to investigate a problem, so knowing this could be useful. If you are feeling adventurous, you might even want to try reading a trace on your own.
For those of you who are veterans of IAG or EGAP, you probably remember some procedure involving editing some XML files or editing a registry key. With UAG, these days are over, as the server includes a friendly and visual tool that’s a lot easier to use. To launch the tool, navigate to the folder c:\Program Files\Microsoft Forefront Unified Access Gateway\common\bin\tracing. In that folder, launch the file LaunchHTA.VBS. Note that this needs to be run as an administrator to function (if you are not, you open a command prompt using the “run as” option, launch it as administrator, and then run the LaunchHTA.vbs script). The trace tool looks like this:
All these items on the long list on the left are names of various UAG components that you can trace. By default, tracing is active on a UAG server all the time, but only at a basic level. To get more details, we’ll have to reconfigure it to provide more data.
When you launch this utility, tracing will be ON, so we first need to turn it all off by scrolling all the way down and clicking “stop”:
You must be wondering which of the various components you should check. The answer to that is everything! This might seem wasteful, as selecting more components results in more data being recorded in the trace, but the reality is that one rarely knows exactly what they are looking for. Also, the components whlfilter, whlcppinfra and whlfiltsecureremote alone are the most traffic-heavy components, so you won’t gain much by checking any of the others…they will have little effect on the size of trace. Also, these three components are a critical part of almost any UAG operation, so we almost always need them on. Note, however, that a busy server can generate a large trace very quickly - 10 seconds of tracing might yield an 80 MB file (which is about 400,000 lines of text, or 6 copies of Tolstoy’s “war and peace”).
To select everything quickly, check the top box under “noise”. If you expect to need to gather a very long log, you might want to uncheck the option “Circular log file of ___ Mbytes”.
When ready, start your trace, and reproduce the issue you want to investigate. When done, click “stop” to stop logging, and retrieve the log file, which would be located by default in c:\windows\debug, and named Forefront-UAG.bin. Be sure to at least move it, because it will get overwritten if you turn on tracing again – better safe than sorry.
The trace file that you just generated, you might soon discover, is unreadable! It’s actually binary, and to read it, you have to perform a decryption using specially created TMF files that Microsoft distributes. These can be downloaded here. Get the file UAG_TMF_files.zip, and expand it to some folder on your hard drive. It’s possible to configure the tracing tool to do the decoding as part of the tracing, by specifying a location of the TMF files in the box “trace format search path”. Then, once tracing has completed, click “text” under it to decode the file and show it. You can also decode the BIN file at a later date, or on another computer using the TRACEFMT tool, which can be found in the same folder (\common\bin\tracing) on the UAG server. To do so, open a CMD prompt, and run the tracefmt tool using the following parameters:
tracefmt.exe <trace bin file> -p <path to extracted TMF files> -o <Target TXT file>.
The resulting file (UAGLog_09_02_10.txt in the above example) would be a simple text file that you can open with Notepad, or your favorite text editor (mine is MetaPad). Keep in mind that the file grows significantly in the decoding process. The above file was 1.5 MB in binary form, but 4 MB as text – expect the text file to be about X2-2.5 the size of the binary, and make sure your text editor can handle it. If not, you can use a file splitting tool to break it to chunks. Here’s an example of what you can expect to find in a log:
Don’t expect me to teach you in a single blog entry how to read a full log, but here is some info. The above piece depicts the start of a request from a UAG client to the UAG server. Each line has a time-stamp, down to the millisecond, followed by the component name (whlfilter in this example), and a function name. Then comes the name of the source file and the line number of where the function is…you don’t have access to the source so you can’t read the full function, of course. Then, after “info:” comes what the data that this function recorded. At the top, you can see the details of this request – the protocol (HTTPS), HTTP method (get), the URL and then more info, including the cookies, hostname of the server, user agent etc.
Below all this, you see the UAG filter start processing the session info, so it can match the request to an existing user session (or fail to do so). On some lines you see a PFC code, which is a unique number assigned to this request, and allows you to track it along the trace. Later on, you can see the header encoding being handled and finally, at the bottom, the application that was identified for this request (Internal Site). Naturally, this is not the end, and this request goes on much longer – several dozen pages, and this is very normal.
Trying to understand all these functions and jargon can be frustrating, no doubt, but with time, you’ll learn to recognize repeating patterns. You can also search the text for keywords like “Error” or “fail”, if you are looking to troubleshoot a specific error. You can do a search for the expression “request from client to filter” to find the URL you are looking to troubleshoot. Each request to UAG has 4 stages:
1. request from client to filter
2. Request sent from filter to web server
3. Response from RWS to filter
4. Response from filter to client
This means that depending on where you see an error, it can indicate if the problem is with UAG itself, or perhaps with what the backend server is sending. Sometimes, UAG will not complete all the steps. For example, if the request violates some access policy, or leads to a 500 error, UAG may not even send it to the backend server. However, even in such a case, the processing of the request will end with a line saying “Info:empty request context”, like this:
Once a request has concluded, the thread is returned to the queue, which also means the PFC code that was used may be reused on subsequent request, so if you need to examine a specific request, make sure you isolate it from subsequent requests that use the same PFC. Keep in mind that request are processed in order, but they can be intertwined. This means that even if you “cut out” a request, it may contain lines that relate to other requests that have been going on during.
Issues you might run into when running a trace would typically be about the trace.hta file not having the permissions to start the trace. This could happen if you are not logged into the UAG server as a local administrator, so make sure you are running the tool as one. Another common issue is that the trace file has many or all of the lines saying “No Format Information Found”:
This typically happens because the decoder couldn’t find the appropriate TMF files, or some of them. This could happen if the file you download is corrupted, did not open fully, or is for a version that’s different than yours. The link I suggested earlier is for UAG Update 1, but if you have installed a later update (as of writing, U1 is the latest, but that won’t be like that for long…), you would need updated TMF files. It’s also possible that when specifying the path to the TMFs, you misspelled it or specified an upper-level path (the decoder cannot search subfolders).
Last thing to know is that a similar procedure can be done on the client – the same utility is there, and the TMF files package includes data for client-side tracing. The trace utility would be at c:\Program Files\Microsoft Forefront UAG\Endpoint Components\3.1.0\trace.hta , and works the same as on the server.
***Updated on Sep 20th 2012: I cleared up some confusion about which components to select when taking a trace.
To get the TRACEFMT tool with the command-line parameters you mention, I had to turn to the WinDDK and install the tracing tools. Otherwise great article. Thanks!
Thanks for the article. Is there any way we can identify the originating client?
Yes, we can, as the trace specifies both the client IP and, if the user logged in, the user name too. I'm afraid I can't be more specific here.
Any sign of the TMF files for SP3?
To Simon: the product team will release them when they are ready.