Part 1: the problem.
With UAG, you can provide remote access towards many types of applications:
· Web applications: The “client” is in this case a simple browser, and dialog with go through UAG that will act as the reverse proxy.
· “TCP” applications: the “client” of the application is an executable installed on the client machine, and it creates a “TCP” connection with the backend application. It is also called client/server applications.
· Terminal Server or Citrix: the data transported through UAG is in fact just the screen/mouse/keyboard data.
· Virtual Applications (VDI, MDOP).
As you can see, you can virtually publish any kind of application through UAG, and more important, the user will don’t even know which type of application it is on the backend.
After authentication, the user will reach the UAG Portal where all the applications are listed. The user just has to click one of them, and the appropriate UAG sub-technology will take care of the connectivity.
Figure 1 - the UAG portal page
Let’s focus now on web applications, which is the purpose of this article.
You probably noticed that “sometimes”, publishing a Web application though a reverse proxy fails.
As a user, we may encounter different kind of behaviors such as error messages, portions of the page not displayed, broken links, button supposed to be clickable but nothing happen… Why do we have these behaviors? It should work like a charm!
In a set of 2 blog articles (where I will explain the problem, and the approach to fix it), and a dedicated section on my blog (where I explain real world examples and how to fix it), I will try to explain what is going on behind the curtain, and “humbly” try to provide guidance to address such problem.
Here is the approach I can propose to you:
· Now that we understand the technologies/problems, part 2 will be used to explain how to use UAG to fix the problem, mainly teaching UAG how to do so with the AppWrap/SRA engine.
· Part 3 will be in fact a link on my blog. With this link (using post tags), I will regroup all the posts related to this subject, related to problems I have encountered in my day to day activity (so from the field examples).
o You can bookmark this link : http://blogs.technet.com/fesnouf/archive/tags/howTo-Filters/default.aspx
First of all, let’s take a few minutes to understand basic concepts of web-related technologies in order to understand why you may experience this problem.
First of all, we need to understand how a Web page is displayed in a browser, and for this, we need to understand a little bit HTML. Look at this screenshot:
Figure 2 - Microsoft Expression view, code + rendering
This screenshot is extracted from Expression, the Microsoft’s HTML editor. On this screenshot you can see 2 sections: one contains the HTML code (top), and the other one (bottom) contains the “rendering of this html code”.
This is exactly how works a browser: it will download a file on the hard disk (the .HTML file using HTTP protocol), and will “render” it, which mean will take sequentially (from top to bottom) each line of HTML, and will display the result of this ‘syntax’. For example when IE sees “<STRONG>”, it knows that the following text has to be displayed in “bold” characters.
In this example you see how is structured an HTML page:
· The “HEAD” of the page that contains many things such as keywords that will be used by search engine to index this page.
· The “BODY” that contains the html code that will generate our nice (or ugly) web page when rendered by the browser on the client
In this example I used different html techniques to display data on the screen:
· Basic Text (Welcome)
· A hyper link (<Href> tag)
· A picture (Microsoft Logo)
· Text in bold or red
How to make this page more “interactive”? In your HTML code, you can add some “code” to create any kind of user experience. As soon as you have the opportunity to inject some code, you can – as a developer – create any kind of action on the client side, and potentially create a very nice interface, and a very nice user experience.
The good thing is that the only limit is your imagination, but the side effect is that now, this part of the page is not predictable anymore (impact for reverse proxies, including UAG).
IE will use a .dll engine to execute this code (an interpreter) on the client side. This script engine will detect syntax errors for sure, but not code that are doing weird things, or not working in some special scenarios.
To add some script in the page, you can either put the code “where” (remember, IE will render the page sequentially, from TOP to BOTTOM) you want it to create your effect (for example write text, first example below), or add your function in the HEADER section of the page, and “call” it where you want to generate your effect (open_win function below is generated when user click the “Open Window” button).
Figure 3 - Create your function (source w3schools.com)
Figure 4 - Call your function (source w3schools.com)
Most of the people who read this article already have this knowledge of HTML/Scripting, mostly because HTML and Scripting is not new at all. We know them for years!
The way it works is very simple. You install the “Framework” in a directory of your web Application, you reference the script files in the HEAD of your web application, and then you just need to “call” these functions from your own code, again, where you want it to do the job.
In the screenshot below you can see an extract of an HTML page that “includes” (via the <Script> tag in the header) some .js files. Each file contains a lot of functions that the developer can then use any time.
Figure 5 - how to add .js files in your page.
This new “Framework” approach is very interesting because creating application is quicker (you don’t re-invent the wheel), but on the other hand this transforms a basic “HTML PAGE” into something very complex, composed by a lot of sub components.
Note: we all have that image in mind about web applications: a simple application that just requires a browser.. how simple it is !! in fact it is more complex !!!
As I said previously, we may experience a problem when we publish an application through a reverse proxy. This means that “something” in the downloaded pages (Html, Html + code or Framework) is causing a problem when rendered on the client. More precisely, this application works great when we are doing a direct connection (from the Corporate network), but “breaking” when a Reverse Proxy is in the middle.
Which code is causing the problem? And how to fix it ? ... that will be our challenge.
Computers and software are very strong to automate things that they know (predictable), but they are not very “smart” to identify things that they are not aware of. And by definition, in the development model we described in the previous chapters (code, framework), we have many things that makes it non predictable at all: multiple “frameworks” on the market, and many programmers coding on top of them.
As a result of that, you may experience situations where the code created by the programmer is “breaking” though a reverse proxy. The main reason for that is that the developer did not code the application to be published though reverse proxy. A good example is “links” inside the page. Internally you access the application via http://financeApplication, but from the internet it is called https://FinanceApplication.extranet.company.com . Reverse proxy can rewrite the links if they are part of HTML (Href tag = predictable) but not touch the code that will generate this link.
As a result of this breaking situation, the user experience could be totally different (not predictable either): error message, parts of the page will not be displayed, etc.
Sit and cry is not the solution! Let’s continue to dig in, and see how we can fix it with UAG.
As I mentioned previously, since the “cause” of the problem is not predictable, UAG will not be able to fix “magically” itself this problem for you. In fact, we will need to teach UAG how to fix it since it is designed to face such problem.
The first thing we need to do is in fact to capture the traffic generated by this breaking application and try to figure out what is causing this behavior
Let’s investigate now how HTTP works.
The client will discuss with UAG using HTTP protocol. This protocol is based on multiple “request/answer” discussions. For example, when the browser wants to download the welcome page of my application, it sends to UAG a “GET /welcome.htm” request. This “GET” verb tells UAG (or any kind of WEB application) to “download” this document.
The HTTP request is composed of several things:
· The Verb : could be GET, POST (used to send data), or other
· The “Path” to the document (document to download in case of a GET, or application page (asp, aspx) we want to send DATA in case of a POST)
· The HTTP Header contains several values. For example the type of browser used by the client, so the published application can “adapt” the page sent back to it.
· If this is a POST, it may also contain the data you want to send to this application for example a login name, a password …
Figure 6 - HTTP request
In return, this web application will answer to that request. This answer contains:
· An HTTP error code, which tells the client if this request worked (if not, provide the error corresponding to the problem). Error 200 means success.
· It contains also an HTTP header, with this time some information about the Web Application.
· The requested data, in this example the “HTML” code of my GET welcome.htm page.
Figure 7 - HTTP response
This dialog is the same if the client is talking directly to the web application, or if UAG sits in the middle. In the second scenario, client will talk to UAG, UAG will talk “on behalf of the client” to the application, and same “2 hops” on the way back.
We will need to look at the HTTP traffic in order to fix the problem, since “this HTTP traffic” transports the HTML and code responsible of this bad behavior.
To do so, Launch your HTTP analyzer (In the article I use HTTPwatch, but Fiddler is very similar), start the capture and connect the application until you get some problems.
Here is an example of what you can see with Httpwatch:
Figure 8 - The HTTPwatch console after tracing
The structure of the console is the same as Network Monitor. In the green section of the screenshot, you can see all the requests/answers (HTTP transactions). If you select one line, then the bottom section of the screen (note the tabs on the top of this section: headers, cookies, Cache, … used to go deeper in the http analysis) shows the details of this particular dialog.
As a “quick” example of a breaking application: you can notice in this screenshot that the capture quickly revealed a very strange behavior. You have two “ERRORs” in the middle of the transactions. This is a real example:
· User experience: part of the page is not displayed + error messages.
· What you see in the trace: URLs are sometimes weird: Https://:/. This means nothing for a browser.
· Reason: there is no obvious reason for that, but we can “feel” that it is due to “code” not working well through the reverse proxy. Probably somewhere we have code such as URLtoConnect = “https://”+MyVar1+MyVar2, and these Variables are probably bogus.
Now that you have captured the traffic that is causing the problem, you will need to analyze it. Again “think” like a browser!!
Based on my experience, a good start is to just look at obvious errors. I mean errors that are identified by the analyzer itself:
· Name that are not resolved: most of the time the page contains some “internal” urls or server names that cannot be resolved from the internet.
· Missing files: because links are not correct.
· HTTP errors
If you can’t fix these problems in the application (or server) configuration (at the end of the day an application should be compatible with Reverse proxy scenarios), and need to change the code itself when it goes through UAG (on the fly), take a look at the “part 2” of this article where I explain how to tell UAG to do this job for you, using an UAG “powerful” internal engine names AppWrap/SRA.
The first time you hit this problem you have the feeling that it is extremely complex to identify what is going on. Even if I hope that these “articles” will help you to analyze the cause of the problem, getting some help is definitely the best approach. It usually takes 1 day of training in order to feel more comfortable with these situations, so if you feel lost, do not hesitate to ask some help ! UAG community is here for this, either directly with your local presales team, or though the UAG forum: http://social.technet.microsoft.com/Forums/en-SG/forefrontedgeiag/threads.
Understanding application (Web) complexity: part 2 Appwrap/SRA
UAG will act as the reverse proxy in the dialog, so will be between the client and the web application.
UAG is a very advanced reverse proxy. Whereas “technical reverse proxy” (TMG, Others) where invented to provide “Web caching”, UAG is designed to provide remote access in a secured way (bases first of all on Risk Analysis).
We will use this engine to “fix” problems we have “identified” previously during the reverse engineering phase (part 1 of this article).
Based on my experience, 80% of the job to fix the problem is to identify the problem and understand how to fix it (reverse engineering phase). Once we understand the problem, we will tell UAG how to fix it using Appwrap/SRA, and this phase will be pretty quick.
A few years ago, when I moved to Microsoft and started to work with IAG team (that was the name of the product before it has been renamed Unified Access Gateway), I asked them this question: “why do you have two technologies to do this job, one named Application Wrapping and the other one Secure Remote Access (SRA)”. First the names are not really explicit, and second it is confusing to have 2 technologies.
The answer was “I don’t know, due to history”.
So this is a fact: we have two configuration files available in UAG to modify web applications (one for SRA technology, one for the AppWrap one), one day we will use one, another day you may use both.
Here is the “second” fact: Since the question cannot be answered, then the good question to ask is rather “what each technology can do for me”... then you will select the one(s) you need to fix your problem. In the UAG advanced administration guide (which is more reference guide than a “how to” one) both of them are described in details. As an example, SRA (compared to AppWrap) can go deeper in cookie manipulations.
In the previous article, I tried to explain why a web application could break through a reverse proxy, and provide basic understanding in order to analyze and understand a particular situation. In this second part (we assume we know why the problem take place) we will now see how to teach UAG in order to fix it.
Step 1 will be to create a configuration file in the appropriate directory. This screenshots shows the place where to put these configuration file (in my scenario I want to link these Appwrap/SRA for my “Portal1” portal). If the customupdate directory do not exist, you can create it.
Figure 9 - AppWrap/SRA folders and files
· The “whlFiltAppwrap_Https.xml” file is used when you want to use the “Application Wrapping engine”.
· The “whlFiltSecureRemote_HTTPs.xml” is used for SRA (which means SecureRemoteAccess).
This screenshot below is extracted from the UAG admin guide. As you can see we have many “areas” where we can ask UAG to modify things “on the fly” for us:
Figure 10 - "Manipulations" available in UAG
The most frequent keyword is the “DATA_CHANGE” used to modify the payload.
Let’s use now a basic example in order to understand the structure of this XML.
Figure 11 - "All in one" example
On this capture, I tried to show you both the XML file and the online documentation, in order to understand how they work together. See that in the documentation we have that DATA_CHANGE->SAR->REPLACE, and that we find the same XML Hierarchy in the file.
Our first example is ready, let’s “read” and understand it:
· You always start with “<App_Wrap> and <Manipulation> tags : best thing to do it, is to start with one of the samples.
· Then you tell UAG what you want to do, here DATA_CHANGE (modify the payload of the transaction).
· Then you tell UAG on which page (you give the URL using regex) you want to do this job: a page named /directory/welcome.html.
· Then what to do. Here we do a “Search And Replace” (SAR) :
o Search “AppTitle”
o Replace it with an UAG “internal” variable name WhlSessionTimeout.
As a result, when UAG will see that “URL”, it will search for “Apptitle” text, and if it founds it, UAG will replace it with the value of this UAG internal variable.
· <SEARCH encoding="base64">QXBwVGl0cmU=</SEARCH>
ð QXBwVGl0cmU= is “apptitle” encoded in Base64
TIP: as soon as you have finished and saved your XML file, double click it. IE will be launched, and will generate an error if the structure of the XML is not correct (if you did a mistake such as forgot a XML Tag). If you don’t do this, and if there is a problem with the XML (a missing tag for example), UAG (during activation phase) will never raise any alert, and will just ignore your configuration. You may then spend “hours” searching why your hello world example is not working if you don’t follow this tip.
TIP: never change more than 1 thing (in your Appwrap/SWRA) at a time. Change your XML, activate config, test the result of your configuration and if it is ok, go and implement the next change.
ð Modifying X thing sat a time is the best way to mess up your entire XML file, and lose a lot of time.
This part is extracted from a 1 day training I had the opportunity to provide to several customers and partners.
During this training, I always use a very simple page where students can test any kind of the modification that Appwrap/SRA can do. All the examples below are based on this HTML page:
Figure 12 - Training HTML page, the code and the rendering
Here is the HTML, and the page is named “fulldemo”:
<meta content="fr" http-equiv="Content-Language" />
<meta content="application/training; charset=utf-8" http-equiv="Content-Type" />
<title>TITLE OF THE PAGE</title>
<meta content="This is the keywords" name="keywords" />
<meta content="This is the PAGE DESCRIPTION" name="description" />
<p class="style2">WELCOME - IAG TRAINING</p>
<p>This is a link : <a href="http://www.microsoft.com/">www.microsoft.com</a></p>
Don’t forget to install HTTP watch or Fiddler on your client machine URL. We will use constantly these tools to capture and analyze the problem, and if you play with these examples, you will see the modifications at the HTTP level (cookies for example).
In this exercise, we want to do a basic “text” change.
We want to change the title of the page which is “WELCOME” into something else, for example, “TheNewText”. Here is the appwrap you have to set if you use IAG (I give the IAG example since there are a lot of examples on the interne, and will give you difference with UAG):
<!-- Example 1: change the title of an application, for example to display the version of the filter. Filename is whlFiltAppWrap_HTTPS -->
<APP_WRAP ver="3.0" id="RemoteAccess_HTTPS.xml">
<!-- We change any kind of TEXT information in the HTML page -->
<REPLACE encoding="" using_variables="false">TheNewText</REPLACE>
For those of you who had the opportunity to play with IAG, UAG introduces a new format for the configuration file. I advise you to read this blog post: http://blogs.technet.com/b/edgeaccessblog/archive/2009/11/17/appwrap-in-uag-what-s-new.aspx. As a result of that, the example here will fail on UAG. You have to modify a tiny thing, and add a new “XML TAG” name : <MANIPULATION_PER_APPLICATION> and provide also the application type you want this manipulation do be done with this other tag : <APPLICATION_TYPE>.
This configuration file will work on UAG (and fail on IAG of course):
Application type is one of the questions you have in the Web Publishing wizard. If you forgot what you set, edit your application in the UAG GUI, and the application type will appear in the title of the application. This is case sensitive:
Figure 13 - UAG console with Application type
Also, if you do a “trace.hta” to look how UAG sees this application from the inside, you will see such log (search keyword “processheaderfrombrowser”). You can see the “AppType”.
13e0.ef0 06/01/2010-15:11:57.664 [whlfiltruleset ProcessHeaderFromBrowser Info:ProcessHeaderFromBrowser() : AppID=8C1FA275B7E2448A895E8B10C6122BC6 AppType=SharePoint2007AAM AppName=BPOS MOSS
Warning about these samples: if you copy an example from the web (this article, blogs, …) some characters may be transformed during the paste and could break your XML file. Especially the “double quote” character. Again always double click the XML file once modified in order to ask IE to verify the structure, and change what is wrong before UAG Activation.
Once this XML file is created in the appropriate directory, and that you have verified that XML structure is ok, you need to activate the configuration in the UAG GUI. Launch also the activation monitor in order to know exactly when the new configuration is “On”, and test the result by connecting the application via UAG.
In this example, we want to do 2 changes at the same time on the same page.
So there is only 1 line that defines the URL, but there are TWO <SAR> manipulations.
<URL case_sensitive="false">/samples/fulldemo\.htm</URL> -->
Note: In this example I just put only the DATA_CHANGE “part” of the XML File. I will use the same approach for the other examples.
In this example I would like to modify the HTTP HEADER.
When UAG is talking to the web application, I want UAG to add a new variable in the HTTP header named “MyHeaderReq=On”.
On the way back, between UAG and the client, I want to add “MyHeaderResp=On”.
<!-- We add an HTTP header in the RESPONSE, which means between IAG and the client -->
<!-- We add an HTTP header in the REQUEST, which means between IAG and the Web Application -->
I had to work with a partner on a case linked to software named “Sage” (finance, accountancy). During the SSO phase, UAG was successfully able to “inject” the login and password in the page, but for a strange reason, the “data” was instantly disappearing when loaded on the client.
The way we fixed this problem was very simple. We just “put in remark” the “call” to this function. The easiest way is the best, don’t want to change the code itself since I don’t know why it is here, and what could be the side effect if I change this code.
Figure 14 - Clear text in remark, and corresponding base64
Note: you can use a text editor such as notepad++ or other that contains base64 encode/decode functions.
Here is the full appwrap. We used in both SEARCH and REPLACE the base64 encoding tag.
<!-- SEARCH <body leftmargin="0" topmargin="0" onLoad="BLOCKED SCRIPTvoid(0);doOnLoad(false)"> -->
<!-- REPLACE WITH <body leftmargin="0" topmargin="0"> -->
<REPLACE encoding="base64" using_variables="false">PGJvZHkgbGVmdG1hcmdpbj0iMCIgdG9wbWFyZ2luPSIwIj4=</REPLACE>
In this example, the “replace” part just removes the call of the function causing the problem
Changing something in the UAG HTTP (client<->UAG or UAG <->Application) discussion is very easy and strong. Take a look at the online UAG administration guide Jason.Jones@silversands.co.ukin order to discover all the opportunities you have, and also understand how to use each function.
· Introduction to appwrap/SRA: http://technet.microsoft.com/en-us/library/ff607339.aspx
· Appwrap advanced documentation : http://technet.microsoft.com/en-us/library/ff607388.aspx
· My blog, articles about this subject (Appwrap/SRA examples) : http://blogs.technet.com/b/fesnouf/archive/tags/howto_2d00_filters/
· UAG online forum: http://social.technet.microsoft.com/Forums/en-SG/forefrontedgeiag/threads.