Last March 19 the Information System Security Association (ISSA) Chapter Brazil had the ISSA Day and they distributed the latest version of their magazine called Antebellum. I was invited by Fernando Fonseca (Security Consultant) to write about Forefront TMG Beta 2 and this article was published in the Antebellum March/April Edition. Here it is the magazine front cover:
Note: Portuguese readers you can download the full magazine in PDF format from here.
I just want to share this tool created by my fellow Brazilian friend Roberto Farah, that uses PowerShell to Control WinDBG:http://blogs.msdn.com/debuggingtoolbox/archive/2009/02/04/powershell-script-powerdbg-v5-0-using-powershell-to-control-windbg.aspx
Why this is cool? Here an example of how this can make it easier to find Leaked objects and send customer ready report: http://blogs.msdn.com/debuggingtoolbox/archive/2008/11/14/powershell-script-finding-the-managed-objects-that-leaked.aspx
Farah is currently writing a book with Dmitry Vostokov (from www.dumpanalysis.org) about Debugging where he will include this stuff:
http://www.dumpanalysis.org/Forthcoming+Windows+Debugging+Notebook
Dimitry has lots of books but my preferred (so far) is Memory Dump Analyses Volume I.
My friend Paul Long from Netmon Team recorded a series of videos that will allow you to better understand how Netmon works and how to use it. Check it out here:
UI Overview Part 1
UI Overview Part 2
Capturing with the UI
Capturing with NmCap
Filtering
Conversation Tree
Creating a New Parser
It is very interesting to me that many people didn’t fully realized yet the benefits of ISABPA. Certainly we already have lots of admins that use this tool, but do you know if you really use the full capability of this tool? This post will describe the most common scenarios of using ISABPA and how to take full advantage of it. In this first part of the post I’m going to discuss how ISA BPA can assist you proactively to mitigate possible issues.
2. Proactive Health Check
When you deploy ISA Server you should first of all plan, plan and plan. I worked in many cases where the ISA was installed like you install Microsoft Office, using NNF technology (Next, Next and Finish), no kidding. We all know that it is easy to install, but you need to collect information prior to deploy. Here some typical questions that can influence how you will size ISA for your environment:
· What type of scenario you plan to install ISA:
o Web Proxy?
o Firewall?
o VPN Access?
o Secure Publishing Server?
o All of them?
· What applications are you planning to publish through ISA?
o Exchange OWA?
o Outlook Anywhere?
o Sharepoint?
This is definitely not the complete list, is just an example of some questions that you should ask your customer (or yourself) when planning an ISA Server installation. After gather all the data, go ahead and use ISA Server Capacity Planner to see if you have the correct hardware for ISA.
Ok, but where ISA BPA comes in on this? I didn’t want to lose the opportunity to bring how important it is the planning phase; this is the reason why I started with that. ISABPA using Health Check option will be a post installation task.
Figure 1 – Starting a new scan.
The following screen shows ISABPA performing the scanning operation:
Figure 2 – Scanning in Progress
When this process finished you can click and view report and you will see (depends on the amount of warning or errors you have) a screen similar to Figure 3:
Figure 3 – ISA Report
This is an example of a pristine installation of ISA Server 2006 on top of Windows Server 2003 SP2 with some basic rules configured on it. Notice how many warnings I have and how many improvements I can make on this configuration. If you want more details about each one of those suggestions, just click on it and you will see what the recommendation is as shown in Figure 4:
Figure 4 – Details about the warning message.
If you want to see a hierarchal view plus more details about this configuration you can click in Tree Reports and you will have a view like the one below:
Figure 5 – Tree Reports View.
3. Conclusion
In this first part of the article I explained some advantages of using ISA BPA for a proactive work, next article I will show you how ISBPA Tools can assist you during a troubleshooting scenario.
The fun has already started with TMG MBE, what about some old friends (such as cachedir) to play with TMG MBE? That’s right; TMG MBE Tools are now available at Microsoft Download Center. Got get it at: http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=82027864-4abd-4896-8255-55f6ea775489
My experience dealing with ISA Server cases on the daily basis showed me that Certificates are a delicate subject. Is the type of thing that it’s initially simple, but when it expires it can be a pain and can bring your ISA Server down if you don’t plan ahead the renew process.
In this economical turmoil that we leave is getting quiet normal that the IT Pro that before was only responsible for administer his Messaging system is now “promoted” to administer the AD infrastructure and the company’s firewall. The result sometimes is quiet frustrating because lack of documentation, no knowledge transfer and higher pressure to keep things working.
I remember one scenario where the new IT guy was in the company for only 2 weeks when his ISA Server stopped working and the whole Internet was down. Panic and clueless about what was going on happened and this IT guy contacted us. We found out that his certificate was expired and Firewall Service was not starting (see an article about that in ISA Blog next week). The problem during that time was that he had no idea about their PKI infrastructure, who was the Root CA that issues the certificate, etc. Bottom line: a case that was supposed to take 5 minutes if we have all the info that we need took 5 hours.
Last month our supportability team asked me to write an article about Certificates that could help in scenarios like this. Took me some time to repro the most common issues and document those, some members of our team reviewed (see tech reviewers in the article) and yesterday this article was published. Take a look at http://technet.microsoft.com/en-us/library/dd547090.aspx
My friend Tom Shinder that this week is up in Seattle to the MVP Summit just released Part 1 of an article that gives you an overview about the new features that TMG Beta 2 has. It is indeed worth reading it to have an idea of new functionalities and improvements that this Beta brings to you. Check it out the complete article at:
http://www.isaserver.org/tutorials/Overview-New-Features-TMG-Beta2-Part1.html
1. Introduction
This post is about a very interesting case that I worked this week, the initial scenario was:
· ISA Server 2004 Publishing OWA
· FBA Enabled on Exchange Server
· ISA Server using RSASecurID as authentication on the Web Listener
On this particular scenario we were doing “two factor authentication”, first using RSASecurID and second using FBA on CAS as explained below:
Figure 1 – How the scenario was setup
This whole scenario was working fine until they change from ISA Server 2004 to ISA Server 2006.
2. The Issue
After replacing the ISA Server 2004 to ISA Server 2006 the following behavior was notice:
· If we try to access https://mail.contoso.com/owa it works just fine. We had the RSA screen first and then the OWA FBA Page that was coming from CAS Server.
· If we try to access https://mail.contoso.com/exchange we got the RSA page, authenticate but then instead of receiving the OWA FBA page we got a blank page. Internet Explorer appears to be processing something in background but never opens the page.
The reason why we needed to use the “/exchange” in the OWA URL is because there is still some user’s mailbox that were residing on Exchange 2000 Back End Server. For backward compatibility Exchange 2007 keeps /exchange, /public and /exchweb virtual directories to allow users in this scenario to access their mailbox through OWA. When you have FBA enabled on the /exchange folder, what happens is that the request will be redirect to /exchweb since it is in there that the forms reside.
Interesting facts:
· ISA Monitoring / Logging not showing any error or deny, all the communication was green and flowing normally.
· If we try to connect from inside (bypassing ISA) it works fine.
· If we change the authentication method on the /exchange folder for basic instead of FBA it works. In this case the second authentication (after the RSA) is a prompt (since is basic) so the user can type his credential.
With that we had the following component as main suspicious at that time:
· Folders /exchange and /exchweb on CAS Server were having some type of issue when the traffic was coming from outside (passing through ISA) since internally was working fine.
At that point I engaged an Exchange Team to validate the settings for those virtual directories. My friend Vandy Rodrigues, from the Exchange CSI Team started to review that.
3. The Eternal Loop
After reviewing the whole configuration, validated all the settings, permissions and everything he told me: Yuri, all the Exchange settings are clear, no errors. What should I say to him at that point? Nothing more than: Roger that buddy!
Moving further in the collaboration we got a netmon trace in the CAS Server while client was trying to perform the logon to https://mail.contoso.com/exchange. The result was a beautiful eternal loop, check this out:
ISA Sends the GET Request to the CAS Server:
10.20.20.1 10.20.20.10 HTTP HTTP:Request, GET /exchange
- Http: Request, GET /exchange
Command: GET
+ URI: /exchange
ProtocolVersion: HTTP/1.1
Reverse-Via: ISACONTOSO
Host: mail.contoso.com:443
Accept-Encoding: gzip, deflate
UserAgent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;.NET CLR 1.1.4322)
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/msword, application/x-shockwave-flash, */*
Accept-Language: en-us
Connection: Keep-Alive
Front-End-Https: On
HeaderEnd: CRLF
The CAS Server sends the 302 Redirects saying that the request needs to go to /exchweb:
10.20.20.10 10.20.20.1 HTTP HTTP:Response, HTTP/1.1, Status Code = 302, URL: /exchange
- Http: Response, HTTP/1.1, Status Code = 302, URL: /exchange
StatusCode: 302, Moved temporarily
Reason: Moved Temporarily
Location: http://mail.contoso.com/exchweb/bin/auth/owalogon.asp?url=http://mail.contoso.com/exchange&reason=0&replaceCurrent=1
Set-Cookie: sessionid=; path=/; expires=Thu, 01-Jan-1999 00:00:00 GMT
Set-Cookie: cadata=; path=/; expires=Thu, 01-Jan-1999 00:00:00 GMT
Connection: close
ContentLength: 0
…and this GET and Redirect keeps going on and on forever:
10.20.20.1 10.20.20.10 HTTP HTTP:Request, GET /Exchange
10.20.20.10 10.20.20.1 HTTP HTTP:Response, HTTP/1.1, Status Code = 302, URL: /Exchange
4. The Resolution
After researching more and more I found that the conditions for this problem were perfectly matching with KB935206. Not only the conditions but also the symptoms were similar. The solution was exactly that! Read KB935206 to better understand the resolution (and the cause) before apply it and notice that the hotfix itself doesn’t fix the problem; you have to run the script to make it happen.
Microsoft Malware Protection Center released last Friday an update about the new Conficker variant, as MMPC’s blog says: “The new sample has modifications which introduce new backdoor functionality. Previous versions of Conficker patched netapi32.dll in memory to prevent further exploitation of the vulnerability addressed by bulletin MS08-067.”
Check it out the complete post at http://blogs.technet.com/mmpc/archive/2009/02/20/updated-conficker-functionality.aspx
Is not only ISA fans that are excited about TMG, marketing is already giving great feedback about this Beta, check it out some of them:
Microsoft's TMG adds antimalware, SSL inspectionCNET
Microsoft Enhances Security ProductsInternetNews.com
Microsoft Forefront Threat Management Gateway Beta 2BetaNews
Microsoft's TMG adds antimalware, SSL inspectionITvoir
Almost one year passed since we first started this project, from the book conception, TOC, proposal until start writing the first line in Chapter 1. We are now hitting 50% of the book and the sensation is that this guy is going to be really good. To celebrate this 50% mark we just release our Book’s web site that will have more info about it, is called http://www.mstmgbook.org. The road is long, but definitely worth and totally amazes to work with those bright minds that are part of this book.
Introduction
First let’s understand what silent quits means:
When a silent exit occurs, the JIT debugger is never invoked because the process itself asked to be terminated. For example, two Win32 Application Programming Interface (API) functions that perform this action are TerminateProcess and ExitProcess.
From: http://support.microsoft.com/kb/329629
Note: Although this article is for Exchange these functions are Windows (Win32) related.
What about graceful shutdown, what is that? That’s simple: a service received an expected command to gracefully stop.
The Scenario
The scenario of this article was based on a real case where customer had to manually start Firewall Service every day, it was “apparently” quitting every night. The problem with a silent quitting is that debugger will not catch; therefore there will be no dump file to analyze. Even knowing that we tried to get a dump and of course the result was a 1st chance exception dump, no second chance. Therefore we got useless data.
Moving Forward
After researching more and more we found out that Telephony Service was set to disable and ISA Server Control depends on Remote Access Connection Manager that depends on Telephony Service:
Figure 1 – ISA Server Control Dependencies.
Looking the System Log, there following sequence of events were showing up:
Event Type: Information
Event Source: Service Control Manager
Event Category: None
Event ID: 7040
Date: 2/19/2009
Time: 10:09:05 PM
User: NT AUTHORITY\SYSTEM
Computer: ISASRVSTD
Description:
The start type of the Telephony service was changed from demand start to disabled.
Event ID: 7035
Time: 10:09:06 PM
The Microsoft Firewall service was successfully sent a stop control.
Event ID: 7036
Time: 10:09:16 PM
User: N/A
The Microsoft Firewall service entered the stopped state.
Time: 10:09:17 PM
The Microsoft ISA Server Control service was successfully sent a stop control.
The Microsoft ISA Server Control service entered the stopped state.
Time: 10:09:18 PM
The Remote Access Connection Manager service was successfully sent a stop control.
In the application log we got the prove that this was not a silent exit, it was actually a graceful shutdown:
Event Source: Microsoft ISA Server Control
Event ID: 14181
The ISA Server Control service was stopped gracefully.
Event Source: Microsoft Firewall
Event ID: 14182
The Firewall service was stopped gracefully.
Now What?
If those services are stopping every night and the administrator needs to manually start those, this leads to a conclusion that something (a process) is stopping it. For a domain joined ISA the first thing you shoul check is Group Policy. A simple thing that can be done without impact the production just to check if ISA Server is receiving any policy is run the command RSOP.MSC. The result for this case was shown in Figure 2:
Figure 2 – RSOP.MSC result.
Bingo !!! Now everything makes sense. What was happen here was that ISA Server was inside of an OU that has a policy which was disabling those services. To fix that we created a new OU, moved ISA Server to this new OU and block inheritance in this OU.
Conclusion
Sometimes IT administrators using their best of intention disable some services that are considered not necessary from a Windows perspective (attempting to hardening). However, for ISA Server this needs to be carefully done since it can stop Firewall Service which will cause downtime in your Internet access. Before do this, review the article below that has a list of services that ISA Server depends on:
http://technet.microsoft.com/en-us/library/cc302488.aspx
Here it come another post about Firewall Service crashing and again the best approach to catch this crash was to attach a debugger to the firewall service (see this article to know how to do this). Result was quiet interesting because at this time there was no third party involved. After some research I found out that this was a known issue and fixed by KB http://support.microsoft.com/kb/956268.
This brings another important point: keep your ISA Server up to date on patches. Don’t think that because you already have ISA Server 2006 SP1 that you are on the latest bit version for ISA. Keep watching for new updates, mainly for “hotfix package”. In this case instead of applying 956268, I applied the latest “hotfix package”, which was http://support.microsoft.com/kb/960148. Since the hotfix is cumulative, this package not only fixes the crash issue but also other issues that might happen, such as:
957655 (http://support.microsoft.com/kb/957655/ ) When you configure Firewall Logging or Web Proxy Logging to use the ISA Server file format in ISA Server 2006, the reports only contain partial log entries
960145 (http://support.microsoft.com/kb/960145/ ) FIX: After you install Update Rollup 3 for Exchange Server 2007, the users may be unable to access their mailboxes by using OWA that is published by using ISA Server 2006
958607 (http://support.microsoft.com/kb/958607/ ) FIX: Users can unexpectedly bypass the ISA Server 2006 redirection rule for HTTPS when they try to access Outlook Web Access
959331 (http://support.microsoft.com/kb/959331/ ) You cannot disconnect a Web proxy session when you remotely manage ISA Server 2006 by using the ISA Server Management console
960146 (http://support.microsoft.com/kb/960146/ ) An update is available for ISA Server 2006 to control the domain name and user name format in Kerberos Constrained Delegation scenarios
Notice that this is big set of issues that were fixed and therefore you shouldn’t ignore. In summary, the tip of the day is: keep your ISA Server 2006 up to date with the latest hotfix package.
Yesterday I was talking with Richard Hicks and he was telling me how excited he was with the new set of features that TMG has. This is indeed true, many people got really excited with the package of new features that TMG Beta 2 brought, but there is something that you might not realize it yet. TMG is design to run in a Windows Server 2008 64-bit platform…ok, and what this means? Well, this means robustness to your edge device. This means the capability of scale your environment to other levels while your edge device is not the bottleneck of the traffic. To summarize the technical values of what this really means, check it out this table:
Architectural component
64-bit Windows
32-bit Windows
Virtual memory
16 terabytes
4 GB
Paging file size
256 terabytes
Hyperspace
8 GB
4 MB
Paged pool
128 GB
470 MB
Non-paged pool
256 MB
System cache
1 terabyte
1 GB
System PTEs
660 MB
Source: http://support.microsoft.com/kb/294418
If you think about it is very interesting to see this in another perspective, because for a firewall standpoint it is not every day that you see such high numbers.
Last Wednesday I delivered a presentation here at TechReady 8 (in Seattle) about TMG Beta 2 SMTP Protection (which is a feature that I really like since I worked with Exchange for so many years) and today I’m really happy to share with you that Forefront TMG Beta 2 is available for public download at Microsoft web site. This TMG version has so many news that you have to take some time to install, analyze the new GUI, read the deployment guide and see with your own eyes the exciting new set of features that this version offers. Let me highlight some of them:
• SMTP Protection
• Network Inspection System
• Integration with the Forefront Security Suite (codename Stirling). For this you will need Stirling Beta 2.
Now go ahead and download TMG Beta 2 and enjoy this great release. On the page below you will find the bits, release note and deployment guide. So you are all set to test, evaluate and send feedback about TMG Beta 2:
http://www.microsoft.com/downloads/details.aspx?FamilyID=e05aecbc-d0eb-4e0f-a5db-8f236995bccd&DisplayLang=en
This post is about a scenario where users were not able to authenticate on a FBA page published by ISA Server using LDAP as authentication repository. The error message that was showing up there was:
Figure 1 – Unable to authenticate.
Although it says to double check the domain name or password to see if it is wrong, this is a generic logon error and this may not be the case. We recently wrote an article at Tales from the Edge that has a troubleshoot framework for LDAPs authentication on ISA. More info check the article at http://technet.microsoft.com/en-us/library/dd316279.aspx.
2. Logging is your Friend
The ISA Server realtime logging can be very helpful in scenarios like this. In this case the error message was the one below:
Figure 2 – Error 58.
As you can see in the figure above the error message says that it was not possible to perform the requested operation. This can be a good start, but you can see even more information if you copy the whole logging to the clipboard using the option below in the task pane:
Figure 3 – Using the Clipboard option.
After copy, paste in a notepad file and save as TXT. Best thing to do is to open this file in Excel to see all the fields and be able to filter. After opening the file in Excel, I was able to see a key error in there:
Figure 4 – Using Excel to filter the logs.
Notice that in the Authentication Server field it says: dccont\No server available. This is it!! Now we can conclude that:
· ISA cannot reach the DC for some reason:
o Networking issue?
o Name resolution issue?
o DC not answering?
Before go crazy and start to investigate this deeper, what about just try to ping the server that are in the LDAP Server Set? This is what I did and the result was:
Figure 5 – unable to resolve the name.
Bingo, unable to resolve the name. After fix the name resolution problem the issue was gone and the authentication worked.
This year I’m going to TechReady 8 in Seattle to present a TMG Beta 2 session with Mohit Saxena and Bala Natarajan. The session will be Wednesday (02/04) morning and for the MSFTEs that will be there and are interested in see some cool features from this beta version, stop by and we will be glad to talk to you. Due this trip I will be away from the blog for the next 10 days.
Last Tuesday night I was helping out a friend from my team that was handling a case where customer was unable to access Outlook Anywhere from outside network. As usual, everything works inside, so who’s to blame? Of course ISA Server, it is the only thing different, right? Will see…
To better isolate the problem we eliminated tests using Outlook Client and just tried to access the RPC URL using IE (example: https://mail.contoso.com:443/rpc/rpcproxy.dll) and the result was the error below:
Figure 1 – host not available error.
We used Fiddler and we got an interesting result, see below:
Figure 2 – Fiddler result.
Since this is a real traffic I’m hiding some of the legitimate URLs, but the point in the different colors are:
Color
Meaning
Expected traffic using the external URL (for example mail.contoso.com)
Non expected traffic using internal URL (for example mail.contoso.local)
What this means? This means that ISA is for same reason losing the host name during this conversation, which is exactly what error 64 means: "The specified network name is no longer available", which is a win32 error originally called ERROR_NETNAME_DELETED.
At that time the question was: who is changing this name and sending it to ISA? Since the answer was not on our side (we saw on netmon trace that CAS was doing that) we collaborated with an Engineer from the Exchange Team that after some other troubleshooting steps fixed the issue by using the following article:
http://blogs.technet.com/asksbs/archive/2008/12/10/intermittent-outlook-anywhere-connectivity-in-sbs-2008.aspx
Note: the error 400 mentioned in the above article is the same as the one that we received from the CAS server (by looking the netmon trace).
Very interesting case where “again” everything works internally but doesn’t work externally. But again we proved the point that ISA was not causing this issue with a very useful help (as usual) from the Exchange folks.
Microsoft Malware Protection Center Blog put together the latest update about Conficker worm, the attack vectors, how to prevent and how to clean the system. It is all consolidated in their blog that you can access from here: http://blogs.technet.com/mmpc/archive/2009/01/22/centralized-information-about-the-conficker-worm.aspx
Just a quick follow up on the article that I wrote for the ISA Team Blog about ISA stopping answering requests. Last week I was collaborating with Networking Team in another case where ISA was stopping answering because of delays in DNS response. They fixed the DNS issue by changing the registry keys SocketPoolSize and MaxUserPort in all internal DNS Servers using recommendations from KB956188.
Conclusion: keep yourself alert on slow browsing issues and make sure that your DNS is working properly prior to start troubleshooting ISA.
Check this out this nice tool that allows you to analyze IIS logs and see if the ASP pages were victim of SQL Injection attack:
http://www.codeplex.com/Release/ProjectReleases.aspx?ProjectName=WSUS&ReleaseId=13436
IAG 2007 SP2 hits the ground running with many customers applying it and realizing that not only this service pack introduces lots of changes but it also has some UI changes. It’s all about having a better experience for the end user / administrator. In this post I’m going to talk about three majors UI enhancements:
· Getting Start Wizard
· Network Configuration
· Policy Editor
2. Getting Start Wizard and Network Configuration
The new getting start wizard has the same format of TMG (already in RTM with TMG MBE) Getting Start Wizard. The idea is to assist the administrator to correctly configure IAG 2007 by an organized set of procedures. You can access getting start wizard by choosing the option Getting Start Wizard in Admin’s menu as show below:
Figure 1 – Accessing Getting Started Wizard.
The first screen has the core steps that this wizard will guide you through:
Figure 2- Getting Start Wizard.
Instead of guide you through each window I will leave it open so you can explore this feature. The step by step is very intuitive and I doubt that you will get stuck while following this wizard. It is important to mention that prior to even execute this wizard is important that you have the following elements already defined:
· How your IAG Network Configuration will be used - what is it considered internal and external?
· Domain membership - what is the domain name that IAG will belong to?
· Trunk configuration - which IP are you going to use to create the trunk?
· Application – what application are you going to publish it?
What is interesting is that the first option in this Wizard also can be accessed individually by Admin menu and choosing the option Network Configuration. The screen below will appear:
Figure 3 – Network configuration.
Either here or in the Getting Start Wizard you can specify network configuration for you IAG 2007.
3. Policy Editor
The other UI change that SP2 introduced was the new Policy Editor. This new UI was improved to make it easier to the administrator to create new policies based on specific platform, such as: Windows, Mac, Linux and other (see square A in the figure below). It also allows you to create new policy from expression without having to use a different window as it was before (see square B in the figure below):
Figure 4 – New Policy Editor.
4. Conclusion
The goal of this post is just present you some of the new UI enhancements of IAG 2007 SP2 and how the product is getting more mature by offering a better user’s experience. Go ahead and try SP2, I’m sure you will not regret.
Quick post just to bring awareness about this new KB that explains how to manually remove Conficker. Follow the steps from:
http://support.microsoft.com/kb/962007
The reason why I’m saying “demystifying” is because many people are still having wrong concepts and therefore making wrong assumptions about how networks are configured on ISA Server/TMG. Although this is well documented at TechNet (since it is a core concept), sometimes due the massive amount of information you feel like: ah…I already know all this, I don’t need to read it.
Wrong assumption and this makes me go back in the day that I was Professor in a university in Brazil. I was teaching Operating System using the classic Tanenbaum’s book about OS and I remember that there was a student that clearly thought he knew all that stuff. He didn’t attend that much and when he did attend he didn’t pay attention. Well, that’s fine, let’s give the benefit of the doubt and assume that he knows what he is doing. Six months later he comes to me saying that he needs help to better understand preemptive multitasking and confessed that he missed that class because he thought he knew and preferred to do other stuff on that day. Moral of the story: never think that you know everything, even if the subject is the same that you read or heard many times. The person that is writing or telling you something usually have a different perspective and insight of the same subject that can show you things that you didn’t realize before.
Sorry, off topic, but I couldn’t resist. Anyway, since I’m a lover of self explanatory pictures combined with a decent walkthrough I think that this is probably one of the most intuitive explanations about networks concept on ISA/TMG. I’m talking about the series of two articles written by my friend Tom Shinder that will make you digest all you need to know about networks on ISA.
Check it out at here:
http://www.isaserver.org/tutorials/Overview-ISA-TMG-Networking-ISA-Networking-Case-Study-Part1.html
http://www.isaserver.org/tutorials/Overview-ISA-TMG-Networking-ISA-Networking-Case-Study-Part2.html
This is another one of those cases where ISA Server Service mysterious crashes once a day, at the same time and nothing changed in the environment. This just make me really fell that the lack of communication between the teams that deals with technology is getting far beyond of what should exactly be. Many companies are investing money in putting Security in place by adding layers and layers of technology but they are still missing two important elements: process awareness and change control procedures. The absence of those elements can directly impact availability of the environment. Why availability? Well, I will tell you later when I finish this post.
2. Analyzing the Data
In this case ISA Server Service was crashing with the following errors:
Event Type: Error
Event Source: Microsoft ISA Server Web Proxy
Event ID: 14197
Date: 01/10/2009
Time: 2:58:03 AM
Computer: MYISA
ISA Server was unable to write content to the cache file.
Event ID: 14057
Time: 2:52:37 AM
The Firewall service stopped because an application filter module C:\Program Files\Microsoft ISA Server\w3filter.dll generated an exception code C0000005 in address 64754CD5 when function CompleteAsyncConnect was called. To resolve this error, remove recently installed application filters and restart the service.
The event 14057 is clear about one thing: this was an access violation exception (C0000005) on the filter module W3Filter.dll. Too broad, can be many things including issues with the filter itself, so we need to get a crash dump of this guy to better understand what is going on. Following the approach of one of my posts we can use DebugDiag to attach to wspsrv.exe and get the dump. After getting the dump you can use this other post as an example of how to analyze it. Unfortunately this is one of the cases where the public symbols don’t help that much as you can see below:
STACK_TEXT:
WARNING: Frame IP not in any known module. Following frames may be wrong.
2b37fe10 6476e6df 27441f80 647717fe 275a5558 0x3a6169
2b37fe24 64778438 00000001 2bf579a0 64703de0 W3Filter!DllUnregisterServer+0x45ede
2b37fe90 0046d701 275a5558 00000000 00000040 W3Filter!DllUnregisterServer+0x4fc37
2b37fefc 0046e461 00000000 00000000 00000000 wspsrv+0x6d701
2b37ff20 0046e568 2bf57818 0046e3d7 2b37ff50 wspsrv+0x6e461
2b37ff30 0046d4ba 00000000 00000000 00000000 wspsrv+0x6e568
2b37ff50 00455fd7 2bf578bc 00000000 00000000 wspsrv+0x6d4ba
2b37ff7c 00456c8e 2bf578bc 00000000 00000000 wspsrv+0x55fd7
2b37ffb8 77e64829 00000015 00000000 00000000 wspsrv+0x56c8e
2b37ffec 00000000 00456b26 00000015 00000000 kernel32!GetModuleHandleA+0xdf
FAULTING_THREAD: 00001d88
DEFAULT_BUCKET_ID: WRONG_SYMBOLS
PRIMARY_PROBLEM_CLASS: SOFTWARE_NX_FAULT
BUGCHECK_STR: APPLICATION_FAULT_SOFTWARE_NX_FAULT_BAD_INSTRUCTION_PTR_CODE_RUNNING_ON_STACK
FOLLOWUP_IP:
W3Filter!DllUnregisterServer+45ede
6476e6df 8b4624 mov eax,dword ptr [esi+24h]
SYMBOL_STACK_INDEX: 1
SYMBOL_NAME: W3Filter!DllUnregisterServer+45ede
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: W3Filter
IMAGE_NAME: W3Filter.dll
STACK_COMMAND: ~50s; .ecxr ; kb
BUCKET_ID: WRONG_SYMBOLS
FAILURE_BUCKET_ID: W3Filter.dll!base_address_c0000005_SOFTWARE_NX_FAULT
The !analyze result showed above will make you under the impression that W3Filter.dll is the culprit and it is exactly the opposite, this guy is only a victim.
After deeply analyze the dump using the private symbols we got to a conclusion that someone was locking the cache file when the Web Filtering was trying to write to it. Guess who was locking it? Once upon a time there was a system administrator that was following a plan that he received from his management to install backup software in all Windows Servers, so he installed this backup software on ISA and configured a Job to run every night…
The backup software was backing up the whole server (all hard drivers) including the driver where the ISA Cache was located. For this reason customer was saying that the issue just happened when the ISA Server Cache was enabled, if they disabled the cache the issue didn’t happen. Well make sense and the recommendation to exclude cache from backup as not new, as a matter of fact the article that recommends this is out there since October 2004, which is the following one:
Event ID 5, event ID 14079, and event ID 14176 are logged in the Application log on your Internet Security and Acceleration Server computer
http://support.microsoft.com/kb/887311
Now the answer for: Why Availability? Because the ISA Server service in this case was crashing due and addition of a new product in the ISA Box without testing it in a lab environment (where the change control procedure is?). The Windows OS maintenance was responsibility of the System Administrator that with all the good intentions configured the Backup Software to back it up the whole hard drive. However the Firewall Admin wasn’t aware of this addition since it was out of the scope of his duty (where the process awareness is?) and he swear since the begging that nothing change in the environment and ISA was crashing from nothing L. But, this story had a happy end at least, so let’s finish this post with a smiling face J.