Well, I was not planning to have a round two on this subject, but it is just amazing how ISA is blamed by sys admin even before the deep troubleshoot being done. I hear every other week someone saying: my ISA is slow to answer, takes forever to open a page when we are in a rush hour, etc.
We need to demystify that, any affirmation that ISA is causing slow down in the Internet access should be carefully analyzed before this statement being said. This post is about one more complain about Internet being slow and of course ISA was the one that the admin was blaming.
2. Start from the Basics – Yeah, I know
Review the start from the basics session explained in the last post.
3. As usual, everything looks fine when we first look
When everything seems to be okay, the challenge is to concentrate in the details. This particular case all the items were covered fine, including the logging (which was done locally in the TXT file). The footprint of something that could cause the issue was discovered while the issue was happening.
4. Reviewing the Data
Here the first set of data that was capture during the time that the issue was happening:
· Perfmon and Netmon from ISA and DC/DNS
· ISA Data Packager
After look to the netmon trace we could clearly see that during the time that the issue was happening the DNS Server was not answering the queries sent by ISA. You might think: but you should review DNS in the basic review, right? Correct, but when we reviewed during that time it was answering in timely manner. This leads to a conclusion that the DNS Server works and then stops to work for some reason.
Another interesting point was that the DNS.exe process was having a leaking behavior, very similar to the behavior explained in KB830381. The problem was: customer’s DNS.exe process was much newer than that one since he was using Windows Server 2003 SP2.
At least at that point we knew that the issue was not caused by ISA, it was the DNS that was not answering for some reason.
5. Looking closer to the DNS Server
All right, since ISA was kind of out of the game we start to concentrate more in the DNS Server itself and we found out that the server had the following event:
Event ID : 7502
Category : None
Source : DNS
Type : Error
Generated : 8/13/2008 8:52:34 PM
Machine : SRVDNS
Message : The DNS server was unable to service a client request due a shortage of available memory. Close any applications not in use or reboot the computer
to free memory.
Clearly we have an issue here; looking to the perfmon data we found the following suspicious counters (data for the DNS.exe process):
The user mode dump from the DNS.exe process also showed us the remaining data that helped us to categorize this as a memory leak. Here the relevant data extracted from the DebugDiag Report:
Top 1 functions by allocation count
58 638 allocation(s)
Allocation type Heap allocation(s)
Heap handle 0x007c0000
Allocation Count 58638 allocation(s)
Allocation Size 106,91 MBytes
Leak Probability 100%
The problem was really caused by the DNS.exe process that was leaking and the issue was fixed applying the post SP2 update for DNS described in KB946565. If your DNS Server is running Windows Server 2003 SP2, make sure to update your DNS.exe process and remember: don’t blame ISA before you are 100% sure that it is ISA that is causing the issue J.