I recently worked in a very interesting case where customer’s Exchange Server got in the SPAM Block list although the environment was clear of malware and no SPAM was originated from that server at all. We ended up identifying why the server got blocked and it was because an external servers was using reverse DNS lookup to verify if the MX record for that email server matches with the source IP address from where the SMTP traffic was coming from. To make it easier to understand, let’s take a look on following diagram for contoso.com network:
Notice that the primary IP bound to ISA’s external interface is using IP 192.168.1.113. The SMTP Publishing rule correctly maps the internal Exchange Server IP but the outbound traffic always will leave with the primary IP of the ISA Server. This means that when the external Exchange Server performs the reverse lookup for the MX record (for example: mail.contoso.com) it will resolve for 192.168.1.60 which doesn’t match with the source IP received in the IP header of the SMTP packet.
The fast resolution here is to change the primary IP to be 192.168.1.60, but sometimes this cannot be done so fast due other policies for example. But….that’s the way it is on ISA Server, not much you can do other than plan to use the primary IP for scenarios like this.
The good thing here is: TMG resolves this problem! How? With a feature called Enhanced NAT (ENAT). Now you can create a network rule to specify which IP address you want to use for outbound traffic as shown below:
Isn’t that nice? It’s amazing for sure!!
Well, we are already in September and TMG is coming very soon…but while is not RTM yet, you still have a chance to download Beta 3 and play with it.
1. Introduction
I wrote some time back about an issue on ISA Server related to VPN and on that post I emphasized the following:
“…ISA Server VPN component uses Windows RRAS component and that when the ISA is isolated from the main problem, the troubleshooting line needs to follow RRAS tools and techniques.”
This is what boils down to on the most case that I work where ISA Server is being used as VPN Server. Troubleshooting on the ISA Server side most of the time is minimum, most of the time on the ISA Server side is just a matter of configuration, or should I say misconfiguration. The big and deep troubleshooting involve other elements around:
· RRAS
· Networking in general
· Connectivity with the DC (or RADIUS Server)
This week I got a couple of VPN cases that were really interesting and I would like to share with you the troubleshooting steps and solution for those.
2. Scenario 1 – VPN Client was unable to authenticate
In this scenario, all VPN settings were correct on ISA and on client; on ISA Server logging we just see error 0x80074E20, which means “A connection was gracefully closed in an orderly shutdown process with a three-way FIN-initiated handshake” (see error codes for more information). Basic connectivity with the DC was also good…but, basic means only a ping with a reply. No other device in between internal NIC of ISA and the DC, full communication was allowed.
Investigating further it was possible to see the following:
· Event Viewer
Event Type: Error
Event Source: Microsoft Firewall
Event Category: None
Event ID: 21137
Date: 9/24/2009
Time: 5:51:18 PM
User: N/A
Computer: ISACONTN1
Description:
The connectivity verifier "Domain Controller" reported an error when trying to connect to DCCONT.contoso.msft.
Reason: The request has timed out.
Event Source: NETLOGON
Event ID: 5719
Time: 5:54:27 PM
This computer was not able to set up a secure session with a domain controller in domain CONTOSO due to the following:
There are currently no logon servers available to service the logon request.
This may lead to authentication problems. Make sure that this computer is connected to the network. If the problem persists, please contact your domain administrator.
ADDITIONAL INFO
If this computer is a domain controller for the specified domain, it sets up the secure session to the primary domain controller emulator in the specified domain. Otherwise, this computer sets up the secure session to any domain controller in the specified domain.
· RRAS Tracing (need to run “netsh ras set tracing * enable” in order to have those files under %systemroot%\tracing)
Content of the file IASSAM.LOG:
[924] 09-24 17:53:39:429: NT-SAM Names handler received request with user identity CONTOSO\administrator.
[924] 09-24 17:53:39:429: Username is already an NT4 account name.
[924] 09-24 17:53:39:429: SAM-Account-Name is "CONTOSO\administrator".
[924] 09-24 17:53:39:429: NT-SAM Authentication handler received request for CONTOSO\administrator.
[924] 09-24 17:53:39:429: Processing MS-CHAP v2 authentication.
[924] 09-24 17:54:27:198: LogonUser succeeded.
[924] 09-24 17:54:27:198: NT-SAM User Authorization handler received request for CONTOSO\administrator.
[924] 09-24 17:54:27:198: Using native-mode dial-in parameters.
[924] 09-24 17:54:27:198: Sending LDAP search to DCCONT.contoso.msft.
[924] 09-24 17:54:27:198: LDAP ERROR in ldap_search_ext_sW. Code = 81
[924] 09-24 17:54:27:218: Extended error string: (null)
[924] 09-24 17:54:27:218: Retrying LDAP search.
[924] 09-24 17:54:27:618: Failed to connect to the cached DC, try DC locator ...
[828] 09-24 17:54:31:284: NT-SAM Names handler received request with user identity CONTOSO\administrator.
[828] 09-24 17:54:31:284: Username is already an NT4 account name.
[828] 09-24 17:54:31:284: SAM-Account-Name is "CONTOSO\administrator".
[828] 09-24 17:54:31:284: NT-SAM Authentication handler received request for CONTOSO\administrator.
[828] 09-24 17:54:31:284: Processing MS-CHAP v2 authentication.
[828] 09-24 17:54:31:284: LogonUser succeeded.
[828] 09-24 17:54:31:284: NT-SAM User Authorization handler received request for CONTOSO\administrator.
[924] 09-24 17:54:50:922: Failed to connect to the DC discovered by DC locator, try DC enumerator ...
[828] 09-24 17:55:02:979: Using native-mode dial-in parameters.
[828] 09-24 17:55:02:979: Could not open an LDAP connection to domain CONTOSO.
[924] 09-24 17:55:02:979: Could not open an LDAP connection to domain CONTOSO.
[828] 09-24 17:55:03:089: NTDomain::getConnection failed: This operation returned because the timeout period expired.
[828] 09-24 17:55:03:089: Retrying LDAP search.
[828] 09-24 17:55:03:089: Could not open an LDAP connection to domain CONTOSO.
[924] 09-24 17:55:03:089: NTDomain::getConnection failed: This operation returned because the timeout period expired.
By this time you can be thinking: how we have basic connectivity and it doesn’t reach the DC? That was the key question. The next step was a simple test: try to change the groups that had permission to access VPN and user a local group, from ISA itself. In order to do that I created a local group called VPNUsers, added the local administrator on it (for testing purpose) and try to add the CONTOSO\Domain Users when….got the following error: The RPC server is unavailable. Ok, now it makes sense and to start troubleshooting this I used:
Troubleshooting RPC Endpoint Mapper errors using the Windows Server 2003 Support Tools from the product CD
It turns out that the issue was related to the binding order of the network adapter. On ISA Server we had the following binding order:
Figure 1
It is strongly recommended that the internal network (in this case LocalCorp) is on the top. This can avoid many issues such as this one. Here are some references on that:
· Delay in NetBIOS connections from a multi-homed computer
· How to properly configure network binding order on a BackOffice Small Business Server system
· DNS: Valid network interfaces should precede invalid interfaces in the binding order
After changing the binding order it worked like a charm!
3. Scenario 2 – Unable to Authenticate using RSA VPN Client
First and foremost, even before start touching on ISA Server, it is very important to review the article below from RSA about how to configure ISA Server 2006 to use RSA for VPN authentication purpose:
http://www.rsa.com/rsasecured/guides/imp_pdfs/Microsoft_ISA2006_AM7.1_VPN.pdf
By following the above article you can pretty much configure all you need in order to make the VPN work with RSA. However in this case, although everything was done, it was not authenticating. The key here it was that in this case the authentication was done using RADIUS.
By looking the Netmon trace while authentication was happening (internal NIC of ISA) we could see that ISA (10.20.20.1) keep sending requests to the RADIUS Server (10.20.20.20) without answer:
Figure 2
According to the system administrator this RADIUS is being used for other purposes and it only fails with ISA. Going to the RADIUS Server it was possible to see the following event on Event Viewer:
Figure 3
That was simple, right? What happens is that if the RADIUS Server doesn’t have the RADIUS Client configured in there, it will fail to answer the request. To fix that we just need to add ISA Server as RADIUS Client and make sure that the secret between them is the same. Here are some references:
Event ID 13 — RADIUS Client Configuration
Error 930; The authentication server did not respond to authentication requests in a timely fashion
4. Conclusion
These are more examples about how troubleshoot VPN on ISA Server 2006. Try to not focus so much on ISA itself because many times ISA is not the culprit.
There is not more frightening then knowing that you can’t backup your server and when I say knowing is because many times administrators don’t backup because they forgot or because they think someone else is doing. But when you know that the backup cannot be performed, than you know that Murphy’s Law might get you.
This post is about a scenario where Firewall administrator was trying to backup ISA Server configuration and it was receiving the following error:
Figure 1 – Error when trying to perform a backup.
Understanding the Error Message
Before panic it is important to take a deep breath and understand what this error message is trying to say to you. If you BING for the error code 0xc0040357 (simple like that http://www.bing.com/search?q=0xc0040357 ), you will find the KB 922222, which is a good start. But for now, let’s leave this on hold and understand the error description. It says:
“The Web Listener referenced by HTTP Compression
HTTP-Compression-Configuration does not exist.
The error occurred on object ‘HTTP-Compression-Configuration’ of class
‘HTTP Compression’ in the scope of array ISACONTN1
I’m highlighting the keywords of this error message, with that we know that the problem is related to:
· A Web Listener that does not exist anymore however it is present in the HTTP Compression option.
To confirm that this understanding is correct, next step is to try to open the HTTP Compression Preferences under General and see if we are able to see the properties. When we do that we get the error below (which confirms that there problem resides in there):
Figure 2 – Error when opening HTTP Compression options.
2. Investigating Further
This scenario happened in an ISA Server 2006 Enterprise Edition that apparently had no problem, everything was working fine, nodes were in sync and the only operation that was failing was the backup. Since this is an Enterprise Edition we have ADAM and having ADAM we know that the values are primarily stored there. According to the error message the error occurred in the object ‘HTTP-Compression-Configuration’, therefore we should take a look on this object.
In order to do that we can use the same approach explained in KB922222 (the one that I mentioned before). Since that article is for ISA 2004, there is a slightly difference when trying to connect to CSS using ADAM on ISA 2006. Instead of using the Distinguished Name CN=FpcConfiguration, you will use CN=FPC2 as shown below:
Figure 3 – Connecting to CSS using ADAMADSIEdit.
After connecting to ADAM, browse to the following location:
Figure 4 – Looking for the HTTP-Compression-Configuration object.
Notice that under the HTTP-Compression-Configuration object we have another object called WebListenerUsed and in this case it has a GUID for a WebListener. This GUID is not the real name of the WebListener, this is actually the CN for this object. To see the name of the real web listener in which this object is referring to, you need to right click on this object and choose properties. Look for the attribute called msFPCName as shown below:
Figure 5 – Checking the msFPCName for this attribute.
If this is a valid listener, we should see this GUID under CN=RuleElements,CN=WebListeners as shown below:
Figure 6 – Valid Web Listeners.
Notice that in this case I do not have this value in there, which means that this is the reason why we are receiving this error. In other words: there is an object present under HTTP-Compression-Configuration object that has an attribute that points to an invalid object.
3. Now What?
Since this is an invalid object we need to remove it from there. However, before do that it is important to emphasize that before any intervention directly on ADAM or Registry make sure that you have a backup of your system. In this case since we cannot backup the whole array at least we should backup the Firewall Policy (which works fine since doesn’t look for that object). Also, before delete this object, you can dump it using LDIFDE, so you can have a backup of the attributes for this object (in case you need). To export this object uses the command below:
C:\>ldifde -t 2171 -f backup.ldf -s isacontn1 -d "CN={58231C84-C3B7-4BF7-9A18-1943A657D410},CN=WebListenerUsed,CN=HTTP-Compression-Configuration,CN=WebProxy,CN=ArrayPolicy,CN={878CC789-AF34-48A1-849B-89A806E2CB88},CN=Arrays,CN=Array-Root,CN=FPC2"
Connecting to "isacontn1"
Logging in as current user using SSPI
Exporting directory to file backup.ldf
Searching for entries...
Writing out entries.
1 entries exported
The command has completed successfully
Notes:
· -t allows you to specify which port you are going to use to connect to ADAM. In this case port 2171,
· -d allows you to specify the Distinguished Name (DN) of the object that you want to dump. In this case you need to open the properties of the object and search for the DN.
The output (backup.ldf) of this command for this case is:
dn: CN={58231C84-C3B7-4BF7-9A18-1943A657D410},CN=WebListenerUsed,CN=HTTP-Compression-Configuration,CN=WebProxy,CN=ArrayPolicy,CN={878CC789-AF34-48A1-849B-89A806E2CB88},CN=Arrays,CN=Array-Root,CN=FPC2
changetype: add
objectClass: top
objectClass: msFPCRef
cn: {58231C84-C3B7-4BF7-9A18-1943A657D410}
distinguishedName:
CN={58231C84-C3B7-4BF7-9A18-1943A657D410},CN=WebListenerUsed,CN=HTTP-Compressi
on-Configuration,CN=WebProxy,CN=ArrayPolicy,CN={878CC789-AF34-48A1-849B-89A806
E2CB88},CN=Arrays,CN=Array-Root,CN=FPC2
instanceType: 4
whenCreated: 20090917112759.0Z
whenChanged: 20090917112759.0Z
uSNCreated: 321498
uSNChanged: 321498
name: {58231C84-C3B7-4BF7-9A18-1943A657D410}
objectGUID:: CkNXAblXDkWSUJaZY5Bexw==
objectCategory:
CN=msFPC-Ref,CN=Schema,CN=Configuration,CN={F2298771-D6AA-42E1-B32D-4C0DCFD325
4D}
msFPCRefClass: msFPCWebListener
msFPCName: {DC6A3B0D-9E21-454D-BF68-00E9A79C4E3E}
Now that we have a backup of the Firewall Rules and a dump of the object that we are deleting, let’s get rid of this invalid object. For this particular scenario the object is the one below:
Figure 7 – Object that needs to be eliminated in this case.
After highlight this value, press delete or right click on the object and choose delete.
Note: For ISA Server 2006 Standard Edition, you have to delete this value from the registry.
4. Validating the Procedure
After remove the bad entry you should make sure that the array configuration is in sync, you can force a change, such as disabling a rule or changing the name of the rule. This is just to force a new synchronization. Now after doing that you should be able to open the HTTP Compression properties, as shown below:
Figure 8 – The bogus web listener was listed in there before, now is clear.
5. Conclusion
Fixing corrupted objects on ADAM is not always straight forward like this, sometimes you can’t really determine easily which object is corrupted because there are many objects to evaluate, compare values, etc. In scenarios where you can’t determine, don’t try to “guess” which object is corrupt and delete it without 100%, better to get an ISA Data Packager in repro mode using the Administration template and open a case with MS CSS for further analyzes.
I was reading the Windows IT Pro Magazine of this month (September 2009) and there I found a nice article written by an Escalation Engineer here from Microsoft Texas (Michael Morales) where he describes how to use ProcDump to catch high CPU utilization. This is an amazing tool that can also help ISA Administrators, mainly for scenarios where we just can’t get the right data (most case dumps) because the issue is random and when it happens there is nobody available to execute a command (for example: launch DebugDiag and choose the option for manual dump the process).
For an ISA Server high CPU utilization scenario a simple example will be dump out the Firewall Service process two times when the CPU for wspsrv.exe is at or exceeds 90 percent for 5 seconds and store the dumps in the c:\dumps folder:
c:\procdump.exe -c 90 -s 5 -n 2 wspsrv.exe c:\dumps
Isn’t that cool?
Make sure to read the article from Michael Morales to fully understand how this tool works:
http://windowsitpro.com/article/articleid/102479/got-high-cpu-usage-problems-procdump-em.html
As TMG RTM is very close, the amount of downloads for TMG Beta 3 is growing and with that, some users are facing some issues while trying to install TMG Beta 3. This one in particular refers to an error trying to check domain membership while installing ADLS and if you open the file %windir%\temp\ ISAADAM_INSTALL_XXX.log you might see the following error:
adamsetup 7E4.BEC 0163 10:05:53.536 Enter CheckServiceSecurity
adamsetup 7E4.BEC 0164 10:05:53.536 Enter InitSecWinntAuthIdentity
adamsetup 7E4.BEC 0165 10:05:53.536 Enter State::GetOperation UNIQUE
adamsetup 7E4.BEC 0166 10:05:53.957 Enter State::GetOperation UNIQUE
adamsetup 7E4.BEC 0167 10:05:53.957 NtdsAdamValidateServiceAccount() => 1789
adamsetup 7E4.BEC 0168 10:05:53.957 info.eValidationResult = 0
adamsetup 7E4.BEC 0169 10:05:53.957 Enter GetErrorMessage 800706FD
adamsetup 7E4.BEC 016A 10:05:53.957 ADAMERR_OK
adamsetup 7E4.BEC 016B 10:05:53.957 AD LDS Setup was unable to validate the selected service account.
Error code: 0x800706fd
The trust relationship between this workstation and the primary domain failed.
adamsetup 7E4.BEC 016C 10:05:53.957 ADAMERR_OK
adamsetup 7E4.BEC 016D 10:05:53.957 Enter Feedback::ShowMessage AD LDS Setup was unable to validate the selected service account.
Also, keep in mind that if you are trying to install TMG Beta 3 on Windows Server 2008 R2 it might not install or it might install and do not work properly, the reason why is because TMG Beta 3 was not projected to run on Windows Server 2008 R2 therefore is not supported (for this release – Beta 3). Don’t worry, by TMG RTM version we will be fine with that, however if you want to play with TMG Beta 3 now make sure to install Windows Server 2008 version (no R2).If you are indeed using Windows Server 2008 (only) and you are receive this error on the log, than go ahead and start troubleshooting domain membership problems, secure channel, RPC, etc…you know the drill J.
If you need to discuss technical issues about TMG Beta 3, Microsoft has an open forum for that on the link below:
http://social.technet.microsoft.com/Forums/en-US/FTMGNext/threads
Today ISSA released the ISSA Journal – September 2009 issue that contains an article that I wrote about unified threat management.
You can view the online version at:
https://www.issa.org/Library/Journals/2009/September/ISSA%20Journal%20September%202009.pdf
I know…but let me explain why.
We are on the final stage for the book; we MUST finish all review by end of this month so MS Press can start the production phase in order to have this book available by December 2009. Since we started writing this book when TMG MBE was still on Beta (yeah, more than one year ago) we have lots of screens to update and also some final technical details to adjust to make sure that is all up to date with the latest build that we have internally. In summary: it is a LOT of work to do until end of this month.
Although I’m absent from here, I’m still writing for ISA Team Blog and this month I already posted two entries there:
· Behavioral Change on IE7 can affect Outbound access through ISA Server 2006 that is using Redirect on a Deny Rule
· Time Matters - When ISA Server is affected by Windows Time settings
Besides that I was working to get my Windows Internals exam done (took it last week and passed). It was a tough exam, I took it last year when it was still on Beta and failed (got 636 and need 700 minimum). Last month I finish reading Windows Internals 4th Edition to prepare for this. Now I’m catching up on my “To Read List” (in the following order):
· Windows Internals 5th Edition (some interesting updates on it)
· Reversing – Secrets of Reversing Engineering (very cool book)
· ANSI C++ (this is the type of book to read in calm moments)
The good thing about being busy with so many projects is that the recognition will come (soon or later). An example of that is what I’m harvesting now from my OOF time in Brazil last June. When I was there I delivered some presentations about information security and now that initiative is part of the ISSA Security Start 25th anniversary gallery, see this link here for more details. I’m not sure if I will be able to go to CA, but it is already an honor to be on this list with so many great people (like my fellow MS friend Russ McRee).
There are other two projects that I can’t reveal right now, but I’m very excited about it and looking forward to start. It will take a year to get it done, but…it will be worth to wait.