Interesting call today. The customer was reporting that his Exchange Server queues were full of "bogus" email and that he was an open relay. He also stated he had followed http://support.microsoft.com/default.aspx?scid=kb;EN-US;324958 to make sure he was not an open relay. He stated that the article had him telnet to the SBS box from a remote host and try to "relay" mail but he was unable to issue any commands...like the SBS box was allowing the port 25 connection but not allowing commands. Interesting... He also stated that he changed the “Relay Restrictions” to "only the list below" and added the SBS’s IP address to the list (per http://support.microsoft.com/default.aspx?scid=kb;EN-US;324958) and it did not change the behavior. It was previously set to “All except the list below” and the list was blank. He would clean out the queues per the article above and mail would get jammed up again. The first thing I did was determine whether the mail was "relayed" mail (meaning the "from" and "to" were not his local domain/users) or they were NDR's resulting from a "mailbomb" or "reverse NDR attack". Well, it was "relayed" mail (with a mix of NDR's as well resulting from the relayed mail eventually failing). Since I could not manually test with telnet, I setup Outlook Express on my client machine here at the office to use his public IP as my "outbound SMTP server" and tried to send mail...sure enough it sent it out. So now I know the IP I was connecting to is in fact an "open relay". This does not mean the SBS box is an open relay (some routers/firewalls can be configured --or misconfigured-- to allow an open proxy or socks proxy). I got Terminal Server access to the box and started looking around. The first thing I did was turn on SMTP Protocol Logging on the properties of the SMTP Virtual Server and set it to Microsoft IIS Logfile Format and restarted the Simple Mail Transport Protocol service. I did this so I could get some logging on the source IP's of the bad guys relaying off him. By default, the logfile is in c:\windows\system32\logfiles\smtpsvc1. So I checked the "logfiles" directory and there was no smtpsvc1 directory created!! It should create this directory when it starts logging SMTP traffic per the setting above. Ok, that's weird. I noticed that the customer had 2 NICs installed. An IPCONFIG showed the internal NIC had 192.168.0.100 and another NIC had 0.0.0.0. I went to the network properties and opened the properties of the "other" NIC and it had a static address assigned (24.x.x.x). When you "hovered" over the "other" NIC, it said "acquiring network address". Ok, that's weird too...
At this point my thoughts were:
1. The router was changing the source IP address of the external SMTP connection to it's own internal IP (192.168.0.1 for example) and SBS was relaying it that way (by default we allow the entire internal subnet to relay). This was a thought because the "settings were right" on the SBS box as far as relay is concerned AND I knew the router was playing with the SMTP traffic because I could not issue my telnet commands when connected from the internet but I could issue the commands when connected via telnet on the LAN.
2. There was some weird IP routing problem on the SBS box itself. This was in the back of my mind because of the IPCONFIG weirdness showing 0.0.0.0 and "acquiring network address" when the nic itself was hard coded to 24.x.x.x.
3. SMTP was simply busted. My thoughts on this were due to the fact that I could not get SMTP logging to kick in.
So now I have some things to start acting on. The first thing I did was disable the 2nd NIC. This did not change the behavior. I then stopped the Default SMTP Virtual Server and created another one called "test". I tried to start the "test" SMTP Virtual Server and it would not start and gave me an error. This gave more weight to #3 above. I looked around some more, double checking and triple checking the SMTP Virtual Server settings, the Recipient Policy in Exchange, weird settings in Routing and Remote Access (he had a single NIC but RRAS was running providing VPN connectivity). Nothing out of the ordinary. I then started focusing on #3 above. Whenever I troubleshoot pretty much any Exchange mailflow problem I always turn on logging. It was bugging me that I could not see the logs. I created another SMTP Virtual Server and could not start that one either. I then wondered if the server might have the weird "DS2MB" issue with the .Net Framework and the "RootVer" registry key. One of the ways to tell is to look for an event MSExchangeMU in the application log (yes, I have seen this problem before). I went to the app log and did a filter on MSExchangeMU and there were a ton of them. I then followed http://support.microsoft.com/default.aspx?scid=kb;EN-US;906154 and checked the RootVer registry key and it was wrong. I changed it per the article and restarted IIS and all the Exchange services. I deleted my extra test SMTP Virtual Servers and started the Default SMTP Virtual Server. Now I have a log in c:\windows\system32\logfiles\smtpsvc1. I opened it up and saw a TON of "550 unable to relay", which is a good sign. I then fired up Outlook Express again here on my work machine and tried to relay off his server and it failed. Good again.
So what happened...?? Before the explanation, an understanding of "DS2MB" is needed. Well, in a nutshell, DS2MB provides replication from Active Directory (Exchange specific AD stuff in this example) to the Metabase. The Metabase is where all the action takes place regarding IIS (SMTP is a subset of IIS). My analogy is: “Registry is to Windows as Metabase is to IIS”. So you make a change to some setting in Exchange that affects SMTP (add a connector, change the recipient policy, enable Recipient Filtering, etc) and pretty quickly DS2MB kicks in and replicates those changes to the Metabase.
Here is my theory: At some point in the past the SBS box was misconfigured and it was an open relay. During it's "open relay" state, the RootVer key broke, breaking DS2MB replication. The "misconfiguration" was corrected in Exchange but the changes never replicated down to the Metabase.
The “DS2MB” issue can produce varying symptoms like the following:
--Unable to receive email for a domain that was recently added to the Exchange Recipient Policy.
--Relay settings in the SMTP Connector not taking effect.
I really enjoy reading this sort of post - the technical detail is interesting but more useful is the insights into the diagnostic process. As an SBS technology consultant, I find this extremely useful.
great article. I have been trying to resolve this issue for weeks on my Exchange 2003 server.