AD Troubleshooting

AD and Domain-related issues and troubleshooting methods for Active Directory.

The story of the Mysteriously Malfunctioning Mail Router (AKA EDNS and Exchange Escapedes)

The story of the Mysteriously Malfunctioning Mail Router (AKA EDNS and Exchange Escapedes)

  • Comments 1
  • Likes

logo-header-e2010
A small anecdote to illustrate how external changes outside of the control of the local Administrators can adversely affect the internal infrastructure:

A colleague of mine in the Exchange team came to me with an issue where the customers Exchange server suddenly stopped being able to route incoming mail for a specific domain.
All other domains were being routed without any problems – but their Exchange servers had suddenly become unable to route mail to that specific domain.
The DNS results we were getting back from our DNS server looked highly suspicious - 'Server Failure'.

Looking at the Exchange logs from the mail router we saw the following:

Routing

 

Request ID=919: MX query 'EXSRV0106.CONTOSO.COM' completed with status 'ServerFailure' and target hosts: Array length=0 {  }

 

 

Routing

 

Request ID=919: last query response received

 

 

Routing

 

Request ID=919: DNS query 'EXSRV0106.CONTOSO.COM' failed with status 'ServerFailure'

 

 

Routing

 

Request ID=919: DNS query 'EXSRV0154.CONTOSO.COM' failed with status 'ServerFailure'

 

 

Routing

 

Request ID=919: DNS query 'EXSRV0117.CONTOSO.COM' failed with status 'ServerFailure'

 

 

Routing

 

Request ID=919: DNS query 'EXSRV0165.CONTOSO.COM' failed with status 'ServerFailure'

 

 

Routing

 

Request ID=919: failed to obtain any addresses

 

 

Routing

 

Request ID=919: EndResolve with status 'ServerFailure' and retry time 01/01/0001 00:00:00

 

 

SmtpSend

 

 

Acking Connection due to DNS error. Status -> Retry : The DNS query for 'SmartHostConnectorDelivery':'EXSRV0106':'37ff75b8-3366-4cdd-8cea-3a8eebcddabb'  failed with error: ServerFailure

 

 

 

Request ID=919: MX query 'EXSRV0106.CONTOSO.COM' completed with status 'ServerFailure' and target hosts: Array length=0 {  }

 

 

Request ID=919: last query response received

 

 

Request ID=919: DNS query 'EXSRV0106.CONTOSO.COM' failed with status 'ServerFailure'

 

 

Request ID=919: failed to obtain any addresses

 

 

Request ID=919: EndResolve with status 'ServerFailure' and retry time 01/01/0001 00:00:00

 

 


Acking Connection due to DNS error. Status -> Retry : The DNS query for 'SmartHostConnectorDelivery':'EXSRV0106':'37ff75b8-3366-4cdd-8cea-3a8eebcddabb' failed with error: ServerFailure

 

 

No change had been made to anything at the customer site - yet this stopped working in the middle of the night with no local admins reported anywhere near a keyboard or a server.

After examining this we determined the following:

- The mail routing for the specific domain was making a DNS query to determine the IP's for the target servers.  The DNS query would normally return no records as the servers aren't registered externally and the mail routing code would fall back to using a local file which contains the IP's of the servers.

- For all other incoming mail there was a conditional forwarder that routed the DNS requests to an internal DNS server.

- There was no conditional forwarder set up for the failing domain so all DNS queries for that domain were being sent to either one of the ROOT DNS servers on the Internet or the ISP's DNS servers.

Essentially, an update had been made outside of the customers network.  This update was preventing the DNS response from coming back to us which resulted in the 'Server failure' message which in turn resulted in the mail routing code terminating the mail routing attempt instead of falling back to the local file.

I.e. a firewall or router enroute was dropping the EDNS packets - which was the root cause.

The case was ultimately resolved by simply adding a conditional forwarder for the affected domain - bypassing the external device that was causing the problem.
Disabling EDNS probes on the W2k3/W2k8 DNS server would also have been an option (dnscmd /config /enableEDNSprobes 0).

This is described in KB 828263 - which also applies to Windows Server 2008/R2 but the difference being that it's not enabled by default in W2k3 whereas it is the default behaviour in W2k8+.

 

Further details:

DNS query responses do not travel through a firewall in Windows Server 2003

 

http://support.microsoft.com/default.aspx?scid=kb;EN-US;828263

Comments
  • Clarification: EDNS0 was enabled by default on Windows Server 2003 RTM DNS Servers and subsequently disabled by post RTM Service Packs

    EDNS was again enabled by default on Windows Server 2008 R2 DNS Servers. EDNS can fail when > than 512 byte UDP formatted frames drop packets. W2K8 R2 also featured fallbak logic that will be improved in Windows Server 2009 R2 Service Pack 1.

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment