Scanning Exchange databases with file system antivirus is a recipe for disaster. This really should not come as a surprise for admins running Exchange services within the enterprise, since this has been the field requirement for a long time. The documentation provided by Microsoft is very clear in what exclusions are required for file system antivirus and Exchange to coexist. For reference the relevant articles are:
If this is so well documented, then what could possibly go wrong? Plenty….
Update 30-6-2014: Please also see this post on a related issue.
Every vendor who writes a file system AV product will implement theirs in a different way. Because of this, and the fact that I will not identify vendors by name, this article will be written in a generic style. The concepts however will apply to the vast majority of AV products.
TechNet does a good job of listing the types of file system antivirus scanners:
Other terminology that may be encountered is the term On-Access. This is where AV will process a file when it is accessed. Unlike the On-Demand scan, if a file is never opened then it is never scanned. Reversely if it is opened multiple times then it will likely get scanned each time it is accessed. The exact details of this are at the discretion of the AV vendor.
The heuristics contained within each AV product vary greatly, and they behave differently on the above point and many others. Some do not show the configured file system exclusions in their admin tool graphical interface and you have to look at the registry to see what file system paths are actually being excluded. Others allow the AV team to lock the management application on the Exchange server down so that it is harder/impossible to see what scans are running, to troubleshoot issues and to terminate the AV scan (if required) without waiting for AV team to respond.
Please consult with your AV team and review their vendor’s documentation to understand how their product works .
Regrettably there are multiple issues that can and will arise if you allow file system AV to scan Exchange. Note that this is not just the mailbox database file, there are range of other locations that must also be exempted from file system AV scanning. For details see the links at the start of this post.
File-level scanners may scan a file when the file is being used or at a scheduled interval. This can cause the scanners to lock or quarantine an Exchange log file or a database file while Exchange tries to use the file. This behaviour may cause a severe failure in Microsoft Exchange and may also cause -1018 ESE errors.
One thing to note is that file-level scanners do not provide protection against e-mail viruses, such as the Storm Worm. Storm Worm was a backdoor Trojan horse virus that propagated itself through e-mail messages. The worm joined the infected computer to a botnet, where the computer was used to send spam e-mail messages in periodic bursts. Such viruses can affect the performance of the computer and the network that it is attached to.
This is not a new issue. As my friend Dave McGarr puts it over on his blog, Friends don’t let friends scan the M- drive ! Because of this, the M:\ drive was hidden by default in Exchange 2003. Exchange 2000, which introduced the M:\ Drive, was often negatively impacted by file system AV scanning M:\…..
This is the story of a recent engagement where I ran into some serious AV issues. The customer in question had recently completed an Exchange Server Risk Assessment (ExRAP). ExRAP looks at both technical and process aspects of managing messaging services. One interview question specifically asks if the correct AV exclusions have been implemented. The customer stated that they were.
Fast forward 4 months. The customer’s stable Exchange environment started to exhibit strange behaviours all of a sudden. Issues included degraded database performance, database failover issues and very poor Outlook client response times. As part of initial troubleshooting Microsoft requested that the AV exclusions be checked to ensure that they are correct and were not causing any issues. Again they were stated as correct. Screen shots and remote assistance sessions showed that the settings were entered. So what was causing databases not to failover between DAG members?
Well it turns out that only half of the puzzle was validated. Unbeknown to the Exchange admins, the AV team had implemented a weekly On-Demand scan that started late Sunday evening and scanned every single file on the server. Yes that's right -- zero exclusions… It gets better! These scans were taking a very long time to complete, and in some cases the scan did not complete until Wednesday or Thursday!
The AV product in use has a feature where it will lock a file that looks suspicious for an un-specified amount of time. The lock duration is controlled by the AV engine and is entirely at its discretion. This is what caused the database failover issues. When trying to mount a database on a server, AV locked the Exchange database as it though that MBD01.edb was suspicious. Since the file was locked, Exchange was unable to gain access to the database and mount it. If enough time elapsed then AV would release the file and the database could be mounted. Reviewing traces corroborated this, as we would see Exchange starting to read the database but not progressing further.
Not only was this an unsupported act as far as Microsoft is concerned the impact to the customer was tremendous. Some of the issues experienced were:
Rather than just state that the required exclusions be implemented, I thought it would be more beneficial to discuss some of the areas which typically contribute to the above situation, and some resolutions.
All teams must be tightly aligned on how AV is deployed and configured. While server teams like Exchange do not need to know the exact details of implementing AV on the backend, they must understand how to communicate with the other teams effectively, more on this in a minute! For example how do the Exchange servers get the correct AV policy assigned? Is it based on server name, location in AD or are Exchange servers manually tagged with a policy? This sounds minor, but this knowledge is critical in understanding the impact of choosing a different server name or the steps required if reinstalling an Exchange server from scratch.
To assist with communicating effectively, all teams should communicate using the same terminology to minimise any potential misunderstandings. In the above example, the Exchange team understood an AV exclusion to apply to any and all AV scans. However the AV teams did not share this viewpoint, and their terminology was more granular.
There must be a detailed discussion on the configuration of the AV policies that are applied to the Exchange infrastructure. Some examples include:
The AV agent health must be monitored by the AV team to ensure that an agent does not “go native”, and ignore its configuration. The worst possible case here would be for an agent to revert back to its default configuration which typically means that there are no exclusions and all files and processes are scanned.
AV team must accept that Exchange requires certain file system exclusions to operate in a supported manner by Microsoft. This is a tendency for such AV teams to perceive a security risk by the fact that MDB01.edb is never scanned by file system AV. Their concern that NaughyFile.edb will be stored on the Exchange server needs to be tempered with:
The above are only a few points in a typical discussion on this topic. Please engage with a security consultant to fully discuss such issues, as each enterprise will have different business requirements which translate into the underlying technical configuration. Some customers track these activities through a security sign off or waiver process.
Finally, do not assume that since a previous version of Exchange ran in a given environment, the AV conversation can be skipped! Take the time to ensure that all teams are on the same page, and that the correct exclusions are applied. Exchange 2010 has different exclusions compared to Exchange 2003! Additionally there will likely have been staff changes over the years since older AV policies were defined so have this critical conversation to prevent a critical situation – aka a CritSit!
If you would like to have Microsoft Premier Field Engineering (PFE) visit your company and assist with the topic(s) presented in this blog post, then please contact your Microsoft Premier Technical Account Manager (TAM) for more information on scheduling and our varied offerings!
If you are not currently benefiting from Microsoft Premier support and you’d like more information about Premier, please email the appropriate contact below, and tell them you how you got introduced!
For all other areas please use the US contact point.
It is indeed informative.
Pretty! This has been a really wonderful post. I really like it.