In a previous post we saw the Microsoft requirements for the exclusions that must be added to file system AV on Exchange servers. In a recent CritSit, basically an uber urgent support request where the customer is down or as good as down, I also got to examine some of the other causes for file system AV not being correctly configured for Exchange.
In the aforementioned post the majority of the issues were caused by the lack of exclusions to a scheduled scan task. The issue below was not related to that but how different processes are identified and what exclusions get applied to the various processes.
Please note that this post is not intended to slight the AV vendor’s product in any way whatsoever. The product was performing as designed, it was how the customer’s AV team had configured the product which was the issue. The underlying intent for this and the other post is to raise field awareness of the types of issues that we see, and to facilitate better and more focussed discussions with the various AV and security teams that we work with on a daily basis.
The file system AV product in question has the option to categorise processes into different risk levels. By default this feature is not enabled, and the customer must explicitly enable it. The different process levels that you may see are Default, Low Risk and High Risk. The below is a brief description:
The key concept to note is that the level a process is defined at will dictate which set of exclusions will apply. For example a process like Trojan.exe can be defined at the high risk level. This means that the exclusions applied to what Trojan.exe touches will be the exclusions defined at the high risk level. Typically by default there will be minimal exclusions at the high risk level.
What happened to cause the issue to get me onsite in a hurry?
(Subject should read Pete Tong MBE, and refers to cockney rhyming slang “It’s gone wrong”.)
The customer’s Exchange team correctly identified that file system AV exclusions were required as part of the design. The required exclusions were passed to the customer’s AV team. Consider this the WHAT of this story. The exclusions are WHAT is required. HOW they get implemented varies depending upon the file system AV product the customer has implemented. AV products each have their own best practices and implementation requirements, for details on this you must consult with your AV team and their vendor. Microsoft cannot provide guidance on HOW a 3rd party vendor’s product be configured to achieve the required results.
In this case, the customer started off by defining all of the required exclusions in the default process section. As noted above this will apply to all process on the system uniformly. What happened next was a bit baffling. For some reason, that was not well understood, they then enabled the low risk process section (and by extension this also enabled high risk). All of the Exchange processes were then added to the low risk section. Job done, no?
<Borat> Not so much </Borat>
Since the Exchange processes were now defined as a low risk process, they picked up the exclusions that were defined at the low risk level. In the paragraph above note that there was no mention of the exclusions being copied over from the default process section, and that was the crux of the issue. The Exchange content was now being scanned by file system AV since it was not excluded at the same level as the defined process. In this case every read and write to the database was intercepted by file system AV. The performance on the system was terrible, CPU consumption was through the roof and since the business was so unsatisfied with Exchange performance I won a free trip to go and fix it.....
Again, the AV product was working as designed. Absolutely no issues were identified with it apart from the configuration the customer had applied. After I noted that not all of the required exclusions were present, I requested the customer’s AV team, the AV vendor and the Exchange team get on a conference call to thrash this out. I have to applaud the level of support we got from the AV support person on the call, she was fantastic! In the space of 60 minutes she clearly and precisely identified the configuration issues, stated what needs to be corrected and then provided multiple other items the customer should address.
What can we take away from this?
Please also refer to the previous post for the other learning items also presented there.
If you would like to have Microsoft Premier Field Engineering (PFE) visit your company and assist with the topic(s) presented in this blog post, then please contact your Microsoft Premier Technical Account Manager (TAM) for more information on scheduling and our varied offerings!
If you are not currently benefiting from Microsoft Premier support and you’d like more information about Premier, please email the appropriate contact below, and tell them you how you got introduced!
For all other areas please use the US contact point.
file level antivirus continues to be a massive problem, i reckon it's caused 10 times more damage than it has prevented on the exchange systems i've been involved with. not just because it's been incorrectly configured, either. a well known and reputable
manufacturer managed to release an update last year that wiped custom exclusions, which was nice. while the exchange exclusions are largely well documented, the OS and other applications are less so... there is a nice paper here (https://skydrive.live.com/view.aspx?resid=CF5623142A67FE0F!1321&app=Word&authkey=!APa6t2AxsVPfiic)
which i wish MS would publish officially, as it seems pretty comprehensive...
Sorry Rhoderick - the link in the comment above was to the "Windows Antivirus Exclusions Recommendations" Word doc written by Brian Helmick (MSPFE) last year, but it seems it's been removed... oh well.
No worries Nick! I've seen the same thing where the agent loses it's connection to its directory and goes native - i.e. reverts back to its factory default settings which means scan everything....
Thanks for this Rhoderick! I went back and revisited our exceptions after reading this, and realized two new databases were brought online in the past few months and the AV team were not aware of (so not exempting) the new LUNS assigned to the mount points.