Blogs

Testing Anti-Virus Application Interoperability with DFS Replication

  • Comments 2
  • Likes

Thanks to Shobana Balakrishnan, Richard Chinn, and Brian Collins for contributing to this blog.

Overview
Anti-virus applications have caused interoperability problems with file replication in the past, namely with NTFRS (File Replication service).  In particular, excessive replication can be triggered by poorly behaved anti-virus applications as their scanning activities are interpreted as file changes needing replication.  DFS Replication (DFSR) relies on the same file system facilities as NTFRS for detecting file changes, so it is subject to similar problems.  DFSR’s new design may make it more robust and tolerant of such applications at the cost of additional processing, but full interoperability must be tested.

The tests should cover the following areas.

  • Triggering excessive replication
  • Interference with replication
  • Local and remote access to replicated folders.
     

The following features of the anti-virus application should be covered.

  • Real-time scanning
  • On-demand scanning
  • Cleaning, deletion, and quarantine
     

Note this blog simply provides recommendations and guidelines for testing. Simply running all theses tests will not guarantee a given anti-virus program’s interoperability with DFSR.

Test Environment
Set up two replication partners, one with the anti-virus product of interest and one without any anti-virus software.  Both partners should have a scratch volume that is only used for replicated folders.  This will minimize noise in the tests.

Tests

Excessive Replication
Excessive replication is caused by programs making no effective changes to files yet causing changes to be introduced in the USN journal.  DFSR sees these changes and is able to suppress some of them as it realizes that file hashes are unchanged.  This is in contrast to NTFRS which would replicate such changes.  Still, in the DFSR case, it is best to not have spurious changes being introduced into the system as it will increase the load on the server.

To monitor USN activity, use PerfMon to show the USN Journal Records Accepted performance counter under the DFS Replication Service Volumes object.  If DFSR accepts a USN journal record, then this is a file change that is eligible for replication.  If you like, you may also add USN Journal Records Read; this reflects the USN journal records that DFSR has read on the given volume.  Changes outside the replicated folder are not accepted.  It is easiest to see the numbers if PerfMon is configured to display a report view.

An alternate way to monitor USN activity is to monitor the DFSR debug logs with a utility such as tail.exe.  Using tail -f on the last debug log will show the real-time debug logs as written by DFSR.  Log messages with USNC are those related to the USN consumer.  You must beware of the logs wrapping to the next log file.  To enable debug logs, run the following WMI command on the members.

wmic /namespace:\\root\microsoftdfs path DfsrMachineConfig set EnableDebugLog=true,DebugLogSeverity=5,MaxDebugLogFiles=10000

Here is the test procedure.

1. Set up replication between M1 and M2 for a replicated folder.

2. Configure M1’s replicated folder for real-time monitoring.

3.  Set up USN journal monitoring using the method of your choice.

4.   Populate the folder with various files of interest including executables and documents.

5.   Ensure DFSR has synced the two folders and backlogs are zero (e.g. run a health report from DfsMgmt.msc or use DfsrDiag.exe with the backlog option).

6. Perform an on-demand scan on M1.

7.  Verify no USN journal records are accepted on M1.

8.  Try accessing files / running programs on M1.

9. Verify no USN journal records are accepted on M1.

10. Make all the files on M1 read-only using attrib.exe or the Explorer.  This will cause USN journal activity.  Let DFSR sync and verify the backlog drops to zero.

11. Perform an on-demand scan on M1.

12. Verify no USN journal records are accepted on M1.

13. Copy a file from another volume, preferably something excluded from anti-virus monitoring, into the replicated folder.

14. Verify there is only one USN record accepted by DFSR.

15. Move a file from the same volume, preferably from a folder that is excluded from anti-virus monitoring, into the replicated folder.

16. Verify there is only one USN record accepted by DFSR.

Interfering with Replication
Anti-virus programs have the ability to delete or move an infected file to a special quarantine area.  Depending on how the infected file is detected and how the deletion or move is done, this may cause DFSR to become permanently backlogged and repeatedly attempt to download and install the file into the replicated folder.

Anti-virus programs with good interoperability are able to delete, clean, or quarantine the file on one member, and have the changed file replicate normally.  Sometimes this will require setting an exception filter on the entire DfsrPrivate folder.

Testing with infected files is facilitated by using a virus test file called eicar.com from http://www.eicar.org.

1. Set up replication between M1 and M2 for a replicated folder.

2. Verify replication between M1 and M2.

3. Configure real-time scanning on M1.  Configure it to quarantine or delete.

4. Copy the infected file into the replicated folder on M2 and monitor M1, M2, and the backlogs between the two servers in both directions.  There are a few things that can happen, so here are some to watch out for.

  • File does not appear on M1 but persists on M2.  A backlog of 1 persists from M2 to M1.  This may indicate that the anti-virus program detected the infected file in the DfsrPrivate\Installing folder and deleted it before DFSR could install the file into the replicated folder.  This is an interoperability issue as the backlog will persist until the file is deleted on M2.
  • File does not appear on M1 but persists on M2.  Backlogs are zero between the two machines.  This is an interoperability issue as the file systems are not in sync but DFSR believes they are in sync.
  • File does not appear on M1 (or disappears quickly) and it is deleted on M2.  Backlogs are zero between M1 and M2.  This probably means the anti-virus program was able to remove the file from the replicated folder on M1 such that DFSR saw a file deletion and replicated the deletion to M2.  This is not an interoperability issue.
  • File replicates to M1 and also persists on M2.  The backlogs are zero between M1 and M2, and everything is in sync.  When the file is opened on M1, the file is deleted / quarantined, and this is replicated to M2 as a delete.  Backlogs are zero.  This is not an interoperability issue.

5. The above tests should be repeated using the clean functionality.

Local and Remote Access
There may be subtle differences when a file is accessed over a share versus to locally on the server itself.  Since DFSR is used between file servers, it will most likely be the case that users will be accessing files over shares.  As a result, it is important to try the above tests over shares as well as locally.

Similarly, client-side anti-virus applications that monitor remote shares as well as applications that scan shares should also be tried if applicable to your scenario.

Recommendations
Problems will generally be minimized if infected files are never allowed into the replicated folders, especially when one member sees a file as infected but another member does not.  To minimize this, you should do the following.

  • Install anti-virus software on all or no replication members.
    Install anti-virus software and ensure the files are clean before enabling replication.
  • Keep the anti-virus signatures in sync across all replication members.
  • Set an exclusion filter for the DfsrPrivate folder for all replicated folders on the member.
  • Routinely monitor backlog and investigate persistent backlogs.
     

Also, anti-virus application behavior will vary from version to version and operating system to operating system.  It is important to perform the testing procedure for each new version that is considered for deployment.

Here are some tests you can use to test Interop of additional products:

Backup

Tests

1.  Create replicated folder, start DFSR, replicate some files

   a. Verify replication works

2.  Run backup application

   a.   Verify no errors in DFSR eventlog

   b. Verify replication still works after backup

   c.  Verify no lingering backlogs

   d. Verify backup did not trigger unexpected replication

3. Delete some data, then restore backed-up files

   a. Verify no errors in DFSR eventlog

   b. Verify replication still works after restore

   c. Verify no lingering backlogs

   d. Verify restored files are replicated

Encryption, Quotas, Defrag, Monitoring

Test 1: Creating a replicated folder on machines already configured w/ vendor’s product

1. Install product on test server(s) and configure as needed so it will impact the to-be-created replicated folder

2.  Create replicated folder with existing data on some or all members, start DFSR, replicate some files

   a. Verify all members sync up with each other

   b.Verify no errors in DFSR eventlog

   c.Verify no extra replication

   d. Verify no lingering backlogs

3. Play with the replicated data (create new files, modify files, delete files, etc.)

   a. Verify no errors in DFSR eventlog

   b. Verify the file modifications replicate

   c. Verify no extra replication

   d. Verify no lingering backlogs

4. If application is capable of running some sort of scan (like an AV scan), run it

 a. Verify no errors in DFSR eventlog

 b. Verify no unexpected extra replication

 

Test 2: Configure vendor’s product on machines already hosting replicated folder

1. Create replicated folder with existing data on some or all members, start DFSR, replicate some files

   a. Verify replication works

2. Install product on test server(s) and configure as needed so it will impact the replicate folder

   a. Verify no errors in the DFSR eventlog

   b. Verify no unexpected extra replication

   c. Verify replication still works after product is installed & active

   d. Verify no lingering backlogs

 

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment