Stop 0x50 on Windows 2008 R2 Failover Cluster

Stop 0x50 on Windows 2008 R2 Failover Cluster

  • Comments 7
  • Likes

Greetings Cluster fans,

 

John Marlin back for another go at it.  I wanted to write something about what we have been seeing involving the use of Cluster Shared Volumes (CSV) and File Server resources on the same Cluster.

 

There have been multiple instances we have seen regarding a Stop 0x00000050 on Cluster Servers that point to CSVFILTER.SYS as being the culprit. 

 

CSVFILTER.SYS is a filter driver used by Failover Clustering to filter metadata I/O writes to a Cluster Shared Volume.  If there is a metadata write from a node that“owns” or is the coordinator node, it allows the direct I/O write.  If the node is not the coordinator, CSVFILTER.SYS redirects the I/O over the network to the node that is the coordinator.

 

Since it is a filter driver, it will attach itself to all drives in the Cluster.  The stop error occurs because CSVFILTER sees SMB I/O that it does not want to see.

 

These are three different scenarios where you can get a Stop 0x00000050 error.

 

1.       A Cluster that has File Server resources only (no Hyper-V VMs) with Cluster Shared Volumes enabled.

2.       File Server resources or shares that are located on the Cluster Shared Volumes.

3.       A Cluster that has both Hyper-V VMs on Cluster Shared Volumes and File Server resources on non-CSV drives.

 

Scenario 1

==========

A Cluster that has File Server resources only (no Hyper-V VMs) with Cluster Shared Volumes enabled. 

 

When you are in this scenario, there is no need for Cluster Shared Volumes to be enabled.  To resolve this, you should disable CSV so that CSVFILTER.SYS is no longer in play. 

 

To do this, run Powershell from the Administrative Tools with this command:

 

get-cluster | %{$_.EnableSharedVolumes="Disabled"}

 

This will disable Cluster Shared Volumes and you will no longer receive the stop errors.  In this type of configuration, there is no need for the enabling of Cluster Shared Volumes as they are not being used anyway.

 

Scenario 2

==========

File Server resources or shares that are located on the Cluster Shared Volumes.

 

When you enable Cluster Shared volumes, you will receive this dialog box:

image

 

As it states, you do not want any kind of user or application data on these volumes.  Key point in the box above is “may result in unpredictable behavior, including data corruption or data loss” and we all know that data integrity needs to be there.

 

So if you are keeping user or application data on a CSV drive, get it off or bad things can happen. This is not a valid or supported configuration.

 

Scenario 3

==========

A Cluster that has both Hyper-V VMs on Cluster Shared Volumes and File Server resources on non-CSV drives.

 

In this configuration, you have all the highly available virtual machines on CSV drives and separate groups for File Servers on non-CSV drives.  As mentioned at the beginning of this, CSVFILTER.SYS is attaching itself to all drives, including these non-CSV drives.  This is where you would need the workaround and there are two options to consider.

 

The first is to create a virtual machine that is the File Server resource and shares.  Add this VM into the Cluster on the drive that you can convert to Cluster Shared Volume.  This one would take some work and a little bit of time to do.

 

The second option is to detach CSVFILTER.SYS from the non-CSV drives.  This one is the easiest and quickest to do, but it is a little kludgy. For example, say your non-CSV was the Z: drive.  To detach it, the command would be:

 

Fltmc detach csvfilter z:

 

This would remove CSVFILTER.SYS as a filter on the drive.  The caveat to this is that if you restart the Cluster Service, reboot the machine, or simply move the group to another node, CSVFILTER.SYS may attach itself again. 

 

To get around this, you would want to create a batch file with the above command and place it on the Z: drive.  You would need to create a Generic Application resource with this batch file. You would then want to have the File Server resources depend on this Generic Application Resource and the Generic Application resource depend on the Drive Z: resource.  This way, no matter what happens, the disk comes online, CSVFILTER is told to attach, the File Server resources do what they do.

 

No more stop errors. Is it kludgy?  Yes.  Does it do the job?  Yes.

 

Microsoft is looking into this further.  There are no guarantees that a fix will be created at this point.  For now, we must utilize the workarounds mentioned above.

 

Happy Clustering !!

 

John Marlin

Senior Support Escalation Engineer

Microsoft Enterprise Platforms Support

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • Nice workaround!  So, in the end, does MS supports the scenario of 'Hyper-V and File Server on the same cluster'? I know there are some folks on the scene that do not recommend this. Personally, I've seen some small cluster deployments in which this was used. No issues so far.

    Regards,

    Fred Larracuente

  • Can you post a How to Session for the Scenario 3? I've tried this, but i'm not able to bring the generic script online...

  • Hi Joh, I too cannot bring the generic application resource online...

  • Solution

    1. You must create this resource in the same group as these disk resources and you application should have a dependency

    upon these disks. If each disk is in a different group, you should create different scripts for each group so that it kicks

    off when the disk comes online.

    2. You must add a "pause" at the end of your script so the script does not close out immediately after completing. If the

    script closes, then the generic application resource will fail as it checks to ensure that the script is still running and

    fails when it is not running.

    3. You must create a generic application (like in this article mentioned), not a generic script

    4. This resource must depend on the drive resource where the script is running.

    5. I do not recommend that you make the file server dependent on the script! If the script does not run you may want to bring

    up the file server regardless and rely on point 6.

    5a. Be sure to uncheck "If a restart is unsuccessful, fail over all resources..." in the policies of the resource

    6. You must create a task scheduler job as well since I have experienced that without any failover the filter attaches itself

    again. Run it every hour on all nodes.  Copy your original script and remove the "pause". I had to insert a "echo", otherwise I got an 0x1

    Script for the genereric application

    Fltmc detach csvfilter Z:

    Pause

    Script for the task scheduler

    Fltmc detach csvfilter Z:

    echo "test" >z:\test.txt

  • Sadly, the BSOD did occur again, regardless of this workaround :(

  • Sorry for the last post...it does work, was just a misunderstanding!

  • Did Microsoft release a fix/patch for this issue ??