Microsoft Enterprise Platforms Support: Windows Server Core Team
My name is Sean Dwyer and I am a Support Escalation Engineer with the Microsoft CORE team.
I’d like to share a quick tip for handling Windows Server Cluster administrators.
There may come a time, for whatever reason, that a Cluster managed volume is flagged as dirty and you will see an event ID message indicating that CHKDSK needs to run against the volume. Just for a little background, the NTFS File System is monitoring the drive/partition at all times. If it detects corruption, it will flip a bit on the volume and mark it as dirty. During the online process of a Clustered drive, it will check for the existance of this bit and spawn CHKDSK if it sees it. You can check, at any time, to see if a volume it is dirty with the CHKNTFS command.
C:\> chkntfs z:
The type of the file system is NTFS.
Z: is not dirty.
Z: is dirty.
In a best case scenario, you can take the volume out of production, run CHKDSK on the volume if needed (refer to: http://technet.microsoft.com/en-us/library/cc772587.aspx, and then put the volume back into production.
In most situations though, the volume that needs attention is a heavily utilized production volume and will be extremely disruptive to have the volume offline for any length of time.
For example, a recent case I was involved with had a 14Tb* (see note 1 below) volume that was being flagged for CHKDSK to run on it about once a month. The volume had about 9tb of data on it. Apart from the concern of why the volume was continually being flagged as corrupt, the length of time that CHKDSK took to run on the volume was extremely painful for the customer’s business. When it ran initially, it took roughly 80 hours to complete a run on the volume.
It may be necessary to temporarily configure a problem volume to block CHKDSK from running against it while troubleshooting continues to determine why the volume is being flagged for CHKDSK to run.
I stress the word temporary here.
Turning off the health monitoring tool for the file system as a permanent solution could only lead to more downtime in the future. You may end up on the phone with one of the File Systems experts on my team, such as Robert Mitchell.
Ok – so let’s talk specifics about temporarily blocking CHKDSK from doing work on a Cluster volume.
Say we have determined that we need to suspend CHKDSK from running on a problem volume. For you old school Cluster admins, the first command parameter that probably jumps to mind is SKIPCHKDSK.
This works just fine for Windows 2003 Server Clusters, but will NOT work for Windows 2008 and 2008R2 Failover Clusters.
If SKIPCHKDSK is used for a Clustered volume, it will be ignored when the disk is next brought online and CHKDSK will be run. In a situation where the volume is 18tb, the volume will remain unavailable for use until CHKDSK finishes* (See note 2 below).
The correct way to configure a volume to block CHKDSK from running on it, is to use the DiskRunChkdsk parameter. Keep in mind that these two parameters we are discussing only apply to the Cluster environment. If the machine is restarted, the OS may prompt for CHKDSK to run on the affected volumes.
For information on how to configure the OS to ignore the dirty bit, refer to:
How to Cancel CHKDSK After It Has Been Scheduled
Before walking through an example of setting the DiskRunChkdsk parameter, I first must expain what the values mean. In Windows 2003 Server Clusters, the SKIPCHKDSK parameter was either 0x0 (disabled) or 0x1 (enabled). In Windows 2008 and 2008R2 Failover Clusters, there are different settings and what it is checking varies.
DiskRunChkDsk <0x0>: This is the default setting for all Failover Clusters. This policy will check the volume to see if the dirty bit is set and it will perform a Normal check of the file system. The Normal check is similar to running the DIR command at the root. If the dirty bit is set or if the Normal check returns a STATUS_FILE_CORRUPT_ERROR or STATUS_DISK_CORRUPT_ERROR, CHKDSK with be started in Verbose mode (Chkdsk /x /f).
DiskRunChkDsk <0x1>: This setting will check the volume to see if the dirty bit is set and it will perform a Verbose check. A verbose check will scan the volume by traversing from the volume root and checking all the files) of the file system. If the dirty bit is set or if the Verbose check returns a STATUS_FILE_CORRUPT_ERROR, CHKDSK with be started in normal mode (Chkdsk /x /f).
DiskRunChkDsk <0x2>: This setting will run CHKDSK in Verbose mode (Chkdsk /x /f) on the volume every time it is mounted.
DiskRunChkDsk <0x3>: This setting will check the volume to see if the dirty bit is set and it will perform a Normal check of the file system. The Normal check is similar to running the DIR command at the root. If the dirty bit is set or if the Normal check returns a STATUS_DISK_CORRUPT_ERROR, CHKDSK will be started in Verbose mode (Chkdsk /x /f), otherwise CHKDSK will be started in read only mode (Chkdsk without any switches).
DiskRunChkDsk <0x4>: This setting doesn’t perform any checks at all.
DiskRunChkDsk <0x5>: This setting will check the volume to see if the dirty bit is set and it will perform a Verbose check (scan the volume by traversing from the volume root and checking all the files) of the file system. If a problem is found, CHKDSK will not be started and the volume will not be brought online.
So now that we know what the varies switches do, to have CHKDSK never run during an online operation of the disk, we want to set DiskRunChkdsk to 0x4.
Here are the steps you can run through to accomplish this task.
Step 1: Determine the resource name as seen by Cluster
Step 2: Open either an Administrative command prompt or Windows Powershell Modules and run the command:
C:\> cluster res "Cluster Disk 8" /priv DiskRunChkdsk=4
PS C:\> Get-ClusterResource "Cluster Disk 8" | Set-ClusterParameter DiskRunChkdsk 4
Note: For the setting to WORK, the disk must be brought offline and back online. Otherwise, it is simply stored until the next time it is taken offline and back online.
Step 4: Bring the disk offline, then online again.
Step 5: Verify the setting is applied
PS C:\> Get-ClusterResource "Cluster Disk 8" | Get-ClusterParameter DiskRunChkdsk
Object Name Value
------ ---- -----
Cluster Disk 8 DiskRunChkDsk 4
Step 6: Actively start troubleshooting what could cause the volume to end up flagged dirty and needing CHKDSK.
Note 1: It’s not suggested to run with volumes this large. In my experience once they exceed 2tb in size, they rapidly become an administrative liability, especially in a situation where CHKDSK has to run against the volume. We strongly suggest that mount points be used to carve up larger volumes like this, into more administratively friendly chunks. CHKDSK runs against mount points just fine, too.
Note 2: While it’s not recommended to interrupt CHKDSK while it’s running, an admin is not locked into having to let CHKDSK finish once it starts. The process can be terminated if absolutely required. However, we cannot guarantee that the end result will be positive. If the process is interrupted during the “magic moment” when CHKDSK is making changes, the results may be worse than the initial reason for the volume being flagged as corrupt.
Additional reading material related to the components and tools mentioned in this post:
How to configure volume mount points on a server cluster in Windows Server 2008
The shared disk on Windows Server 2008 cluster fails to come online
FSUTIL utility; marking a volume dirty for testing
In summary; try to keep your production volumes’ size under control, be aware that command line switches may not persist through all versions of a product, and continue being successful with Windows Server 2008!
I hope this post has been helpful!
Support Escalation Engineer
Windows CORE Team
What do you suggest to use instead of mount points when there is no logical place to divide up data into multiple volumes? We have many clients for which we need to load and process data, but each clients' needs are variable, and often unknown at the outset. One client project could make up the 90% of the data for all clients, but we don't know that when the project starts, and given that the data distribution for that project is also variable, there is no obvious place to segment the data.
One question I've had is whether it is supported to create a spanned volume consisting of multiple VHDs (on a CSV) in a clustered VM, but I've not been able to find anyplace that suggests that it is (or isn't) or what the implications might be if you do. That approach also sounds messy. Maybe the real answer doesn't come until ReFS? 2 TB is just not enough these days.
Very informative article. It gave me a very good insight.
2TB these days isn't all that much, you're right. Remember the olden days where a 100gb drive was considered huge? Ah, the good ole days =)
You're also right where MPs aren't always possible to get setup and running in an existing datacenter or in situations where space consumption can't be quantified until it starts beign used. It's always in the best interest of everyone involved to try to get some sort of scoping or useage metrics but it's not always possible.
AS for specific configurations, I can't really help in depth as I'm not aware of all the possible OEM solutions or configurations, but typically I see people run with volumes presented from a SAN to specific departments or customers, that can then be dynamically grown on the back end as more space is needed. This can still lead to a massive volume that will take a long time for chkdsk to run through, but it's at least better than the alternative.
Using mount points is a general suggestion to help keep customers from throwing out a single 2+tb volume for use; becasue they can.
Doing that can lead to much gnashing of teeth and lots of downtime when a major file system issue is hit and chkdsk needs to run its full course.
As for the spanend VHD question. I think I know what you're asking? :)
A VM with, say, 4 VHDs, 1 OS, 3 data, that you have booted into the OS and created a Spanned set from those 3 data VHDs will be completely fine to store on a CSV volume. Cluster has no clue what's *in* the VHDs, and doesnt' care. The Spanned Volume and Dynamic Disk information is within the VHD file itself and is only valid to the OS that boot straps the file system in those VHDs.
Now, if I'm wrong, and that's not what you were asking, just clarify for me and I'll be glad to respond.
When following best practices for mountPoints:
•Try to use the root (host) volume exclusively for mount points. The root volume is the volume that hosts the mount points. This practice greatly reduces the time that is required to restore access to the mounted volumes if you have to run the Chkdsk.exe tool. This also reduces the time that is required to restore from backup on the host volume.
The question I have is when a root volume is flagged dirty, does chkdsk run only on the root volumee or does it run on all of the mount point under the root volume.