When I first heard of ReFS my immediate first question was whether ReFS as a file system was a direct replacement for NTFS. It didn’t take long before my customers began to ask me the same question. In learning about this new file system in Windows Server 2012, it became apparent fairly quickly that ReFS, while a new file system, is built differently. ReFS stands for Resilient File System. NTFS has its place, and so does ReFS. While ReFS may appear to have some similarity to NTFS, it does not contain all the underlying NTFS features and scales efficiently to handle data sets far larger than NTFS.
Although ReFS inherited some of the NTFS code base initially, it is a different file system with different uses in mind. In fact, disk tools that work with the NTFS Master File Table (MFT) won’t be able to work with ReFS because ReFS has its own mechanism for keeping up with file metadata. ReFS is ideal for storing large amounts of data and can be leveraged for file shares. Applications that run locally on the server and rely on specific NTFS features may not work with ReFS. However, many may have no issues due to ReFS compatibility with many of the Win32 storage APIs. For example, Windows Deployment Services (WDS) explicitly requires NTFS because it relies on specific features in order to implement the RemoteInstall folder structure used for storing images. These are features that a conventional file server or data repository does not require.
CHKDSK isn’t applicable to ReFS. Yes…I did just indicate that there’s no need to run CHKDSK on a ReFS partition. Are you feeling that the tool you’ve wanted to avoid for so long is now something you might want to hold onto…just in case? A love-hate relationship perhaps? The counselor is in. It’s okay to have those feelings if you have them. The truth is that in terms of ReFS partitions, ReFS doesn’t need CHKDSK because repair functionality is built-into the file system. Repair, if needed, occurs on-the-fly. Yep…there’s no need for extra tools to go fix corruption like with other file systems. And for what it’s worth, Windows Server 2012 contains improvements for CHKDSK.
ReFS can use checksums to detect if data has changed since last written and is able to detect and recover from corruption quickly. In fact, when data is written to disk, it is written to a new location on disk rather than over the top of existing data. Once successfully written, the file system can free the space used by the old data stream. ReFS is able to recover from corruption within the file system rapidly without limiting availability of the volume. Further, ReFS may be used with clusters, Hyper-V, file shares, data archival, and many other uses.
Additional integrity data streams may be enabled if you have additional needs for data protection. By default, ReFS uses conventional streams which behave identically to NTFS data streams. However, don’t think that because this is so that file system metadata is conventional also. File system metadata is protected against corruption. If you want additional protection of data streams, you may enable Integrity Streams. When configured to do so, checksums are used against written data and updates are done using copy-on-write. You may enable Integrity Streams on particular folders, volumes, or even granularly on a per-file basis. When coupled with redundancy through Storage Spaces, it is default behavior for Integrity Streams to be enabled for the entire volume. Storage Spaces and ReFS complement each other. When coupled with a mirrored Storage Space, duplicate copies of data will automatically be leveraged by ReFS. With this configuration, if corruption were to be encountered, ReFS can immediately leverage redundant data within Storage Spaces to expediently address the issue. One other example of how Storage Spaces compliments ReFS would be how ReFS periodically scrubs file system data to look for differences in an event to combat bit flips that can occur over time due to data stored over a long period of time in the same location.
As a file system, ReFS is not only good for resiliency, it is great for maintaining extremely large amounts of data. With data integrity and recovery features built-into the file system, there is no need to wait for CHKDSK to run to fix corruption. What if you needed to store 500 billion gigabytes in one place?. But why stop there? I recently read that ReFS can handle up to 1 Yottabyte (YB). That wasn’t in my geek vocabulary until now and I’m not sure I’m going to remember that name next week. To give you an idea of how large a number that is, 1GB can be represented as 109. 1YB is represented by 1024. That is like 1 quadrillion GB. That seriously messes with my head. Good luck purchasing that much available hard drive space for your home lab in the basement. Imagine trying to power all that storage without shorting out the power for the whole neighborhood. Can you imagine how long even a modern efficient version of CHKDSK might run on a volume that size? ReFS answers the needs of data integrity and efficiency and is mainly intended for very large volumes on file servers. It is a file system that goes beyond capability of NTFS.
NTFS remains the file system of choice for the operating system boot volume as well as any other general needs for data storage. CHKDSK remains the tool of choice for dealing with NTFS file system issues should they happen. CHKDSK in Windows Server 2012 contains improvements…so CHKDSK continues to evolve and get better. The average NTFS deployment currently is around 500GB. In many cases administrators were hesitant to go beyond that due to the time potentially required to run CHKDSK against a volume that size. The time required to run CHKDSK has not been predictable due to file system structure complexity differences from one volume to another as well as a variety of other factors. The average number mentioned above has increased over time because the efficiency of CHDKSK has evolved with each new Windows Server release for quite some time now. As the following TechNet reference indicates, you can safely deploy multi-terabyte volumes based on the improvements in CHKDSK and the existing capabilities of the NTFS file system. Windows Server 2012 builds upon the self-healing capabilities of Windows Server 2008 R2 NTFS. NTFS fixes the corruption it can on-the-fly and what can’t be addressed immediately can already be calculated in how it needs to be fixed to that when you choose to mitigate the issue, the time required is truly minimal compared to the CHKDSK of years past.
Although ReFS is a different file system, there are similar features between ReFS and NTFS. The easiest way to compare those is to look at the feature list side by side. Consider the following feature sets:
Supports Case-sensitive filenames
Preserves Case of filenames
Supports Unicode in filenames
Preserves & Enforces ACL's
Supports file-based Compression
Supports Disk Quotas
Supports Sparse files
Supports Reparse Points
Supports Object Identifiers
Supports Encrypted File System
Supports Named Streams
Supports Hard Links
Supports Extended Attributes
Supports Open By FileID
Supports USN Journal
How did I pull this information? There are charts available on the net…and people that know me can testify that I like charts. Yet, it is easy to obtain the available features of each file system yourself from the live file systems that you already use. If you don’t have a volume available that uses ReFS, it is easy to create a VHD file, attach it, and format as ReFS to use for your own testing. Once you have NTFS and ReFS file systems at your fingertips, use FSUTIL to gather the file system supported features. Then it’s a matter of comparing each list of features. FSUTIL syntax is very simple:
FSUTIL fsinfo volumeinfo driveletter
This simple command lists the capabilities provided by the file system in use on the supplied drive letter. Simply run this command against each type of file system and compare.
Here are a few screen shots of using Disk Management to create a VHD, initialize as GPT, format as ReFS, and using FSUTIL against the newly created ReFS volume.
For this example I used dynamic expansion. If I were really going to use this VHD for data I would not thin provision like in this example. I only want to inventory ReFS features for this example.
Right-click on the new disk. Choose Initialize. Choose GPT. Right-click on the disk and format as ReFS.
Notice that key capabilities missing from ReFS as compared to NTFS are EFS encryption, quotas, and compression. BitLocker may be used for encrypting these volumes while EFS is not an option. Thus, when using ReFS, encryption doesn’t need to be part of the file system. BitLocker satisfies the need for encrypting data on the volume as it encrypts the contents of the entire disk. Quotas may be managed outside the file system rather than through the file system like NTFS. Not supporting Hard Links is a key reason why you wouldn’t use ReFS for a system disk; files in the system32 folder are really hard linked back into the WinSxS folder structure. You might think that ReFS would have data deduplication built-into the file system. The fact that it doesn’t may not prohibit other components or third-party solutions from interfacing with ReFS through the API set provided.
For those that are script fanatics (you know who you are), I’ve written this section just for you. We all know that the need for scripting language happens all the time. Perhaps you’re just creating re-creatable virtual environments for one reason or another…and now you need to include the creation of some ReFS volumes. Below are some valuable commands including conventional and PowerShell examples.
While this can be done from the UI quite easily by choosing ReFS as the file system in the drop down from the Format dialog, this is also easily done from the command line. Full format example below:
Format /fs:ReFS J:
Format-Volume –DriveLetter J –FileSystem ReFS -Full
In fact, typical command line syntax and optional parameters apply. Therefore, if you want this to be a quick format, just append /q to the above command line. However, you have the option to enable Integrity Streams for the volume. Note the following command:
Format /fs:ReFS /q /i:enable J:
Format-Volume –DriveLetter J –FileSystem ReFS –SetIntegrityStreams $true
The preceding command enables Integrity Streams on drive J: and performs a quick format. The /i option offers you the ability to enable|disable this feature for the volume. If you enable this option, all files created on the volume will be configured with integrity. You may turn this off for individual files or folders using the Integrity command. However, know that if a file is non-empty and has data streams created with Integrity Streams, you cannot disable the feature for that file. You cannot change the integrity status for a file once the file contains integrity data streams. You could copy the file to another partition, delete the original, and then copy it back without Integrity Streams.
During the beta for Windows Server 2012, there were blog posts on the net that mentioned a tool called INTEGRITY.EXE. In the released version of Windows Server 2012 this tool does not exist. This is not a bug or mistake. Development provided PowerShell cmdlets to address configuration options for ReFS instead of providing another utility to keep up with that had narrow focus to begin with. I plan to construct a future post on storage cmdlets that will include how to adjust ReFS behavior using PowerShell. Also, just because a file system is self-healing doesn’t mean that backups should be forgotten. If a large chunk of blue ice falls from the sky and crushes your massive ReFS file system in one loud thunk…you’re more than likely going to need to restore the data. While that's my conservative side speaking, unexpected events do happen. It never hurts to be prepared for the unexpected or the unlikely in a datacenter.