Hi Folks –
Back in December, I wrote a blog article on Data Deduplication, which was first introduced in Windows Server 2012 and Windows Storage Server 2012. It’s been improved in Windows Server 2012 R2 and Windows Storage Server 2012 R2, and it is quickly becoming a very popular feature. In this post, I’ll share the perspective of Matthias Wollnik, a Senior Program Manager for Data Deduplication at Microsoft:
Q: Why did Microsoft develop Data Deduplication? What customer pain(s) did you set out to solve?
When we looked at what we could do to best serve our customers’ storage needs, we saw that:
Based on these factors, we saw an opportunity to deliver new customer value by building Data Deduplication into Windows Server—in a way that would both help companies save money on storage and reduce related storage management costs.
Q: Where can I get Data Deduplication and what does it cost?
Data Deduplication is built into Windows Server 2012 R2 Standard, Windows Server 2012 R2 Datacenter, and Windows Storage Server 2012 R2 Standard—as well as the 2012 (pre-R2) versions of those editions. It’s a configurable feature under the File and Storage Services role and can be managed via Server Manager or Windows PowerShell. And because it’s built-in and ready to use, there’s no additional cost beyond that of the operating system.
Q: Are there any system requirements for using Data Deduplication?
Unlike some other solutions, Data Deduplication does not require any additional hardware. System requirements and considerations include the following:
Because it works entirely at the file server level, the clients that connect to the server can be running any operating system. Data Deduplication doesn’t care if you’re using SMB3, SMB2.1, or NFS file protocols to access a share where the data is stored, or if it’s just local data that isn’t exposed for remote access.
Q: What are the recommended use cases for Data Deduplication?
Data Deduplication is recommended for—and delivers significant results—on home directory shares, group file and collaboration shares, software deployment shares, and VHD libraries. There are several things to consider when determining whether to use Data Deduplication:
Q: Microsoft says Data Deduplication can reduce required disk space by up to 90 percent. What kind of results can a company expect?
The amount of disk space you’ll save depends on the type of data being stored:
Q: How does Data Deduplication work?
Data Deduplication reduces the amount of physical disk space required to store a given amount of logical data. During the deduplication process,
When a deduplicated file is read, a filter in the read-path reassembles the file in a manner that is transparent to the calling application or user.
Q: What opportunity does Data Deduplication present for Microsoft partners who help companies deploy Windows Server?
Data Deduplication is a great thing to offer to setup for customers for two reasons:
That said, the larger opportunity for Microsoft partners is that Data Deduplication is a great enabler for new VDI solutions, as supported by the ability to deduplicate running VDI workloads that we added in R2. Because it greatly reduces the amount of storage that’s required for VDI VHDs, it makes new VDI solutions a lot more cost-competitive.
Q: Can you provide an example of how much a company might save in a VDI scenario?
Let’s assume that you want to deploy 100 VDI VMs at 40 GB per desktop, and that for performance and reliability considerations you want to use mirrored, high-performance, dual-port SAS2 SSD drives:
Thanks to Matthias for taking the time to discuss some of the most commonly asked questions about Data Deduplication. For more information on Data Deduplication see my blog here and for more info on how to deploy it for VDI storage, see Matthias’s blog articles here and here.
Cheers, Scott M. Johnson Senior Program Manager Windows Storage Server @supersquatchy