Hyper-V backup at Private Cloud scale - System Center: Data Protection Manager Engineering Team Blog - Site Home - TechNet Blogs

Hyper-V backup at Private Cloud scale

Hyper-V backup at Private Cloud scale

  • Comments 10
  • Likes

Update Rollup 3 (UR3) for DPM 2012 R2 brings key enhancements for VM backups that will ensure guaranteed backup SLAs and make backups much more efficient at scale for a virtualized deployment. The current update is aimed at minimizing the impact that backup has on the production storage infrastructure for private cloud deployments (1000s of VMs) on Windows Server 2012 R2.

We support both the configurations for Hyper-V deployments as listed below:

  • VMs hosted on a Hyper-V cluster with storage on SMB shares backed by a Scale-Out File Server cluster (Hyper-V over SOFS)
  • VMs hosted on a Hyper-V cluster with storage on Clustered Shared Volumes (Hyper-V over CSV)

Scale testing on SOFS

We have done extensive scale testing by taking continuous daily backups for 3 weeks using Virtualized DPM servers. The guest OS used for the protected VMs was Windows Server 2012 R2. The workload running inside each of the VMs was spread across multiple IO profiles (SQL OLTP, Exchange, File Server, Video Streaming, SQL Decision Support System). 

Here are the details of the Hyper-V over SOFS deployments:

image
Configuration Hyper-V over SOFS
# of Hyper-V Hosts 24
VM Config (RAM) 2-8 GB
VM Disk Size 120 GB (20 GB for OS + 100 GB for data)
Total # of VMs 1000
VM Churn per day 5%
SOFS Cluster Nodes 4
# of Virtual DPM Servers 8

We scale tested with each DPM server protecting between 50 to 250 VMs. DPM VMs were deployed in scale-out configuration to protect VMs from the same Hyper-V cluster nodes. We pivoted the results around the following criteria:  

  • Backup success rate per day – This signifies the percentage of VMs having successful backups in a single day.
  • Overall backup success rate – This signifies overall percentage of successful backups across all VMs for a 3 week duration.

We achieved more than 98% for both the metrics. It also implies that there were more than 20,000 jobs than ran successfully during this 3 weeks duration. The few errors that we encountered were due to known auto-recoverable failures - such as "Out of storage space" and "Retry-able VSS errors".

Stress testing on SOFS

We stress tested the Hyper-V backups on a slightly different scale (2 DPM servers protecting 500 VMs), taking 8 backups a day (every 3 hours) for more than a week. Here’s a 3 min video which shows the backup in action:

Scale testing on CSV

We did scale testing for Hyper-V over CSV and got similar results. 

image
Configuration Hyper-V over CSV
# of Hyper-V Hosts 12
VM Config (RAM) 1-8 GB
VM Disk Size 50 GB (20 GB for OS + 30 GB for data)
Total # of VMs 600
SAN Make/Model Dell Compellent SC8000
# of CSV 12
# of Virtual DPM Servers 2

 

DPM Deployment

The recommended virtualized deployment model is to provision backup storage through VHDs residing on Scale-out File Server (SOFS) shares.

A suggested DPM deployment configuration would look like the one mentioned below:

Virtual Processors 4
RAM 8 GB
NIC 10 Gbps
Storage 20 TB (1 TB X 20) Dynamic VHDs on SMB share

This configuration has a few advantages:

  1. Virtualized DPM setup allows easy scale-out
  2. SOFS cluster provides storage resiliency
  3. VHDs used as the backup storage provides flexibility for data growth

Additionally, we heard some customers required the flexibility to run backups during off-peak hours, so the concept of a Backup Window for VM data sources was introduced. Here is how you can set the backup window using PowerShell (ensure that the backup schedule aligns with the StartTime parameter used in Set-DPMBackupWindow):

Set-DPMBackupWindow -ProtectionGroup <ModifiablePGObject> -StartTime 23:00 -DurationInHours 6

Set-DPMProtectionGroup <ModifiablePGObject>

Now that you have seen scalable VM backup in action, try it out yourself. Installation instructions for this DPM update are provided in KB 2966014.

    Your comment has been posted.   Close
    Thank you, your comment requires moderation so it may take a while to appear.   Close
    Leave a Comment
    • Great improvements indeed, but where is storage pool deduplication that was mention at TechEd NA 2014? From my understanding UR3 should have provided this feature that would benefit many of us.

    • Great post--thank you! Regarding the recommendation for virtualized DPM servers and using SOFS shares for storage, I'm a bit confused:

      - Virtualized DPM: I thought this prevents you from restoring individual files within a VHD/VHDX (without restoring the entire VHD/VHDX); Hyper-V is listed as a requirement for DPM for individual file restore from within a VHD/VHDX, which implies a physical DPM server environment. Is this no longer the case?

      - SOFS shares for storage: I thought Storage Spaces are not allowed or supported for DPM storage. SOFS (JBOD-style) would have Storage Spaces on the back end. Could you clarify what is supported with regard to Storage Spaces and DPM?

      Also, what is Microsoft's recommendation for backing up file servers? According to the documentation, DPM does not support backing up SOFS-based file servers directly (Hyper-V VMs on SOFS is supported by backing up through the Hyper-V node/cluster). This necessitates file server VMs running on Hyper-V with the VHDs/VHDXs on SOFS--a file server on top of a file server. Backing up the VM is MUCH, MUCH faster than backing up the file share itself. In a physical DPM environment where the files within the VHDs/VHDXs can be restored directly, is there any reason to backup anything other than the file server VM?

    • First I´m sorry that UR3 did not solve our case with "old/deleted vm´s, without any recovery point, still acting as production server in DPM". Second I think it would be a huge improvement if DPM could integrate much tighter with VMM. Maybe basically all VM backup/restore should be possible to configure from VMM?

    • Nice !
      I'm planning the following architecture for our next Hyper-V Infrastructure :

      - 1 Hyper-V Cluster
      - 2 SOFS Cluster each hosting half of VM storage
      - 2x DPM VM groups each backuping "their" SOFS on the other SOFS.

      Any thoughts about that ? Seem like your tests validate it ?

    • Thanks Miha. We are working on detailed guidance on configuration and deployment best practices for storage optimization. Will be publishing a new blog post as soon as it is ready.

    • Thanks Ryan for the feedback.
      1) Virtualized DPM - you can still restore individual files without restoring the entire VHD. Disks are directly attached to the DPM server which is running virtual.
      2) You are right in the sense that Storage spaces is not supported as DPM disk storage pool. Since DPM is running virtual, it will see 20 disks of 1 TB each (VHDs on the SMB storage) in the Config described above.
      3) We don't support backing up SOFS shares directly. Since you mentioned VM backup is much more efficient, you should just backup the VMs and you don't need to run DPM in physical environment.

    • Hi Rune,
      1. UR3 issue: Did you stop the protection (with or without retain data option) after the VM is deleted? If yes, can you send me the details to my email id: shivamg@microsoft.com
      2. DPM integration with VMM: Thanks for the feedback. It is in our roadmap.

      Thanks
      Shivam

    • Thanks Arthur. The architecture should absolutely work. We didn't test with 1 Hyper-V cluster connecting to 2 SOFS clusters though. Here's 2 configs we tested with Hyper-V over SOFS:

      a) 1 Hyper-V cluster with 1 SOFS Cluster. DPM running in virtualized environment and part of the same Hyper-V cluster. In SOFS, we created different storage pools (spindle isolation) and assigned backup storage on one pool.

      b) 1 Hyper-V cluster with 1 SOFS cluster for Production VMs. Another Hyper-V cluster (only DPM servers) with a different SOFS Cluster for Backup storage.

      Shivam

    • Thank you for your quick response, Shivam! I was getting my information from this article: http://technet.microsoft.com/en-us/library/hh758184.aspx:

      "Server prerequisites

      "1. Install the Hyper-V role on the DPM server. DPM protects Hyper-V virtual machines even without the role, but you cannot perform item-level recovery (ILR) unless the role is installed."

      This documentation is apparently not accurate, then?

      Regarding Storage Spaces, it sounds like they are supported for DPM storage as long as they are not hosted on the DPM server itself, then? The documentation does not seem to make this exception clear.

      Thank you for the recommendation regarding file server backup (just backup the VM rather than the file shares within the VM). That kills two birds with one stone, and the performance is fantastic--a very nice feature of DPM. These are the types of scenarios (file server VM on top of a scale-out file server, or file-server-within-a-file-server) are not obvious when setting up an environment. The documentation isn't really helpful in this regard, because it seems very likely one could build a traditional SOFS file share, only to find later--buried in a caveat--that SOFS file share backup isn't supported in DPM.

    • Hi Shivam. I am curious to know how you went about generating the VM churn for your testing. Are there any specific tools or scripts that were used to do this. I am interested in replicating this for our own testing/validation.