Information and announcements from Program Managers, Product Managers, Developers and Testers in the Microsoft Virtualization team.
Hyper-V Replica provides protection to VMs by tracking and replicating changes to the virtual hard disks (VHDs) of the VM. Hyper-V Replica runs 24 hours, 365 days in a year; for any VM that has been enabled for replication it ensures that the data on the primary site and the Replica site are kept as closely in sync as supported.
To begin with, Hyper-V Replica (HVR) requires that the data on the virtual hard disks (VHDs) of the primary and replica VMs be the same. This is achieved through the process of initial replication, and establishes a baseline on which replicated changes can be applied. However, due to factors beyond the control of the administrator – such as faulty hardware and OS bugchecks – it is possible that the primary and Replica VMs are not in sync.
Thus in a rainy day scenario (details in following section), when HVR determines that the replica VM can no longer be kept in sync with the primary by applying the replicated changes then resynchronization is required. Resynchronization (or Resync) is the process of re-establishing the baseline – by ensuring that the primary and replica VHDs have exactly the same data stored.
(NOTE: In this post we will use a VM named “RESYNC VM” in all examples and screenshots.)
It would become quite obvious after going through this table below that Resync is not expected to occur regularly. In fact, in the normal course of replication this is quite a rare event. The VM enters the “Resynchronization Required” state when any one of the conditions are encountered:
Modify VHD when VM is turned off
Mount/modify VHD outside the VM, Edit disk, Offline patching
Size of tracking log files > 50% of total VHD size for a VM
Network outage causes logs to accumulate
Write failure to tracking log file
VHD and logs are on SMB and connectivity to the SMB storage is flaky.
Tracking log file is not closed gracefully
Host crash with primary VM running. Applicable to VMs in a cluster also.
Reverting the volume to an older point in time
Reverting the VM to an older snapshot
Volume/snapshot backup and restore
Out-of-sequence or Invalid log file is applied
Restoring a backed-up copy of the Replica VM
Importing an older VM copy, when migration by using export-import
Reverting volume to an older point in time using Volume backup and restore.
When the VM enters the “Resynchronization Required” state, the replication health becomes “Critical” and the VM is scheduled for resynchronization. At the same time, HVR stops tracking the guest writes for the VM and nothing is replicated.
The replication health will also show this message:
Depending on the VM setting, the user might have to trigger the resynchronization operation explicitly. When that is required, follow the instructions as given in the replication health screen:
You will be presented with the screen to schedule the resynchronization operation:
To start the resync operation from PowerShell, use the Resume-VMReplication commandlet:
Resume-VMReplication –VMName “RESYNC VM” -Resynchronize –ResynchronizeStartTime “04/15/2013 12:00:00”
User-initiated resynchronization is also possible, but unless absolutely necessary it should be avoided. In order to explicitly force resynchronization on a VM that is not in the “Resynchronization Required” state, first suspend the replication and then initiate resync:
Suspend-VMReplication -VMName "RESYNC VM"
Resume-VMReplication -VMName "RESYNC VM" -Resynchronize
The scheduling of the resynchronization operation can be configured for each VM:
The default option is to schedule the resynchronization operation during off-peak hours. The resource intensive nature of the operation makes such scheduling useful, and aims to reduce the impact on running VMs.
The same can be configured in PowerShell using the Set-VMReplication commandlet:
# Manual resync
Set-VMReplication -VMName "RESYNC VM" -AutoResynchronizeEnabled 0
# Automatic resync
Set-VMReplication –VMName "RESYNC VM" -AutoResynchronizeEnabled 1 -AutoResynchronizeIntervalStart 00:00:00 -AutoResynchronizeIntervalEnd 23:59:59
# Scheduled resync
Set-VMReplication –VMName "RESYNC VM" -AutoResynchronizeEnabled 1 -AutoResynchronizeIntervalStart 00:00:00 -AutoResynchronizeIntervalEnd 06:00:00
To see the resynchronization settings in PowerShell, use the Get-VMReplication commandlet and look for the AutoResynchronizeEnabled, AutoResynchronizeIntervalStart, and AutoResynchronizeIntervalEnd fields:
Get-VMReplication -VMname "RESYNC VM" | fl *
When the resync operation is triggered – either automatically or by the user – the following high-level sub-operations are executed in sequence:
Resynchronization performance was tested and compared against the performance of Online Initial Replication (IR). The setup consisted of a standalone server with 4 running VMs – 2 File Servers and 2 SQL servers running typical workloads. Two VMs were replicated to a standalone Replica server. The network bandwidth was varied to see the impact. Data size that was replicated during Online IR was approximately 80GB.
The tests indicate that resync is preferable to Online IR in low speed networks. When the two sites are connected by a high speed network, resync works well for low churn workloads.
There is also a perfmon counter for measuring the resynchronized bytes: \Hyper-V Replica VM\Resynchronized Bytes.
The disks going out of sync is a rainy-day event in Hyper-V Replica. However with the Resynchronization operation, this is handled gracefully within the product to optimize the administrative overhead and the resources used in bringing the disks back into sync.
Hyper-V Replica runs 24 hours, 365 days in a year ::-> I see a leap year bug right there ;)
Question: Is it possible to run a separate VM on the replica server that isn't part of the replication topology? For example, I want to replicate a critical production server to remote site, and at the remote replica site, also run a VM that isn't being replicated on the replica server.
Chris-Arch: Absolutely! A host that has been enabled as a Replica server is first and foremost a Hyper-V host, and being a Hyper-V host it can run VMs.
Your question also points to a well known deployment style. Since the Replica VMs are turned off during the normal course of replication, there are system resources that can be put to good use - and customers do run VMs on these hosts to utilize these resources. Prevents Replica servers from sitting idle (figuratively speaking).
A word of caution here: going down this path requires you to be aware of the resources required at the time of failover. If the Replica servers cannot accommodate all VMs when running then there has to be a plan to handle that situation.
Great writeup. We had a power outage situation and the Replica 2012 R2 Servers rebooted. However one of the running 'source' VM's of the Replica came up with file corruption that was not caught immediately - people had to log on and work for a few minutes before it became apparent. However in that time the target VM that it was replicating too also became corrupted. Now in theory we could have a days worth of snapshots but my preference would be that Replica not auto-start on a reboot. I have not had any luck yet in finding out how that could be done. Your input is appreciated in advance. Arlester.
Let me answer this in a few stages. The short answer to your question is: there is no out-of-the-box way to do this today. Any solution to solve your problem will involve some amount of scripting.
1) Regarding file corruption:
The way Hyper-V Replica works is that if there is a possible VHD corruption detected, we set the VMs to resynchronize. So if your VM starts replicating on host restart, it means that the VHD is okay. Of course, that says nothing about the application using the VHD... and it is quite possible that there are inconsistencies at an application-level that would still be consistent at the VHD level.
A good practice here would be to schedule the resynchronization or make it manual so that it is not triggered immediately. The settings for resynchronization can be found under VM Settings --> Replication --> Resynchronization.
2) Pausing replication:
In the scenario that you have described, you seem to be looking for a way to pause the replication on a host restart. There are two places where you can pause the replication: on the source or on the destination. The suggestion I would give is to write a script to monitor your source and pause VMs on the destination when the source is unreachable. Pausing replication on the destination will reject incoming packets from the source, while pausing replication on the source will stop sending the packets itself. The reason I think pausing on the destination is better is because you will need a VM object to trigger this action... and on the source this needs to be done as soon as the service is up. This can lead to timing issues which can be easily avoided by pausing on the destination site.
You can use PowerShell scripts or create a SCO runbook to monitor and pause the VMs.
Hope this helps.
I resized my disk (VHD) . and now it saying Cannot Perform operation for Virtual Machine as virtual size of one or more virtual hard disk are different between primary and replica servers. Delete and re enable replication.
What are the steps?
Hi Geoffrey, I would suggest reading the blog post about online resize:
Also, please check what your OS version is. Online resize is supported only from Windows Server 2012R2 onwards.