A recent conversation with a customer brought out the question: What is the best way to create an entire Replica site from scratch? At the surface this seems simple enough – configure initial replication to send the data over the network for the VMs one after another in sequence. For this specific customer however, there were some additional constraints placed:
This left OOB IR as the only realistic way to transfer data. But at 300GB per VM, it is easy to exhaust a removable drive of 1TB. That left us thinking about deduplication – after all, deduplication is supported on the Replica site . So why not use it for deduplicating OOB IR data?
So I tested this out in my lab environment with a removable USB drive, and a bunch of VMs created out of the same Windows Server 2012 VHDX file. The expectation was that at least 20% to 40% of the data would be same in the VMs, and the overall deduplication rate would be quite high and we could fit a good number of VMs into the removable USB drive.
I started this experiment by attaching the removable drive to my server and attempted to enable deduplication on the associated volume in Server Manager.
Interesting discovery #1: Deduplication is not allowed on volumes on removable disksWhoops! This seems like a fundamental block to our scenario – how do you build deduplicated OOB IR, if the deduplication is not supported on removable media? This limitation is officially documented here: http://technet.microsoft.com/en-us/library/hh831700.aspx , and says “Volumes that are candidates for deduplication must conform to the following requirements: Must be exposed to the operating system as non-removable drives. Remotely-mapped drives are not supported.”
Fortunately my colleague Paul Despe in the Windows Server Data Deduplication team came to the rescue. There is a (slightly) convoluted way to get the data on the removable drive and deduplicated. Here goes:
You can also enable deduplication in PowerShell with the following commandlets:
Now you are set to start the OOB IR process and take advantage of the deduplicated volume. This is what I saw after 1 VM was enabled for replication with OOB IR:
That’s about 32.6GB of storage used. Wait… shouldn’t there be a reduction in size because of deduplication?
Ah… so if you were expecting that the VHD data would arrive into the volume in deduplicated form, this is going to be a bit of a surprise. At the first go, the VHD data will be present in the volume in its original size. Deduplication happens as post-facto as a job that crunches the data and reduces the size of the VHD after it has been fully copied as a part of the OOB IR process. This is because deduplication needs an exclusive handle on the file in order to go about doing its work.
The good part is that you can trigger the job on-demand and start the deduplication as soon as the first VHD is copied. You can do that by using the PowerShell commandlet provided:
There are other parameters provided by the commandlet that allow you to control the deduplication job. You can explore the various options in the TechNet documentation: http://technet.microsoft.com/en-us/library/hh848442.aspx .
This is what I got after the deduplication job completed:
That’s a 54% saving with just one VM – a very good start!
After this I threw in a few more virtual machines with completely different applications installed and here is the observed savings after each step:
I think the excellent results speak for themselves! Notice how between VM2 and VM3, almost all of the data (~9GB) has been absorbed by deduplication with an increase of only 300MB! As the deduplication team as published on TechNet, VDI VMs would have a high degree of similarity in their disks and would result in a much higher deduplication rate. A random mix of VMs yields surprisingly good results as well.
Once you are done with the OOB IR and deduplication of your VMs, you need to do the following steps:
Hope this blog post helps with setting up your own Hyper-V Replica sites from scratch using OOB IR! Try it out and let us know your feedback.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.