First published on TechNet on Nov 18, 2012
Have you ever had someone – either a customer or colleague – build a mission critical server without taking time, up-front, to ensure they are following best practices? Or what about the build document you spent untold hours creating, only to have it completely ignored. If neither of these have happened to you, consider yourself lucky! For the rest of us, however, this happens far too often.
In my first official posting on Ask Premier Field Engineering (PFE) Platforms, I would like to share a checklist I developed over time while working with customers’ Hyper-V environments. I find it’s a great tool to use not only when reviewing an existing Hyper-V implementation, but one which can be easily leveraged as part of pre-planning stages, to ensure best practices are implemented from the start.
Although the majority of the items within this checklist still apply to Hyper-V in Server 2012, I will be sharing an updated checklist in the near future specific to the latest Hyper-V version. Stay tuned, folks!! J
My colleague, Mike Hildebrand, posted a somewhat similar entry, which I strongly encourage you to review, as well: https://techcommunity.microsoft.com/t5/core-infrastructure-and-security/the-journey-of-a-thousand-vm...
Disclaimer: As with all Best Practices, not every recommendation can – or should – be applied. Best Practices are general guidelines, not hard, fast rules that must be followed. As such, you should carefully review each item to determine if it makes sense in your environment. If implementing one (or more) of these Best Practices seems sensible, great; if it doesn't, simply ignore it. In other words, it's up to you to decide if you should apply these in your setting.
⎕ Use Server Core if possible, to reduce OS overhead, reduce potential attack surface, and to minimize reboots (due to fewer software updates)
⎕ Hyper-V services should be configured to start automatically, to ensure uninterrupted VM services after reboots. (Verify in Administrative Tools à Services):
⎕ Ensure hosts are up-to-date with recommended Microsoft updates, to ensure critical patches and updates – addressing security concerns or fixes to the core OS – are applied.
⎕ Ensure all applicable Hyper-V hotfixes and Cluster hotfixes (if applicable) have been applied. Review the following sites and compare it to your environment, since not all hotfixes will be applicable:
⎕ Install the latest PowerShell version (currently 3.0) on each Hyper-V host:
⎕ Download and install the Hyper-V PowerShell Management Library
⎕ Ensure hosts have the latest BIOS version, to address any known issues/supportability
⎕ Host should be domain joined, unless security standards dictate otherwise. Doing so makes it possible to centralize the management of policies for identity, security, and auditing. Additionally, hosts must be domain joined before you can create a Hyper-V High-Availability Cluster.
⎕ RDP Printer Mapping should be disabled on hosts, to remove any chance of a printer driver causing instability issues on the host machine.
⎕ Set host power plan to Maximum Performance, to ensure maximum CPU performance.
⎕ Do not install any other Roles on a host besides the Hyper-V host
⎕ The only Features that should be installed on the host are: Failover Cluster Manager (if host will become part of a cluster) and Multipath I/O (if host will be connecting to an iSCSI SAN, for example). (See explanation above for reasons why installing additional features is not recommended.)
⎕ Anti-virus software can be installed, if desired; however, be sure to exclude Hyper-V specific files using KB 961804:
⎕ Default VM path and VHD path should be set to a non-system drive, due to this can cause disk latency, as well as create the potential for running out of disk space.
⎕ Enable iSCSI Service TCP-In (for Inbound) and iSCSI Service TCP-Out (for outbound) in Firewall settings on host (Port 3260), to allow iSCSI traffic to pass to and from host and SAN device. Not enabling these rules will prevent iSCSI communication.
⎕ Periodically run performance counters against the host, to ensure optimal performance.
Recommend using the Hyper-V R2 SP1 performance counter that can be extracted from the (free) Codeplex PAL application:
⎕ If server has more than 32 physical cores, do not enable Hyper Threading, as it creates more logical cores than Hyper-V supports on Server 2008 R2. (Max is 64.)
⎕ Ensure NICs have the latest firmware, which often address known issues with hardware.
⎕ Ensure latest NIC drivers have been installed on the host, which resolve known issues and/or increase performance.
⎕ Consider disabling Chimney Offload, as it has been found to cause slowness of virtual machines.
From an elevated command-prompt, type the following:
netsh int tcp set global chimney=disabled
⎕ Jumbo frames should be turned on and set for 9000 or 9014 (depending on your hardware) for CSV, iSCSI and Live Migration networks. This can significantly increase (6x increased throughput) throughput while also reducing CPU cycles.
⎕ NICs used for iSCSI communication should have all Networking protocols (on the Local Area Connection Properties) unchecked, with the exception of:
Unbinding other protocols (not listed above) helps eliminate non-iSCSI traffic/chatter on these NICs.
⎕ When creating virtual switches, uncheck the Allow management operating system to share this network adapter, in order to create a dedicated network for your VM(s) to communicate with other computers on the physical network.
⎕ Recommended network configuration when clustering:
Min # of Networks on Host
VM Network Access
** CSV & Live Migration Networks can be crossover cables, if you are building a 2 node cluster **
VIRTUAL NETWORK ADAPTERS (NICs):
⎕ Legacy Network Adapters (a.k.a. Emulated NIC drivers) should only be used for PXE booting a VM or when installing non-Hyper-V aware Guest operating systems. Hyper-V's synthetic NICs (the default NIC selection; a.k.a. Synthetic NIC drivers) are far more efficient, due to using a dedicated VMBus to communicate between the virtual NIC and the physical NIC; as a result, there are reduced CPU cycles, as well as much lower hypervisor/guest transitions per operation.
⎕ Disks should be Fixed or Pass-Through in a production environment, to increase disk throughput. Differencing and Dynamic disks are not recommended for production, due to possible data loss (differencing disks) and increased disk read/write latency times (differencing/dynamic disks).
⎕ Disable snapshots from all production VMs. Snapshots can cause disk space issues, as well as additional physical I/O overhead.
⎕ The physical format of hard disk drives used for hosting VMs should be 512-byte sectors, to prevent compatibility issues (see http://support.microsoft.com/kb/2515143).
It is not recommended to use 512e formatting for disks that will house VHDs, due to internal testing has shown a performance degradation of around 30% for most workloads.
Regarding 4K Disks:
“The VHD driver in Server 2008 R2 assumes that the physical sector size of the disk to be 512 bytes and issues 512 byte IOs, which makes it incompatible with these disks. The VHD stack fails to open the VHD files on physical 4kB sector disks for this reason.”
Taken from: http://support.microsoft.com/kb/2515143
Side-Note: Windows 2012 fully supports 4K disks out of the box.
⎕ Page file on Hyper-V Host should be set to a fixed size (4GB max) on the system drive, since most Hyper-V implementations have large amounts of physical memory, and, by default, the page file is the same size as the physical amount of memory.
⎕ Set reserved Hyper-V Parent Host memory, to ensure memory is set aside for the host, itself.
⎕ Use Dynamic Memory on all VMs (unless not supported. e.g. Lync).
⎕ Guest OS should be configured with (minimum) recommended memory
⎕ Ensure Integration Components (IC) have been installed on all VMs (Pre 2008/Pre Win 7/Other OS). IC's significantly improve interaction between the VM and the physical host.
⎕ Set preferred network for CSV communication, to ensure the correct network is used for this traffic. (Note: This will only need to be run on one of your Hyper-V nodes.)
⎕ Set preferred network for Live Migration, to ensure the correct network(s) are used for this traffic:
⎕ The Cluster Shutdown Time (ShutdownTimeoutInMinutes registry entry) should be set to an acceptable number
⎕ Each node in the cluster requires an identically named (case sensitive!) virtual switch. Failovers and Live Migrations will fail without identically named switches
⎕ Run Cluster Validation periodically to remediate any issues
VITRUAL DOMAIN CONTROLLERS (DCs):
⎕ It is recommended to partially disable the time synchronization between the VM DC and the host (using registry change). This enables the guest DC to synchronize time for the domain hierarchy, but protects it from having a time skew if it is restored from a saved state:
reg add HKLM\SYSTEM\CurrentControlSet\Services\W32Time\TimeProviders\VMICTimeProvider /v Enabled /t reg_dword /d 0
⎕ DC VMs should have "Shut down the guest operating system" in the Automatic Stop Action setting applied (in the settings on the Hyper-V Host)
⎕ If VHDs are IDE/ATA drives, ensure disk write caching is disabled, to reduce the chance of AD corruption.
Thanks for reading!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.