Save to My DOJO
Guest clustering describes an increasingly popular deployment configuration for Windows Server Failover Clusters where the entire infrastructure is virtualized. With a traditional cluster, the hosts are physical servers and run virtual machines (VMs) as their highly available workloads. With a guest cluster, the hosts are also VMs which form a virtual cluster, and they run additional virtual machines nested within them as their highly available workloads. Microsoft now recommends dedicating clusters for each class of enterprise workload, such as your Exchange Server, SQL Server, File Server, etc., because each application has different cluster settings and configuration requirements. Setting up additional clusters became expensive for organizations when they had to purchase and maintain more physical hardware. Other businesses wanted guest clustering as a cheaper test, demo or training infrastructure. To address this challenge, Microsoft Hyper-V supports “nested virtualization” which allows you to create virtualized hosts and run VMs from them, creating fully-virtualized clusters. While this solves the hardware problem, it has created new obstacles for backup providers as each type of guest cluster has special considerations.
Hyper-V Guest Cluster Configuration and Storage
Let’s first review the basic configuration and storage requirements for a guest cluster. Fundamentally a guest cluster has the same requirements as a physical cluster, including two or more hosts (nodes), a highly available workload or VM, redundant networks, and shared storage. The entire solution must also pass the built-in cluster validation tests. You should also force every virtualized cluster node to run on different physicals hosts so that if a single server fails, it will not bring down your entire guest cluster. This can be easily configured using Failover Clustering’s AntiAffinityClassNames or Azure Availability Sets, so in the event that you lose that physical server, the entire cluster will not fail. Some of the guest cluster requirements will also vary on the nested virtualized application which you are running, so always check for workload-specific requirements during your planning.
Shared storage used to be a requirement for all clusters because it allows the workload or VM to access the same data regardless of which node is running that workload. When the workload fails over to a different node, its services get restarted, then it accesses the same shared data which it was previously using. Windows Server 2012 R2 and later supports guest clusters with shared storage using a shared VHDX disk, iSCSI or virtual fibre channel. Microsoft added support for local DAS replication using storage spaces direct (S2D) within Windows Server 2016 and continued to improve S2D with the latest 2019 release.
For a guest cluster deployment guide, you can refer to the documentation provided by Microsoft to create a guest cluster using Hyper-V. If you want to do this in Microsoft Azure, then you can also follow enabling nested virtualization within Microsoft Azure.
Backup and Restore the Entire Hyper-V Guest Cluster
The easiest backup solution for guest clustering is to save the entire environment by protecting all the VMs in that set. This has almost-universal support by third party backup vendors such as Altaro, as it is essentially just protecting traditional virtual machines which have a relationship to each other. If you are using another VM as part of the set as an isolated domain controller, iSCSI target or file share witness, make sure it is backup up too.
A (guest) cluster-wide backup is also the easiest solution for scenarios where you wish to clone or redeploy an entire cluster for test, demo or training purposes by restoring it from a backup. If you are restoring a domain controller, make sure you bring this back online first. Note that if you are deploying copies of a VM, especially if one contains a domain controller, that any images have been Sysprepped to avoid conflicts by giving them new global identifiers. Also, use DHCP to get new IP addresses for all network interfaces. In this scenario, it is usually much easier to just deploy this cloned infrastructure in a full isolated environment so that the cloned domain controllers do not cause conflicts.
The downside to cluster-wide backup and restore is that you will lack the granularity to protect and recover a single workload (or item) running within the VM, which is why most admins will select another backup solution for guest clusters. Before you pick one of the alternative options, make sure that both your storage and backup vendor support this guest clustering configuration.
Backup and Restore a Guest Cluster using iSCSI or Virtual Fibre Channel
When guest clusters first became supported for Hyper-V, the most popular storage configurations were to use an iSCSI target or virtual fibre channel. iSCSI was popular because it was entirely Ethernet-based, which means that inexpensive commodity hardware could be used and Microsoft offered a free iSCSI Target server. Virtual fiber channel was also prevalent since it was the first type of SAN-based storage supported by Hyper-V guest clusters through its virtualized HBAs. Either solution works fine and most backup vendors support Hyper-V VMs running on these shared storage arrays. This is a perfectly acceptable solution for reliable backups and recovery if you are deploying a stable guest cluster. The main challenge was that in its earlier versions, Cluster Shared Volumes (CSV) disks and live migration had limited support by vendors. This meant that basic backups would work, but there were a lot of scenarios that would cause backups to fail, such as when a VM was live migrating between hosts. Most scenarios are supported in production, yet still make sure that your storage and backup vendors support and recommend it.
Backup and Restore a Guest Cluster using a Shared Virtual Hard Disk (VHDX) & VHD Set
Windows Server 2012 R2 introduced a new type of shared storage disk which was optimized for guest clustering scenarios, known as the shared virtual hard disk (.vhdx file), or Shared VHDX. This allowed multiple VMs to synchronously access a single data file which represented a shared disk (similar to a drive shared by an iSCSI Target). This disk could be used as a file share witness disk, or more commonly to store shared application data used by the workload running on the guest cluster. This Shared VHDX file could either be stored on a CSV disk or SMB file share (using a Scale-Out File Server).
This first release of a shared virtual hard disk had some limitations and was generally not recommended for production. The main criticisms were that backups were not reliable, and backup vendors were still catching up to support this new format. Windows Server 2016 addressed these issues by adding support for online resizing, Hyper-V Replica, and application-consistent checkpoints. These enhancements were released as a newer Hyper-V VHD Set (.vhds) file format. The VHD Set included additional file metadata which allowed each node to have a consistent view of that shared drive’s metadata, such as the block size and structure. Prior to this, nodes might have an inconsistent view of the Shared VHDX file structure which could cause backups to fail.
While VHD Sets was optimized to support guest clusters, there were inevitably some issues discovered which are documented by Microsoft Support. An important thing when using Shared VHDX / VHD Sets for your guest cluster is that all of your storage, virtualization, and clustering components are patched with any related hotfixes specific to your environment, including any from your storage and backup provider. Also, make sure you explicitly check that your ISVs support this updated file format and follow Microsoft’s best practices. Today this is the recommended deployment configuration for most new guest clusters.
Backup and Restore a Guest Cluster using Storage Spaces Direct (S2D)
Microsoft introduced another storage management technology in Windows Server 2016, which was improved in Windows Server 2019, known as Storage Spaces Direct (S2D). S2D was designed as a low-cost solution to support clusters without any requirement for shared storage. Instead, local DAS drives are synchronously replicated between cluster nodes to maintain a consistent state. This is certainly the easiest guest clustering solution to configure, however, Microsoft has announced some limitations in the current release (this link also includes a helpful video showing how to deploy a S2D cluster in Azure).
First, you are restricted to a 2-node or 3-node cluster only, and in either case you can only sustain the loss or outage of a single node. You also want to ensure that the disks have low latency and high performance, ideally using SSD drives or Azure’s Premium Storage managed disks. One of the major limitations still remains around backups as host-level virtual disk backups are currently not supported. If you deploy the S2D cluster, you are restricted to only taking backups from within the guest OS. Until this has been resolved and your backup vendor supports S2D, the safest option with the most flexibility will be to deploy a guest clustering using Shared VHDX / VHD Sets.
Microsoft is striving to improve guest clustering with each subsequent release. Unfortunately, this makes it challenging for third-party vendors to keep up with their support of the latest technology. It can be especially frustrating to admins when their preferred backup vendor has not yet added support for the latest version of Windows, and you should share this feedback on what you need with your ISVs. It is always a good best practice to select a vendor with close ties to Microsoft, as they get provided with early access to code and always aim to support the latest and greatest technology. The leading backup companies like Altaro are staffed by Microsoft MVPs and regularly consult with former Microsoft engineers such as myself, to support the newest technologies as quickly as possible. But always make sure that you do your homework before you deploy any of these guest clusters so you can pick the best configuration which is supported by your backup and storage provider.
Not a DOJO Member yet?
Join thousands of other IT pros and receive a weekly roundup email with the latest content & updates!