Save to My DOJO
This article will help you understand how to plan, configure and optimize your SOFS infrastructure, primarily focused on Hyper-V scenarios.
Over the past decade, it seems that an increasing number of components are recommended when building a highly-available Hyper-V infrastructure. I remember my first day as a program manager at Microsoft when I was tasked with building my first Windows Server 2008 Failover Cluster. All I had to do was connect the hardware, configure shared storage, and pass Cluster Validation, which was fairly straightforward.
Figure 1 – A Failover Cluster with Traditional Cluster Disks
Nowadays, the recommend cluster configuration for Hyper-V virtual machines (VMs) requires adding additional management layers such as Cluster Shared Volumes (CSV), disks which must also cluster a file server to host the file path to access it, known as a Scale-Out File Server (SOFS). While the SOFS provides the fairly basic functionality of keeping a file share online, understanding this configuration can be challenging for experienced Windows Server administrators. To see the complete stack which Microsoft recommends, scroll down to see the figures throughout this article. This may appear daunting, but do not worry, we’ll explain what all of these building blocks are for.
While there are management tools like System Center Virtual Machine Manager (SCVMM) that can automate the entire infrastructure deployment, most organizations need to configure these components independently. There is limited content online explaining how Scale-Out File Server clusters work and best practices for optimizing them. Let’s get into it!
Scale-Out File Server (SOFS) Capabilities & Limitations
A SOFS cluster should only be used for specific scenarios. The following list of features have been tested and are either supported, supported but not recommended, or not supported with the SOFS.
Supported SOFS scenarios
- File Server
- Deduplication – VDI Only
- DFS Namespace (DFSN) – Folder Target Server Only
- File System
- Continuous Availability
- Transparent Failover
- Other Roles
- IIS Web Server
- Remote Desktop (RDS) – User Profile Disks Only
- SQL Server
- System Center Virtual Machine Manager (VMM)
Supported, but not recommended SOFS scenarios
- File Server
- Folder Redirection
- Home Directories
- Offline Files
- Roaming User Profiles
Unsupported SOFS scenarios
- File Server
- Deduplication – General Purpose
- DFS Namespace (DFSN) – Root Server
- DFS Replication (DFSR)
- Dynamic Access Control (DAC)
- File Server Resource Manager (FSRM)
- File Classification Infrastructure (FCI)
- Network File System (NFS)
- Work Folders
Scale-Out File Server (SOFS) Benefits
Fundamentally, a Scale-Out File Server is a Failover Cluster running the File Server role. It keeps the file share path (\ClusterStorageVolume1) continually available so that it can always be accessed. This is critical because Hyper-V VMs us this file path to access their virtual hard disks (VHDs) via the SMB3 protocol. If this file path is unavailable, then the VMs cannot access their VHD and cannot operate.
Additionally, it also provides the following benefits:
- Deploy Multiple VMs on a Single Disk – SOFS allows multiple VMs running on different nodes to use the same CSV disk to access their VHDs.
- Active / Active File Connections – All cluster nodes will host the SMB namespace so that a VM can connect or quickly reconnect to any active server and have access to its CSV disk.
- Automatic Load Balancing of SOFS Clients – Since multiple VMs may be using the same CSV disk, the cluster will automatically distribute the connections. Clients are able to connect to the disk through any cluster node, so they are sent to the server with fewest file share connections. By distributing the clients across different nodes, the network traffic and its processing overhead are spread out across the hardware which should maximize its performance and reduce bottlenecks.
- Increased Storage Traffic Bandwidth – Using SOFS, the VMs will be spread across multiple nodes. This also means that the disk traffic will be distributed across multiple connections which maximizes the storage traffic throughput.
- Anti-Affinity – If you are hosting similar roles on a cluster, such as two active/active file shares for a SOFS, these should be distributed across different hosts. Using the cluster’s anti-affinity property, these two roles will always try to run on different hosts eliminating a single point of failure.
- CSV Cache – SOFS files which are frequently accessed will be copied locally on each cluster node in a cache. This is helpful if the same type of VM file is read many times, such as in VDI scenarios.
- CSV CHKDSK – CSV disks have been optimized to skipping the offline phase, which means that they will come online faster after a crash. Faster recovery time is important for high-availability since it minimizes downtime.
Scale-Out File Server (SOFS) Cluster Architecture
This section will explain the design fundaments of Scale-Out File Servers for Hyper-V. The SOFS can run on the same cluster as the Hyper-V VMs it is supporting, or on an independent cluster. If you are running everything on a single cluster, the SOFS must be deployed as a File Server role directly on the cluster; it cannot run inside a clustered VM since that VM won’t start without access to the File Server. This would cause a problem since neither the VM nor the virtualized File Server could start-up since they have a dependency on each other.
Hyper-V Storage and Failover Clustering
When Hyper-V was first introduced with Windows Server 2008 Failover Clustering, it had several limitations that have since been addressed. The main challenge was that each VM required its own cluster disk, which made the management of cluster storage complicated. Large clusters could require dozens or hundreds of disks, one for each virtual machine. This was sometimes not even possible due to limitations created by hardware vendors which required a unique drive letter for each disk. Technically you could run multiple VMs on the same cluster disk, each with their own virtual hard disks (VHDs). However, this configuration was not recommended, because if one VM crashed and had to failover to a different node, it would force all the VMs using that disk to shut down and failover to other nodes. This causes unplanned downtime, and as virtualization becomes more popular, a cluster-aware file system was created known as Cluster Shared Volumes (CSV). See Figure 1 (above) for the basic architecture of a cluster using traditional cluster disks.
Cluster Shared Volume (CSV) Disks and Failover Clustering
CSV Disks were introduced in Windows Server 2008 R2 as a distributed file system that is optimized for Hyper-V VMs. The disk must be visible by all cluster nodes, use NTFS or ReFS, and can be created from pools of disks using Storage Spaces.
The CSV disk is designed to host VHDs from multiple VMs from different nodes and run them simultaneously. The VMs can distribute themselves across the cluster nodes, balancing the hardware resources which they are consuming. A cluster can host multiple CSV disks and their VMs can freely move around the cluster, without any planned downtime. The CSV disk traffic communicates over standard networks using SMB, so traffic can be routed across different cluster communication paths for additional resiliency, without being restricted to use a SAN.
A Cluster Shared Volumes disk functions similar to a file share hosting the VHD file since it provides storage and controls access. Virtual machines can access their VHDs like clients would access a file hosted in a file share using a path like \ClusterStorageVolume1. This file path is identical on every cluster node, so as a VM moves between servers it will always be able to access its disk using the same file path. Figure 2 shows a Failover Cluster storing its VHDs on a CSV disk. Note that multiple VHDs for different VMs on different nodes can reside on the same disk which they access through the SMB Share.
Figure 2 – A Failover Cluster with a Cluster Shared Volumes (CSV) Disk
Scale-Out File Server (SOFS) and Failover Clustering
The SMB file share used for the CSV disk must be hosted by a Windows Server File Server. However, the file share should also be highly-available so that it does not become a single point of failure. A clustered File Server can be deployed as a SOFS through Failover Cluster Manager as described at the end of this article.
The SOFS will publish the VHD’s file share location (known as the “CSV Namespace”) on every node. This active/active configuration allows clients to be able to access their storage through multiple pathways. This provides additional resiliency and availability because if one node crashes, the VM will temporarily pause its transactions until it can quickly reconnect to the disk via another active node, but it remains online.
Since the SOFS runs on a standard Windows Server Failover Cluster, it must follow the hardware guidance provided by Microsoft. One of the fundamental rules of failover clustering is that all the hardware and software should be identical. This allows a VM or file server to be able to operate the same way on any cluster node, as all the setting, file paths, and registry settings will be the same. Make sure you run the Cluster Validation tests and follow Altaro’s Cluster Validation troubleshooting guidance if you see any warnings or errors.
The following figure shows a SOFS deployed in the same cluster. The clustered SMB shares create a highly-available CSV namespace allowing VMs to access their disk through multiple file paths.
Figure 3 – A Failover Cluster using Clustered SMB File Shares for CSV Disk Access
Storage Spaces Direct (S2D) with SOFS
Storage Spaces Direct (S2D) lets organizations deploy small failover clusters with no shared storage. S2D will generally use commodity servers with direct-attached storage (DAS) to create clusters that use mirroring to replicate their data between local disks to keep their states consistent. These S2D clusters can be deployed as Hyper-V hosts, storage hosts or in a converged configuration running both roles. The storage uses Scale-Out File Servers to host the shares for the VHD files.
In Figure 4, a SOFS cluster is shown which uses storage spaces direct, rather than shared storage, to host the CSV volumes and VHD files. Each CSV volume and its respective VHDs are mirrored between each of the local storage arrays.
Figure 4 – A Failover Cluster with Storage Spaces Direct (S2D)
Infrastructure Scale-Out File Server (SOFS)
Windows Server 2019 introduced a new Scale-Out File Server role called the Infrastructure File Server. This functions as the traditional SOFS, but it is specifically designed to only support Hyper-V virtual infrastructure with no other types of roles. There can also be only one Infrastructure SOFS per cluster.
The Infrastructure SOFS can be created manually via PowerShell or automatically when it is deployed by Windows Azure Stack or System Center Virtual Machine Manager (SCVMM). This role will automatically create a CSV namespace share using the syntax \InfraSOFSNameVolume1. Additionally, it will enable the Continuous Availability (CA) setting for the SMB shares, also known as SMB Transparent Failover.
Figure 5 – Infrastructure File Server Role on a Windows Server 2019 Failover Cluster
Windows Server 2019 Failover Clustering introduced the management concept of cluster sets. A cluster set is a collection of failover cluster which can be managed as a single logical entity. It allows VMs to seamlessly move between clusters which then lets organizations create a highly-available infrastructure with almost limitless capacity. To simplify the management of the cluster sets, a single namespace can be used to access the cluster. This namespace can run on a SOFS for continual availability and clients will automatically get redirected to the appropriate location within the cluster set.
The following figure shows two Failover Clusters within a cluster set, both of which are using a SOFS. Additionally, a third independent SOFS is deployed to provide highly-available access to the cluster set itself.
Figure 6 – A Scale-Out File Server with Cluster Sets
Guest Clustering with SOFS
Acquiring dedicated physical hardware is not required for the SOFS as this can be fully-virtualized. When a cluster runs inside of VMs instead of physical hardware, this is known as guest clustering. However, you should not run a SOFS within a VM which it is providing the namespace for, as it can get into a situation where it cannot start the VM since it cannot access the VM’s own VHD.
Microsoft Azure with SOFS
Microsoft Azure allows you to deploy virtualized guest clusters in the public cloud. You will need at least 2 storage accounts, each with a matching number and size of disks. It is recommended to use at least DS-series VMs with premium storage. Since this cluster is already running in Azure, it can also use a cloud witness for its quorum disk.
You can even download an Azure VM template which comes as a pre-configure two-node Windows Server 2016 Storage Spaces Direct (S2D) Scale-Out File Server (SOFS) cluster.
System Center Virtual Machine Manager (VMM) with SOFS
Since the Scale-Out File Server has become an important role in virtualized infrastructures, System Center Virtual Machine Manager (VMM) has tightly integrated it into their fabric management capabilities.
VMM makes it fairly easy to deploy SOFS throughout your infrastructure on bare-metal or Hyper-V hosts. You can add existing file servers under management or deploy each SOFS throughout your fabric. For more information visit:
When VMM is used to create a cluster set, an Infrastructure SOFS is automatically created on the Management Server (if it does not already exist). This file share will host the single shared namespace used by the cluster set.
Many of the foundational components of a Scale-Out File Server can be deployed and managed by VMM. This includes the ability to use physical disks to create storage pools that can host SOFS file shares. The SOFS file shares themselves can also be created through VMM. If you are also using Storage Spaces Direct (S2D) then you will need to create a disk witness which will use the SOFS to host the file share. Quality of Service (QoS) can also be adjusted to control network traffic speed to resources or VHDs running on the SOFS shares.
In large virtualized environments, it is recommended to have a dedicated management cluster for System Center VMM. The virtualization management console, database, and services are highly-available so that they can continually monitor the environment. The management cluster can use unified storage namespace runs on a Scale-Out File Server, granting additional resiliency to accessing the storage and its clients.
VMM uses a library to store files which may be deployed multiple times, such as VHDs or image files. The library uses an SMB file share as a common namespace to access those resources, which can be made highly-available using a SOFS. The data in the library itself cannot be stored on a SOFS, but rather on a traditional clustered file server.
Cluster patch management is one of the most tedious tasks which administrators face as it is repetitive and time-consuming. VMM has automated this process through serially updating one node at a time while keeping the other workloads online. SOFS clusters can be automatically patched using VMM.
Rolling upgrades refers to the process where infrastructure servers are gradually updated to the latest version of Windows Server. Most of the infrastructure servers managed by VMM can be included in the rolling upgrade cycle which functions like the Update Management feature. Different nodes in the SOFS cluster are sequentially placed into maintenance mode (so the workloads are drained), updated, patched, tested and reconnected to the cluster. Workloads will gradually migrate to the newly installed nodes while the older nodes wait to be updated. Gradually all the SOFS cluster nodes are updated to the latest version of Windows Server.
Internet Information Services (IIS) Web Server with SOFS
Everything in this article so far has referenced SOFS in the context of being used for Hyper-V VMs. SOFS is gradually being adopted by other infrastructure services to provide high-availability to their critical components which use SMB file shares.
The Internet Information Services (IIS) Web Server is used for hosting websites. To distribute the network traffic, usually, multiple IIS Servers are deployed. If they have any shared configuration information or data, this can be stored in the Scale-Out File Server.
Remote Desktop Services (RDS) with SOFS
The Remote Desktop Services (RDS) role has a popular feature known as user profile disks (UPDs) which allows users to have a dedicated data disk stored on a file server. The file share path can be placed on a SOFS to make access to that share highly-available.
SQL Server with SOFS
Certain SQL Server roles have been able to use SOFS to make their SMB connections highly-available. Starting with SQL Server 2012, the SMB file server storage option is offered for SQL Server, databases (including Master, MSDB, Model and TempDB) and the database engine. The SQL Server itself can be standalone or deployed as a failover cluster installation (FCI).
Deploying a SOFS Cluster & Next Steps
Now that you understand the planning considerations, you are ready to deploy the SOFS. From Failover Cluster Manager, you will launch the High Availability Wizard and select the File Server role. Next, you will select the File Server Type. Traditional clustered file servers will use the File Server for general use. For SOFS, select Scale-Out File Server for application data.
The interface is shown in the following figure and described as, “Use this option to provide storage for server applications or virtual machines that leave files open for extended periods of time. Scale-Out File Server client connections are distributed across nodes in the cluster for better throughput. This option supports the SMB protocol. It does not support the NFS protocol, Data Deduplication, DSF Replication, or File Server Resource Manager.”
Figure 7 – Installing a Scale-Out File Server (SOFS)
Now you should have a fundamental understanding of the use and deployment options for the SOFS. For additional information about deploying a Scale-Out File Server (SOFS), please visit https://docs.microsoft.com/en-us/windows-server/failover-clustering/sofs-overview. If there’s anything you want to ask about SOFS, let me know in the comments below and I’ll get back to you!
Not a DOJO Member yet?
Join thousands of other IT pros and receive a weekly roundup email with the latest content & updates!