How to Architect and Implement Networks for a Hyper-V Cluster

How to Architect and Implement Networks for a Hyper-V Cluster

We recently published a quick tip article recommending the number of networks you should use in a cluster of Hyper-V hosts. I want to expand on that content to make it clear why we’ve changed practice from pre-2012 versions and how we arrive at this guidance. Use the previous post for quick guidance; read this one to learn the supporting concepts. These ideas apply to all versions from 2012 onward.

Why Did We Abandon Practices from 2008 R2?

If you dig on TechNet a bit, you can find an article outlining how to architect networks for a 2008 R2 Hyper-V cluster. While it was perfect for its time, we have new technologies that make its advice obsolete. I have two reasons for bringing it up:

  • Some people still follow those guidelines on new builds — worse, they recommend it to others
  • Even though we no longer follow that implementation practice, we still need to solve the same fundamental problems

We changed practices because we gained new tools to address our cluster networking problems.

What Do Cluster Networks Need to Accomplish for Hyper-V?

Our root problem has never changed: we need to ensure that we always have enough available bandwidth to prevent choking out any of our services or inter-node traffic. In 2008 R2, we could only do that by using multiple physical network adapters and designating traffic types to individual pathways. Note: It was possible to use third-party teaming software to overcome some of that challenge, but that was never supported and introduced other problems.

Starting from our basic problem, we next need to determine how to delineate those various traffic types. That original article did some of that work. We can immediately identify what appears to be four types of traffic:

  • Management (communications with hosts outside the cluster, ex: inbound RDP connections)
  • Standard inter-node cluster communications (ex: heartbeat, cluster resource status updates)
  • Cluster Shared Volume traffic
  • Live Migration

However, it turns out that some clumsy wording caused confusion. Cluster communication traffic and Cluster Shared Volume traffic are exactly the same thing. That reduces our needs to three types of cluster traffic.

What About Virtual Machine Traffic?

You might have noticed that I didn’t say anything about virtual machine traffic above. Same would be true if you were working up a different kind of cluster, such as SQL. I certainly understand the importance of that traffic; in my mind, service traffic prioritizes above all cluster traffic. Understand one thing: service traffic for external clients is not clustered. So, your cluster of Hyper-V nodes might provide high availability services for virtual machine vmabc, but all of vmabc‘s network traffic will only use its owning node’s physical network resources. So, you will not architect any cluster networks to process virtual machine traffic.

As for preventing cluster traffic from squelching virtual machine traffic, we’ll revisit that in an upcoming section.

Fundamental Terminology and Concepts

These discussions often go awry over a misunderstanding of basic concepts.

  • Cluster Name Object: A Microsoft Failover Cluster has its own identity separate from its member nodes known as a Cluster Name Object (CNO). The CNO uses a computer name, appears in Active Directory, has an IP, and registers in DNS. Some clusters, such as SQL, may use multiple CNOs. A CNO must have an IP address on a cluster network.
  • Cluster Network: A Microsoft Failover Cluster scans its nodes and automatically creates “cluster networks” based on the discovered physical and IP topology. Each cluster network constitutes a discrete communications pathway between cluster nodes.
  • Management network: A cluster network that allows inbound traffic meant for the member host nodes and typically used as their default outbound network to communicate with any system outside the cluster (e.g. RDP connections, backup, Windows Update). The management network hosts the cluster’s primary cluster name object. Typically, you would not expose any externally-accessible services via the management network.
  • Access Point (or Cluster Access Point): The IP address that belongs to a CNO.
  • Roles: The name used by Failover Cluster Management for the entities it protects (e.g. a virtual machine, a SQL instance). I generally refer to them as services.
  • Partitioned: A status that the cluster will give to any network on which one or more nodes does not have a presence or cannot be reached.
  • SMB: ALL communications native to failover clustering use Microsoft’s Server Message Block (SMB) protocol. With the introduction of version 3 in Windows Server 2012, that now includes innate multi-channel capabilities (and more!)

Are Microsoft Failover Clusters Active/Active or Active/Passive?

Microsoft Failover Clusters are active/passive. Every node can run services at the same time as the other nodes, but no single service can be hosted by multiple nodes. In this usage, “service” does not mean those items that you see in the Services Control Panel applet. It refers to what the cluster calls “roles” (see above). Only one node will ever host any given role or CNO at any given time.

How Does Microsoft Failover Clustering Identify a Network?

The cluster decides what constitutes a network; your build guides it, but you do not have any direct input. Any time the cluster’s network topology changes, the cluster service re-evaluates.

First, the cluster scans a node for logical network adapters that have IP addresses. That might be a physical network adapter, a team’s logical adapter, or a Hyper-V virtual network adapter assigned to the management operating system. It does not see any virtual NICs assigned to virtual machines.

For each discovered adapter and IP combination on that node, it builds a list of networks from the subnet masks. For instance, if it finds an adapter with an IP of and a subnet mask of, then it creates a network.

The cluster then continues through all of the other nodes, following the same process.

Be aware that every node does not need to have a presence in a given network in order for failover clustering to identify it; however, the cluster will mark such networks as partitioned.

What Happens if a Single Adapter has Multiple IPs?

If you assign multiple IPs to the same adapter, one of two things will happen. Which of the two depends on whether or not the secondary IP shares a subnet with the primary.

When an Adapter Hosts Multiple IPs in Different Networks

The cluster identifies networks by adapter first. Therefore, if an adapter has multiple IPs, the cluster will lump them all into the same network. If another adapter on a different host has an IP in one of the networks but not all of the networks, then the cluster will simply use whichever IPs can communicate.

As an example, see the following network:

The second node has two IPs on the same adapter and the cluster has added it to the existing network. You can use this to re-IP a network with minimal disruption.

A natural question: what happens if you spread IPs for the same subnet across different existing networks? I tested it a bit and the cluster allowed it and did not bring the networks down. However, it always had the functional IP pathway to use, so that doesn’t tell us much. Had I removed the functional pathways, then it would have collapsed the remaining IPs into an all-new network and it would have worked just fine. I recommend keeping an eye on your IP scheme and not allowing things like that in the first place.

When an Adapter Hosts Multiple IPs in the Same Network

The cluster will pick a single IP in the same subnet to represent the host in that network.

What if Different Adapters on the Same Host have an IP in the Same Subnet?

The same outcome occurs as if the IPs were on the same adapter: the cluster picks one to represent the cluster and ignores the rest.

The Management Network

All clusters (Hyper-V, SQL, SOFS, etc.) require a network that we commonly dub Management. That network contains the CNO that represents the cluster as a singular system. The management network has little importance for Hyper-V, but external tools connect to the cluster using that network. By necessity, the cluster nodes use IPs on that network for their own communications.

The management network will also carry cluster-specific traffic. More on that later.

Note: Replica uses a management network.

Cluster Communications Networks (Including Cluster Shared Volume Traffic)

A cluster communications network will carry:

  • Cluster heartbeat information. Each node must hear from every other node within a specific amount of time (1 second by default). If it does not hear from a minimum of nodes to maintain quorum, then it will begin failover procedures. Failover is more complicated than that, but beyond the scope of this article.
  • Cluster configuration changes. If any configuration item changes, whether to the cluster’s own configuration or the configuration or status of a protected service, the node that processes the change will immediately transmit to all of the other nodes so that they can update their own local information store.
  • Cluster Shared Volume traffic. When all is well, this network will only carry metadata information. Basically, when anything changes on a CSV that updates its volume information table, that update needs to be duplicated to all of the other nodes. If the change occurs on the owning node, less data needs to be transmitted, but it will never be perfectly quiet. So, this network can be quite chatty, but will typically use very little bandwidth. However, if one or more nodes lose direct connectivity to the storage that hosts a CSV, all of its I/O will route across a cluster network. Network saturation will then depend on the amount of I/O the disconnected node(s) need(s).

Live Migration Networks

That heading is a bit of misnomer. The cluster does not have its own concept of a Live Migration network per se. Instead, you let the cluster know which networks you will permit to carry Live Migration traffic. You can independently choose whether or not those networks can carry other traffic.

Other Identified Networks

The cluster may identify networks that we don’t want to participate in any kind of cluster communications at all. iSCSI serves as the most common example. We’ll learn how to deal with those.

Architectural Goals

Now we know our traffic types. Next, we need to architect our cluster networks to handle them appropriately. Let’s begin by understanding why you shouldn’t take the easy route of using a singular network. A minimally functional Hyper-V cluster only requires that “management” network. Stopping there leaves you vulnerable to three problems:

  • The cluster will be unable to select another IP network for different communication types. As an example, Live Migration could choke out the normal cluster hearbeat, causing nodes to consider themselves isolated and shut down
  • The cluster and its hosts will be unable to perform efficient traffic balancing, even when you utilize teams
  • IP-based problems in that network (even external to the cluster) could cause a complete cluster failure

Therefore, you want to create at least one other network. In the pre-2012 model we could designate specific adapters to carry specific traffic types. In the 2012 and later model, we simply create at least one more additional network to allow cluster communications but not client access. Some benefits:

  • Clusters of version 2012 or new will automatically employ SMB multichannel. Inter-node traffic (including Cluster Shared Volume data) will balance itself without further configuration work.
  • The cluster can bypass trouble on one IP network by choosing another; you can help by disabling a network in Failover Cluster Manager
  • Better load balancing across alternative physical pathways

The Second Supporting Network… and Beyond

Creating networks beyond the initial two can add further value:

  • If desired, you can specify networks for Live Migration traffic, and even exclude those from normal cluster communications. Note: For modern deployments, doing so typically yields little value
  • If you host your cluster networks on a team, matching the number of cluster networks to physical adapters allows the teaming and multichannel mechanisms the greatest opportunity to fully balance transmissions. Note: You cannot guarantee a perfectly smooth balance

Architecting Hyper-V Cluster Networks

Now we know what we need and have a nebulous idea of how that might be accomplished. Let’s get into some real implementation. Start off by reviewing your implementation choices. You have three options for hosting a cluster network:

  • One physical adapter or team of adapters per cluster network
  • Convergence of one or more cluster networks onto one or more physical teams or adapters
  • Convergence of one or more cluster networks onto one or more physical teams claimed by a Hyper-V virtual switch

A few pointers to help you decide:

  • For modern deployments, avoid using one adapter or team for a cluster network. It makes poor use of available network resources by forcing an unnecessary segregation of traffic.
  • I personally do not recommend bare teams for Hyper-V cluster communications. You would need to exclude such networks from participating in a Hyper-V switch, which would also force an unnecessary segregation of traffic.
  • The most even and simple distribution involves a singular team with a Hyper-V switch that hosts all cluster network adapters and virtual machine adapters. Start there and break away only as necessary.
  • A single 10 gigabit adapter swamps multiple gigabit adapters. If your hosts have both, don’t even bother with the gigabit.

To simplify your architecture, decide early:

  • How many networks you will use. They do not need to have different functions. For example, the old management/cluster/Live Migration/storage breakdown no longer makes sense. One management and three cluster networks for a four-member team does make sense.
  • The IP structure for each network. For networks that will only carry cluster (including intra-cluster Live Migration) communication, the chosen subnet(s) do not need to exist in your current infrastructureAs long as each adapter in a cluster network can reach all of the others at layer 2 (Ethernet), then you can invent any IP network that you want.

I recommend that you start off expecting to use a completely converged design that uses all physical network adapters in a single team. Create Hyper-V network adapters for each unique cluster network. Stop there, and make no changes unless you detect a problem.

Comparing the Old Way to the New Way (Gigabit)

Let’s start with a build that would have been common in 2010 and walk through our options up to something more modern. I will only use gigabit designs in this section; skip ahead for 10 gigabit.

In the beginning, we couldn’t use teaming. So, we used a lot of gigabit adapters:


There would be some variations of this. For instance, I would have added another adapter so that I could use MPIO with two iSCSI networks. Some people used Fiber Channel and would not have iSCSI at all.

Important Note: The “VMs” that you see there means that I have a virtual switch on that adapter and the virtual machines use it. It does not mean that I have created a VM cluster network. There is no such thing as a VM cluster network. The virtual machines are unaware of the cluster and they will not talk to it (if they do, they’ll use the Management access point like every other non-cluster system).

Then, 2012 introduced teaming. We could then do all sorts of fun things with convergence. My very least favorite:

This build takes teams to an excess. Worse, the management, cluster, and Live Migration teams will be idle almost all the time, meaning that this 60% of this host’s networking capacity will be generally unavailable.

Let’s look at something a bit more common. I don’t like this one either, but I’m not revolted by it either:

A lot of people like that design because, so they say, it protects the management adapter from problems that affect the other roles. I cannot figure out how they perform that calculus. Teaming addresses any probable failure scenarios. For anything else, I would want the entire host to fail out of the cluster. In this build, a failure that brought the team down but not the management adapter would cause its hosted VMs to become inaccessible because the node would remain in the cluster. That’s because the management adapter would still carry cluster heartbeat information.

My preferred design follows:

Now we are architected against almost all types of failure. In a “real-world” build, I would still have at least two iSCSI NICs using MPIO.

What is the Optimal Gigabit Adapter Count?

Because we had one adapter per role in 2008 R2, we often continue using the same adapter count in our 2012+ builds. I don’t feel that’s necessary for most builds. I am inclined to use two or three adapters in data teams and two adapters for iSCSI. For anything past that, you’ll need to have collected some metrics to justify the additional bandwidth needs.

10 Gigabit Cluster Network Design

10 gigabit changes all of the equations. In reasonable load conditions, a single 10 gigabit adapter moves data more than 10 times faster than a single gigabit adapter. When using 10 GbE, you need to change your approaches accordingly. First, if you have both 10GbE and gigabit, just ignore the gigabit. It is not worth your time. If you really want to use it, then I would consider using it for iSCSI connections to non-SSD systems. Most installations relying on iSCSI-connected spinning disks cannot sustain even 2 Gbps, so gigabit adapters would suffice.

Logical Adapter Counts for Converged Cluster Networking

I didn’t include the Hyper-V virtual switch in any of the above diagrams, mostly because it would have made the diagrams more confusing. However, I would use a Hyper-V team to host all of the logical adapters necessary. For a non-Hyper-V cluster, I would create a logical team adapter for each role. Remember that on a logical team, you can only have a single logical adapter per VLAN. The Hyper-V virtual switch has no such restrictions. Also remember that you should not use multiple logical team adapters on any team that hosts a Hyper-V virtual switch. Some of the behavior is undefined and your build might not be supported.

I would always use these logical/virtual adapter counts:

  • One management adapter
  • A minimum of one cluster communications adapter up to n-1, where n is the number of physical adapters in the team. You can subtract one because the management adapter acts as a cluster adapter as well

In a gigabit environment, I would add at least one logical adapter for Live Migration. That’s optional because, by default, all cluster-enabled networks will also carry Live Migration traffic.

In a 10 GbE environment, I would not add designated Live Migration networks. It’s just logical overhead at that point.

In a 10 GbE environment, I would probably not set aside physical adapters for storage traffic. At those speeds, the differences in offloading technologies don’t mean that much.

Architecting IP Addresses

Congratulations! You’ve done the hard work! Now you just need to come up with an IP scheme. Remember that the cluster builds networks based on the IPs that it discovers.

Every network needs one IP address for each node. Any network that contains an access point will need an additional IP for the CNO. For Hyper-V clusters, you only need a management access point. The other networks don’t need a CNO.

Only one network really matters: management. Your physical nodes must use that to communicate with the “real” network beyond. Choose a set of IPs available on your “real” network.

For all the rest, the member IPs only need to be able to reach each other over layer 2 connections. If you have an environment with no VLANs, then just make sure that you pick IPs in networks that don’t otherwise exist. For instance, you could use for something, as long as that’s not a “real” range on your network. Any cluster network without a CNO does not need to have a gateway address, so it doesn’t matter that those networks won’t be routable. It’s preferred, in fact.

Implementing Hyper-V Cluster Networks

Once you have your architecture in place, you only have a little work to do. Remember that the cluster will automatically build networks based on the subnets that it discovers. You only need to assign names and set them according to the type of traffic that you want them to carry. You can choose:

  • Allow cluster communication (intra-node heartbeat, configuration updates, and Cluster Shared Volume traffic)
  • Allow client connectivity to cluster resources (includes cluster communication) and cluster communications (you cannot choose client connectivity without cluster connectivity)
  • Prevent participation in cluster communications (often used for iSCSI and sometimes connections to external SMB storage)

As much as I like PowerShell for most things, Failover Cluster Manager makes this all very easy. Access the Networks tree of your cluster:

I’ve already renamed mine in accordance with their intended roles. A new build will have “Cluster Network”, “Cluster Network 1”, etc. Double-click on one to see which IP range(s) it assigned to that network:

Work your way through each network, setting its name and what traffic type you will allow. Your choices:

  • Allow cluster network communication on this network AND Allow clients to connect through this network: use these two options together for the management network. If you’re building a non-Hyper-V cluster that needs access points on non-management networks, use these options for those as well. Important: The adapters in these networks SHOULD register in DNS.
  • Allow cluster network communication on this network ONLY (do not check Allow clients to connect through this network): use for any network that you wish to carry cluster communications (remember that includes CSV traffic). Optionally use for networks that will carry Live Migration traffic (I recommend that). Do not use for iSCSI networks. Important: The adapters in these networks SHOULD NOT register in DNS.
  • Do not allow cluster network communication on this network: Use for storage networks, especially iSCSI. I also use this setting for adapters that will use SMB to connect to a storage server running SMB version 3.02 in order to run my virtual machines. You might want to use it for Live Migration networks if you wish to segregate Live Migration from cluster traffic (I do not do or recommend that).

Once done, you can configure Live Migration traffic. Right-click on the Networks node and click Live Migration Settings:

Check a network’s box to enable it to carry Live Migration traffic. Use the Up and Down buttons to prioritize.

What About Traffic Prioritization?

In 2008 R2, we had some fairly arcane settings for cluster network metrics. You could use those to adjust which networks the cluster would choose as alternatives when a primary network was inaccessible. We don’t use those anymore because SMB multichannel just figures things out. However, be aware that the cluster will deliberately choose Cluster Only networks over Cluster and Client networks for inter-node communications.

What About Hyper-V QoS?

When 2012 first debuted, it brought Hyper-V networking QoS along with it. That was some really hot new tech, and lots of us dove right in and lost a lot of sleep over finding the “best” configuration. And then, most of us realized that our clusters were doing a fantastic job balancing things out all on their own. So, I would recommend that you avoid tinkering with Hyper-V QoS unless you have tried going without and had problems. Before you change QoS, determine what traffic needs to be attuned or boosted before you change anything. Do not simply start flipping switches, because the rest of us already tried that and didn’t get results. If you need to change QoS, start with this TechNet article.

Your thoughts?

Does your preferred network management system differ from mine? Have you decided to give my arrangement a try? How id you get on? Let me know in the comments below, I really enjoy hearing from you guys!

How to Monitor Hyper-V Performance using PNP4Nagios

How to Monitor Hyper-V Performance using PNP4Nagios

At a high level, you need three things to run a trouble-free datacenter (even if your datacenter consists of two mini-tower systems stuffed in a closet): intelligent architecture, monitoring, and trend analysis. Intelligent architecture consists of making good purchase decisions and designing virtual machines that can appropriately handle their load. Monitoring allows you to prevent or respond quickly to emergent situations. Trend analysis helps you to determine how well your reality matches your projections and greatly assists in future architectural decisions. In this article, we’re going to focus on trend analysis. We will set up a data collection and graphing system called “PNP4Nagios” that will allow you to track anything that you can measure. It will hold that data for four years. You can display it in graphs on demand.

What You Get

I know that intro was a little heavy. So, to put it more simply, I’m giving you graphs. Want to know how much CPU that VM has been using? Trying to figure out how quickly your VMs are filling up your Cluster Shared Volumes? Curious about a VM’s memory usage? We have all of that.

Where I find it most useful: Getting rid of vendor excuses. We all have at least one of those vendors that claim that we’re not providing enough CPU or memory or disk or a combination. Now, you can visually determine the reasonableness of their demands.

First, the host and service screens in Nagios will get a new graph icon next to every host and service that track performance data. Also, hovering over one of those graph icons will show a preview of the most recent chart:


Second, clicking any of those icons will open a new tab with the performance data graph for the selected item.


Just as the Nagios pages periodically refresh, the PNP4Nagios page will update itself.

Additionally, you can do the following:

  • Click-dragging a section on a graph will cause it to zoom. If you’ve ever used the zoom feature in Performance Monitor, this is similar.
  • In the Actions bar, you can:
    • Set a custom time/date range to graph
    • Generate a PDF of the visible charts
    • Generate XML summary data
  • Create a “basket” of the graphs that you view most. The basket persists between sessions, so you can build a dashboard of your favorite charts

What You Need

Fortunately, you don’t need much to get going with PNP4Nagios.

Fiscal Cost

Let’s answer the most important question: what does it cost? PNP4Nagios does not require you to purchase anything. Their site does include a Donate button. If your organization finds PNP4Nagios useful, it would be good to throw a few dollars their way.

You’ll need an infrastructure to install PNP4Nagios on, of course. We’ll wrap that up into the later segments.


As its name implies, PNP4Nagios needs Nagios. PNP4Nagios installs alongside Nagios on the same system. We have a couple of walkthroughs for installing Nagios as a Hyper-V guest, divided by distribution.

The installation really doesn’t change much between distributions. The differences lie in how you install the prerequisites and in how you configure Apache. If you know those things about your distribution, then you should be able to use either of the two linked walkthroughs to great effect. If you’d rather see something on your exact distribution, the official Nagios project has stepped up its game on documentation. If we haven’t got instructions for your distribution, maybe they do. There are still things that I do differently, but nothing of critical importance. Also, being a Hyper-V blog, I have included special items just for monitoring Hyper-V, so definitely look at the post-installation steps of my articles.

Also, if you want to use SSL and Active Directory to secure your Nagios installation, we’ve got an article for that.

Disk Space

According to the PNP4Nagios documentation, each item that you monitor will require about 400 kilobytes once it has reached maximum data retention. That assumes that you will leave the default historical interval and retention lengths. More information can be found on the PNP4Nagios site. So, 20 systems with 12 monitors apiece will use about 96 megabytes.

PNP4Nagios itself appears to use around 7 megabytes once installed and extracted.

Downloading PNP4Nagios

PNP4Nagios is distributed on Sourceforge:

As always, I recommend that you download to a standard workstation and then transfer the files to the Nagios server. Since I operate using a Windows PC and run Nagios on a Linux system, WinSCP is my choice of transfer tool.

On my Linux systems, I create a “Download” directory in my home folder and place everything there. The install portion of my instructions will be written using the file’s location as a starting point. So, for me, I begin with cd ~/Downloads.

Installing PNP4Nagios

PNP4Nagios installs quite easily.

PNP4Nagios Prerequisites

Most of the prerequisites for PNP4Nagios automatically exist in most Linux distributions. Most of the remainder will have been satisfied when you installed Nagios. The documentation lists them:

  • Perl, at least version 5. To check your installed Perl version: perl -v
  • RRDTool: This one will not be installed automatically or during a regular Nagios build. Most distributions include it in their mainstream repositories. Install with your distribution’s package manager.
    • CentOS and most other RedHat-based distributions: sudo yum install perl-rrdtool
    • SUSE-based systems: sudo zypper install rrdtool
    • Ubuntu and most other Debian-based distributions: sudo apt install rrdtool librrds-perl
  • PHP, at least version 5. This would have been installed with Nagios. Check with: php -v
  • GD extension for PHP. You might have installed this with Nagios. Easiest way to check is to just install it; it will tell you if you’ve already got it.
    • CentOS and most other RedHat-based distributions: sudo yum install php-gd
    • SUSE-based systems: sudo zypper install php-gd
    • Ubuntu and most other Debian-based distributions: sudo apt install php-gd
  • mod_rewrite extension for Apache. This should have been installed along with Nagios. How you check depends on whether your distribution uses “apache2” or “httpd” as the name of the Apache executable:
    • CentOS and most other RedHat-based distributions: sudo httpd -M | grep rewrite
    • Ubuntu, openSUSE, and most Debian and SUSE distributions: sudo apache2ctl -M | grep rewrite
  • There will be a bit more on this in the troubleshooting section near the end of the article, but if you’re running a more current version of PHP (like 7), then you may not have the XML extension built-in. I only ran into this problem on my Ubuntu installation. I solved it with this: sudo apt install php-xml
  • openSUSE was missing a couple of PHP modules on my system: sudo zypper install php-sockets php-zlib

If you are missing anything that I did not include instructions for, you can visit one of my articles on installing Nagios. If I haven’t got one for your distribution, then you’ll need to search for instructions elsewhere.

Unpacking and Installing PNP4Nagios

As I mentioned in the download section, I place my downloaded files in ~/Downloads. I start from there (with cd ~/Downloads). Start these directions in the folder where you placed your downloaded PNP4Nagios tarball.

  1. Unpack the tarball. I wrote these directions with version 0.6.26. Modify your command as necessary (don’t forget about tab completion!): tar xzf pnp4nagios-0.6.26.tar.gz
  2. Move to the unpacked folder: cd ./pnp4nagios-0.6.26/
  3. Next, you will need to configure the installer. Most of us can just use it as-is. Some of us will need to override some things, such as the Nagios user groups. To determine if that applies to you, open /usr/local/nagios/etc/nagios.cfg. Look for the following section:

    If both nagios_user and nagios_group are “nagios”, then you don’t need to do anything special.
    Regular configuration: ./configure
    Configuration with overrides: ./configure --with-nagios-user=naguser --with-nagios-group=nagcmd .
    Other overrides are available. You can view them all with ./configure --help. One useful override would be to change the location of the emitted perfdata files to an auxiliary volume to control space usage. On my Ubuntu system, I needed to override the location of the Apache conf files: ./configure --with-httpd-conf=/etc/apache2/sites-available
  4. When configure completes, check its output. Verify that everything looks OK. Especially pay attention to “Apache Config File” — note the value because you will access it later. If anything looks off, install any missing prerequisites and/or use the appropriate configure options. You can continue running ./configure until everything suits your needs.
  5. Compile the program: make all. If you have an “oh no!” moment in which you realize that you missed something, you can still re-run ./configure and then compile again.
  6. Because we’re doing a new installation, we will have it install everything: sudo make fullinstall. Be aware that we are now using sudo. That’s because it will need to copy files into locations that your regular account won’t have access to. For an upgrade, you’d likely only want sudo make install. Please check the documentation for additional notes about upgrading. If you didn’t pay attention to the output file locations during configure, they’ll be displayed to you again.
  7. We’re going to be adding a bit of flair to our Nagios links. Enable the pop-up extension with: sudo cp ./contrib/ssi/status-header.ssi /usr/local/nagios/share/ssi/

Installation is complete. We haven’t wired it into Nagios yet, so don’t expect any fireworks.

Configure Apache Security for PNP4Nagios

If you just use the default Apache security for Nagios, then you can skip this whole section. As outlined in my previous article, I use Active Directory authentication. Really, all that you need to do is duplicate your existing security configuration to the new site. Remember how I told you to pay attention to the output of configure, specifically “Apache Config File”? That’s the file to look in.

My “fixed” file looks like this:

Only a single line needed to be changed to match my Nagios virtual directories.

Initial Verification of PNP4Nagios Installation

Before we go any further, let’s ensure that our work to this point has done what we expected.

  1. If you are using a distribution whose Apache enables and disables sites by symlinking into sites-available and you instructed PNP4Nagios to place its files there (ex: Ubuntu), enable the site: sudo a2ensite pnp4nagios.conf
  2. Restart Apache.
    1. CentOS and most other RedHat-based distributions: sudo service httpd restart
    2. Almost everyone else: sudo service apache2 restart
  3. If necessary, address any issues with Apache starting. For instance, Apache on my openSUSE box really did not like the “Order” and “Allow” directives.
  4. Once Apache starts correctly, access http://yournagiosserveraddress/pnp4nagios. For instance, my internal URL is Remember that you copied over your Nagios security configuration, so you will log in using the same credentials that you use on a normal Nagios site.
  5. Fix any problems indicated by the web page. Continue reloading the Apache server and the page as necessary until you get the green light:
  6. Remove the file that validates the installation: sudo rm /usr/local/pnp4nagios/share/install.php

Installation was painless on my CentOS and Ubuntu systems. openSUSE gave me more drama. In particular, it complained about “PHP zlib extension not available” and “PHP socket extension not available”. Very easy to fix: sudo zypper install php-sockets php-zlib. Don’t forget to restart Apache after making these changes.

Initial Configuration of Nagios for PNP4Nagios

At this point, you have PNP4Nagios mostly prepared to do its job. However, if you try to access the URL, you’ll get a message that says that it doesn’t have any data: “perfdata directory “/usr/local/pnp4nagios/var/perfdata/” is empty. Please check your Nagios config.” Nagios needs to start feeding it data.

We start by making several global changes. If you are comparing my walkthrough to the official PNP4Nagios documentation, be aware that I am guiding you to a Bulk + NPCD configuration. I’ll talk about why after the how-to.

Global Nagios Configuration File Changes

In the text editor of your choice, open /usr/local/nagios/etc/nagios.cfg. Find each of the entries that I show in the following block and change them accordingly. Some don’t need anything other than to be uncommented:


Next, open /usr/local/nagios/etc/objects/templates.cfg. At the end, you’ll find some existing commands that mention “perfdata”. After those, add the commands from the following block. If you don’t use the initial Nagios sample files, then just place these commands in any active cfg file that makes sense to you.

Configuring NPCD

The performance collection method that we’re employing involves the Nagios Perfdata C Daemon (NPCD). The default configuration will work perfectly for this walkthrough. If you need something more from it, you can edit /usr/local/pnp4nagios/etc/npcd.cfg. We just want it to run as a daemon:

Enable it to run automatically at startup.

  • Most Red Hat and SUSE based distributions: sudo chkconfig --add npcd
  • Ubuntu and most other Debian-based distributions: sudo update-rc.d npcd defaults

Configuring Hosts in Nagios for PNP4Nagios Graphing

If you made it here, you’ve successfully completed all the hard work! Now you just need to tell Nagios to start collecting performance data so that PNP4Nagios can graph it.

Note: I deviate substantially from the PNP4Nagios official documentation. If you follow those directions, you will quickly and easily set up every single host and every single service to gather data. I didn’t want that because I don’t find such a heavy hand to be particularly useful. You’ll need to do more work to exert finer control. In my opinion, that extra bit of work is worth it. I’ll explain why after the how-to.

If you followed the path of least resistance, every single host in your Nagios environment inherits from a single root source. Open /usr/local/nagios/etc/objects/templates.cfg. Find the define host object with a name of generic-host. Most likely, this is your master host object. Look at its configuration:

Now that you’ve enabled performance data processing in nagios.cfg, this means that Nagios and PNP4Nagios will now start graphing for every single host in your Nagios configuration. Sound good? Well, wait a second. What it really means is that it will graph the output of the check_command for every single host in your Nagios configuration. What is check_command in this case? Probably check_ping or check_icmp. The performance data that those output are the round-trip average and packets lost during pings from the Nagios server to the host in question. Is that really useful information? To track for four years?

I don’t really need that information. Certainly not for every host. So, I modified mine to look this:

What we have:

  • Our existing hosts are untouched. They’ll continue not recording performance data just as they always have.
  • A new, small host definition called “perf-host”. It also does not set up the recording of host performance data. However, its “action_url” setting will cause it to display a link to any graphs that belong to this host. You can use this with hosts that have graphed services but you don’t want the ping statistics tracked. To use it, you would set up/modify hosts and host templates to inherit from this template in addition to whatever host templates they already inherit from. For example: use perf-host,generic-host.
  • A new, small host definition called “perf-host-pingdata”. It works exactly like “perf-host” except that it will capture the ping data as well. The extra bit on the end of the “action_url” will cause it to draw a little preview when you mouseover the link. To use it, you will set up/modify hosts and host templates to inherit from this template in addition to whatever host templates they already inherit from. For example: use perf-host-pingdata,generic-host.

Note: When setting the inheritance:

  • perf-host or perf-host-pingdata must come before any other host templates in a use line.
  • In some instances, including a space after the comma in a use line causes Nagios to panic if the name of the host does not also have a space (ex: you are using tabs instead of spaces on the name generic_host line. Make sure that all of your use directives have no spaces after any commas and you will never have a problem. Ex: use perf-host,generic-host.

Remember to check the configuration and restart Nagios after any changes to the .cfg files:

Couldn’t You Just Set a Single Root Host for Inheritance?

An alternative to the above would be:

In this configuration, perf-host inherits directly from generic-host. You could then have all of your other systems inherit from perf-host instead of generic-host. The problem is that even in a fairly new Nagios installation, a fair number of hosts already inherit from generic-host. You’d need to determine which of those you wanted to edit and carefully consider how inheritance works. If you’re going to all of that trouble, it seems to me that maybe you should just directly edit the generic-host template and be done with it.

Truthfully, I’m only telling you what I do. Do whatever makes sense to you.

Configuring Services in Nagios for PNP4Nagios Graphing

You’ll get much more use of out service graphing than host graphing. Just as with hosts, the default configuration enables performance graphing for all services. Not all services emit performance data, and you may not want data from all services that do produce data. So, let’s fine-tune that configuration as well.

Still in /usr/local/nagios/etc/objects/templates.cfg, find the define service object with a name of generic-service. Disable performance data collection on it and add a stub service that enables performance graphing:

When you want to capture performance data from a service, prepend the new stub service to its use line. Ex: use perf-service,generic-service. The warnings from the host section about the order of items and the lack of a space after the comma in the use line transfer to the service definition.

Remember to check the configuration and restart Nagios after any changes to the .cfg files:

Example Configurations

In case the above doesn’t make sense, I’ll show you what I’m doing.

Most of the check_nt services emit performance data. I’m especially interested in CPU, disk, and memory. The uptime service also emits data, but for some reason, it doesn’t use the defined “counter” mode. Instead, it’s just a graph that steadily increases at each interval until you reboot, then it starts over again at zero. I don’t find that terribly useful, especially since Nagios has its own perfectly capable host uptime graphs. So, I first configure the “windows-server” host to show the performance action_url. Then I configure the desired default Windows services to capture performance data.

My /usr/local/nagios/etc/objects/windows.cfg:

Now, my hosts that inherit from the default Windows template have the extra action icon, but my other hosts do not:

p4n_hostswithiconsThe same story on the services page; services that track performance data have an icon, but the others do not:


Troubleshooting your PNP4Nagios Deployment

Not getting any data? First of all, be patient, especially when you’re just getting started. I have shown you how to set up the bulk mode with NPCD which means that data captures and graphing are delayed. I’ll explain why later, but for now, just be aware that it will take some time before you get anything at all.

If it’s been some time, say, 15 minutes, and you’re still not getting any data. Go to and download the verify_pnp_config file. Transfer it to your Nagios host. I just plop it into my Downloads folder as usual. Navigate to the folder where you placed yours, then run:

That should give you the clues that you need to fix most any problems.

I did have one leftover problem, but only my Ubuntu system where I had updated to PHP 7. The verify script passed everything, but trying to load any PNP4Nagios page gave me this error: “Call to undefined function simplexml_load_file()”. I only needed to install the PHP XML package to fix that: sudo apt install php-xml. I didn’t look up the equivalent on the other distributions.

Plugin Output for Performance Graphing

To determine if a plugin can be graphed, you could just look at its documentation. Otherwise, you’ll need to manually execute it from /usr/local/nagios/libexec. For instance, we’ll just use the first one that shows up on an Ubuntu system, check_apt:


See the pipe character (|) there after the available updates report? Then the jumble of characters after that? That’s all in the standard format for Nagios performance charting. That format is:

  1. A pipe character after the standard Nagios service monitoring result.
  2. A human-readable label. If the label includes any special characters, the entire label should be enclosed in single quotes.
  3. An equal sign (=)
  4. The reported value.
  5. Optionally, a unit of measure.
  6. A semi-colon, optionally followed by a value for the warning level. If the warning level is visible on the produced chart, it will be indicated by a horizontal yellow line.
  7. A semi-colon, optionally followed by a value for the critical level. If the warning level is visible on the produced chart, it will be indicated by a horizontal red line.
  8. A semicolon, optionally followed by the minimum value for the chart’s y-axis. Must be the same unit of measure as the value in #4. If not specified, PNP4Nagios will automatically set the minimum value. If this value would make the current value invisible, PNP4Nagios will set its own minimum.
  9. A semicolon, optionally followed by the maximum value for the chart’s y-axis. Must be the same unit of measure as the value in #4. If not specified, PNP4Nagios will automatically set the maximum value. If this value would make the current value invisible, PNP4Nagios will set its own maximum.

This format is defined by Nagios and PNP4Nagios conforms to it. You can read more about the format at:

My plugins did not originally emit any performance data. I have been working on that and should hopefully have all of that work completed before you read this article.

My PNP4Nagios Configuration Philosophy

I had several decision points when setting up my system. You may choose to diverge as it meets your needs. I’ll use this section to explain why I made the choices that I did.

Why “Bulk with NPCD” Mode?

Initially, I tried to set up PNP4Nagios in “synchronous” mode. That would cause Nagios to instantly call on PNP4Nagios to generate performance data immediately after every check’s results were returned. I chose that initially because it seemed like the path of least resistance.

It didn’t work for me. I’m betting that I did something wrong. But, I didn’t get my problem sorted out. I found a lot more information on the NPCD mode. So, I switched. Then I researched the differences. I feel like I made the correct choice.

You can read up on the available modes yourself:

In synchronous mode, Nagios can’t do anything while PNP4Nagios processes the return information. That’s because it all occurs in the same thread; we call that behavior “blocking”. According to the PNP4Nagios documentation, that method “will work very good up to about 1,000 services in a 5-minute interval”. I assume that’s CPU-driven, but I don’t know. I also don’t know how to quantify or qualify “will work very good”. I also don’t know what sort of environments any of my readers are using.

Bulk mode moves the processing of data from per-return-of-results to gathering results for a while and then processing them all at once. The documentation says that testing showed that 2,000 services were processed in .06 seconds. That’s easier to translate to real-world systems, although I still don’t know the overall conditions that generated that benchmark.

When we add NPCD onto bulk mode, then we don’t block Nagios at all. Nagios still does the bulk gathering, but NPCD processes the data, not Nagios. I chose this method as it means that as long as your Nagios system is multi-core and not already overloaded, you should not encounter any meaningful interruption to your Nagios service by adding PNP4Nagios. It should also work well with most installation sizes. For really big Nagios/PNP4Nagios installations (also not qualified or quantified), you can follow their instructions on configuring “Gearman Mode”.

One drawback to this method: Your “4 Hour” charts will frequently show an empty space at the right of their charts. That’s because they will be drawn in-between collection/processing periods. All of the data will be filled in after a few minutes. You just may not have instant gratification.

Why Not Just Allow Every Host and Service to be Monitored?

The default configuration of PNP4Nagios results in every single host and every single service being enabled for monitoring. From an “ease-of-configuration” standpoint, that’s tempting. Once you’ve set the globals, you literally don’t have to do anything else.

However, we are also integrating directly with Nagios’ generated HTML pages. Whereas PNP4Nagios can determine that a service doesn’t have performance data because Nagios won’t have generated anything, the front-end just has an instruction to add a linked icon to every single service. So, if you just globally enable it, then you’ll get a lot of links that don’t work.

If you’re the only person using your environment, maybe that’s OK. But, if you share the environment, then you’ll start getting calls wanting to you to “fix” all those broken links. It won’t take long before you’re spending more time explaining (and re-explaining) that not all of the links have anything to show.

Why Not Just Change the Inheritance Tree?

If you want, you could have your performance-enabled hosts and services inherit from the generic-host/generic-service templates, then have later templates, hosts, and services inherit from those. If that works for you, then take that approach.

I chose to employ multiple inheritance as a way of overriding the default templates because it seemed like less effort to me. When I went to modify the services, I simply copied “perf-service,” to the clipboard and then selectively pasted it into the use line of every service that I wanted. It worked easier for me than a selective find-replace operation or manual replacement. It also seems to me that it would be easier to revert that decision if I make a mistake somewhere.

I can envision very solid arguments for handling this differently. I won’t argue. I just think that this approach was best for my situation.

6 Hardware Tweaks that will Skyrocket your Hyper-V Performance

6 Hardware Tweaks that will Skyrocket your Hyper-V Performance

Few Hyper-V topics burn up the Internet quite like “performance”. No matter how fast it goes, we always want it to go faster. If you search even a little, you’ll find many articles with long lists of ways to improve Hyper-V’s performance. The less focused articles start with general Windows performance tips and sprinkle some Hyper-V-flavored spice on them. I want to use this article to tighten the focus down on Hyper-V hardware settings only. That means it won’t be as long as some others; I’ll just think of that as wasting less of your time.

1. Upgrade your system

I guess this goes without saying but every performance article I write will always include this point front-and-center. Each piece of hardware has its own maximum speed. Where that speed barrier lies in comparison to other hardware in the same category almost always correlates directly with cost. You cannot tweak a go-cart to outrun a Corvette without spending at least as much money as just buying a Corvette — and that’s without considering the time element. If you bought slow hardware, then you will have a slow Hyper-V environment.

Fortunately, this point has a corollary: don’t panic. Production systems, especially server-class systems, almost never experience demand levels that compare to the stress tests that admins put on new equipment. If typical load levels were that high, it’s doubtful that virtualization would have caught on so quickly. We use virtualization for so many reasons nowadays, we forget that “cost savings through better utilization of under-loaded server equipment” was one of the primary drivers of early virtualization adoption.

2. BIOS Settings for Hyper-V Performance

Don’t neglect your BIOS! It contains some of the most important settings for Hyper-V.

  • C States. Disable C States! Few things impact Hyper-V performance quite as strongly as C States! Names and locations will vary, so look in areas related to Processor/CPU, Performance, and Power Management. If you can’t find anything that specifically says C States, then look for settings that disable/minimize power management. C1E is usually the worst offender for Live Migration problems, although other modes can cause issues.
  • Virtualization support: A number of features have popped up through the years, but most BIOS manufacturers have since consolidated them all into a global “Virtualization Support” switch, or something similar. I don’t believe that current versions of Hyper-V will even run if these settings aren’t enabled. Here are some individual component names, for those special BIOSs that break them out:
    • Virtual Machine Extensions (VMX)
    • AMD-V — AMD CPUs/mainboards. Be aware that Hyper-V can’t (yet?) run nested virtual machines on AMD chips
    • VT-x, or sometimes just VT — Intel CPUs/mainboards. Required for nested virtualization with Hyper-V in Windows 10/Server 2016
  • Data Execution Prevention: DEP means less for performance and more for security. It’s also a requirement. But, we’re talking about your BIOS settings and you’re in your BIOS, so we’ll talk about it. Just make sure that it’s on. If you don’t see it under the DEP name, look for:
    • No Execute (NX) — AMD CPUs/mainboards
    • Execute Disable (XD) — Intel CPUs/mainboards
  • Second Level Address Translation: I’m including this for completion. It’s been many years since any system was built new without SLAT support. If you have one, following every point in this post to the letter still won’t make that system fast. Starting with Windows 8 and Server 2016, you cannot use Hyper-V without SLAT support. Names that you will see SLAT under:
    • Nested Page Tables (NPT)/Rapid Virtualization Indexing (RVI) — AMD CPUs/mainboards
    • Extended Page Tables (EPT) — Intel CPUs/mainboards
  • Disable power management. This goes hand-in-hand with C States. Just turn off power management altogether. Get your energy savings via consolidation. You can also buy lower wattage systems.
  • Use Hyperthreading. I’ve seen a tiny handful of claims that Hyperthreading causes problems on Hyper-V. I’ve heard more convincing stories about space aliens. I’ve personally seen the same number of space aliens as I’ve seen Hyperthreading problems with Hyper-V (that would be zero). If you’ve legitimately encountered a problem that was fixed by disabling Hyperthreading AND you can prove that it wasn’t a bad CPU, that’s great! Please let me know. But remember, you’re still in a minority of a minority of a minority. The rest of us will run Hyperthreading.
  • Disable SCSI BIOSs. Unless you are booting your host from a SAN, kill the BIOSs on your SCSI adapters. It doesn’t do anything good or bad for a running Hyper-V host but slows down physical boot times.
  • Disable BIOS-set VLAN IDs on physical NICs. Some network adapters support VLAN tagging through boot-up interfaces. If you then bind a Hyper-V virtual switch to one of those adapters, you could encounter all sorts of network nastiness.

3. Storage Settings for Hyper-V Performance

I wish the IT world would learn to cope with the fact that rotating hard disks do not move data very quickly. If you just can’t cope with that, buy a gigantic lot of them and make big RAID 10 arrays. Or, you could get a stack of SSDs. Don’t get six or so spinning disks and get sad that they “only” move data at a few hundred megabytes per second. That’s how the tech works.

Performance tips for storage:

  • Learn to live with the fact that storage is slow.
  • Remember that speed tests do not reflect real world load and that file copy does not test anything except permissions.
  • Learn to live with Hyper-V’s I/O scheduler. If you want a computer system to have 100% access to storage bandwidth, start by checking your assumptions. Just because a single file copy doesn’t go as fast as you think it should, does not mean that the system won’t perform its production role adequately. If you’re certain that a system must have total and complete storage speed, then do not virtualize it. The only way that a VM can get that level of speed is by stealing I/O from other guests.
  • Enable read caches
  • Carefully consider the potential risks of write caching. If acceptable, enable write caches. If your internal disks, DAS, SAN, or NAS has a battery backup system that can guarantee clean cache flushes on a power outage, write caching is generally safe. Internal batteries that report their status and/or automatically disable caching are best. UPS-backed systems are sometimes OK, but they are not foolproof.
  • Prefer few arrays with many disks over many arrays with few disks.
  • Unless you’re going to store VMs on a remote system, do not create an array just for Hyper-V. By that, I mean that if you’ve got six internal bays, do not create a RAID-1 for Hyper-V and a RAID-x for the virtual machines. That’s a Microsoft SQL Server 2000 design. This is 2017 and you’re building a Hyper-V server. Use all the bays in one big array.
  • Do not architect your storage to make the hypervisor/management operating system go fast. I can’t believe how many times I read on forums that Hyper-V needs lots of disk speed. After boot-up, it needs almost nothing. The hypervisor remains resident in memory. Unless you’re doing something questionable in the management OS, it won’t even page to disk very often. Architect storage speed in favor of your virtual machines.
  • Set your fibre channel SANs to use very tight WWN masks. Live Migration requires a hand off from one system to another, and the looser the mask, the longer that takes. With 2016 the guests shouldn’t crash, but the hand-off might be noticeable.
  • Keep iSCSI/SMB networks clear of other traffic. I see a lot of recommendations to put each and every iSCSI NIC on a system into its own VLAN and/or layer-3 network. I’m on the fence about that; a network storm in one iSCSI network would probably justify it. However, keeping those networks quiet would go a long way on its own. For clustered systems, multi-channel SMB needs each adapter to be on a unique layer 3 network (according to the docs; from what I can tell, it works even with same-net configurations).
  • If using gigabit, try to physically separate iSCSI/SMB from your virtual switch. Meaning, don’t make that traffic endure the overhead of virtual switch processing, if you can help it.
  • Round robin MPIO might not be the best, although it’s the most recommended. If you have one of the aforementioned network storms, Round Robin will negate some of the benefits of VLAN/layer 3 segregation. I like least queue depth, myself.
  • MPIO and SMB multi-channel are much faster and more efficient than the best teaming.
  • If you must run MPIO or SMB traffic across a team, create multiple virtual or logical NICs. It will give the teaming implementation more opportunities to create balanced streams.
  • Use jumbo frames for iSCSI/SMB connections if everything supports it (host adapters, switches, and back-end storage). You’ll improve the header-to-payload bit ratio by a meaningful amount.
  • Enable RSS on SMB-carrying adapters. If you have RDMA-capable adapters, absolutely enable that.
  • Use dynamically-expanding VHDX, but not dynamically-expanding VHD. I still see people recommending fixed VHDX for operating system VHDXs, which is just absurd. Fixed VHDX is good for high-volume databases, but mostly because they’ll probably expand to use all the space anyway. Dynamic VHDX enjoys higher average write speeds because it completely ignores zero writes. No defined pattern has yet emerged that declares a winner on read rates, but people who say that fixed always wins are making demonstrably false assumptions.
  • Do not use pass-through disks. The performance is sometimes a little bit better, but sometimes it’s worse, and it almost always causes some other problem elsewhere. The trade-off is not worth it. Just add one spindle to your array to make up for any perceived speed deficiencies. If you insist on using pass-through for performance reasons, then I want to see the performance traces of production traffic that prove it.
  • Don’t let fragmentation keep you up at night. Fragmentation is a problem for single-spindle desktops/laptops, “admins” that never should have been promoted above first-line help desk, and salespeople selling defragmentation software. If you’re here to disagree, you better have a URL to performance traces that I can independently verify before you even bother entering a comment. I have plenty of Hyper-V systems of my own on storage ranging from 3-spindle up to >100 spindle, and the first time I even feel compelled to run a defrag (much less get anything out of it) I’ll be happy to issue a mea culpa. For those keeping track, we’re at 6 years and counting.

4. Memory Settings for Hyper-V Performance

There isn’t much that you can do for memory. Buy what you can afford and, for the most part, don’t worry about it.

  • Buy and install your memory chips optimally. Multi-channel memory is somewhat faster than single-channel. Your hardware manufacturer will be able to help you with that.
  • Don’t over-allocate memory to guests. Just because your file server had 16GB before you virtualized it does not mean that it has any use for 16GB.
  • Use Dynamic Memory unless you have a system that expressly forbids it. It’s better to stretch your memory dollar farther than wring your hands about whether or not Dynamic Memory is a good thing. Until directly proven otherwise for a given server, it’s a good thing.
  • Don’t worry so much about NUMA. I’ve read volumes and volumes on it. Even spent a lot of time configuring it on a high-load system. Wrote some about it. Never got any of that time back. I’ve had some interesting conversations with people that really did need to tune NUMA. They constitute… oh, I’d say about .1% of all the conversations that I’ve ever had about Hyper-V. The rest of you should leave NUMA enabled at defaults and walk away.

5. Network Settings for Hyper-V Performance

Networking configuration can make a real difference to Hyper-V performance.

  • Learn to live with the fact that gigabit networking is “slow” and that 10GbE networking often has barriers to reaching 10Gbps for a single test. Most networking demands don’t even bog down gigabit. It’s just not that big of a deal for most people.
  • Learn to live with the fact that a) your four-spindle disk array can’t fill up even one 10GbE pipe, much less the pair that you assigned to iSCSI and that b) it’s not Hyper-V’s fault. I know this doesn’t apply to everyone, but wow, do I see lots of complaints about how Hyper-V can’t magically pull or push bits across a network faster than a disk subsystem can read and/or write them.
  • Disable VMQ on gigabit adapters. I think some manufacturers are finally coming around to the fact that they have a problem. Too late, though. The purpose of VMQ is to redistribute inbound network processing for individual virtual NICs away from CPU 0, core 0 to the other cores in the system. Current-model CPUs are fast enough that they can handle many gigabit adapters.
  • If you are using a Hyper-V virtual switch on a network team and you’ve disabled VMQ on the physical NICs, disable it on the team adapter as well. I’ve been saying that since shortly after 2012 came out and people are finally discovering that I’m right, so, yay? Anyway, do it.
  • Don’t worry so much about vRSS. RSS is like VMQ, only for non-VM traffic. vRSS, then, is the projection of VMQ down into the virtual machine. Basically, with traditional VMQ, the VMs’ inbound traffic is separated across pNICs in the management OS, but then each guest still processes its own data on vCPU 0. vRSS splits traffic processing across vCPUs inside the guest once it gets there. The “drawback” is that distributing processing and then redistributing processing causes more processing. So, the load is nicely distributed, but it’s also higher than it would otherwise be. The upshot: almost no one will care. Set it or don’t set it, it’s probably not going to impact you a lot either way. If you’re new to all of this, then you’ll find an “RSS” setting on the network adapter inside the guest. If that’s on in the guest (off by default) and VMQ is on and functioning in the host, then you have vRSS. woohoo.
  • Don’t blame Hyper-V for your networking ills. I mention this in the context of performance because your time has value. I’m constantly called upon to troubleshoot Hyper-V “networking problems” because someone is sharing MACs or IPs or trying to get traffic from the dark side of the moon over a Cat-3 cable with three broken strands. Hyper-V is also almost always blamed by people that just don’t have a functional understanding of TCP/IP. More wasted time that I’ll never get back.
  • Use one virtual switch. Multiple virtual switches cause processing overhead without providing returns. This is a guideline, not a rule, but you need to be prepared to provide an unflinching, sure-footed defense for every virtual switch in a host after the first.
  • Don’t mix gigabit with 10 gigabit in a team. Teaming will not automatically select 10GbE over the gigabit. 10GbE is so much faster than gigabit that it’s best to just kill gigabit and converge on the 10GbE.
  • 10x gigabit cards do not equal 1x 10GbE card. I’m all for only using 10GbE when you can justify it with usage statistics, but gigabit just cannot compete.

6. Maintenance Best Practices

Don’t neglect your systems once they’re deployed!

  • Take a performance baseline when you first deploy a system and save it.
  • Take and save another performance baseline when your system reaches a normative load level (basically, once you’ve reached its expected number of VMs).
  • Keep drivers reasonably up-to-date. Verify that settings aren’t lost after each update.
  • Monitor hardware health. The Windows Event Log often provides early warning symptoms, if you have nothing else.


Further reading

If you carry out all (or as many as possible) of the above hardware adjustments you will witness a considerable jump in your hyper-v performance. That I can guarantee. However, for those who don’t have the time, patience or prepared to make the necessary investment in some cases, Altaro has developed an e-book just for you. Find out more about it here: Supercharging Hyper-V Performance for the time-strapped admin.


Hyper-V Backup Best Practices: Terminology and Basics

Hyper-V Backup Best Practices: Terminology and Basics


One of my very first jobs performing server support on a regular basis was heavily focused on backup. I witnessed several heart-wrenching tragedies of permanent data loss but, fortunately, played the role of data savior much more frequently. I know that most, if not all, of the calamities could have at least been lessened had the data owners been more educated on the subject of backup. I believe very firmly in the value of a solid backup strategy, which I also believe can only be built on the basis of a solid education in the art. This article’s overarching goal is to give you that education by serving a number of purposes:

  • Explain industry-standard terminology and how to apply it to your situation
  • Address and wipe away 1990s-style approaches to backup
  • Clearly illustrate backup from a Hyper-V perspective

Backup Terminology

Whenever possible, I avoid speaking in jargon and TLAs/FLAs (three-/four-letter acronyms) unless I’m talking to a peer that I’m certain has the experience to understand what I mean. When you start exploring backup solutions, you will have these tossed at you rapid-fire with, at most, brief explanations. If you don’t understand each and every one of the following, stop and read those sections before proceeding. If you’re lucky enough to be working with an honest salesperson, it’s easy for them to forget that their target audience may not be completely following along. If you’re less fortunate, it’s simple for a dishonest salesperson to ridiculously oversell backup products through scare tactics that rely heavily on your incomplete understanding.

  • Backup
  • Full/incremental/differential backup
  • Delta
  • Deduplication
  • Inconsistent/crash-consistent/application-consistent
  • Bare-metal backup/restore (BMB/BMR)
  • Disaster Recovery/Business Continuity
  • Recovery point objective (RPO)
  • Recovery time objective (RTO)
  • Retention
  • Rotation — includes terms such as Grandfather-Father-Son (GFS)

There are a number of other terms that you might encounter, although these are the most important for our discussion. If you encounter a vendor making up their own TLAs/FLAs, take a moment to investigate their meaning in comparison to the above. Most are just marketing tactics — inherently harmless attempts by a business entity trying to turn a coin by promoting its products. Some are more nefarious — attempts to invent a nail for which the company just conveniently happens to provide the only perfectly matching hammer (with an extra “value-added” price, of course).


This heading might seem pointless — doesn’t everyone know what a backup is? In my experience, no. In order to qualify as a backup, you must have a distinct, independent copy of data. A backup cannot have any reliance on the health or well-being of its source data or the media that contains that data. Otherwise, it is not a true backup.

Full/Incremental/Differential Backups

Recent technology changes and their attendant strategies have made this terminology somewhat less popular than in past decades, but it is still important to understand because it is still in widespread use. They are presented in a package because they make the most sense when compared to each other. So, I’ll give you a brief explanation of each and then launch into a discussion.

  • Full Backups: Full backups are the easiest to understand. They are a point-in-time copy of all target data.
  • Differential Backups: A differential backup is a point-in-time copy of all data that is different from the last full backup that is its parent.
  • Incremental Backups: An incremental backup is a point-in-time copy of all data that is different from the backup that is its parent.

The full backup is the safest type because it is the only one of the three that can stand alone in any circumstances. It is a complete copy of whatever data has been selected.

Full Backup

Full Backup

A differential backup is the next safest type. Remember the following:

  • To fully restore the latest data, a differential backup always requires two backups: the latest full backup and the latest differential backup. Intermediary differential backups, if any exist, are not required.
  • It is not necessary to restore from the most recent differential backup if an earlier version of the data is required.
  • Depending on what data is required and the intelligence of the backup application, it may not be necessary to have both backups available to retrieve specific items.

The following is an illustration of what a differential backup looks like:

Differential Backup

Differential Backup

Each differential backup goes all the way back to the latest full backup as its parent. Also, notice that each differential backup is slightly larger than the preceding differential backup. This phenomenon is conventional wisdom on the matter. In theory, each differential backup contains the previous backup’s changes as well as any new changes. In reality, it truly depends on the change pattern. A file backed up on Monday might have been deleted on Tuesday, so that part of the backup certainly won’t be larger. A file that changed on Tuesday might have had half its contents removed on Wednesday, which would make that part of the backup smaller. A differential backup can range anywhere from essentially empty (if nothing changed) to as large as the source data (if everything changed). Realistically, you should expect each differential backup to be slightly larger than the previous.

The following is an illustration of an incremental backup:

Incremental Backup

Incremental Backup

Incremental backups are best thought of as a chain. The above shows a typical daily backup in an environment that uses a weekly full with daily incrementals. If all data is lost and restoring to Wednesday’s backup is necessary, then every single night’s backup from Sunday onward will be necessary. If any one is missing or damaged, then it will likely not be possible to retrieve anything from that backup or any backup afterward. Therefore, incremental backups are the riskiest; they are also the fastest and consume the least amount of space.

Historically, full/incremental/differential backups have been facilitated by an archive bit in Windows. Anytime a file is changed, Windows sets its archive bit. The backup types operate with this behavior:

  • A full backup captures all target files and clears any archive bits that it finds.
  • A differential backup captures only target files that have their archive bit set and it leaves the bit in the state that it found it.
  • An incremental backup captures only files with the archive bit set and clears it afterward.
Archive Bit Example

Archive Bit Example


“Delta” is probably the most overthought word in all of technology. It means “difference”. Do not analyze it beyond that. It just means “difference”. If you have $10 in your pocket and you buy an item for $8, the $8 dollars that you spent is the “delta” between the amount of money that you had before you made the purchase and the amount of money that you have now.

The way that vendors use the term “delta” sometimes changes, but usually not by a great deal. In the earliest incarnation that I am aware of “delta” as applied to backups, it meant intra-file changes. All previous backup types operated with individual files being the smallest level of granularity (not counting specialty backups such as Exchange item-level). Delta backups would analyze the blocks of individual files, making the granularity one step finer.

The following image illustrates the delta concept:

Delta Backup

Delta Backup

A delta backup is essentially an incremental backup, but at the block level instead of the file level. Somebody got the clever idea to use the word “delta”, probably so that it wouldn’t be confused with “differential”, and the world thought it must mean something extra special because it’s Greek.

The major benefit of delta backups is that they use much less space than even incremental backups. The trade-off is in the computing power to calculate deltas. The archive bit can tell it if a file needs to be scanned, but it cannot tell it which blocks to cover. Backup systems that perform delta operations require some other method for change tracking.


Deduplication represents the latest iteration of backup innovation. The term explains itself quite nicely. The backup application searches for identical blocks of data and reduces them to a single copy.

Deduplication involves three major feats:

  • The algorithm that discovers duplicate blocks must operate in a timely fashion
  • The system that tracks the proper location of duplicated blocks must be foolproof
  • The system that tracks the proper location of duplicated blocks must use significantly less storage than simply keeping the original blocks

So, while deduplication is conceptually simple, implementations can depend upon advanced computer science.

Deduplication’s primary benefit is that it can produce backups that are even smaller than delta systems. Part of that will depend on the overall scope of the deduplication engine. If you were to run fifteen new Windows Server 2016 virtual machines through even a rudimentary deduplicator, it would reduce all of them to the size of approximately a single Windows Server 2016 virtual machine — a 93% savings.

There is risk in overeager implementations, however. With all data blocks represented by a single copy, each block becomes a single point of failure. The loss of a single vital block could spell disaster for a backup set. This risk can be mitigated by employing a single pre-existing best practice: always maintain multiple backups.


We already have an article set that explores these terms in some detail. Quickly:

  • Inconsistent backups would be effectively the same thing as performing a manual file copy of a directory tree.
  • Crash-consistent backup captures data as it sits on the storage volume at a given point in time, but cannot touch anything passing through the CPU or waiting in memory. You could lose any in-flight I/O operations.
  • Application-consistent backup coordinates with the operating system and, where possible, individual applications to ensure that in-flight I/Os are flushed to disk so that there are no active file changes at the moment that the backup is taken

I occasionally see people twisting these terms around, although I believe that’s most accidental. The definitions that I used above have been the most common, stretching back into the 90s. Be aware that there are some disagreements, so ensure that you clarify terminology with any salespeople.

Bare-Metal Backup/Restore

A so-called “bare-metal backup” and/or “bare metal restore” involves capturing the entirety of a storage unit including metadata portions such as the boot sector. These backup/restore types essentially mean that you could restore data to a completely empty physical system without needing to install an operating system and/or backup agent on it first.

Disaster Recovery/Business Continuity

The terms “Disaster Recovery” (DR) and “Business Continuity” are often used somewhat interchangeably in marketing literature. “Disaster Recovery” is the older term and more accurately reflects the nature of the involved solutions. “Business Continuity” is a newer, more exciting version that sounds more positive but mostly means the same thing. These two terms encompass not just restoring data, but restoring the organization to its pre-disaster state. “Business Continuity” is used to emphasize the notion that, with proper planning, disasters can have little to no effect on your ability to conduct normal business. Of course, the more “continuous” your solution is, the higher your costs are. That’s not necessarily a bad thing, but it must be understood and expected.

One thing that I really want to make very clear about disaster recovery and/or business continuity is that these terms extend far beyond just backing up and restoring your data. DR plans need to include downtime procedures, phone trees, alternative working sites, and a great deal more. You need to think all the way through a disaster from the moment that one occurs to the moment that everything is back to some semblance of normal.

Recovery Point Objective

The maximum acceptable span of time between the latest backup and a data loss event is called a recovery point objective (RPO). If the words don’t sound very much like their definition, that’s because someone worked really hard to couch a bad situation within a somewhat neutral term. If it helps, the “point” in RPO means “point in time.” Of all data adds and changes, anything that happens between backup events has the highest potential of being lost. Many technologies have some sort of fault tolerance built in; for instance, if your domain controller crashes and it isn’t due to a completely failed storage subsystem, you’re probably not going to need to go to backup. Most other databases can tell a similar story. RPOs mostly address human error and disaster. More common failures should be addressed by technology branches other than backup, such as RAID.

A long RPO means that you are willing to lose a greater span of time. A daily backup gives you a 24-hour RPO. Taking backups every two hours results in a 2-hour RPO. Remember that an RPO represents a maximum. It is highly unlikely that a failure will occur immediately prior to the next backup operation.

Recovery Time Objective

Recovery time objective (RTO) represents the maximum amount of time that you are willing to wait for systems to be restored to a functional state. This term sounds much more like its actual meaning than RPO. You need to take extra care when talking with backup vendors about RTO. They will tend to only talk about RTO in terms of restoring data to a replacement system. If your primary site is your only site and you don’t have a contingency plan for complete building loss, your RTO is however long it takes to replace that building, fill it with replacement systems, and restore data to those systems. Somehow, I suspect that a six-month or longer RTO is unacceptable for most institutions. That is one reason that DR planning must extend beyond taking backups.

In more conventional usage, RTOs will be explained as though there is always a target system ready to receive the restored data. So, if your backup drives are taken offsite to a safety deposit box by the bookkeeper when she comes in at 8 AM, your actual recovery time is essentially however long it takes someone to retrieve the backup drive plus the time needed to perform a restore in your backup application.


Retention is the desired amount of time that a backup should be kept. This deceptively simple description hides some complexity. Consider the following:

  • Legislation mandates a ten-year retention policy on customer data for your industry. A customer was added in 2007. Their address changed in 2009. Must the customer’s data be kept until 2017 or 2019?
  • Corporate policy mandates that all customer information be retained for a minimum of five years. The line-of-business application that you use to record customer information never deletes any information that was placed into it and you have a copy of last night’s data. Do you need to keep the backup copy from five years ago or is having a recent copy of the database that contains five-year-old data sufficient?

Questions such as these can plague you. Historically, monthly and annual backup tapes were simply kept for a specific minimum number of years and then discarded, which more or less answered the question for you. Tape is an expensive solution, however, and many modern small businesses do not use it. Furthermore, laws and policies only dictate that the data be kept; nothing forced anyone to ensure that the backup tapes were readable after any specific amount of time. One lesson that many people learn the hard way is that tapes stored flat can lose data after a few years. We used to joke with customers that their bits were sliding down the side of the tape. I don’t actually understand the governing electromagnetic phenomenon, but I can verify that it does exist.

With disk-based backups, the possibilities are changed somewhat. People typically do not keep stacks of backup disks lying around, and their ability to hold data for long periods of time is not the same as backup tape. The rules are different — some disks will outlive tape, others will not.


Backup rotations deal with the media used to hold backup information. This has historically meant tape, and tape rotations often came in some very grandiose schemes. One of the most widely used rotations is called “Grandfather-Father-Son” (GFS):

  • One full backup is taken monthly. The media it is taken on is kept for an extended period of time, usually one year. One of these is often considered an annual and kept longer. This backup is called the “Grandfather”.
  • Each week thereafter, on the same day, another full backup is taken. This media is usually rotated so that it is re-used once per month. This backup is known as the “Father”.
  • On every day between full backups, an incremental backup is taken. Each day’s media is rotated so that it is re-used on the same day each week. This backup is known as the “Son”.

The purpose of rotation is to have enough backups to provide sufficient possible restore points to guard against a myriad of possible data loss instances without using so much media that you bankrupt yourself and run out of physical storage room. Grandfathers are taken offsite and placed in long-term storage. Fathers are taken offsite, but perhaps not placed in long-term storage so that they are more readily accessible. Sons are often left onsite, at least for a day or two, to facilitate rapid restore operations.

Replacing Old Concepts with New Best Practices

Some backup concepts are simply outdated, especially for the small business. Tape used to be the only feasible mass storage device that could be written and rewritten on a daily basis and were sufficiently portable. I recall being chastised by a vendor representative in 2004 because I was “still” using tape when I “should” be backing up to his expensive SAN. I asked him, “Oh, do employees tend to react well when someone says, ‘The building is on fire! Grab the SAN and get out!’?” He suddenly didn’t want talk to me anymore.

The other somewhat outdated issue is that backups used to take a very, very long time. Tape was not very fast, disks were not very fast, networks were not very fast. Differential and incremental backups were partly the answer to that problem, and partly to the problem that tape capacity was an issue. Today, we have gigantic and relatively speedy portable hard drives, networks that can move at least many hundreds of megabits per second, and external buses like USB 3 that outrun both of those things. We no longer need all weekend and an entire media library to perform a full backup.

One thing that has not changed is the need for backups to exist offsite. You cannot protect against a loss of a building if all of your data stays in that building. Solutions have evolved, though. You can now afford to purchase large amounts of bandwidth and transmit your data offsite to your alternative business location(s) each night. If you haven’t got an alternative business location, there are an uncountable number of vendors that would be happy to store your data each night in exchange for a modest (or not so modest) sum of money. I still counsel periodically taking an offline offsite backup copy, as that is a solid way to protect your organization against malicious attacks (some of which can be by disgruntled staff).

These are the approaches that I would take today that would not have been available to me a few short years ago:

  • Favor full backups whenever possible — incremental, differential, delta, and deduplicated backups are wonderful, but they are incomplete by nature. It must never be forgotten that the strength of backup lies in the fact that it creates duplicates of data. Any backup technique that reduces duplication dilutes the purpose of backup. I won’t argue against anyone saying that there are many perfectly valid reasons for doing so, but such usage must be balanced. Backup systems are larger and faster than ever before; if you can afford the space and time for full copies, get full copies.
  • Steer away from complicated rotation schemes like GFS whenever possible. Untrained staff will not understand them and you cannot rely on the availability of trained staff in a crisis.
  • Encrypt every backup every time.
  • Spend the time to develop truly meaningful retention policies. You can easily throw a tape in a drawer for ten years. You’ll find that more difficult with a portable disk drive. Then again, have you ever tried restoring from a ten-year-old tape?
  • Be open to the idea of using multiple backup solutions simultaneously. If using a combination of applications and media types solves your problem and it’s not too much overhead, go for it.

There are a few best practices that are just as applicable now as ever:

  • Periodically test your backups to ensure that data is recoverable
  • Periodically review what you are backing up and what your rotation and retention policies are to ensure that you are neither shorting yourself on vital data nor wasting backup media space on dead information
  • Backup media must be treated as vitally sensitive mission-critical information and guarded against theft, espionage, and damage
    • Magnetic media must be kept away from electromagnetic fields
    • Tapes must be stored upright on their edges
    • Optical media must be kept in dark storage
    • All media must be kept in a cool environment with a constant temperature and low humidity
  • Never rely on a single backup copy. Media can fail, get lost, or be stolen. Backup jobs don’t always complete.

Hyper-V-Specific Backup Best Practices

I want to dive into the nuances of backup and Hyper-V more thoroughly in later articles, but I won’t leave you here without at least bringing them up.

  • Virtual-machine-level backups are a good thing. That might seem a bit self-serving since I’m writing for Altaro and they have a virtual-machine-level backup application, but I fit well here because of shared philosophy. A virtual-machine-level backup gives you the following:
    • No agent installed inside the guest operating system
    • Backups are automatically coordinated for all guests, meaning that you don’t need to set up some complicated staggered schedule to prevent overlaps
    • No need to reinstall guest operating systems separately from restoring their data
  • Hyper-V versions prior to 2016 do not have a native changed block tracking mechanism, so virtual-machine-level backup applications that perform delta and/or deduplication operations must perform a substantial amount of processing. Keep that in mind as you are developing your rotations and scheduling.
  • Hyper-V will coordinate between backup applications that run at the virtual-machine-level (like Altaro VM) and the VSS writer(s) within guest Windows operating systems and the integration components within Linux guest operating systems. This enables application-consistent backups without doing anything special other than ensuring that the integration components/services are up-to-date and activated.
  • For physical installations, no application can perform a bare metal restore operation any more quickly than you can perform a fresh Windows Server/Hyper-V Server installation from media (or better yet, a WDS system). Such a physical server should only have very basic configuration and only backup/management software installed. Therefore, backing up the management operating system is typically a completely pointless endeavor. If you feel otherwise, I want to know what you installed in the management operating system that would make a bare-metal restore worth your time, as I’m betting that such an application or configuration should not be in the management operating system at all.
  • Use your backup application’s ability to restore a virtual machine next to its original so that you can test data integrity

Follow-Up Articles

With the foundational material supplied in this article, I intend to work on further posts that expand on these thoughts in greater detail. If you have any questions or concerns about backing up Hyper-V, let me know. Anything that I can’t answer quickly in comments might find its way into an article.

Troubleshooting Adapter Teaming in Hyper-V Server

Troubleshooting Adapter Teaming in Hyper-V Server

Native adapter teaming is a hot topic in the world of Hyper-V. It’s certainly nice for Windows Server as well, but the ability to spread out traffic for multiple virtual machines is practically a necessity. Unfortunately, there is a still a lot of misunderstanding out there about the technology and how to get it working correctly. (more…)

How to Install and Configure TCP/IP Routing in a Hyper-V Guest

How to Install and Configure TCP/IP Routing in a Hyper-V Guest

One of the great things about the Hyper-V virtual switch is that it can be used to very effectively isolate your virtual machines from the physical network. This grants them a layer of protection that’s nearly unparalleled. Like any security measure, this can be a double-edged sword. Oftentimes, these isolated guests still need some measure of access to the outside world, or they at least need to have access to a system that can perform such access on their behalf.

There are a few ways to facilitate this sort of connection. The biggest buzzword-friendly solution today is network virtualization, but that currently requires additional software (usually System Center VMM) and a not-unsubstantial degree of additional know-how. For most small, and even many medium-sized organizations, this is an unwelcome burden not only in terms of financial expense, but also in training/education and maintenance.

A simpler solution that’s more suited to smaller and less complicated networks is software routing. Because we’re talking about isolation using a Hyper-V internal or private switch, such software would need to be inside a virtual machine on the same Hyper-V host as the isolated guests.

If you’re not clear on how this would work, you can refer to one of our articles on the Hyper-V virtual switch. This is the same image that I used in that article:

External and Private Switch

External and Private Switch

The routing VM would be represented in that image as the Dual-Presence VM.

Choosing a Software Router

There are major commercial software routing solutions available, such as Vyatta. There are free routing software packages available, such as VyOS, the community fork for Vyatta. These are all Linux-based, and as such, are not within my scope of expertise. However, it should be possible to deploy them as Hyper-V guests.

What we’re going to look at in this article is Microsoft’s Routing and Remote Access Service. It’s an included component of Windows Server, and it’s highly recommended that an RRAS system perform no other functions. If you have a spare virtualization right, then this environment is free for you. Otherwise, you’ll need to purchase an additional Windows Server license.

Do not enable RRAS in Hyper-V’s management operating system! Doing so does not absolve you of the requirement to provide a virtualization right and networking performance of both the management operating system and RAS will be degraded. It’s also an unsupported configuration. To make matters worse, the performance will be unpredictable.

Step 1: Build the Virtual Machine

Sizing a software router depends heavily upon the quantity of traffic that it’s going to be dealing with. However, systems administrators that don’t work with physical routers very often are usually surprised at just how few hardware resources are in even some of the major physical devices. My starting recommendation for the routing virtual machine is as follows:

  • vCPUs: 2
  • Dynamic Memory: 512MB Startup
  • Dynamic Memory: 256MB Minimum
  • Dynamic Memory: 2GB Maximum
  • Disk: 1 VHDX, Dynamically Expanding, 80GB (expect < 20 GB use, fairly stagnant growth)
  • vNIC: 1 to connect to the external switch
  • vNIC: 1 per private subnet per private/internal switch (wait on this during initial build)
  • OS: latest Windows Server that you are licensed for
  • This VM won’t necessarily need to be a member of your domain. If it’s going to be sitting in the perimeter, then you might consider leaving it out. Another option is to leave it in the domain but with higher-than-usual security settings. A great thing to do for machines such as this is disable cached credentials, whether it’s left in the domain or not.

Once the virtual machine is deployed, monitor it for CPU contention, high memory usage, and long network queue lengths. Adjust provisioning upward as necessary.

During VM setup, I would only create a single virtual adapter to start, and place it on the external virtual switch. Then use PowerShell to rename that adapter:

Then, boot up the virtual machine and install Windows Server (I’ll be using 2012 R2 for this article). Rename the adapter inside Windows Server as well to reflect that it connects to the outside world. These steps will help you avoid a lot of issues later. If you’ve already created your adapters and would like a way to identify them, you can disconnect them from their switches and watch which show as being unplugged in the virtual machine.

Before proceeding, ensure you have added all virtual adapters necessary, one for each switch. The virtual adapter connected to the external virtual switch is the only one that requires complete IP information. It should have a default gateway pointing to the next external hop and it should know about DNS servers. It can use any IP on that network that you have available. It only needs a memorable IP if other systems will need to be able to send traffic directly to hosts on the isolated network.

All other adapters require only an IP and a subnet mask. The IP that you choose will be used as the default gateway for all other systems on that switch, so you’ll probably want it to be something easy to remember. If you’re using the router’s operating system as a domain member or in some other situation in which it will be able to register its IP addresses in DNS, make sure that you disable DNS registration for all adapters other than the one that’s in the primary network.

For reference, my test configuration is as follows:

  • System name: SVRRAS
  • “External” adapter: IP:, Mask:, GW
  • “Isolated” adapter: IP:, Mask:, GW: None

In order for this to work, you’ll need to make some adjustments to the Windows Firewall. If you want, you can just turn it off entirely. However, it is a moderately effective barrier and better than nothing. We’ll follow up on the firewall after the RRAS configuration explanation.

Step 2: Installing and Configuring RRAS

Once Windows is installed and you have all the necessary network adapters connected to their respective virtual switches, the next thing to do is install RRAS. This is done through the Add Roles and Features wizard, just like any other Windows Server role. Choose Remote Access on the Roles screen:

RRAS Server Role

RRAS Server Role

On the Role Services screen, check the box for Routing. You’ll be prompted to add several other items in order to fully enable routing, which will include DirectAcccess and VPN (RAS) on the same screen:

RRAS Role Services

RRAS Role Services

After this, just proceed through, accepting the suggested settings.

Once the installation is complete, a reboot is unnecessary. You just need to configure the router. Open up the Start menu, and find the Routing and Remote Access snap-in. Open that up, and you’ll find something similar to the following screenshot. Right-click on your server’s name and click Configure and Enable Routing and Remote Access.

Configure RRAS

Configure RRAS

On the next screen, you have one of two choices for our purposes. The first is Network Address Translation (NAT). This mode works like a home Internet router does. All the virtual machines behind the router will appear to have the same IP address. This is a great solution for isolation purposes. It’s also useful when you haven’t got a hardware router available and want to connect your virtual machines into another network, such as the Internet. Your second choice is to select Custom configuration. This mode allows you to build a standard router, in which the virtual machines on the private network can be reached from other virtual machines by using their IP addresses. I won’t be illustrating this method as it doesn’t do a great deal for isolation.

On the next screen, you’ll tell the routing system which of the adapters is connected to the external network:

RRAS Adapter Selection

RRAS Adapter Selection

On the next screen, you’ll choose how addressing is handled on the private networks. The first option will set up your RRAS system to perform DHCP services for them and forward their DNS requests out to the DNS server(s) you specified for the external network. If you choose the second option, you can build your own addressing services as desired. I’m just going to work with the first option for the sake of ease (and a quicker article):

RRAS Name Services

RRAS Name Services

Once this wizard completes, you’ll be returned to the main RRAS screen where the server should now show online (with a small upward pointing green arrow). That’s really all it takes to configure RRAS inside a Hyper-V guest.

Step 3: Configure the Virtual Machines

This is probably the best part: there is no necessary configuration at all. Just attach the virtual machines to the private switch and leave them in the default networking configuration. Here’s a screenshot from a Windows 7 virtual machine I connected to my test isolated switch:

RRAS in a VM: Demo

RRAS in a VM: Demo

You’ll see that it’s using the IP information of SVRRAS’s adapter on the isolated switch for DHCP, gateway, and DNS information. As long as the firewall is down or properly set on the RRAS system, it will be able to communicate as expected.

[optin-monster-shortcode id=”gaua0qrhzreh818c”]

Windows Firewall on the RRAS System

The easiest thing to do with the Windows Firewall is to just turn it off. If you’re using it at the perimeter, that’s a pretty bad idea. While it may not be the best firewall available, it does pretty well for a software firewall and seems to have the fewest negative side effects in that genre.

For many, a build of this kind exposes them to a completely new configuration style, one that many hardware firewall administrators have dealt with for a very long time. In a traditional software firewall build running on a single system, you usually only have to worry about inbound and outbound traffic on a single adapter. Now, you have to worry about it on at least two adapters. This is a visualization:

RRAS and Windows Firewall

RRAS and Windows Firewall

When configuring Windows Firewall, you have to be mindful of how this will work. In order to allow guests on the private network to access web sites on the external network, you’ll need to open port 80 INBOUND on the adapter connected to the private network. This kind of management can get tedious, but it’s very effective if you want to further isolate those guests.

What you might want to do instead is leave the firewall on that connects to the external network but disable it for the private network. Then, your private guests will have receive the default levels of protection from the router VM’s firewall. You could then turn off the firewalls in the guests, if you like. In order to do this, open up Windows Firewall with Advanced Security on the router VM (or target the MMC from a remote computer). In the center pane, click Windows Firewall Properties or, in the left pane, right-click Windows Firewall with Advanced Security and click Properties. Next to Protected network connections, click Customize. In the dialog that appears, uncheck the box that represents the network adapter on the private switch. Click OK all the way out. I’ve renamed my adapters to “Isolated” and “External for the following screenshot:

RRAS Firewall Adapter Selection

RRAS Firewall Adapter Selection

DMZs and Port Forwarding in RRAS

You might want to selectively expose some of your guests on the private network to inbound traffic from the external network. Personally, I don’t see much value in using the DMZ mode. It’s functionally equivalent and less computationally expensive to just connect a “DMZ” virtual machine directly to the external switch and not have it go through a routing VM at all. Port forwarding (sometimes called “pincushioning”), on the other hand, does have its uses.

Open the Routing and Remote Access console. Expand your server, then expand the IP version (IPv4 or IPv6) that you want to configure forwarding for. Click NAT. In the center pane, locate the interface that is connected to the external switch. Right-click it and click Properties.

If you wish to configure one or more “DMZ” virtual machines, the Address Pool tab is where this is done. First, you create a pool that contains the IP address(es) that you want to map to the private VM(s). For instance, if you want to make the IP address of a private VM, then you would enter that IP address with a subnet mask of Once you have your range(s) configured, you use the Reservations button to map the external address(es) to the private VMs address(es).

For port forwarding, go to the Services and Ports tab. If you check one of the existing boxes, it will present you with a prefilled dialog where all you need to enter is the IP address of the private VM for which you wish to forward traffic:

RRAS Port Forwarding

RRAS Port Forwarding

In this screenshot, the On this address pool entry field is unavailable. That’s because I did not add any other IP addresses on the Address Pool tab. If you do that, then the external adapter will have multiple IPs assigned to it. If you don’t use those additional IPs for DMZ purposes, then you can use them here. The reason for doing so is so that multiple private virtual machines can have port forwarding for the same service. One use case for doing so is if you have two or more virtual machines on your private switch serving web pages and you want to make them all visible to computers on the external network.


This is a topic that comes up often. If you’ve read our networking series, then hopefully you already know how this works. VLANs are a layer 2 concept while routing is layer 3. Therefore, VLAN assignments for virtual machines on your private virtual switch will have no relation to any other VLANs anywhere else. Using VLANs on the private switch is probably not useful unless you need to isolate them from each other. If that’s the case, then you’ll need to make a distinct virtual adapter inside the routing virtual machine for each of those VLANs, or it won’t be able to communicate with the other guests on that VLAN.

Happy Routing

That’s really all there is to getting routing working through a Hyper-V virtual machine. This is a great solution for isolating virtual machines and for test labs.

Do remember that it’s not as efficient or as robust as using a true hardware router.


The Hyper-V Hub 2014 Year in Review

The Hyper-V Hub 2014 Year in Review

Like any creative work, a blog post is never really done; it’s just abandoned. Unlike many other mediums, blogs do allow us to easily refresh those older articles, but we so rarely ever do it. To close out this year, a few of us on the editorial team got together and selected a few highlights from the past year.

Our 14 selections from 2014 (in no particular order):

Hyper-V Guest Licensing

This was our first licensing article directly related to guest licensing. We followed it up with a downloadable eBook that was expanded to include a number of examples, and Andy Syrewicze and Thomas Maurer gave a fantastic webinar on the topic. We’ve received quite a few questions and some great feedback. Keep an eye out for a follow-up post that takes on some of those questions and incorporates some of the suggestions. If you’ve got questions or suggestions of your own, feel free to send them in (by leaving a comment)!

Demystifying Virtual Domain Controllers (series)

This two part series started with a look at all the myths surrounding the virtualization of domain controllers and exposed the truths. In part 2, we explained how to successfully virtualize your domain controllers without headaches. I felt that both of these articles explained the situation and remedies very well, except that it seems that some people have been unable to make time synchronization work correctly when the Hyper-V Time Synchronization service is partially disabled for virtualized domain controllers. Microsoft has changed their published policy to include a recommendation that time sync be fully disabled for DC guests, although personally, I think the jury is still out. Completely disabling it is certainly an expedient solution to a frustrating situation, but that doesn’t automatically make it the best option. I’ve been able to make partial disablement work every time I’ve tried, and it’s the only way to guarantee that a DC that was saved (even accidentally) will be able to properly recover.

7 Reasons not to Make Hyper-V a Domain Controller

I wrote this post as I saw a lot of people really trying hard to justify using their Hyper-V hosts as domain controllers. It’s a really bad idea, so I collected all the reasoning I could think of not to do it. In retrospect, I probably should have ordered them a little better to address the topmost concerns first. Those are:

  1. Licensing costs. People want to save on licensing by just installing the domain controller in the Hyper-V host so that they don’t have to pay for a Windows license just to run domain services. As explained in #4, this doesn’t work.
  2. The chicken-and-egg myth. People believe that if they join the Hyper-V host to the domain and the only DC is a guest, then the Hyper-V host won’t boot or will have other problems. That wasn’t even mentioned in this article, at least not outright, although it was a major point of the articles that it links to.
  3. The myth that Hyper-V hosts should be left in workgroup mode to increase security. This was included as point #1, which isn’t a bad placement for it. People get too close to a situation and sometimes make irrational decisions. They don’t want to put their systems at unnecessary risk and a compromised Hyper-V host can potentially put a lot of guests at risk all at once, so they do take steps to distance the host from the guests. While there are solutions, workgroup mode isn’t one of them. I mean, just try to say this out loud without laughing: “Workgroup mode is more secure than domain mode”. Or, write this on your resume: “I once put a computer in workgroup mode to increase security over domain mode”. I’ll bet that won’t get you many callbacks.

Common Hyper-V Deployment Mistakes

“There are three kinds of men. The one that learns by reading. The few who learn by observation. The rest of them have to pee on the electric fence for themselves.” – Will Rogers

One really good way to learn is from your own mistakes. Most people that do something really wrong are much less likely to do that same thing twice. But, as far as I’m concerned, it’s a whole lot better if you can learn from other people’s mistakes. This article compiles a list of the ones I’ve seen most frequently in the hopes that at least a few people can avoid them.

Storage Formatting and File Systems (part 4 of a series)

We had an unusually long (7 parts!) series on storage and Hyper-V. Even though this particular piece looked at parts of storage that some people might find very basic, we often forget that there are always newcomers, and sometimes even experienced administrators missed some of the basics during their career. A highlight of this article is that it puts to bed an old recommendation about spending a lot of time on sector alignment. It also takes a generalized look at architecting disk layout for Hyper-V.

Connecting to Storage in Hyper-V (part 6 of a series)

Everyone likes a good how-to. Even if you can figure something out on your own, it makes little sense to do so if someone else has already done the work. For many administrators, moving to a virtualization platform is the first time they’ll connect to external storage, which is why I took the time to lay out exactly how it’s done. I noticed that we had intended to publish a Powershell-equivalent article to this one, but that never came to fruition. We’ll rectify that in 2015.

Creating a Virtual Machine from a Hyper-V Snapshot (series)

Another outstanding submission from Jeff Hicks, this article shows a clever way to create a snapshot (now checkpoint) of a virtual machine and turn it into its own virtual machine. This gives you some cloning powers without needing to incur the expense of something like Virtual Machine Manager. Be sure to keep reading into part 2, where he shows you how to do it all much more efficiently with PowerShell.

10 Awesome Hyper-V Cmdlets

Jeff Hicks compiled a list of his 10 favorite Hyper-V cmdlets and took us all for a quick tour. If you’re thinking about integrating PowerShell with Hyper-V into your toolkit as a New Year’s resolution, this is a great place to start with one of the topmost experts.

A PowerShell-Based Hyper-V Health Report

This fantastic article was written by Jeff Hicks, and is one of my personal favorites. This is a wonderful little script that quickly runs against the target hosts that you specify and returns a snappy-looking HTML health report. Even better, Jeff shows you how to set it up to run automatically against all your hosts, so you can easily have a daily report.

Best Practices to Maximize Hyper-V Performance

In this article, Nirmal discusses the best approaches to ensure your Hyper-V guests are operating at their peak performance. Since Nirmal focused on peak performance and we had more than a few comments about that, we’re planning a follow-up article that contains more generalized best practices.

Resizing Online VM Disks

Another great how-to guide from Nirmal shows you how to use the new feature of 2012 R2 that allows you to resize a Hyper-V VM’s virtual SCSI disk without shutting the guest down.

Hyper-V Virtual CPUs Explained

Out of all my articles, this is one of my personal favorites. I feel really bad for people who spend a lot of time wringing their hands over how many cores should be in their Hyper-V hosts, especially when they wind up spending too much money on CPUs that are just going to sit idle. One thing I probably should have mentioned in that article is that one of the first things many of us do to address certain Hyper-V performance issues is disable many of the power-saving features of our processors (C-states, especially). If you’ve got a lot of cores sitting idle, that’s a lot of wasted energy. And, if Aidan Finn’s prediction about per-core licensing comes true in 2015 (I really hope it doesn’t), then it’s going to translate into lots of wasted licensing dollars as well.

The Hyper-V and PowerShell Series

PowerShell still hasn’t become of serious importance to far too many administrators, and that’s really a shame. Since I clearly remember the days before I learned to embrace it and all the reasons I invoked to avoid it, I can certainly understand why. It’s just one of those things that can be tough to see the value in until you have your own personal “A ha!” moment. This series is meant to serve a dual role: one is to provide useful scripts to the Hyper-V community. The other is to simply to teach by example. If you’ve just come for the scripts, great! If you happen to learn something while you’re here, that’s even better!

Finding Orphaned Hyper-V VM Files

I’ve written a number of PowerShell scripts for you now, but far and away this one is both the most complicated and my favorite. For a very long time prior, I had been thinking, “Someone should do that.” I was eventually forced to accept that I qualify as “someone”. A few days into this, I realized just why no one had done it. I learned a lot, though, and there are certainly a great many uncommon techniques included in this script. So, it has its surface value of being the only tool I know of that can track down orphaned Hyper-V files, but it would also be something that intermediate PowerShell scripters can tear apart for tidbits to add to their own scripting toolkit.

The Future

Time keeps on slippin’, slippin’, slippin’, into the future. – The Steve Miller Band, “Fly Like an Eagle”

As we close out 2014 by examining our successes, we acknowledge that all of it has been made possible by you, our readers. If you want to have a say in how in how we approach 2015, now is your chance! Let us know in the comments what you’d like to see more of, what you’d like to see less of, and what big things we’ve missed entirely.

On behalf of the Altaro blogging crew, we’ve had a wonderful time writing for you and interacting with you in the comments section. We wish you and yours a wonderful holiday season and look forward to another wonderful year!

Hyper-V Load-Balancing Algorithms

Hyper-V Load-Balancing Algorithms

We’ve had a long run of articles in this series that mostly looked at general networking technologies. Now we’re going to look at a technology that gets us closer to Hyper-V. Load-balancing algorithms are a feature of the network team, which can be used with any Windows Server installation, but is especially useful for balancing the traffic of several operating systems sharing a single network team.

We’ve already had a couple of articles on the subject of teaming in the Server 2012+ products. The first, not part of this series, talked about MPIO, but outlined the general mechanics of teaming. The second was part of this series and took a deeper look at teaming and the aggregation options available.

The selected load-balancing method is how the team decides to utilize the team members for sending traffic. Before we go through these, it’s important to reinforce that this is load-balancing. There isn’t a way to just aggregate all the team members into a single unified pipe.

I will periodically remind you of this point, but keep in mind that the load-balancing algorithms apply only to outbound traffic. The connected physical switch decides how to send traffic to the Windows Server team. Some of the algorithms have a way to exert some influence over the options available to the physical switch, but the Windows Server team is only responsible for balancing what it sends out to the switch.

Hyper-V Port Load-Balancing Algorithm

This method is commonly chosen and recommended for all Hyper-V installation based solely on its name. This is a poor reason. The name wasn’t picked because it’s the automatic best choice for Hyper-V, but because of how it operates.

The operation is based on the virtual network adapters. In versions 2012 and prior, it was by MAC address. In 2012 R2, and presumably onward, it will be based on the actual virtual switch port. Distribution depends on the teaming mode of the virtual switch.

Switch-independent: Each virtual adapter is assigned to a specific physical member of the team. It sends and receives only on that member. Distribution of the adapters is just round-robin. The impact on VMQ is that each adapter gets a single queue on the physical adapter it is assigned to, assuming there are enough left.

Everything else: Virtual adapters are still assigned to a specific physical adapter, but this will only apply to outbound traffic. The MAC addresses of all these adapters appear on the combined link on the physical switch side, so it will decide how to send traffic to the virtual switch. Since there’s no way for the Hyper-V switch to know where inbound traffic for any given virtual adapter will be, it must register a VMQ for each virtual adapter on each physical adapter. This can quickly lead to queue depletion.

Recommendations for Hyper-V Port Distribution Mode

If you somehow landed here because you’re interested in teaming but you’re not interested in Hyper-V, then this is the worst possible distribution mode you can pick. It only distributes virtual adapters. The team adapter will be permanently stuck on the primary physical adapter for sending operations. The physical switch can still distribute traffic if the team is in a switch-dependent mode.

By the same token, you don’t want to use this mode if you’re teaming from within a virtual machine. It will be pointless.

Something else to keep in mind is that outbound traffic from a VM is always limited to a single physical adapter. For 10 Gb connections, that’s probably not an issue. For 1 Gb, think about your workloads.

For 2012 (not R2), this is a really good distribution method for inbound traffic if you are using the switch-independent mode. This is the only one of the load-balancing modes that doesn’t force all inbound traffic to the primary adapter when the team is switch-independent. If you’re using any of the switch-dependent modes, then the best determinant is usually the ratio of virtual adapters to physical adapters. The higher that number is, the better result you’re likely to get from the Hyper-V port mode. However, before just taking that and running off, I suggest that you continue reading about the hash modes and think about how it relates to the loads you use in your organization.

For 2012 R2 and later, the official word is that the new Dynamic mode universally supersedes all applications of Hyper-V port. I have a tendency to agree, and you’d be hard-pressed to find a situation where it would be inappropriate. That said, I recommend that you continue reading so you get all the information needed to compare the reasons for the recommendations against your own system and expectations.

Hash Load-Balancing Algorithms

The umbrella term for the various hash balancing methods is “address hash”. This covers three different possible hashing modes in an order of preference. Of these, the best selection is the “Transport Ports”. The term “4-tuple” is often seen with this mode. All that means is that when deciding how to balance outbound traffic, four criteria are considered. These are: source IP address, source port, destination IP address, destination port.

Each time traffic is presented to the team for outbound transmission, it needs to decide which of the team members it will use. At a very high level, this is just a round-robin distribution. But, it’s inefficient to simply set the next outbound packet onto the next path in the rotation. Depending on contention, there could be a lot of issues with stream sequencing. So, as explained in the earlier linked posts, the way that the general system works is that a single TCP stream stays on a single physical path. In order to stay on top of this, the load-balancing system maintains a hash table. A hash table is nothing more than a list of entries with more than one value, with each entry being unique from all the others based on the values contained in that entry.

To explain this, we’ll work through a complete example. We’ll start with an empty team passing no traffic. A request comes in to the team to send from a VM with IP address to the Altaro web address. The team sends that packet out the first adapter in the team and places a record for it in a hash table:

Source IP Source Port Destination IP Destination Port Physical Adapter 49152 80 1

Right after that, the same VM request a web page from the Microsoft web site. The team compares it to the first entry:

Source IP Source Port Destination IP Destination Port Physical Adapter 49152 80 1 49153 80 ?

The source ports and the destination IPs are different, so it sends the packet out the next available physical adapter in the rotation and saves a record of it in the hash table. This is the pattern that will be followed for subsequent packets; if any of the four fields for an entry make it unique when compared to all current entries in the table, it will be balanced to the next adapter.

As we know, TCP “conversations” are ongoing streams composed of multiple packets. The client’s web browser will continue sending requests to the above systems. The additional packets headed to the Altaro site will continue to match on the first hash entry, so they will continue to use the first physical adapter.

IP and MAC Address Hashing

Not all communications have the capability of participating in the 4-tuple hash. For instance, ICMP (ping) messages only use IP addresses, not ports. Non-TCP/IP traffic won’t even have that. In those cases, the hash algorithm will fall back from the 4-tuple method to the most suitable of the 2-tuple matches. These aren’t as granular, so the balancing won’t be as even, but it’s better than nothing.

Recommendations for Hashing Mode

If you like, you can use PowerShell to limit the hash mode to IP addresses, which will allow it to fall back to MAC address mode. You can also limit it to MAC address mode. I don’t know of a good use case for this, but it’s possible. Just check the options on New- and Set-NetLbfoTeam. In the GUI, you can only pick “Address Hash” unless you’ve already used PowerShell to set a more restrictive option.

For 2012 (not R2), this is the best solution in non-Hyper-V teaming, including teaming within a virtual machine. For Hyper-V, it’s good when you don’t have very many virtual adapters or when the majority of the traffic coming out of your virtual machines is highly varied in a way that way that would have a high number of balancing hits. Web servers are likely to fit this profile.

In contrast to Hyper-V Port balancing, this mode will mode always balance outbound traffic regardless of the teaming mode. But, in switch-independent mode, all inbound traffic comes across the primary adapter. This is not a good combination for high quantities of virtual machines whose traffic balance is heavier on the receive side. This part of the reason that the Hyper-V port mode almost always makes more sense in a switch independent mode, especially as the number of virtual adapters increases.

For 2012 R2, the official recommendation is the same as with the Hyper-V port mode. You’re encourage to use the new Dynamic mode. Again, this is generally a good recommendation that I’m overly inclined to agree with. However, I still recommend that you keep reading so you understand all your options.

Dynamic Balancing

This mode is new in 2012 R2, and it’s fairly impressive. For starters, it combines features from the Hyper-V port and Address Hash modes. The virtual adapters are registered separately across physical adapters in switch independent mode so received traffic can be balanced, but sending is balanced using the Address Hash method. In switch independent mode, this gives you an impressive balancing configuration. This is why the recommendations are so strong to stop using the other modes. However, if you’ve got an overriding use case, don’t be shy about using it. I suppose it’s possible that limiting virtual adapters to a single physical adapter for sending might have some merits in some cases.

There’s another feature added by the Dynamic mode that its name is derived from. It makes use of flowlets. I’ve read a whitepaper that explains this technology. To say the least, it’s a dense work that’s not easy for mortals to follow. The simple explanation is that it is a technique that can break an existing TCP stream and move it to another physical adapter. Pay close attention to what that means: the Dynamic mode cannot, and does not, send a single TCP stream across multiple adapters simultaneously. The odds of out-of-sequence packets and encountering interim or destination connections that can’t handle the parallel data is just too high for this to be feasible at this stage of network evolution. What it can do is move a stream from one physical adapter.

Let’s say you have two 10 GbE cards in a team using Dynamic load-balancing. A VM starts a massive outbound file transfer and it gets balanced to the first adapter. Another VM starts a small outbound transfer that’s balanced to the second adapter. A third VM begins its own large transfer and is balanced back to the first adapter. The lone transfer on the second adapter finishes quickly, leaving two large transfers to share the same 10 Gb adapter. Using the Hyper-V port or any address hash load-balancing method, there would be nothing that could be done about this short of canceling a transfer and restarting it, hoping that it would be balanced to the second adapter. With the new method, one of the streams can be dynamically moved to the other adapter, hence the name “Dynamic”. Flowlets require the split to be made at particular junctions in the stream. It is possible for Dynamic to work even when a neat flowlet opportunity doesn’t present itself.

Recommendations for Dynamic Mode

For the most part, Dynamic is the way to go. The reasons have been pretty well outlined above. For switch independent modes, it solves the dilemma of choosing Hyper-V port for inbound balancing against Address Hash for outbound balancing. For both switch independent and dependent modes, the dynamic rebalancing capability allows it to achieve a higher rate of well-balanced outbound traffic.

It can’t be stressed enough that you should never expect a perfect balancing of network traffic. Normal flows are anything but even or predictable, especially when you have multiple virtual machines working through the same connections. The Dynamic method is generally superior to all other load-balancing method but you’re not going to see perfectly level network utilization by using it.

Remember that if your networking goal is to enhance throughput, you’ll get the best results by using faster network hardware. No software solution will perform on par with dedicated hardware.


Link Aggregation and Teaming in Hyper-V

Link Aggregation and Teaming in Hyper-V

In part 3, I showed you a diagram of a couple of switches that were connected together using a single port. I mentioned then that I would likely use link aggregation to connect those switches in a production environment. Windows Server introduced the ability to team adapters natively starting with the 2012 version. Hyper-V can benefit from this ability.

To save you from needing to click back to part 2, here is the visualization again:

Switch Interaction ExamplePort 19 is empty on each of these switches. That’s not a good use of our resources. But, we can’t just go blindly plugging in a wire between them, either. Even if we configure ports 19 just like we have ports 20 configured, it still won’t work. In fact, either of these approaches will fail with fairly catastrophic effects. That’s because we’ll have created a loop.

Imagine that we have configured ports 19 and 20 on each switch identically and wired them together. Then, switch port 1 on switch 1 sends out a broadcast frame. Switch 1 will know that it needs to deliver that frame to every port that’s a member of VLAN 10. So, it will go to ports 2-6 and, because they are trunk ports with a native VLAN of 10, it will also deliver it to 19 and 20. Ports 19 and 20 will carry the packet over to switch 2. When it comes out on port 19, it will try to deliver it to ports 1-6 and 20. When it comes out on port 20, it will try to deliver it to ports 1-6 and port 19. So, the frame will go back to ports 19 and 20 on switch 1, where it will repeat the process. Because Ethernet doesn’t have a time to live like TCP/IP does (at least, as far as I know, it doesn’t), this process will repeat infinitely. That’s a loop.

Most switches can identify a loop long before any frames get caught up. The way Cisco switches will handle it is by cutting off the offending loop ports. So, if it’s the only connection that switch has with the outside world, all its endpoints will effectively go out. I’ve never put any other manufacturer into a loop, so I’m not sure how the various other vendors will deal with it. No matter what, you can’t just connect switches to each other using multiple cables without some configuration work.

Port Channels and Link Aggregation

The answer to the above problem is found in Port Channels or Link Aggregation. A port channel is Cisco’s version. Everyone else calls it link aggregation. Cisco does have some proprietary technology wrapped up in theirs, but it’s not necessary to understand that for this discussion. So, to make the above problem go away, we would assign ports 19 and 20 on the Cisco switch into a port channel. On any other hardware vendor, we would assign them to a link aggregation group (LAG). Once that’s done, the port channel or LAG is then configured just like a single port would be, as in trunk/(un)tagged or access/PVID. What’s really important to understand here is that the MAC addresses that the switch assigned to the individual ports are gone. The MAC address now belongs to the port channel/LAG. MAC addresses that it knows about on the connecting switch are delivered to the port channel, not to a switch port.

LAG Modes

It’s been quite a while since I worked on a Cisco environment, but as I recall, a port channel is just a port channel. You don’t need to do a lot of configuration once it’s set up. For other vendors, you have to set up the mode. We’re going to see these modes again with the Windows NIC team, so we’ll get acquainted with that first.

NIC Teaming

Now we look at how this translates into the Windows and Hyper-V environment. For a number of years, we’ve been using NIC teaming in our data centers to provide a measure of redundancy for servers. This uses multiple connections as well, but the most common types don’t include the same sort of cooperation between server and switch that you saw above between switches. Part of it is that a normal server doesn’t usually host multiple endpoints the way a switch does, so it doesn’t really need a trunk mode. A server is typically not concerned with VLANs. So, usually a teamed interface on a server isn’t maintaining two active connections. Instead, it has its MAC address registered on one of the two connected switch ports and the other is just waiting in reserve. Remember that it can’t actually be any other way, because a MAC address can only appear on a single port. So, even though a lot of people thought that they were getting aggregated bandwidth, they really weren’t. But, the nice thing about this configuration is that it doesn’t need any special configuration on the switch, except perhaps if there is a security restriction that prevents migration of MAC addresses.

New, starting in Windows/Hyper-V Server 2012, is NIC teaming built right into the operating system. Before this, all teaming schemes were handled by manufacturers’ drivers. There are three teaming modes available.

Switch Independent

This is the same mode as the traditional teaming mode. The switch doesn’t need to participate. Ordinarily, the Hyper-V switch will register all of its virtual adapters’ MAC addresses on a single port, so all inbound traffic comes through a single physical link. We’ll discuss the exceptions in another post. Outbound traffic can be sent using any of the physical links.

The great benefit of this method is that it can work with just about any switch, so small businesses don’t need to make special investments in particular hardware. You can even use it to connect to multiple switches simultaneously for redundancy. The downside, of course, is that all incoming traffic is bound to a single adapter.


The Hyper-V virtual switch and many physical switches can operate in this mode. The common standard is 802.3ad, but not all implementations are equal. In this method, each member is grouped into a single unit as explained in the Port Channels and Link Aggregation section above. Both switches (whether physical or virtual) much have their matching members configured into a static mode.

MAC addresses on all sides are registered on the overall aggregated group, not on any individual port. This allows incoming and outgoing traffic to use any of the available physical links. The drawbacks are that the switches all have to support this and you lose the ability to split connections across physical switches (with some exceptions, as we’ll talk about later).

If a connection experiences troubles but isn’t down, then the static switch will experience problems that might be difficult to troubleshoot. For instance, if you create a static team on 4 physical adapters in your Hyper-V host but only three of the switch’s ports are configured in a static trunk, then the Hyper-V system will still attempt to use all four.


LACP stands for “Link Aggregation Control Protocol”. This is defined in the 802.1ax standard, which supersedes 802.3ad. Unfortunately, there is a common myth that gives the impression that LACP provides special bandwidth consolidation capabilities over static aggregation. This is not true. An LACP group is functionally like a static group. The difference is that connected switches communicate using LACPDU packets to detect problems in the line. So, if the example setup at the end of the Static teaming section used LACP instead of Static, the switches would detect that one side was configured using only 3 of the 4 connected ports and would not attempt to use the 4th link. Other than that, LACP works just like static. The physical switch needs to be setup for it, as does the team in Windows/Hyper-V.

Bandwidth Usage in Aggregated Links

Bandwidth usage in aggregated links is a major confusion point. Unfortunately, it’s not a simple matter of all physical links being simply combined into one bigger one. It’s more likely that load-balancing will occur than bandwidth aggregation.

In most cases, the sending switch/team controls traffic flow. Specific load-balancing algorithms will be covered in another post. However it chooses to perform it, the sending system will transmit on a specific link. But, any given communication will almost exclusively use only one physical link. This is mostly because it helps ensure that the frames that make up a particular conversation arrive in order. If they were broken up and sent down separate pipes, contention and buffering would dramatically increase the probability that they would be scrambled before reaching their destination. TCP and a few other protocols have built-in ways to correct this, but this is a computationally expensive operation that usually doesn’t outweigh the restrictions of simply using a single physical link.

Another reason for the single-link restriction is simple practicality. Moving a transmission through multiple ports from Switch A to Switch B is fairly trivial. From Switch B to Switch C, it becomes less likely that enough links will be available. The longer the communications chain, the more likely a transmission won’t have the same bandwidth available as the initial hop. Also, the final endpoint is most likely on a single adapter. The available methods to deal with this are expensive and create a drag on network resources.

The implications of all this aren’t exactly clear. A quick explanation is that no matter what teaming mode you pick, when you run a network performance test across your team, the result is going to show the maximum speed of a single team member. But, if you run two such tests simultaneously, it might use two of the links. What I normally see is people trying to use a file copy to test bandwidth aggregation. Aside from the fact that file copy is a horrible way to test anything other than permissions, it’s not going to show anything more than the speed of a single physical link.

The exception to the sender-controlling rule is the switch-independent teaming mode. Inbound traffic is locked to a single physical adapter as all MAC addresses are registered in a single location. It can still load-balance outbound traffic across all ports. If used with the Hyper-V port load-balancing algorithm, then the MAC addresses for virtual adapters will be evenly distributed across available physical adapters. Each virtual adapter can still only receive at the maximum speed of a single port, though.

Stacking Switches

Some switches have the power to “stack”. What this means is that individual physical switches can be combined into a single logical unit. Then, they share a configuration and operate like a single unit. The purpose is for redundancy. If one of the switch(es) fails, the other(s) will continue to operate. What this means is that you can split a static or LACP inter-switch connection, including to a Hyper-V switch, across multiple physical switch units. It’s like having all the power of the switch independent mode with none of the drawbacks.

One concern with stacked switches is the interconnect between them. Some use a special interlink cable that provides very high data transfer speeds. With those, the only bad thing about the stack is the monetary cost. Cheaper stacking switches often just use regular Ethernet or 1gb or 2gb fiber. This could lead to bandwidth contention between the stack members. Since most networks use only a fraction of their available bandwidth at any given time, this may not be an issue. For heavily loaded core switches, a superior stacking method is definitely recommended.

Aggregation Summary

Without some understanding of load-balancing algorithms, it’s hard to get the complete picture here. These are the biggest things to understand:

  • The switch independent mode is the closest to the original mode of network adapter teaming that has been in common use for years. It requires that all inbound traffic flow to a single adapter. You cannot choose this adapter. If combined with the Hyper-V switch port load-balancing algorithm, virtual switch ports are distributed evenly across the available adapters and each will use only its assigned port for inbound traffic.
  • Static and LACP modes are common to the Windows/Hyper-V Server NIC team and most smart switches.
  • Not all static and LACP implementations are created equally. You may encounter problems connecting to some switches.
  • LACP doesn’t have any capabilities for bandwidth aggregation that the static method does not have.
  • Bandwidth aggregation occurs by balancing different communications streams across available links, not by using all possible paths for each stream.

What’s Next

While it might seem logical that the next post would be about the load-balancing algorithms, that’s actually a little more advanced than where I’m ready for this series to proceed. Bandwidth aggregation using static and LACP modes is a fairly basic concept in terms of switching. I’d like to continue with the basics of traffic flow by talking about DNS and protocol bindings.


The complete guide to Mapping the OSI Model to Hyper-V

The complete guide to Mapping the OSI Model to Hyper-V

After storage, Hyper-V’s next most confusing subject is networking. There are a dizzying array of choices and possibilities. To make matters worse, many administrators don’t actually understand that much about the fundamentals because, up until now, they’ve never really had to.

Why It Matters

In the Windows NT 4.0 days, the Microsoft Certified Systems Engineer exam track required passage of “Networking Essentials” and the electives included a TCP/IP exam. Neither of these exams had a corollary in the Windows 2000 track and, although I haven’t kept up much with the world of certification since the Windows 2003 series, I’m fairly certain that networking has largely disappeared from Microsoft certifications. That’s both a blessing and a curse. Basic networking isn’t overly difficult and a working knowledge can be absorbed through simple hands-on experience. More advanced, and sometimes even intermediate skills, can be involved and require a fair level of dedication. If all you really need to do is plug a Windows Server into an existing network and get it going, then a lot of that is probably excess detail that you can leave to someone else. There are certification, expertise, and career tracks available just for networking, and the network engineers and administrators that earn them deserve to have their own world separate from system engineering and administration. Learning all of that is burdensome for systems administrators and is unlikely to pay dividends, especially with the risk of skill rot. The downside is that it’s no longer good enough to know how to set up a vendor team and slam in some basic IP information. Too many systems people have ignored the networking stacks in favor of their servers and applications and are now playing catch-up as integrated teaming, datacenter bridging, software-defined networking, and other technologies escape the confines of netops and intrude into the formerly tidy world of sysops.

The first post of this series will (re)introduce you to the fundamentals of networking that you will build the rest of your Hyper-V networking understanding upon.

The OSI Model

If you’ve never heard the phrases “All People Seem To Need Data Processing” or “Please Do Not Throw Sausage Pizza Away”, then someone along your technical education path has done you a great disservice (or you learned the OSI model in a non-English language). These are mnemonics used by students to drill for exams that test on the seven layers of the OSI model, which obviously worked because I can still recall them fifteen years later:

  1. Physical
  2. Data Link
  3. Network
  4. Transport
  5. Session
  6. Presentation
  7. Application

Oddly enough, I’ve never been asked on any test what “OSI” stands for and I had to look that up: Open Systems Interconnection. Now you know what to put on that blank Apples to Apples card if you want to never be invited to a party again.

The reason that we have two mnemonics is because traffic travels both ways in the model. If your application is Skype, then the model covers your voice being broken into a rush of electrons (from seventh down to first layer) and back into something that might sound almost like you on the other side of an ocean (from first up to seventh layer).

The OSI model is a true model in that it does nothing but describe how a complete networking stack might look. In practice, there is nothing that perfectly matches to this model. The idea is that each of the seven layers performs a particular function in network communications, but only knows enough to interoperate with the layer immediately above and immediately below. So, no jumping from the physical layer to the presentation layer, for instance.

I’m not going to spend a bunch of time on the seven layers. There are a lot of great references and guides available on the Internet, so if you really care, do some searching and find the resource that suits your learning model. If you’re in systems or network administration/engineering, layers six and seven will likely never be of any real concern to you. You might occasionally care about layer five. We’re really focused on layers one through four, and that’s what we’ll talk about.

Use the following diagram as a visual reference for the upcoming sections:

OSI and Hyper-VLayer One

In theory, layer one is extremely easy to understand; it’s all in the name: “physical”. This is the electrons and the wires and fiber and switch ports and network adapters and such. It’s the world of twisted pairs and CAT-5e and crossover. In practice, it’s always messier. This is also the world of crosstalk and interference and phrases like, “Hey, we can use the ballasts in these fluorescent lights as anchors for the network cable, right?” and, “I only use cables with red jackets because they have better throughput than the blue ones,” and all sorts of other things that are pretty maddening if you spend too much time thinking about them. We’ll move along quickly. Just be aware that cables and switches are important, and they break, and they need to be cared for.

Layer Two

Layer two is where things start to get fuzzy. From this point upward, everything exists because we say it does. It’s the first level at which those pulses of light and electron bursts take on some sort of meaning. For us in the Hyper-V world, it’s mostly going to be Ethernet. The unit of communication in Ethernet is the frame. In keeping with our layered model concept, the frame was a sequence of light or electric pulses that some physical device, like a network adapter, has re-interpreted into digital bits. The Ethernet specification says that a series of bits has a particular format and meaning. An incoming series of these bits starts with a header and is then followed by what is expected to be a data section (called the payload), and ends with a set of validation bits. This is the first demonstration point of the OSI model: layer one handles all the nasty parts of converting pulses to data bits and back. Layer two is only aware of and concerned with the ordering of these bits.

The Ethernet Frame

By tearing apart the Ethernet frame header, we can see most of the basic features that live in this layer.

Tagged Ethernet Frame

Tagged Ethernet Frame

The first thing of note is the destination and source MAC addresses (“media access control address”). On any Windows machine, run IPCONFIG /ALL and you’ll find the MAC address in the Physical Address field. Run Get-NetAdapter in PowerShell and you can retrieve the value of the MacAddress field or the LinkLayerAddress field. The MAC address comes in six binary octets, usually represented in two-digit hexadecimal number groupings, like this: E0-06-E6-2A-CD-FB. In case it’s not obvious, the hyphens are only present to make it human readable. You’ll sometimes see colons used, or no delimiters at all. Every network device manufacturer and (some other entities) have their own prefix(-es), indicated in the first three octets. If you search the Internet for “MAC address prefix lookup”, you’ll find a number of sites that allow you to identify the actual manufacturer of the network chip on your branded adapter.

The presence of the MAC address in the Ethernet frame tells us that layer 2 is what deals with these addresses. Therefore, it could also be said that this is the level at which we will find ARP (address resolution protocol), although, as a tangent, it could also be considered as layer 3. Either way, all data that travels across an Ethernet network knows only about MAC addresses. There is no other addressing scheme available here. TCP/IP and its attendant IP addresses have no presence in Ethernet, and unless you get really deep into the technicalities, TCP/IP isn’t considered to be in layer two at all. It’s vital that you understand this, as it is a common stumbling point that presents a surprisingly high barrier to comprehension. As a bit of a trivia piece, the ability to manage MAC addresses and tables is what differentiates a switch from a hub.

Next, we might encounter the 802.1q tag. This is the technology that enables VLANs to work. This is a potentially confusing topic and will get its own section later. For now, just be aware that, if present, VLAN information lives in the Ethernet frame which means it is part of layer 2. Layer 3 and upward have no idea that VLANs even exist.

What puts layer two right in the face of the Windows administrator is the fact that the Hyper-V virtual switch and Windows network adapter teaming live at this level. Without an ability to parse the Ethernet frame, teaming cannot work at all. It must be able to work with MAC addresses. The Hyper-V virtual switch is a switch, and as such it must also be aware of MAC addresses. It also happens to be a smart switch, so it must also know how to work with 802.1q VLAN tags.

A fairly recent addition to the Ethernet specification is Datacenter Bridging (DCB). This is an advanced subject that I might write a dedicated article about, as it is a large and complex topic in its own right. The basic goal of DCB is to overcome the lossy nature of TCP/IP in the datacenter where data loss is both unnecessary and undesirable. There are a number of implementations, but the Ethernet versions include some way of tagging the frame. The significance is that Windows can apply a DCB tag to traffic and DCB-aware physical switches are able to process and prioritize traffic according to these tags. You need a fairly large TCP/IP network for this to be an issue of major concern as most LANs see so little contention that any data loss usually indicates a broken component.

The final thing we’re going to talk about here is the payload. In the modern Windows world, the content of this payload is a TCP/IP packet. It doesn’t have to be that, though. In days of yore, it might have been an IPX/SPX packet. Or a NetBEUI packet. Or anything. All that Ethernet cares about is the destination MAC address. Once the frame is delivered, layer two will unpackage the packet and deliver it up to layer three to deal with.

Layer Three

Here is where we first begin to encounter TCP/IP. A couple of things to note here. First, TCP/IP is not really a protocol, but a protocol group. TCP is one of them, IP is another, so on and so forth. Second, it’s also where you really start to see that the layers of the OSI model are only conceptual, because a number of things could be considered to exist in multiple layers simultaneously.

Layer three is where we start talking about the packet as opposed to the frame. Ethernet (or Token Ring, or any other layer 2 protocol… it doesn’t really matter) has delivered the frame and the payload has been extracted for processing. Everything layer 2-related is now gone: no MAC address. No 802.1q tag. In general, the network adapter driver is the first and last thing in your Windows system to know anything about the Ethernet frame. After that, Windows takes over with the TCP/IP stack.

What we have at this level is IP. The stand-out feature of IP is, of course, the IP address. This is a four-octet binary number that is usually represented in dotted-decimal notation, like this: IP is the addressing mechanism of layer three.

TCP/IP traffic is packaged in the packet. In many ways, it looks similar to the Ethernet frame. It has a defined sequence that includes a header and a data section. Inside the header, we find source and destination IP addresses. This is also the point at which we can start thinking about routing.

A very important fact to know when you’re testing a network is that ICMP (which means PING for most of us) lives in layer 3, not layer 4. You need to be aware of this because you will see behaviors in ICMP that don’t make a lot of sense when you try to think of them in terms of layer 4 behavior, especially in comparison to TCP and UDP. We’ll talk about this again once we are introduced to layer 4.

What’s not here is the Hyper-V virtual switch. It has no IP address of its own and is generally oblivious to the fact that IP addresses exist. When you “share” the physical adapter that a Hyper-V switch is assigned to, what actually happens is that a virtual network adapter is created for the management operating system. That virtual adapter “connects” to the Hyper-V virtual switch at layer one (which is, of course virtual). It does the work of bringing the layer two information off the Hyper-V switch into the layer three world of the management operating system. So, the virtual switch and virtual adapter are in layers one and two, but only the adapter can be said to meaningfully participate in layer three at all.

The Hyper-V Server/Windows Server team is also not really in level three. You do create a team interface, but it also works much like Hyper-V’s virtual adapter.

Layer Four

Layer four is where we find much more of the TCP/IP stack, particularly TCP and UDP. The OSI model is really fuzzy at this point, because these protocols are advertised right there in the TCP/IP packet header, which is definitely a layer three object. However, it is the TCP/IP control software operating in this layer that is responsible for the packaging and handling of these various packets, and the actual differences are seen inside the payload portion of the packet. For the most part, Hyper-V administrators don’t really need to think much about layer four operations, but having no understanding of them will hurt.

The features that we see in layer four are really what made TCP/IP the most popular protocol. This is especially true for TCP, which allows for packets to be lost while preventing data loss. TCP packets are tracked from source to destination, and if one never arrives, the recipient can signal for a retransmission. So, if a few packets in a stream happen to travel a different route and arrive out of order, this protocol can put them back into their original intended pattern. UDP does not do this, but it shares TCP’s ability to detect problems.

This capability is really what separates layer three from layer four, and why ICMP doesn’t behave like a layer four protocol. For instance, if you’re running a Live Migration and a ping is dropped, that doesn’t mean that TCP will be affected at all. I’ve heard it said that ICMP is designed to find network problems and that’s why it fails when other protocols don’t. That’s true to some degree, but it’s also because the functionality that allows TCP and UDP to deal with aberrations in the network are not layer 3 functions.


The best summary of the process described by the OSI model is that networking is a series of encapsulation. The following illustration shows this at the levels we’ve discussed:

OSI Encapsulation

Each successive layer takes the output of the previous layer, and depending on the direction that the data is flowing, either encapsulates it with that layer’s information or unpacks the data for further processing.

What’s Next

In the next installment of this series, we’ll start to see application of these concepts by tearing into VLANs.


Page 1 of 212