Hyper-V Load-Balancing Algorithms

We’ve had a long run of articles in this series that mostly looked at general networking technologies. Now we’re going to look at a technology that gets us closer to Hyper-V. Load-balancing algorithms are a feature of the network team, which can be used with any Windows Server installation, but is especially useful for balancing the traffic of several operating systems sharing a single network team.

We’ve already had a couple of articles on the subject of teaming in the Server 2012+ products. The first, not part of this series, talked about MPIO, but outlined the general mechanics of teaming. The second was part of this series and took a deeper look at teaming and the aggregation options available.

The selected load-balancing method is how the team decides to utilize the team members for sending traffic. Before we go through these, it’s important to reinforce that this is load-balancing. There isn’t a way to just aggregate all the team members into a single unified pipe.

I will periodically remind you of this point, but keep in mind that the load-balancing algorithms apply only to outbound traffic. The connected physical switch decides how to send traffic to the Windows Server team. Some of the algorithms have a way to exert some influence over the options available to the physical switch, but the Windows Server team is only responsible for balancing what it sends out to the switch.

Hyper-V Port Load-Balancing Algorithm

This method is commonly chosen and recommended for all Hyper-V installation based solely on its name. This is a poor reason. The name wasn’t picked because it’s the automatic best choice for Hyper-V, but because of how it operates.

The operation is based on the virtual network adapters. In versions 2012 and prior, it was by MAC address. In 2012 R2, and presumably onward, it will be based on the actual virtual switch port. Distribution depends on the teaming mode of the virtual switch.

Switch-independent: Each virtual adapter is assigned to a specific physical member of the team. It sends and receives only on that member. Distribution of the adapters is just round-robin. The impact on VMQ is that each adapter gets a single queue on the physical adapter it is assigned to, assuming there are enough left.

Everything else: Virtual adapters are still assigned to a specific physical adapter, but this will only apply to outbound traffic. The MAC addresses of all these adapters appear on the combined link on the physical switch side, so it will decide how to send traffic to the virtual switch. Since there’s no way for the Hyper-V switch to know where inbound traffic for any given virtual adapter will be, it must register a VMQ for each virtual adapter on each physical adapter. This can quickly lead to queue depletion.

Recommendations for Hyper-V Port Distribution Mode

If you somehow landed here because you’re interested in teaming but you’re not interested in Hyper-V, then this is the worst possible distribution mode you can pick. It only distributes virtual adapters. The team adapter will be permanently stuck on the primary physical adapter for sending operations. The physical switch can still distribute traffic if the team is in a switch-dependent mode.

By the same token, you don’t want to use this mode if you’re teaming from within a virtual machine. It will be pointless.

Something else to keep in mind is that outbound traffic from a VM is always limited to a single physical adapter. For 10 Gb connections, that’s probably not an issue. For 1 Gb, think about your workloads.

For 2012 (not R2), this is a really good distribution method for inbound traffic if you are using the switch-independent mode. This is the only one of the load-balancing modes that doesn’t force all inbound traffic to the primary adapter when the team is switch-independent. If you’re using any of the switch-dependent modes, then the best determinant is usually the ratio of virtual adapters to physical adapters. The higher that number is, the better result you’re likely to get from the Hyper-V port mode. However, before just taking that and running off, I suggest that you continue reading about the hash modes and think about how it relates to the loads you use in your organization.

For 2012 R2 and later, the official word is that the new Dynamic mode universally supersedes all applications of Hyper-V port. I have a tendency to agree, and you’d be hard-pressed to find a situation where it would be inappropriate. That said, I recommend that you continue reading so you get all the information needed to compare the reasons for the recommendations against your own system and expectations.

Hash Load-Balancing Algorithms

The umbrella term for the various hash balancing methods is “address hash”. This covers three different possible hashing modes in an order of preference. Of these, the best selection is the “Transport Ports”. The term “4-tuple” is often seen with this mode. All that means is that when deciding how to balance outbound traffic, four criteria are considered. These are: source IP address, source port, destination IP address, destination port.

Each time traffic is presented to the team for outbound transmission, it needs to decide which of the team members it will use. At a very high level, this is just a round-robin distribution. But, it’s inefficient to simply set the next outbound packet onto the next path in the rotation. Depending on contention, there could be a lot of issues with stream sequencing. So, as explained in the earlier linked posts, the way that the general system works is that a single TCP stream stays on a single physical path. In order to stay on top of this, the load-balancing system maintains a hash table. A hash table is nothing more than a list of entries with more than one value, with each entry being unique from all the others based on the values contained in that entry.

To explain this, we’ll work through a complete example. We’ll start with an empty team passing no traffic. A request comes in to the team to send from a VM with IP address 192.168.50.20 to the Altaro web address. The team sends that packet out the first adapter in the team and places a record for it in a hash table:

Source IP Source Port Destination IP Destination Port Physical Adapter
192.168.50.20 49152 108.168.254.197 80 1

Right after that, the same VM request a web page from the Microsoft web site. The team compares it to the first entry:

Source IP Source Port Destination IP Destination Port Physical Adapter
192.168.50.20 49152 108.168.254.197 80 1
192.168.50.20 49153 65.55.57.27 80 ?

The source ports and the destination IPs are different, so it sends the packet out the next available physical adapter in the rotation and saves a record of it in the hash table. This is the pattern that will be followed for subsequent packets; if any of the four fields for an entry make it unique when compared to all current entries in the table, it will be balanced to the next adapter.

As we know, TCP “conversations” are ongoing streams composed of multiple packets. The client’s web browser will continue sending requests to the above systems. The additional packets headed to the Altaro site will continue to match on the first hash entry, so they will continue to use the first physical adapter.

IP and MAC Address Hashing

Not all communications have the capability of participating in the 4-tuple hash. For instance, ICMP (ping) messages only use IP addresses, not ports. Non-TCP/IP traffic won’t even have that. In those cases, the hash algorithm will fall back from the 4-tuple method to the most suitable of the 2-tuple matches. These aren’t as granular, so the balancing won’t be as even, but it’s better than nothing.

Recommendations for Hashing Mode

If you like, you can use PowerShell to limit the hash mode to IP addresses, which will allow it to fall back to MAC address mode. You can also limit it to MAC address mode. I don’t know of a good use case for this, but it’s possible. Just check the options on New- and Set-NetLbfoTeam. In the GUI, you can only pick “Address Hash” unless you’ve already used PowerShell to set a more restrictive option.

For 2012 (not R2), this is the best solution in non-Hyper-V teaming, including teaming within a virtual machine. For Hyper-V, it’s good when you don’t have very many virtual adapters or when the majority of the traffic coming out of your virtual machines is highly varied in a way that way that would have a high number of balancing hits. Web servers are likely to fit this profile.

In contrast to Hyper-V Port balancing, this mode will mode always balance outbound traffic regardless of the teaming mode. But, in switch-independent mode, all inbound traffic comes across the primary adapter. This is not a good combination for high quantities of virtual machines whose traffic balance is heavier on the receive side. This part of the reason that the Hyper-V port mode almost always makes more sense in a switch independent mode, especially as the number of virtual adapters increases.

For 2012 R2, the official recommendation is the same as with the Hyper-V port mode. You’re encourage to use the new Dynamic mode. Again, this is generally a good recommendation that I’m overly inclined to agree with. However, I still recommend that you keep reading so you understand all your options.

Dynamic Balancing

This mode is new in 2012 R2, and it’s fairly impressive. For starters, it combines features from the Hyper-V port and Address Hash modes. The virtual adapters are registered separately across physical adapters in switch independent mode so received traffic can be balanced, but sending is balanced using the Address Hash method. In switch independent mode, this gives you an impressive balancing configuration. This is why the recommendations are so strong to stop using the other modes. However, if you’ve got an overriding use case, don’t be shy about using it. I suppose it’s possible that limiting virtual adapters to a single physical adapter for sending might have some merits in some cases.

There’s another feature added by the Dynamic mode that its name is derived from. It makes use of flowlets. I’ve read a whitepaper that explains this technology. To say the least, it’s a dense work that’s not easy for mortals to follow. The simple explanation is that it is a technique that can break an existing TCP stream and move it to another physical adapter. Pay close attention to what that means: the Dynamic mode cannot, and does not, send a single TCP stream across multiple adapters simultaneously. The odds of out-of-sequence packets and encountering interim or destination connections that can’t handle the parallel data is just too high for this to be feasible at this stage of network evolution. What it can do is move a stream from one physical adapter.

Let’s say you have two 10 GbE cards in a team using Dynamic load-balancing. A VM starts a massive outbound file transfer and it gets balanced to the first adapter. Another VM starts a small outbound transfer that’s balanced to the second adapter. A third VM begins its own large transfer and is balanced back to the first adapter. The lone transfer on the second adapter finishes quickly, leaving two large transfers to share the same 10 Gb adapter. Using the Hyper-V port or any address hash load-balancing method, there would be nothing that could be done about this short of canceling a transfer and restarting it, hoping that it would be balanced to the second adapter. With the new method, one of the streams can be dynamically moved to the other adapter, hence the name “Dynamic”. Flowlets require the split to be made at particular junctions in the stream. It is possible for Dynamic to work even when a neat flowlet opportunity doesn’t present itself.

Recommendations for Dynamic Mode

For the most part, Dynamic is the way to go. The reasons have been pretty well outlined above. For switch independent modes, it solves the dilemma of choosing Hyper-V port for inbound balancing against Address Hash for outbound balancing. For both switch independent and dependent modes, the dynamic rebalancing capability allows it to achieve a higher rate of well-balanced outbound traffic.

It can’t be stressed enough that you should never expect a perfect balancing of network traffic. Normal flows are anything but even or predictable, especially when you have multiple virtual machines working through the same connections. The Dynamic method is generally superior to all other load-balancing method but you’re not going to see perfectly level network utilization by using it.

Remember that if your networking goal is to enhance throughput, you’ll get the best results by using faster network hardware. No software solution will perform on par with dedicated hardware.

 

Threat Monitor
Share this post

Not a DOJO Member yet?

Join thousands of other IT pros and receive a weekly roundup email with the latest content & updates!

24 thoughts on "Hyper-V Load-Balancing Algorithms"

  • Daniel Handy says:

    Great series of articles.

    I have a question about Dynamic load balancing on *physical-only* servers.

    It seems that inbound traffic for a physical NIC team would be restricted to just one NIC from your description ? That is, a physical NIC team could not take advantage of inbound aggregation?
    Outbound is obviously straightforward, but I am left with ambiguity about inbound traffic for a physical server.

    Thanks. Hopefully you can clarify.

    • Eric Siron says:

      There is no inbound aggregation on Dynamic in switch-independent mode anyway. For Hyper-V, there is only load distribution by virtual NIC. For a physical-only server, inbound traffic to a Dynamic switch will be only on the primary adapter.

  • Daniel Handy says:

    Great series of articles.

    I have a question about Dynamic load balancing on *physical-only* servers.

    It seems that inbound traffic for a physical NIC team would be restricted to just one NIC from your description ? That is, a physical NIC team could not take advantage of inbound aggregation?
    Outbound is obviously straightforward, but I am left with ambiguity about inbound traffic for a physical server.

    Thanks. Hopefully you can clarify.

  • Boudewijn Plomp says:

    I have done a lot of testing with NIC Teaming. I always prefer to use LACP/Dynamic for normal NIC Teams presented to the Management OS. But if you use 10GbE and enable a Hyper-V vSwitch I have noticed that LACP/Dynamic most often gives a significant lower throughput then Switch Independent/Hyper-V Port. I have seen this so many times that I most often configure Switch Independent/Hyper-V Port when using 10GbE and a Hyper-V vSwitch is involved. This is not really a big deal because 10GbE gives plenty of throughput, but something you have to test when you work with different hardware combinations.

    The only downside is, if you use 1GbE instead and you add converged networking by adding vNICs (e.g. Management, Cluster and Live Migration) on top of the vSwitch presented to the Management OS, they are limited to a single 1GbE NIC. So in case of 1GbE, vSwitch with vNIC’s I tend to use Switch Independent/Dynamic.

    NOTE: VMQ is not the issue, configured that properly.

    • Eric Siron says:

      Thank you for the input, Boudewijn. I don’t know if I’m in a great position to see if my experience matches yours, but I’d certainly like to try it out.

  • Boudewijn Plomp says:

    I have done a lot of testing with NIC Teaming. I always prefer to use LACP/Dynamic for normal NIC Teams presented to the Management OS. But if you use 10GbE and enable a Hyper-V vSwitch I have noticed that LACP/Dynamic most often gives a significant lower throughput then Switch Independent/Hyper-V Port. I have seen this so many times that I most often configure Switch Independent/Hyper-V Port when using 10GbE and a Hyper-V vSwitch is involved. This is not really a big deal because 10GbE gives plenty of throughput, but something you have to test when you work with different hardware combinations.

    The only downside is, if you use 1GbE instead and you add converged networking by adding vNICs (e.g. Management, Cluster and Live Migration) on top of the vSwitch presented to the Management OS, they are limited to a single 1GbE NIC. So in case of 1GbE, vSwitch with vNIC’s I tend to use Switch Independent/Dynamic.

    NOTE: VMQ is not the issue, configured that properly.

  • Boudewijn Plomp says:

    By the way, I forgot to mention; Great blog! Very well written.

  • Majid Karimpour says:

    Dear Eric,
    I have a question about switch independent/address Hashing mechanism.
    In my environment I have a server that has 2 NICs. These 2 NICs connected to a switch. If I configure switch independent with address hashing, NIC Teaming Common Core sends my server outbound traffics through both pNICs with the same source team interface MAC address. this is not acceptable by physical switch because from physical switch perspective a server cannot connect to 2 ports of switch with the same MAC address. what happens now?
    How Microsoft Teaming handles MAC Flapping problem?
    Thanks

    • Eric Siron says:

      You have two choices.
      If you have more vNICs than pNICs by a substantial ratio, such as 3:1 or more, then your best bet is to disable the MAC port security for those physical ports and continue using address hash or switch to dynamic load balancing.
      If you have roughly equal vNICs to pNICs and/or you don’t want to change the port security settings, then your best bet is to use Hyper-V Port load balancing. This will cause each vNIC to register its MAC on only one physical adapter and eliminate the flapping.

      • Majid Karimpour says:

        I don’t use Hyper-V on my servers and i use native nic teaming on my native OS. So i should use Address hash or Dynamic for load balancing algorithm. Suppose i disabled port security but this load balancing mechanism has highly overhead on the switches and is not acceptable by network engineers. i think Microsoft designed this feature without knowledge of problems that may happen on the network. what is you idea?

        • Eric Siron says:

          Microsoft followed industry standards. Their teaming method is fully compliant with 802.1AX. If it’s a real problem for your networking team, then your choice is to go to active/passive the way that old vendor teaming software did. The physical switch ports still need have MAC port security disabled or failover won’t work.

          • Majid Karimpour says:

            I know Microsoft follows industry standards and teaming is fully compliant with IEEE 802.1AX. but I talk about Switch Independent mode and not Switch Dependent (LACP or Static).
            Even in some switches when it detect there is a MAC Flapping, STP tries to disable one port. I know i can use Active/Standby mode, but my problem is why Microsoft lets to select other algorithms in switch independent mode when it knows it creates big problems.

          • Eric Siron says:

            Let me reflect back to you what I’m hearing you say: “I set my Windows Server in a load-balancing configuration. Because it is balancing its load across two physical adapters, its MAC address appears on two physical switch ports. I configured my physical switch to panic when it sees a MAC address on two different switch ports. I do not understand why my physical switch is panicking when it sees a MAC address on two different switch ports.”
            Your switch and the Microsoft team are both doing exactly what you told them to do. In active/active mode, ALL of the non-Hyper-V load-balancing algorithms will transmit from all active adapters regardless of the teaming mode. If you don’t want that, do not use active/active. It’s not a Microsoft problem. ANY active/active team technology is going to behave this way because that is how Ethernet works.
            I don’t know how you managed to get STP mixed up in this. It sounds like your switch is poorly configured or heavily over-configured.

  • Majid Karimpour says:

    Dear Eric,
    I have a question about switch independent/address Hashing mechanism.
    In my environment I have a server that has 2 NICs. These 2 NICs connected to a switch. If I configure switch independent with address hashing, NIC Teaming Common Core sends my server outbound traffics through both pNICs with the same source team interface MAC address. this is not acceptable by physical switch because from physical switch perspective a server cannot connect to 2 ports of switch with the same MAC address. what happens now?
    How Microsoft Teaming handles MAC Flapping problem?
    Thanks

Leave a comment or ask a question

Your email address will not be published. Required fields are marked *

Your email address will not be published. Required fields are marked *

Notify me of follow-up replies via email

Yes, I would like to receive new blog posts by email

What is the color of grass?

Please note: If you’re not already a member on the Dojo Forums you will create a new account and receive an activation email.