Save to My DOJO
Few Hyper-V topics burn up the Internet quite like “performance”. No matter how fast it goes, we always want it to go faster. If you search even a little, you’ll find many articles with long lists of ways to improve Hyper-V’s performance. The less focused articles start with ge1neral Windows performance tips and sprinkle some Hyper-V-flavored spice on them. I want to use this article to tighten the focus down on Hyper-V hardware settings only. That means it won’t be as long as some others; I’ll just think of that as wasting less of your time. So in the name of speed, let’s get right into it!
Upgrade your system
I would prefer if everyone just knew this upfront. Unfortunately, it seems like I need to frequently remind folks that hardware cannot exceed its capabilities. So, every performance article I write will always include this point front-and-center. Each piece of hardware has an upper limit on maximum speed. Where that speed barrier lies in comparison to other hardware in the same category almost always correlates directly with the cost. You cannot tweak a go-cart to outrun a Corvette without spending at least as much money as just buying a Corvette — and that’s without considering the time element. If you bought slow hardware, then you will have a slow Hyper-V environment.
Fortunately, this point has a corollary: don’t panic. Production systems, especially server-class systems, almost never experience demand levels that compare to the stress tests that admins put on new equipment. If typical load levels were that high, it’s doubtful that virtualization would have caught on so quickly. We use virtualization for so many reasons nowadays, we forget that “cost savings through better utilization of under-loaded server equipment” was one of the primary drivers of early virtualization adoption.
BIOS Settings for Hyper-V Performance
Don’t neglect your BIOS! It contains some of the most important settings for Hyper-V.
- C States. Disable C States! Few things impact Hyper-V performance quite as strongly as C States! Names and locations will vary, so look in areas related to Processor/CPU, Performance, and Power Management. If you can’t find anything that specifically says C States, then look for settings that disable/minimize power management. C1E is usually the worst offender for Live Migration problems, although other modes can cause issues.
- Virtualization support: A number of features have popped up through the years, but most BIOS manufacturers have since consolidated them all into a global “Virtualization Support” switch, or something similar. I don’t believe that current versions of Hyper-V will even run if these settings aren’t enabled. Here are some individual component names, for those special BIOSs that break them out:
- Virtual Machine Extensions (VMX)
- AMD-V — AMD CPUs/mainboards. Be aware that Hyper-V can’t (yet?) run nested virtual machines on AMD chips
- VT-x, or sometimes just VT — Intel CPUs/mainboards. Required for nested virtualization with Hyper-V in Windows 10/Server 2016
- Data Execution Prevention: DEP means less for performance and more for security. It’s also a requirement. But, we’re talking about your BIOS settings and you’re in your BIOS, so we’ll talk about it. Just make sure that it’s on. If you don’t see it under the DEP name, look for:
- No Execute (NX) — AMD CPUs/mainboards
- Execute Disable (XD) — Intel CPUs/mainboards
- Second Level Address Translation: I list this feature primarily for the sake of completeness. It’s been many years since any system was built new without SLAT support. If you have one, following every point in this post to the letter still won’t make that system fast. Starting with Windows 8 and Server 2016, you cannot use Hyper-V without SLAT support. Names that you will see SLAT under:
- Nested Page Tables (NPT)/Rapid Virtualization Indexing (RVI) — AMD CPUs/mainboards
- Extended Page Tables (EPT) — Intel CPUs/mainboards
- Disable power management. This goes hand-in-hand with C States. Just turn off power management altogether. Get your energy savings via consolidation. You can also buy lower wattage systems.
- Use Hyperthreading. I’ve seen a tiny handful of claims that Hyperthreading causes problems on Hyper-V. I’ve heard more convincing stories about space aliens. I’ve personally seen the same number of space aliens as I’ve seen Hyperthreading problems with Hyper-V (that would be zero). If you’ve legitimately encountered a problem that was fixed by disabling Hyperthreading AND you can prove that it wasn’t a bad CPU, that’s great! Please let me know. But remember, you’re still in a minority of a minority of a minority. The rest of us will run Hyperthreading. If you want to minimize your exposure to Spectre and similar cache side-channel attacks, then upgrade to at least 2016 and employ the core scheduler.
- Disable SCSI BIOSs. Unless you plan to boot your host from a SAN, kill the BIOSs on your SCSI adapters. A SCSI card’s BIOS doesn’t do anything good or bad for a running Hyper-V host, but it slows down physical boot times.
- Disable BIOS-set VLAN IDs on physical NICs. Some network adapters support VLAN tagging through boot-up interfaces. If you then bind a Hyper-V virtual switch to one of those adapters, you could encounter all sorts of network nastiness.
Storage Settings for Hyper-V Performance
I wish the IT world would learn to accept that rotating hard disks do not move data very quickly. If you just can’t cope with that, buy a gigantic lot of them and make big RAID 10 arrays. Or, you could get a stack of SSDs. Don’t get six or so spinning disks and get sad that they “only” move data at a few hundred megabytes per second. That’s how the tech works.
Performance tips for storage:
- Learn to live with the fact that storage is slow.
- Remember that speed tests do not reflect real-world load and that file copy does not test anything except permissions.
- Learn to live with Hyper-V’s I/O scheduler. If you want a computer system to have 100% access to storage bandwidth, start by checking your assumptions. Just because a single file copy doesn’t go as fast as you think it should, does not mean that the system won’t perform its production role adequately. If you’re certain that a system must have total and complete storage speed, then do not virtualize it. A VM cannot achieve that level of speed without stealing I/O from other guests.
- Enable read caches
- Your SAN/NAS will likely have a read cache
- Your internal storage will likely have a read cache
- Cluster Shared Volumes have a read cache
- Carefully consider the potential risks of write caching. If acceptable, enable write caches. If your internal disks, DAS, SAN, or NAS has a battery backup system that can guarantee clean cache flushes on a power outage, write caching is generally safe. Internal batteries that report their status and/or automatically disable caching are best. UPS-backed systems are sometimes OK, but they are not foolproof.
- Prefer few arrays with many disks over many arrays with few disks.
- Unless you’re going to store VMs on a remote system, do not create an array just for Hyper-V. By that, I mean that if you’ve got six internal bays, do not create a RAID-1 for Hyper-V and a RAID-x for the virtual machines. That’s a Microsoft SQL Server 2000 design. This is 2019 and you’re building a Hyper-V server. Use all the bays in one big array.
- Do not architect your storage to make the hypervisor/management operating system go fast. I can’t believe how many times I read on forums that Hyper-V needs lots of disk speed. After boot-up, it needs almost nothing. The hypervisor remains resident in memory. Unless you’re doing something questionable in the management OS, it won’t even page to disk very often. Architect storage speed in favor of your virtual machines.
- Set your fiber channel SANs to use very tight WWN masks. Live Migration requires a handoff from one system to another, and the looser the mask, the longer that takes. With 2016 the guests shouldn’t crash, but the hand-off might be noticeable.
- Keep iSCSI/SMB networks clear of other traffic. I see a lot of recommendations to put each and every iSCSI NIC on a system into its own VLAN and/or layer-3 network. I’m on the fence about that advice. On one hand, a network storm in one iSCSI network might justify it. However, keeping those networks quiet would go a long way on its own. For clustered systems, multi-channel SMB needs each adapter to be on a unique layer 3 network (according to the docs; from what I can tell, it works even with same-net configurations).
- If using gigabit, try to physically separate iSCSI/SMB from your virtual switch. Meaning, don’t make that traffic endure the overhead of virtual switch processing if you can help it.
- Round robin MPIO might not be the best, although it’s the most recommended. If you have one of the aforementioned network storms, Round Robin will negate some of the benefits of VLAN/layer 3 segregation. I like least queue depth, myself.
- MPIO and SMB multi-channel are much faster than the best teaming.
- If you must run MPIO or SMB traffic across a team, create multiple virtual or logical NICs. That will give the teaming implementation more opportunities to create balanced streams.
- Use jumbo frames for iSCSI/SMB connections if everything supports it (host adapters, switches, and back-end storage). You’ll improve the header-to-payload bit ratio by a meaningful amount.
- Enable RSS on SMB-carrying adapters. If you have RDMA-capable adapters, absolutely enable that.
- Use dynamically expanding VHDX, but not dynamically expanding VHD. I still see people recommending fixed VHDX for operating system VHDXs, which is just absurd. Fixed VHDX is good for high-volume databases, but mostly because they’ll probably expand to use all the space anyway. Dynamic VHDX enjoys higher average write speeds because it completely ignores zero writes. No defined pattern has yet emerged that declares a winner on read rates, but people who say that fixed always wins are making demonstrably false assumptions.
- Do not use pass-through disks. The performance is sometimes a little bit better, but sometimes it’s worse, and it almost always causes some other problems elsewhere. The trade-off is not worth it. Just add one spindle to your array to make up for any perceived speed deficiencies. If you insist on using pass-through for performance reasons, then I want to see the performance traces of production traffic that prove it.
- Don’t let fragmentation keep you up at night. Fragmentation is a problem for single-spindle desktops/laptops, “admins” that never should have been promoted above first-line help desk, and salespeople selling defragmentation software. If you’re here to disagree, you better have a URL to performance traces that I can independently verify before you even bother entering a comment. I have plenty of Hyper-V systems of my own on storage ranging from 3-spindle up to >100-spindle, and the first time I even feel compelled to run a defrag (much less get anything out of it) I’ll be happy to issue a mea culpa. For those keeping track, we’re at 8 years and counting.
Memory Settings for Hyper-V Performance
There isn’t much that you can do for memory. Buy what you can afford and, for the most part, don’t worry about it.
- Buy and install your memory chips optimally. Multi-channel memory is somewhat faster than single-channel. Your hardware manufacturer will be able to help you with that.
- Don’t over-allocate memory to guests. Just because your file server had 16GB before you virtualized it does not mean that it has any use for 16GB.
- Use Dynamic Memory unless you have a system that expressly forbids it. It’s better to stretch your memory dollar farther than to wring your hands about whether or not Dynamic Memory is a good thing. Until directly proven otherwise for a given server, it’s a good thing.
- Don’t worry so much about NUMA. I’ve read volumes and volumes on it. I even spent a lot of time configuring it on a high-load system. Wrote some about it. I never got any of that time back. I’ve had some interesting conversations with people that really did need to tune NUMA. They constitute… oh, I’d say about .1% of all the conversations that I’ve ever had about Hyper-V. The rest of you should leave NUMA enabled at defaults and walk away.
Network Settings for Hyper-V Performance
Networking configuration can make a real difference to Hyper-V performance.
- Learn to live with the fact that gigabit networking is “slow” and that 10GbE networking often has barriers to reaching 10Gbps for a single test. Most networking demands don’t even bog down gigabit. It’s just not that big of a deal for most people.
- Learn to live with the fact that a) your four-spindle disk array can’t fill up even one 10GbE pipe, much less the pair that you assigned to iSCSI, and that b) it’s not Hyper-V’s fault. I know this doesn’t apply to everyone, but wow, do I see lots of complaints about how Hyper-V can’t magically pull or push bits across a network faster than a disk subsystem can read and/or write them.
- Disable VMQ on gigabit adapters. I think some manufacturers are finally coming around to the fact that they have a problem. Too late, though. The purpose of VMQ is to redistribute inbound network processing for individual virtual NICs away from CPU 0, core 0 to the other cores. Current-model CPUs are fast enough to handle many gigabit adapters.
- If you use a Hyper-V virtual switch on a network team and you’ve disabled VMQ on the physical NICs, disable it on the team adapter as well. I’ve been saying that since shortly after 2012 came out and people are finally discovering that I’m right, so, yay? Anyway, do it.
- Don’t worry so much about vRSS. RSS is like VMQ, only for non-VM traffic. vRSS, then, is the projection of VMQ down into the virtual machine. Basically, with traditional VMQ, the VMs’ inbound traffic is separated across pNICs in the management OS, but then each guest still processes its own data on vCPU 0. vRSS splits traffic processing across vCPUs inside the guest once it gets there. The “drawback” is that distributing processing and then redistributing processing costs more processing. So, you have a nicely distributed load, but you also have more overall processing. The upshot: almost no one will care either way. Set it or don’t set it, you probably can’t detect the difference in production. If you’re new to all of this, then you’ll find an “RSS” setting on the network adapter inside the guest. If that’s on in the guest (off by default) and VMQ is on and functioning in the host, then you have vRSS. woohoo.
- Don’t blame Hyper-V for your networking ills. I mention this in the context of performance because your time has value. I’m constantly called upon to troubleshoot Hyper-V “networking problems” because someone is sharing MACs or IPs or trying to get traffic from the dark side of the moon over a Cat-3 cable with three broken strands. Hyper-V is also frequently blamed by people that just don’t have a functional understanding of TCP/IP. More wasted time that I’ll never get back.
- Use one virtual switch. Multiple virtual switches cause processing overhead without providing returns. This is a guideline, not a rule, but you need to be prepared to provide an unflinching, sure-footed defense for every virtual switch in a host after the first.
- Don’t mix gigabit with 10 gigabit in a team. Teaming will not automatically select 10GbE over the gigabit. 10GbE is so much faster than gigabit that it’s best to just kill gigabit and converge on the 10GbE.
- Ten one-gigabit cards do not equal a single 10GbE card. I’m all for only using 10GbE when you can justify it with usage statistics, but gigabit just cannot compete.
Maintenance Best Practices
Don’t neglect your systems once they’re deployed!
- Take a performance baseline when you first deploy a system and save it.
- Take and save another performance baseline when your system reaches a normative load level (basically, once you’ve reached its expected number of VMs).
- Keep drivers reasonably up-to-date. Verify that settings aren’t lost after each update.
- Monitor hardware health. The Windows Event Log often provides early warning symptoms, if you have nothing else.
If you carry out all (or as many as possible) of the above hardware adjustments you will witness a considerable jump in your hyper-v performance. That I can guarantee. However, for those who don’t have the time, patience, or are prepared to make the necessary investment in some cases, Altaro has developed an e-book just for you. Find out more about it here: Supercharging Hyper-V Performance for the time-strapped admin.
Note: This article was originally published in August 2017. It has been fully updated to be relevant as of October 2019.
Not a DOJO Member yet?
Join thousands of other IT pros and receive a weekly roundup email with the latest content & updates!
162 thoughts on "6 Hardware Tweaks that will Skyrocket your Hyper-V Performance"
Thanks for the comprehensive and thorough coverage, helps especially to get rid of a few features I was never really sure if I should invest time to uncover, that it has almost no use or just can be neglected, like vRSS on gigabit networking gear 🙂
One thing I still seem to struggle is, that HP does nowhere provide any best practices specifically for deploying Hyper-V on a say Dl380 G9 server, where the DL380 is by far the most used server line in the world.
While your writings are general recommendations, the specifics are of course different with every server type and model.
I really struggle per example with the raid 10 creation on a HP DL380 G9. The raid will mainly run RDS server VMs and some low end MS SQL VMs. Should I go with the default stripe size on raid creating or not, what is adequate for that scenario? Or does Hyper-V in general have a recommendation, except for maybe high load Exchange/SQL deployments, which always have specifics.
The next point then is what allocation size should I choose for the NTFS formated disk, which consists of the said raid 10 array. Is there a general 4k for small files and 64k for few but big files, what Hyper-V generally is?
And finally the VMs them selves. Does it matter what NTFS allocation size I choose there for the VHDX? Is an RDS VM best run with NTFS at 4k and say a SQL servers C drive as well, but the D drive, which holds the db at NTFS 64k and E drive which holds logs NTFS with 4k again?
The above is just an example, of what it is that currently seems quite hard get facts or at least some encouraging recommendations on. I know, I could spend many hours to find out myself, but somehow I can’t believe that this is an area so few good recommendations exist. I start to believe, what you often tell us during your blogs, that most admins over estimate their storage needs, since they never tested it really, except with a dumb file copy test, hehe.
And maybe it’s another example, where the answer is somewhere in between “it doesn’t matter so much” and “makes a real difference only with db based usage scenarios”…
Great comment. Thanks for taking the time!
Re: the DL380 — I don’t have one and I almost never use HP. I’m not opposed to writing an article on one, but I have literally zero experience to share.
I have not ever seen a practical upside to tinkering with stripe sizes or allocation units. I can’t say that it never matters, but I can say that adding more spindles or mixing in SSDs will pay better, quicker dividends than worrying about either of those things. Really, the important thing has always been to ensure that your byte alignment is correct, which it automatically has been ever since WS2008. After that, the performance differences are mostly theoretical. In practice, you would need to consistently drive your disks at high queue depths for any difference to even become measurable, and then you would still have many other factors in play. In order to tune for those loads, you’d need to have a thorough understanding of how your particular system picks up and lies down data.
An excellent guide, glad I took the time to read through it. I was unaware that there was an issue with VMQ – I’ve never had issues with it on the servers we run, but a quick check showed it enabled on almost everything (all 1GB NICs), so I’ve now disabled it universally (no point introducing a potential risk for zero benefit).
I have a question about fixed VHDX – there was always a recommendation to run fixed VHD’s for performance, and had assumed that applied to VHDX also. In the case where storage for a VM is allocated a full physical array (we have a couple of Windows fileserver VMs like this as well as a Linux video security system), is there any benefit to *not* creating a fixed VHDX? The drives/array are dedicated to the VM anyway, so there’s no requirement to share, and the VHDX will be allocated the full space regardless, so it seems like an either/or scenario where it doesn’t really matter either way (and I’d still err on the side of fixed, simply because it means less write overhead over time, even if it doesn’t affect performance).
Appreciate your thoughts 🙂
For VHD, fixed provided better performance. The difference peaked at a few percentage points, but the difference was there. For VHDX, the differences are all situational.
For an already-deployed full-utilization system like yours, the primary benefit of dynamic VHDX will be portability. If a dynamic VHDX outlives its physical storage system and has not yet fully expanded, moving to the newer system will be faster. Is it an important difference? Eh, unless it hovers at some low utilization percentage, probably not. Unless your system is damaged and you’re moving under duress, I figure that a Storage Live Migration can take as long as it wants. But, if I were to create a new system like that, I’d choose dynamic just so I wouldn’t have to sit there and do nothing while it zeroed out all that space.
It doesn’t really result in less write overhead, not to any meaningful degree. Allocate and zero the space in the beginning or allocate and zero it later; it’s all the same. There is the block allocation table, but that’s not enough of anything that it needs to be worried about.
Thanks for the detailed reply. Is there anything to be concerned about regarding dynamic VHDX and a Linux ext3/4 volume? Pretty sure the boot volume is dynamic, so hopefully not!
I use dynamic VHDX with ext4 on a regular basis and have had no problems.
Excellent, great to know. Thanks again Eric 🙂
Thanks for this write up. Just got finished with setting up four 2012 Hyper-V servers, and they each have one guest, and that solo guest on each was running woefully slow doing even basic things – logging into domain, copying files, etc…
Disabled VMQ on the hosts, and the guests all got perky real quick just from that change.
The problem with Hyper-V and hyperthreading is Hyper-V thinks they are actual cores when they are not. An example would be one 8-core processor with Hyperthreading (16 logical cores) on the host and then you configure one VM with 4 vCPU.
That single VM will only be able to use up to 25% of the total host’s CPU resources when you might expect it to be able to use up to 50%.
Either way it works out the same, so to use up to 50% you would have to give the VM 8vCPU. Now you are having multiple VM’s with crazy number of vCPU.
That’s not precisely true. Hyper-V is quite aware of the Hyperthreading situation. Also, assigning vCPU to a VM establishes no mapping to the underlying hardware. Your example seems to imply that assigning 4 vCPU to a VM on an 8/16 host would set up a 2 “real” 2 Hyperthread VM. In truth, it sets up a VM that can schedule a maximum of 4 active threads. The hypervisor thread scheduler will figure out where best to put guest processes when their time slice comes up. You can see which cores the distributor favors by watching the applicable counters under “Hyper-V Hypervisor Root Virtual Processor”.
But, if you’re talking about guaranteeing 50% through reservations, that changes things. On the one hand, the discussion seems mostly academic to me. In all but one of the cases where I’ve been told that I absolutely must set aside a huge amount of CPU or *insert apocalyptic threat here*, the application wound up topping out at 3% usage in the 90th percentile. I think that maybe one of the apps occasionally sat above 5% for a few minutes. In the other case, the app burned so much CPU that it never made sense to virtualize it in the first place. But, if you happen to really need that 50%, then I do not know how the underlying reservation calculations work, if it only does simple math to guarantee access to 50% of the logical cores, or if it does bigger math to figure out what 50% really means, or if it does something in-between. I might test that if I ever encounter a practical reason to know.
The situation we ran into is multiple Citrix servers running on multiple hosts. We found that we had complaints about the servers running slow at various times but we noticed our hosts never had high cpu utilization. Yes, instead of giving the virtual machine 4vCPU we could of gave it 8vCPU to potentially solve the issue but instead we decided to disable hyperthreading.
As a test run one host with a 8-core processor having hyperthreading enabled and one virtual machine with 4vCPU. You will find the host utilization would never go above 25% that the guest is using.
One way or the other, adjusting the guest number of processors or disabling hyperthreading resolved the issue but it was easier in our minds to assume if we had 24-cores and gave a vm 12vCPU, we can expect it to use up to 50% of the host resources if it were available.
Since hyper-threading only boosts the “real” core by a few percentage points, an 8 vCPU guest on a stressed non-HT host will outperform an 8 vCPU guest on an otherwise equivalent stressed HT host. Windows and Hyper-V are both aware of the nature of hyper-threading but still need to distribute threads where capacity exists. I’m almost at the point where I just want to turn off HT by default and be done with the charade.
How were you measuring host CPU? Just checking if you’re aware that Task Manager and most other common tools cannot see the virtualization load.
I would definitely consider de-fragging the disks both at the virtual level and at the OS level. This results in a double secret de-frag that will significantly boost your performance and restore your hair when done properly.
I anxiously await the forthcoming publication of your real-world equivalent testing methodology and performance traces that prove this. Or anyone’s. Ever.
Great article! I’ve been “messing around” with Hyper-V for a few years and have a good grip on a “basic” setup by now. These tips will definitely improve my future and existing configs. One simple question for you… I’ve always built physical servers with 1 logical drive for the OS(RAID1) and 1 for DATA(RAID10)… I’m ordering a pretty ridiculous HP DL380 server right now and want to build it right for Hyper-V. It’s going to be home to an Exchange 2016 VM and I have those requirements all set. But I also plan on throwing 1 or 2 other VMs on it so I’m trying to get these disks done right. I was originally going to use 2 480GB SSDs for the OS and then 6 1.6TB SSDs for DATA. I usually keep my VMs on this drive. How would you do it? Is my setup style outdated or not ideal for Hyper-V?
I would probably not do that, but I also would not criticize anyone that did. I don’t like to see the management OS languishing on SSDs while the VMs suffer on spinners. In your projected build, the VMs will live on SSDs as well, so no major foul.
I have not seen your performance traces so I cannot say with any certainty what I would do. Following my current practice, I would build one large array and use logical disks to separate the management OS from the VMs. Maybe you can get away with just using the 6x 1.6TB drives and tossing the smaller drives completely. Or, you might bump up to 8x 1.6TB.
If you want to split the arrays, then I would not use 480GB drives for the boot array. 250GB for a management OS drive should cover even the most paranoid concerns.
Great write up, and keeping things real!
Hey there, just one suggestion, as it took me five seconds to get it straight. Instead of:
“10-gigabit cards do not equal a single 10GbE card.”
“Ten 1-gigabit cards do not equal a single 10GbE card.”
I know, I’m overly focused on tiny details.
That’s odd, I remember spending time writing that line so that it would not confuse readers. No idea what happened. I’ve reworded it, hopefully it’s clearer now.
We are working on a relatively simple WS2019 deployment that will include domain, database and Hyper-V.
As expected, few questions come up regarding the following configuration (Host = OS running on the actual hardware):
Host – WS2019: AD, Hyper-V
Guest 1 (WS2019): DHCP/DNS/WINS/RAS
Guest 2 (WS2019): Database, File Server, Print Server
Multiple Client Guest (WIN10, hosted by Hyper-V, accessed via Remote Desktop)
1. Should the Hyper-V role run on the Host OS instance or virtualized on a nested WS2019 VM ?
2. Are there any recommended server roles that should be assigned to the Host OS compared to ones that should be assigned the a guest WS2019 VM ?
3. Regarding Hyper-V: we plan on storing the VHD/VHDX on a separate disk. Any reservations from storing the Configuration Files on the main Host disk with the main Host OS is stored ?
Any comments or references would be helpful. Thank you.
answered on Dojo forums: https://dojoforums.altaro.com/topic/server-roles-assignment-hyper-v/
Eric Siron, you are a really cool dude! I express my gratitude to you. I have read almost all of your articles, no one writes in more detail than you! Would like a detailed article on using ROCE in a failover cluster HV environment.