Did your software vendor indicate that you can virtualize their application, but only if you dedicate one or more CPU cores to it? Not clear on what happens when you assign CPUs to a virtual machine? You are far from alone.
Physical Processors are Never Assigned to Specific Virtual Machines
This is the most important note. Assigning 2 vCPUs to a system does not mean that Hyper-V plucks two cores out of the physical pool and permanently marries them to your virtual machine. I’ve seen IBM systems that do something like this, but I don’t believe that any other hypervisor does. Hyper-V certainly doesn’t. You can’t actually assign a physical core to a VM at all. So, does this mean that vendor request to dedicate a core just can’t be met? Well, not exactly. More on that toward the end.
Start by Understanding Operating System Processor Scheduling
Let’s kick this off by looking at how CPUs are used in regular Windows. Here’s a shot of my Task Manager screen:
Nothing fancy, right? Looks familiar, right?
Now, back when computers never, or almost never, came in multi-CPU multi-core boxes, we all knew that computers couldn’t really multitask. They had one CPU and one core, so there was only one possible thread of execution. But aside from the fancy graphical updates, Task Manager then looked pretty much like Task Manager now. You had a long list of running processes, all of them with a metric indicating what percentage of the CPUs time they were using.
Then, as in now, each line item you see is a process (or, new in the recent Task Manager versions, a process group). A process might consist of one or many threads. A thread is nothing more than a sequence of CPU instructions (key word: sequence).
What happens is that (in Windows, this started in 95 and NT) the operating system would stop a running thread, preserve its state, and then start another thread. After a bit of time, it would repeat those operations for the next thread. Remember that this is pre-emptive, meaning that it is the operating system that decides when a new thread will run. The thread can beg for more, and you can set priorities that affect where a process goes in line, but the OS is in charge of thread scheduling.
The only difference today is that you have multiple cores and/or multiple CPUs in practically every system (as well as hyper-threading in Intel processors), so Windows can actually multi-task now.
Taking These Concepts to the Hypervisor
Because of its role as a thread manager, Windows can be called a “supervisor” (very old terminology that you really never see anymore): a system that manages processes that are made up of threads. Hyper-V is a hypervisor: a system that manages supervisors that manage processes that are made up of threads. Pretty easy to understand, right?
Task Manager doesn’t work the same way for Hyper-V, but the same thing is going on. There is a list of partitions, and inside those partitions are processes and threads. The thread scheduler works pretty much the same way. What follows is a rethought version of the original image that was submitted for the book, changed to avoid plagiarism:
Of course, there are always going to be a lot more than just nine threads going at any given time. They’ll be queued up in the thread scheduler.
What About Processor Affinity?
You probably know that you can affinitize threads in Windows so that they always run on a particular core or set of cores. As far as I know there’s no way to do that in Hyper-V with vCPUs. Doing so would be of questionable value anyway; dedicating a thread to a core is not the same thing as dedicating a core to a thread, which is what many people really want to try to do. You can’t prevent a core from running other threads in the Windows world.
How Does Thread Scheduling Work?
The simplest answer is that Hyper-V makes the decision at the hypervisor level, but it doesn’t really let the guests have any input. Guest operating systems decide which of their threads they wish to operate. The image I presented is necessarily an oversimplification, as it’s not simple first-in-first-out. NUMA plays a role, for instance. Really understanding this topic requires a fairly deep dive into some complex ideas, and that level of depth is not really necessary for most administrators.
The first thing that matters is that (affinity aside) you never know where any given thread is going to actually execute. A thread that was paused to yield CPU time to another thread may very well be assigned to another core when it is resumed. Did you ever wonder why an application consumes right at 50% of a dual core system and each core looks like it’s running at 50% usage? That behavior indicates a single-threaded application. Each time it is scheduled, it consumes 100% of the core that it’s on. The next time it’s scheduled, it goes to the other core and consumes 100% there. When the performance is aggregated for Task Manager, that’s an even 50% utilization for the app. Since the cores are handing the thread off at each scheduling event and are mostly idle while the other core is running that app, they amount to 50% utilization for the measured time period. If you could reduce the period of measurement to capture individual time slices, you’d actually see the cores spiking to 100% and dropping to 0% (or whatever the other threads are using) in an alternating pattern.
What we’re really concerned with is the number of vCPUs assigned to a system and priority.
What Does the Number of vCPUs I Select Actually Mean?
You should first notice that you can’t assign more vCPUs to a virtual machine than you have physical cores in your host.
So,a virtual machine’s CPU count means the maximum number of threads that it is allowed to operate on physical cores at any given time. I can’t set that virtual machine to have more than two vCPUs because the host only has two CPUs. Therefore, there is nowhere for a third thread to be scheduled. But, if I had a 24-core system and left this VM at 2 vCPUs, then it would only ever send a maximum of two threads up to the hypervisor for scheduling. Other threads would be kept in the guest’s thread scheduler (the supervisor), waiting their turn.
But Can’t You Assign More Total vCPUs to all VMs than Physical Cores?
Absolutely. Not only can you, but you’re almost definitely going to. It’s no different than the fact that I’ve got 40+ processes “running” on my dual core laptop right now. I can’t actually run more than two threads at a time, but I’m always going to have far more than two threads scheduled. Windows has been doing this for a very long time now, and Windows is so good at it (usually) that most people don’t even pause to consider just what’s going on. Your VMs (supervisors) will bubble up threads to run and Hyper-V (hypervisor) will schedule them the way (mostly) that Windows has been scheduling them ever since it outgrew cooperative scheduling in Windows 3.x.
What’s The Proper Ratio of vCPU to pCPU/Cores?
This is the question that’s on everyone’s mind. I’ll tell you straight: in the generic sense, this question has no answer.
Sure, way back when, people said 1:1. Some people still say that today. And you know, you can do it. It’s wasteful, but you can do it. I could run my current desktop configuration on a quad 16 core server and I’d never have any contention. But, I probably wouldn’t see much performance difference. Why? Because almost all my threads sit idle almost all the time. If something needs 0% CPU time, what does giving it its own core do? Nothing, that’s what.
Later, the answer was upgraded to 8 vCPUs per 1 physical core. OK, sure, good.
Then it became 12.
And then the recommendations went away.
They went away because they were dumb. I mean, it was probably a good rule of thumb that was built out of aggregated observations and testing, but really, think about it. You know that mostly, operating threads will be evenly distributed across whatever hardware is available. So then, the amount of physical CPUs needed doesn’t depend on how many virtual CPUs there are. It’s entirely dependent on what the operating threads need. And, even if you’ve got a bunch of heavy threads going, that doesn’t mean their systems will die as they get pre-empted by other heavy threads. It really is going to depend on how many other heavy threads they wait for.
I’m going to let you in on a dirty little secret about CPUs: Every single time a thread runs, no matter what it is, it drives the CPU at 100% (power-throttling changes the clock speed, not workload saturation). The CPU is a binary device; it’s either processing or it isn’t. The 100% or 20% or 50% or whatever number you see is completely dependent on a time measurement. If you see it at 100%, it means that the CPU was completely active across the measured span of time. 20% means it was running a process 1/5th of the time and 4/5th of the time it was idle. What this means is that a single thread can’t actually consume 100% of the CPU the way people think it can, because Windows/Hyper-V will pre-empt it when it’s another thread’s turn. You can actually have multiple “100%” CPU threads running on the same system. The problem is that a normally responsive system expects some idle time, meaning that some threads will simply let their time slice go by, freeing it up so other threads get CPU access more quickly. When you have multiple threads always queuing for active CPU time, the overall system becomes less responsive because the other threads have to wait longer for their turns. Using additional cores will address this concern as it spreads the workload out.
What this means is, if you really want to know how many physical cores you need, then you need to know what your actual workload is going to be. If you don’t know, then go with the 8:1 or 12:1, because you’ll probably fine.
What About Reserve and Weighting (Priority)?
I don’t recommend you tinker with CPU settings unless you really understand what’s going on. Let the thread scheduler do its job. Just like setting CPU priorities on threads in Windows can get initiates into trouble in a hurry, fiddling with hypervisor vCPU settings can throw a wrench into the operations. In fact, I’ll confess that I haven’t spent a great deal of time testing it because I trust the hypervisor enough.
Let’s look at the config screen:
The first group of boxes is the reserve. The first box represents the percentage that I want to set, and its actual meaning depends on how many vCPUs I’ve given the VM. In this case, I have a 2 vCPU system on a dual core host, so the two boxes will be the same. If I set 10 percent reserve, that’s 10 percent of the total physical resources. If I drop this down to 1 vCPU, then 10 percent reserve becomes 5 percent physical. The second box, which is grayed out, will be calculated for you as you adjust the first box.
The reserve is a hard minimum… sort of. If the total of all reserve settings of all virtual machines on a given host exceeds 100%, then at least one virtual machine isn’t going to start. But, if a VM’s reserve is 0%, then it doesn’t count toward the 100% at all (seems pretty obvious, but you never know). But, if a VM with a 20% reserve is sitting idle, then other processes are allowed to use up to 100% of the available processor power… until such time as the VM with the reserve starts up. Then, once the CPU capacity is available, the reserved VM will be able to dominate up to 20% of the total computing power. Because time slices are so short, it’s effectively like it always has 20% available, but it does have to wait like everyone else.
So, that vendor that wants a dedicated CPU? If you really want to honor their wishes, this is how you do it. You enter whatever number in the top box that makes the second box the equivalent processor power of however many pCPUs/cores they think they need. If they want one whole CPU and you have a quad core host, then make the second box show 25%. Do you really have to? Well, I don’t know. Their software probably doesn’t need that kind of power, but if they can kick you off support for not listening to them, well… don’t get me in the middle of that. The real reason virtualization densities never hit what the hypervisor manufacturers say they can do is because of software vendors’ arbitrary rules, but that’s a rant for another day.
The next two boxes are the limit. Now that you understand the reserve, you can understand the limit. It’s a resource cap. It keeps a greedy VM’s hands out of the cookie jar.
The final box is the weight. As indicated, this is relative. Every VM set to 100 (the default) has the same pull with the scheduler, but they’re all beneath all the VMs that have 200, so on and so forth. If you’re going to tinker, this is safer than fiddling with reserves because you can’t ever prevent a VM from starting by changing relative weights. What the weight means is that when a bunch of VMs present threads to the hypervisor thread scheduler at once, the higher weighted VMs go first. That’s it, that’s all.
But What About Hyper-Threading?
If you want to know what Hyper-Threading is and how it functions, please check the comments section for a great explanation by Jordan. It’s much more accurate and clear than my explanation.
If you want to know how to plan for it, the official guideline is to not treat the second logical processor presented by Hyper-Threading as a a true core. It does boost performance, but by no more than about 25% at the maximum.
Hyper-Threading in the host is exposed to guests. Some applications vendors will require you to disable Hyper-Threading in hardware because there are situations where a thread’s execution can be very briefly halted due to issues with another thread, but in the absence of such specific directions or direct evidence that it is causing problems, leave it on.
Hyper-Threading is an Intel-specific technology that lets a single core process two separate instructions in parallel (called pipelines). Neat, right? One problem: the pipelines run in lockstep. If the instruction in pipeline one finishes before the thread in pipeline two, pipeline one sits and does nothing. But, that second pipeline shows up as another core. So the question is, do you count it? As far as I know, the official response is: No, Hyper-Threading should not be counted toward physical cores when considering hypervisor processing capabilities. Me, I’m a little more lenient. It’s not quite as good as another actual core, but it’s not useless either. Your mileage may vary.
A Note on Book Errata
As I mentioned in the January 2014 roundup article, my book, Microsoft Hyper-V Cluster Design, contains some errors that were introduced in the final edit phase. One of them was the illustration that I included on this subject (chapter 2). Apparently, they wanted to convert the words in the image to actual text items to grant copy/paste ability in the ebook version. Unfortunately, whoever did it didn’t notice that the labels weren’t all the same and just duplicated them in a way that made the graphic nonsensical. The text around it is OK and I stand by that, but the image is all sorts of wrong. The image used in this post provides a much more accurate presentation.