Few Hyper-V topics burn up the Internet quite like “performance”. No matter how fast it goes, we always want it to go faster. If you search even a little, you’ll find many articles with long lists of ways to improve Hyper-V’s performance. The less focused articles start with general Windows performance tips and sprinkle some Hyper-V-flavored spice on them. I want to use this article to tighten the focus down on Hyper-V hardware settings only. That means it won’t be as long as some others; I’ll just think of that as wasting less of your time.
1. Upgrade your system
I guess this goes without saying but every performance article I write will always include this point front-and-center. Each piece of hardware has its own maximum speed. Where that speed barrier lies in comparison to other hardware in the same category almost always correlates directly with cost. You cannot tweak a go-cart to outrun a Corvette without spending at least as much money as just buying a Corvette — and that’s without considering the time element. If you bought slow hardware, then you will have a slow Hyper-V environment.
Fortunately, this point has a corollary: don’t panic. Production systems, especially server-class systems, almost never experience demand levels that compare to the stress tests that admins put on new equipment. If typical load levels were that high, it’s doubtful that virtualization would have caught on so quickly. We use virtualization for so many reasons nowadays, we forget that “cost savings through better utilization of under-loaded server equipment” was one of the primary drivers of early virtualization adoption.
2. BIOS Settings for Hyper-V Performance
Don’t neglect your BIOS! It contains some of the most important settings for Hyper-V.
- C States. Disable C States! Few things impact Hyper-V performance quite as strongly as C States! Names and locations will vary, so look in areas related to Processor/CPU, Performance, and Power Management. If you can’t find anything that specifically says C States, then look for settings that disable/minimize power management. C1E is usually the worst offender for Live Migration problems, although other modes can cause issues.
- Virtualization support: A number of features have popped up through the years, but most BIOS manufacturers have since consolidated them all into a global “Virtualization Support” switch, or something similar. I don’t believe that current versions of Hyper-V will even run if these settings aren’t enabled. Here are some individual component names, for those special BIOSs that break them out:
- Virtual Machine Extensions (VMX)
- AMD-V — AMD CPUs/mainboards. Be aware that Hyper-V can’t (yet?) run nested virtual machines on AMD chips
- VT-x, or sometimes just VT — Intel CPUs/mainboards. Required for nested virtualization with Hyper-V in Windows 10/Server 2016
- Data Execution Prevention: DEP means less for performance and more for security. It’s also a requirement. But, we’re talking about your BIOS settings and you’re in your BIOS, so we’ll talk about it. Just make sure that it’s on. If you don’t see it under the DEP name, look for:
- No Execute (NX) — AMD CPUs/mainboards
- Execute Disable (XD) — Intel CPUs/mainboards
- Second Level Address Translation: I’m including this for completion. It’s been many years since any system was built new without SLAT support. If you have one, following every point in this post to the letter still won’t make that system fast. Starting with Windows 8 and Server 2016, you cannot use Hyper-V without SLAT support. Names that you will see SLAT under:
- Nested Page Tables (NPT)/Rapid Virtualization Indexing (RVI) — AMD CPUs/mainboards
- Extended Page Tables (EPT) — Intel CPUs/mainboards
- Disable power management. This goes hand-in-hand with C States. Just turn off power management altogether. Get your energy savings via consolidation. You can also buy lower wattage systems.
- Use Hyperthreading. I’ve seen a tiny handful of claims that Hyperthreading causes problems on Hyper-V. I’ve heard more convincing stories about space aliens. I’ve personally seen the same number of space aliens as I’ve seen Hyperthreading problems with Hyper-V (that would be zero). If you’ve legitimately encountered a problem that was fixed by disabling Hyperthreading AND you can prove that it wasn’t a bad CPU, that’s great! Please let me know. But remember, you’re still in a minority of a minority of a minority. The rest of us will run Hyperthreading.
- Disable SCSI BIOSs. Unless you are booting your host from a SAN, kill the BIOSs on your SCSI adapters. It doesn’t do anything good or bad for a running Hyper-V host but slows down physical boot times.
- Disable BIOS-set VLAN IDs on physical NICs. Some network adapters support VLAN tagging through boot-up interfaces. If you then bind a Hyper-V virtual switch to one of those adapters, you could encounter all sorts of network nastiness.
3. Storage Settings for Hyper-V Performance
I wish the IT world would learn to cope with the fact that rotating hard disks do not move data very quickly. If you just can’t cope with that, buy a gigantic lot of them and make big RAID 10 arrays. Or, you could get a stack of SSDs. Don’t get six or so spinning disks and get sad that they “only” move data at a few hundred megabytes per second. That’s how the tech works.
Performance tips for storage:
- Learn to live with the fact that storage is slow.
- Remember that speed tests do not reflect real world load and that file copy does not test anything except permissions.
- Learn to live with Hyper-V’s I/O scheduler. If you want a computer system to have 100% access to storage bandwidth, start by checking your assumptions. Just because a single file copy doesn’t go as fast as you think it should, does not mean that the system won’t perform its production role adequately. If you’re certain that a system must have total and complete storage speed, then do not virtualize it. The only way that a VM can get that level of speed is by stealing I/O from other guests.
- Enable read caches
- Carefully consider the potential risks of write caching. If acceptable, enable write caches. If your internal disks, DAS, SAN, or NAS has a battery backup system that can guarantee clean cache flushes on a power outage, write caching is generally safe. Internal batteries that report their status and/or automatically disable caching are best. UPS-backed systems are sometimes OK, but they are not foolproof.
- Prefer few arrays with many disks over many arrays with few disks.
- Unless you’re going to store VMs on a remote system, do not create an array just for Hyper-V. By that, I mean that if you’ve got six internal bays, do not create a RAID-1 for Hyper-V and a RAID-x for the virtual machines. That’s a Microsoft SQL Server 2000 design. This is 2017 and you’re building a Hyper-V server. Use all the bays in one big array.
- Do not architect your storage to make the hypervisor/management operating system go fast. I can’t believe how many times I read on forums that Hyper-V needs lots of disk speed. After boot-up, it needs almost nothing. The hypervisor remains resident in memory. Unless you’re doing something questionable in the management OS, it won’t even page to disk very often. Architect storage speed in favor of your virtual machines.
- Set your fibre channel SANs to use very tight WWN masks. Live Migration requires a hand off from one system to another, and the looser the mask, the longer that takes. With 2016 the guests shouldn’t crash, but the hand-off might be noticeable.
- Keep iSCSI/SMB networks clear of other traffic. I see a lot of recommendations to put each and every iSCSI NIC on a system into its own VLAN and/or layer-3 network. I’m on the fence about that; a network storm in one iSCSI network would probably justify it. However, keeping those networks quiet would go a long way on its own. For clustered systems, multi-channel SMB needs each adapter to be on a unique layer 3 network (according to the docs; from what I can tell, it works even with same-net configurations).
- If using gigabit, try to physically separate iSCSI/SMB from your virtual switch. Meaning, don’t make that traffic endure the overhead of virtual switch processing, if you can help it.
- Round robin MPIO might not be the best, although it’s the most recommended. If you have one of the aforementioned network storms, Round Robin will negate some of the benefits of VLAN/layer 3 segregation. I like least queue depth, myself.
- MPIO and SMB multi-channel are much faster and more efficient than the best teaming.
- If you must run MPIO or SMB traffic across a team, create multiple virtual or logical NICs. It will give the teaming implementation more opportunities to create balanced streams.
- Use jumbo frames for iSCSI/SMB connections if everything supports it (host adapters, switches, and back-end storage). You’ll improve the header-to-payload bit ratio by a meaningful amount.
- Enable RSS on SMB-carrying adapters. If you have RDMA-capable adapters, absolutely enable that.
- Use dynamically-expanding VHDX, but not dynamically-expanding VHD. I still see people recommending fixed VHDX for operating system VHDXs, which is just absurd. Fixed VHDX is good for high-volume databases, but mostly because they’ll probably expand to use all the space anyway. Dynamic VHDX enjoys higher average write speeds because it completely ignores zero writes. No defined pattern has yet emerged that declares a winner on read rates, but people who say that fixed always wins are making demonstrably false assumptions.
- Do not use pass-through disks. The performance is sometimes a little bit better, but sometimes it’s worse, and it almost always causes some other problem elsewhere. The trade-off is not worth it. Just add one spindle to your array to make up for any perceived speed deficiencies. If you insist on using pass-through for performance reasons, then I want to see the performance traces of production traffic that prove it.
- Don’t let fragmentation keep you up at night. Fragmentation is a problem for single-spindle desktops/laptops, “admins” that never should have been promoted above first-line help desk, and salespeople selling defragmentation software. If you’re here to disagree, you better have a URL to performance traces that I can independently verify before you even bother entering a comment. I have plenty of Hyper-V systems of my own on storage ranging from 3-spindle up to >100 spindle, and the first time I even feel compelled to run a defrag (much less get anything out of it) I’ll be happy to issue a mea culpa. For those keeping track, we’re at 6 years and counting.
4. Memory Settings for Hyper-V Performance
There isn’t much that you can do for memory. Buy what you can afford and, for the most part, don’t worry about it.
- Buy and install your memory chips optimally. Multi-channel memory is somewhat faster than single-channel. Your hardware manufacturer will be able to help you with that.
- Don’t over-allocate memory to guests. Just because your file server had 16GB before you virtualized it does not mean that it has any use for 16GB.
- Use Dynamic Memory unless you have a system that expressly forbids it. It’s better to stretch your memory dollar farther than wring your hands about whether or not Dynamic Memory is a good thing. Until directly proven otherwise for a given server, it’s a good thing.
- Don’t worry so much about NUMA. I’ve read volumes and volumes on it. Even spent a lot of time configuring it on a high-load system. Wrote some about it. Never got any of that time back. I’ve had some interesting conversations with people that really did need to tune NUMA. They constitute… oh, I’d say about .1% of all the conversations that I’ve ever had about Hyper-V. The rest of you should leave NUMA enabled at defaults and walk away.
5. Network Settings for Hyper-V Performance
Networking configuration can make a real difference to Hyper-V performance.
- Learn to live with the fact that gigabit networking is “slow” and that 10GbE networking often has barriers to reaching 10Gbps for a single test. Most networking demands don’t even bog down gigabit. It’s just not that big of a deal for most people.
- Learn to live with the fact that a) your four-spindle disk array can’t fill up even one 10GbE pipe, much less the pair that you assigned to iSCSI and that b) it’s not Hyper-V’s fault. I know this doesn’t apply to everyone, but wow, do I see lots of complaints about how Hyper-V can’t magically pull or push bits across a network faster than a disk subsystem can read and/or write them.
- Disable VMQ on gigabit adapters. I think some manufacturers are finally coming around to the fact that they have a problem. Too late, though. The purpose of VMQ is to redistribute inbound network processing for individual virtual NICs away from CPU 0, core 0 to the other cores in the system. Current-model CPUs are fast enough that they can handle many gigabit adapters.
- If you are using a Hyper-V virtual switch on a network team and you’ve disabled VMQ on the physical NICs, disable it on the team adapter as well. I’ve been saying that since shortly after 2012 came out and people are finally discovering that I’m right, so, yay? Anyway, do it.
- Don’t worry so much about vRSS. RSS is like VMQ, only for non-VM traffic. vRSS, then, is the projection of VMQ down into the virtual machine. Basically, with traditional VMQ, the VMs’ inbound traffic is separated across pNICs in the management OS, but then each guest still processes its own data on vCPU 0. vRSS splits traffic processing across vCPUs inside the guest once it gets there. The “drawback” is that distributing processing and then redistributing processing causes more processing. So, the load is nicely distributed, but it’s also higher than it would otherwise be. The upshot: almost no one will care. Set it or don’t set it, it’s probably not going to impact you a lot either way. If you’re new to all of this, then you’ll find an “RSS” setting on the network adapter inside the guest. If that’s on in the guest (off by default) and VMQ is on and functioning in the host, then you have vRSS. woohoo.
- Don’t blame Hyper-V for your networking ills. I mention this in the context of performance because your time has value. I’m constantly called upon to troubleshoot Hyper-V “networking problems” because someone is sharing MACs or IPs or trying to get traffic from the dark side of the moon over a Cat-3 cable with three broken strands. Hyper-V is also almost always blamed by people that just don’t have a functional understanding of TCP/IP. More wasted time that I’ll never get back.
- Use one virtual switch. Multiple virtual switches cause processing overhead without providing returns. This is a guideline, not a rule, but you need to be prepared to provide an unflinching, sure-footed defense for every virtual switch in a host after the first.
- Don’t mix gigabit with 10 gigabit in a team. Teaming will not automatically select 10GbE over the gigabit. 10GbE is so much faster than gigabit that it’s best to just kill gigabit and converge on the 10GbE.
- 10x gigabit cards do not equal 1x 10GbE card. I’m all for only using 10GbE when you can justify it with usage statistics, but gigabit just cannot compete.
6. Maintenance Best Practices
Don’t neglect your systems once they’re deployed!
- Take a performance baseline when you first deploy a system and save it.
- Take and save another performance baseline when your system reaches a normative load level (basically, once you’ve reached its expected number of VMs).
- Keep drivers reasonably up-to-date. Verify that settings aren’t lost after each update.
- Monitor hardware health. The Windows Event Log often provides early warning symptoms, if you have nothing else.
If you carry out all (or as many as possible) of the above hardware adjustments you will witness a considerable jump in your hyper-v performance. That I can guarantee. However, for those who don’t have the time, patience or prepared to make the necessary investment in some cases, Altaro has developed an e-book just for you. Find out more about it here: Supercharging Hyper-V Performance for the time-strapped admin.
One of my very first jobs performing server support on a regular basis was heavily focused on backup. I witnessed several heart-wrenching tragedies of permanent data loss but, fortunately, played the role of data savior much more frequently. I know that most, if not all, of the calamities could have at least been lessened had the data owners been more educated on the subject of backup. I believe very firmly in the value of a solid backup strategy, which I also believe can only be built on the basis of a solid education in the art. This article’s overarching goal is to give you that education by serving a number of purposes:
- Explain industry-standard terminology and how to apply it to your situation
- Address and wipe away 1990s-style approaches to backup
- Clearly illustrate backup from a Hyper-V perspective
Whenever possible, I avoid speaking in jargon and TLAs/FLAs (three-/four-letter acronyms) unless I’m talking to a peer that I’m certain has the experience to understand what I mean. When you start exploring backup solutions, you will have these tossed at you rapid-fire with, at most, brief explanations. If you don’t understand each and every one of the following, stop and read those sections before proceeding. If you’re lucky enough to be working with an honest salesperson, it’s easy for them to forget that their target audience may not be completely following along. If you’re less fortunate, it’s simple for a dishonest salesperson to ridiculously oversell backup products through scare tactics that rely heavily on your incomplete understanding.
- Full/incremental/differential backup
- Bare-metal backup/restore (BMB/BMR)
- Disaster Recovery/Business Continuity
- Recovery point objective (RPO)
- Recovery time objective (RTO)
- Rotation — includes terms such as Grandfather-Father-Son (GFS)
There are a number of other terms that you might encounter, although these are the most important for our discussion. If you encounter a vendor making up their own TLAs/FLAs, take a moment to investigate their meaning in comparison to the above. Most are just marketing tactics — inherently harmless attempts by a business entity trying to turn a coin by promoting its products. Some are more nefarious — attempts to invent a nail for which the company just conveniently happens to provide the only perfectly matching hammer (with an extra “value-added” price, of course).
This heading might seem pointless — doesn’t everyone know what a backup is? In my experience, no. In order to qualify as a backup, you must have a distinct, independent copy of data. A backup cannot have any reliance on the health or well-being of its source data or the media that contains that data. Otherwise, it is not a true backup.
Recent technology changes and their attendant strategies have made this terminology somewhat less popular than in past decades, but it is still important to understand because it is still in widespread use. They are presented in a package because they make the most sense when compared to each other. So, I’ll give you a brief explanation of each and then launch into a discussion.
- Full Backups: Full backups are the easiest to understand. They are a point-in-time copy of all target data.
- Differential Backups: A differential backup is a point-in-time copy of all data that is different from the last full backup that is its parent.
- Incremental Backups: An incremental backup is a point-in-time copy of all data that is different from the backup that is its parent.
The full backup is the safest type because it is the only one of the three that can stand alone in any circumstances. It is a complete copy of whatever data has been selected.
A differential backup is the next safest type. Remember the following:
- To fully restore the latest data, a differential backup always requires two backups: the latest full backup and the latest differential backup. Intermediary differential backups, if any exist, are not required.
- It is not necessary to restore from the most recent differential backup if an earlier version of the data is required.
- Depending on what data is required and the intelligence of the backup application, it may not be necessary to have both backups available to retrieve specific items.
The following is an illustration of what a differential backup looks like:
Each differential backup goes all the way back to the latest full backup as its parent. Also, notice that each differential backup is slightly larger than the preceding differential backup. This phenomenon is conventional wisdom on the matter. In theory, each differential backup contains the previous backup’s changes as well as any new changes. In reality, it truly depends on the change pattern. A file backed up on Monday might have been deleted on Tuesday, so that part of the backup certainly won’t be larger. A file that changed on Tuesday might have had half its contents removed on Wednesday, which would make that part of the backup smaller. A differential backup can range anywhere from essentially empty (if nothing changed) to as large as the source data (if everything changed). Realistically, you should expect each differential backup to be slightly larger than the previous.
The following is an illustration of an incremental backup:
Incremental backups are best thought of as a chain. The above shows a typical daily backup in an environment that uses a weekly full with daily incrementals. If all data is lost and restoring to Wednesday’s backup is necessary, then every single night’s backup from Sunday onward will be necessary. If any one is missing or damaged, then it will likely not be possible to retrieve anything from that backup or any backup afterward. Therefore, incremental backups are the riskiest; they are also the fastest and consume the least amount of space.
Historically, full/incremental/differential backups have been facilitated by an archive bit in Windows. Anytime a file is changed, Windows sets its archive bit. The backup types operate with this behavior:
- A full backup captures all target files and clears any archive bits that it finds.
- A differential backup captures only target files that have their archive bit set and it leaves the bit in the state that it found it.
- An incremental backup captures only files with the archive bit set and clears it afterward.
Archive Bit Example
“Delta” is probably the most overthought word in all of technology. It means “difference”. Do not analyze it beyond that. It just means “difference”. If you have $10 in your pocket and you buy an item for $8, the $8 dollars that you spent is the “delta” between the amount of money that you had before you made the purchase and the amount of money that you have now.
The way that vendors use the term “delta” sometimes changes, but usually not by a great deal. In the earliest incarnation that I am aware of “delta” as applied to backups, it meant intra-file changes. All previous backup types operated with individual files being the smallest level of granularity (not counting specialty backups such as Exchange item-level). Delta backups would analyze the blocks of individual files, making the granularity one step finer.
The following image illustrates the delta concept:
A delta backup is essentially an incremental backup, but at the block level instead of the file level. Somebody got the clever idea to use the word “delta”, probably so that it wouldn’t be confused with “differential”, and the world thought it must mean something extra special because it’s Greek.
The major benefit of delta backups is that they use much less space than even incremental backups. The trade-off is in the computing power to calculate deltas. The archive bit can tell it if a file needs to be scanned, but it cannot tell it which blocks to cover. Backup systems that perform delta operations require some other method for change tracking.
Deduplication represents the latest iteration of backup innovation. The term explains itself quite nicely. The backup application searches for identical blocks of data and reduces them to a single copy.
Deduplication involves three major feats:
- The algorithm that discovers duplicate blocks must operate in a timely fashion
- The system that tracks the proper location of duplicated blocks must be foolproof
- The system that tracks the proper location of duplicated blocks must use significantly less storage than simply keeping the original blocks
So, while deduplication is conceptually simple, implementations can depend upon advanced computer science.
Deduplication’s primary benefit is that it can produce backups that are even smaller than delta systems. Part of that will depend on the overall scope of the deduplication engine. If you were to run fifteen new Windows Server 2016 virtual machines through even a rudimentary deduplicator, it would reduce all of them to the size of approximately a single Windows Server 2016 virtual machine — a 93% savings.
There is risk in overeager implementations, however. With all data blocks represented by a single copy, each block becomes a single point of failure. The loss of a single vital block could spell disaster for a backup set. This risk can be mitigated by employing a single pre-existing best practice: always maintain multiple backups.
We already have an article set that explores these terms in some detail. Quickly:
- Inconsistent backups would be effectively the same thing as performing a manual file copy of a directory tree.
- Crash-consistent backup captures data as it sits on the storage volume at a given point in time, but cannot touch anything passing through the CPU or waiting in memory. You could lose any in-flight I/O operations.
- Application-consistent backup coordinates with the operating system and, where possible, individual applications to ensure that in-flight I/Os are flushed to disk so that there are no active file changes at the moment that the backup is taken
I occasionally see people twisting these terms around, although I believe that’s most accidental. The definitions that I used above have been the most common, stretching back into the 90s. Be aware that there are some disagreements, so ensure that you clarify terminology with any salespeople.
A so-called “bare-metal backup” and/or “bare metal restore” involves capturing the entirety of a storage unit including metadata portions such as the boot sector. These backup/restore types essentially mean that you could restore data to a completely empty physical system without needing to install an operating system and/or backup agent on it first.
Disaster Recovery/Business Continuity
The terms “Disaster Recovery” (DR) and “Business Continuity” are often used somewhat interchangeably in marketing literature. “Disaster Recovery” is the older term and more accurately reflects the nature of the involved solutions. “Business Continuity” is a newer, more exciting version that sounds more positive but mostly means the same thing. These two terms encompass not just restoring data, but restoring the organization to its pre-disaster state. “Business Continuity” is used to emphasize the notion that, with proper planning, disasters can have little to no effect on your ability to conduct normal business. Of course, the more “continuous” your solution is, the higher your costs are. That’s not necessarily a bad thing, but it must be understood and expected.
One thing that I really want to make very clear about disaster recovery and/or business continuity is that these terms extend far beyond just backing up and restoring your data. DR plans need to include downtime procedures, phone trees, alternative working sites, and a great deal more. You need to think all the way through a disaster from the moment that one occurs to the moment that everything is back to some semblance of normal.
Recovery Point Objective
The maximum acceptable span of time between the latest backup and a data loss event is called a recovery point objective (RPO). If the words don’t sound very much like their definition, that’s because someone worked really hard to couch a bad situation within a somewhat neutral term. If it helps, the “point” in RPO means “point in time.” Of all data adds and changes, anything that happens between backup events has the highest potential of being lost. Many technologies have some sort of fault tolerance built in; for instance, if your domain controller crashes and it isn’t due to a completely failed storage subsystem, you’re probably not going to need to go to backup. Most other databases can tell a similar story. RPOs mostly address human error and disaster. More common failures should be addressed by technology branches other than backup, such as RAID.
A long RPO means that you are willing to lose a greater span of time. A daily backup gives you a 24-hour RPO. Taking backups every two hours results in a 2-hour RPO. Remember that an RPO represents a maximum. It is highly unlikely that a failure will occur immediately prior to the next backup operation.
Recovery Time Objective
Recovery time objective (RTO) represents the maximum amount of time that you are willing to wait for systems to be restored to a functional state. This term sounds much more like its actual meaning than RPO. You need to take extra care when talking with backup vendors about RTO. They will tend to only talk about RTO in terms of restoring data to a replacement system. If your primary site is your only site and you don’t have a contingency plan for complete building loss, your RTO is however long it takes to replace that building, fill it with replacement systems, and restore data to those systems. Somehow, I suspect that a six-month or longer RTO is unacceptable for most institutions. That is one reason that DR planning must extend beyond taking backups.
In more conventional usage, RTOs will be explained as though there is always a target system ready to receive the restored data. So, if your backup drives are taken offsite to a safety deposit box by the bookkeeper when she comes in at 8 AM, your actual recovery time is essentially however long it takes someone to retrieve the backup drive plus the time needed to perform a restore in your backup application.
Retention is the desired amount of time that a backup should be kept. This deceptively simple description hides some complexity. Consider the following:
- Legislation mandates a ten-year retention policy on customer data for your industry. A customer was added in 2007. Their address changed in 2009. Must the customer’s data be kept until 2017 or 2019?
- Corporate policy mandates that all customer information be retained for a minimum of five years. The line-of-business application that you use to record customer information never deletes any information that was placed into it and you have a copy of last night’s data. Do you need to keep the backup copy from five years ago or is having a recent copy of the database that contains five-year-old data sufficient?
Questions such as these can plague you. Historically, monthly and annual backup tapes were simply kept for a specific minimum number of years and then discarded, which more or less answered the question for you. Tape is an expensive solution, however, and many modern small businesses do not use it. Furthermore, laws and policies only dictate that the data be kept; nothing forced anyone to ensure that the backup tapes were readable after any specific amount of time. One lesson that many people learn the hard way is that tapes stored flat can lose data after a few years. We used to joke with customers that their bits were sliding down the side of the tape. I don’t actually understand the governing electromagnetic phenomenon, but I can verify that it does exist.
With disk-based backups, the possibilities are changed somewhat. People typically do not keep stacks of backup disks lying around, and their ability to hold data for long periods of time is not the same as backup tape. The rules are different — some disks will outlive tape, others will not.
Backup rotations deal with the media used to hold backup information. This has historically meant tape, and tape rotations often came in some very grandiose schemes. One of the most widely used rotations is called “Grandfather-Father-Son” (GFS):
- One full backup is taken monthly. The media it is taken on is kept for an extended period of time, usually one year. One of these is often considered an annual and kept longer. This backup is called the “Grandfather”.
- Each week thereafter, on the same day, another full backup is taken. This media is usually rotated so that it is re-used once per month. This backup is known as the “Father”.
- On every day between full backups, an incremental backup is taken. Each day’s media is rotated so that it is re-used on the same day each week. This backup is known as the “Son”.
The purpose of rotation is to have enough backups to provide sufficient possible restore points to guard against a myriad of possible data loss instances without using so much media that you bankrupt yourself and run out of physical storage room. Grandfathers are taken offsite and placed in long-term storage. Fathers are taken offsite, but perhaps not placed in long-term storage so that they are more readily accessible. Sons are often left onsite, at least for a day or two, to facilitate rapid restore operations.
Replacing Old Concepts with New Best Practices
Some backup concepts are simply outdated, especially for the small business. Tape used to be the only feasible mass storage device that could be written and rewritten on a daily basis and were sufficiently portable. I recall being chastised by a vendor representative in 2004 because I was “still” using tape when I “should” be backing up to his expensive SAN. I asked him, “Oh, do employees tend to react well when someone says, ‘The building is on fire! Grab the SAN and get out!’?” He suddenly didn’t want talk to me anymore.
The other somewhat outdated issue is that backups used to take a very, very long time. Tape was not very fast, disks were not very fast, networks were not very fast. Differential and incremental backups were partly the answer to that problem, and partly to the problem that tape capacity was an issue. Today, we have gigantic and relatively speedy portable hard drives, networks that can move at least many hundreds of megabits per second, and external buses like USB 3 that outrun both of those things. We no longer need all weekend and an entire media library to perform a full backup.
One thing that has not changed is the need for backups to exist offsite. You cannot protect against a loss of a building if all of your data stays in that building. Solutions have evolved, though. You can now afford to purchase large amounts of bandwidth and transmit your data offsite to your alternative business location(s) each night. If you haven’t got an alternative business location, there are an uncountable number of vendors that would be happy to store your data each night in exchange for a modest (or not so modest) sum of money. I still counsel periodically taking an offline offsite backup copy, as that is a solid way to protect your organization against malicious attacks (some of which can be by disgruntled staff).
These are the approaches that I would take today that would not have been available to me a few short years ago:
- Favor full backups whenever possible — incremental, differential, delta, and deduplicated backups are wonderful, but they are incomplete by nature. It must never be forgotten that the strength of backup lies in the fact that it creates duplicates of data. Any backup technique that reduces duplication dilutes the purpose of backup. I won’t argue against anyone saying that there are many perfectly valid reasons for doing so, but such usage must be balanced. Backup systems are larger and faster than ever before; if you can afford the space and time for full copies, get full copies.
- Steer away from complicated rotation schemes like GFS whenever possible. Untrained staff will not understand them and you cannot rely on the availability of trained staff in a crisis.
- Encrypt every backup every time.
- Spend the time to develop truly meaningful retention policies. You can easily throw a tape in a drawer for ten years. You’ll find that more difficult with a portable disk drive. Then again, have you ever tried restoring from a ten-year-old tape?
- Be open to the idea of using multiple backup solutions simultaneously. If using a combination of applications and media types solves your problem and it’s not too much overhead, go for it.
There are a few best practices that are just as applicable now as ever:
- Periodically test your backups to ensure that data is recoverable
- Periodically review what you are backing up and what your rotation and retention policies are to ensure that you are neither shorting yourself on vital data nor wasting backup media space on dead information
- Backup media must be treated as vitally sensitive mission-critical information and guarded against theft, espionage, and damage
- Magnetic media must be kept away from electromagnetic fields
- Tapes must be stored upright on their edges
- Optical media must be kept in dark storage
- All media must be kept in a cool environment with a constant temperature and low humidity
- Never rely on a single backup copy. Media can fail, get lost, or be stolen. Backup jobs don’t always complete.
Hyper-V-Specific Backup Best Practices
I want to dive into the nuances of backup and Hyper-V more thoroughly in later articles, but I won’t leave you here without at least bringing them up.
- Virtual-machine-level backups are a good thing. That might seem a bit self-serving since I’m writing for Altaro and they have a virtual-machine-level backup application, but I fit well here because of shared philosophy. A virtual-machine-level backup gives you the following:
- No agent installed inside the guest operating system
- Backups are automatically coordinated for all guests, meaning that you don’t need to set up some complicated staggered schedule to prevent overlaps
- No need to reinstall guest operating systems separately from restoring their data
- Hyper-V versions prior to 2016 do not have a native changed block tracking mechanism, so virtual-machine-level backup applications that perform delta and/or deduplication operations must perform a substantial amount of processing. Keep that in mind as you are developing your rotations and scheduling.
- Hyper-V will coordinate between backup applications that run at the virtual-machine-level (like Altaro VM) and the VSS writer(s) within guest Windows operating systems and the integration components within Linux guest operating systems. This enables application-consistent backups without doing anything special other than ensuring that the integration components/services are up-to-date and activated.
- For physical installations, no application can perform a bare metal restore operation any more quickly than you can perform a fresh Windows Server/Hyper-V Server installation from media (or better yet, a WDS system). Such a physical server should only have very basic configuration and only backup/management software installed. Therefore, backing up the management operating system is typically a completely pointless endeavor. If you feel otherwise, I want to know what you installed in the management operating system that would make a bare-metal restore worth your time, as I’m betting that such an application or configuration should not be in the management operating system at all.
- Use your backup application’s ability to restore a virtual machine next to its original so that you can test data integrity
With the foundational material supplied in this article, I intend to work on further posts that expand on these thoughts in greater detail. If you have any questions or concerns about backing up Hyper-V, let me know. Anything that I can’t answer quickly in comments might find its way into an article.
Native adapter teaming is a hot topic in the world of Hyper-V. It’s certainly nice for Windows Server as well, but the ability to spread out traffic for multiple virtual machines is practically a necessity. Unfortunately, there is a still a lot of misunderstanding out there about the technology and how to get it working correctly. (more…)
One of the great things about the Hyper-V virtual switch is that it can be used to very effectively isolate your virtual machines from the physical network. This grants them a layer of protection that’s nearly unparalleled. Like any security measure, this can be a double-edged sword. Oftentimes, these isolated guests still need some measure of access to the outside world, or they at least need to have access to a system that can perform such access on their behalf.
There are a few ways to facilitate this sort of connection. The biggest buzzword-friendly solution today is network virtualization, but that currently requires additional software (usually System Center VMM) and a not-unsubstantial degree of additional know-how. For most small, and even many medium-sized organizations, this is an unwelcome burden not only in terms of financial expense, but also in training/education and maintenance.
A simpler solution that’s more suited to smaller and less complicated networks is software routing. Because we’re talking about isolation using a Hyper-V internal or private switch, such software would need to be inside a virtual machine on the same Hyper-V host as the isolated guests.
If you’re not clear on how this would work, you can refer to one of our articles on the Hyper-V virtual switch. This is the same image that I used in that article:
External and Private Switch
The routing VM would be represented in that image as the Dual-Presence VM.
Choosing a Software Router
There are major commercial software routing solutions available, such as Vyatta. There are free routing software packages available, such as VyOS, the community fork for Vyatta. These are all Linux-based, and as such, are not within my scope of expertise. However, it should be possible to deploy them as Hyper-V guests.
What we’re going to look at in this article is Microsoft’s Routing and Remote Access Service. It’s an included component of Windows Server, and it’s highly recommended that an RRAS system perform no other functions. If you have a spare virtualization right, then this environment is free for you. Otherwise, you’ll need to purchase an additional Windows Server license.
Do not enable RRAS in Hyper-V’s management operating system! Doing so does not absolve you of the requirement to provide a virtualization right and networking performance of both the management operating system and RAS will be degraded. It’s also an unsupported configuration. To make matters worse, the performance will be unpredictable.
Step 1: Build the Virtual Machine
Sizing a software router depends heavily upon the quantity of traffic that it’s going to be dealing with. However, systems administrators that don’t work with physical routers very often are usually surprised at just how few hardware resources are in even some of the major physical devices. My starting recommendation for the routing virtual machine is as follows:
- vCPUs: 2
- Dynamic Memory: 512MB Startup
- Dynamic Memory: 256MB Minimum
- Dynamic Memory: 2GB Maximum
- Disk: 1 VHDX, Dynamically Expanding, 80GB (expect < 20 GB use, fairly stagnant growth)
- vNIC: 1 to connect to the external switch
- vNIC: 1 per private subnet per private/internal switch (wait on this during initial build)
- OS: latest Windows Server that you are licensed for
- This VM won’t necessarily need to be a member of your domain. If it’s going to be sitting in the perimeter, then you might consider leaving it out. Another option is to leave it in the domain but with higher-than-usual security settings. A great thing to do for machines such as this is disable cached credentials, whether it’s left in the domain or not.
Once the virtual machine is deployed, monitor it for CPU contention, high memory usage, and long network queue lengths. Adjust provisioning upward as necessary.
During VM setup, I would only create a single virtual adapter to start, and place it on the external virtual switch. Then use PowerShell to rename that adapter:
Get-VMNetworkAdapter –VMName svrras | Rename-VMNetworkAdapter –NewName External
Then, boot up the virtual machine and install Windows Server (I’ll be using 2012 R2 for this article). Rename the adapter inside Windows Server as well to reflect that it connects to the outside world. These steps will help you avoid a lot of issues later. If you’ve already created your adapters and would like a way to identify them, you can disconnect them from their switches and watch which show as being unplugged in the virtual machine.
Before proceeding, ensure you have added all virtual adapters necessary, one for each switch. The virtual adapter connected to the external virtual switch is the only one that requires complete IP information. It should have a default gateway pointing to the next external hop and it should know about DNS servers. It can use any IP on that network that you have available. It only needs a memorable IP if other systems will need to be able to send traffic directly to hosts on the isolated network.
All other adapters require only an IP and a subnet mask. The IP that you choose will be used as the default gateway for all other systems on that switch, so you’ll probably want it to be something easy to remember. If you’re using the router’s operating system as a domain member or in some other situation in which it will be able to register its IP addresses in DNS, make sure that you disable DNS registration for all adapters other than the one that’s in the primary network.
For reference, my test configuration is as follows:
- System name: SVRRAS
- “External” adapter: IP: 192.168.25.254, Mask: 255.255.255.0, GW 192.168.25.1
- “Isolated” adapter: IP: 172.16.0.1, Mask: 255.255.0.0, GW: None
In order for this to work, you’ll need to make some adjustments to the Windows Firewall. If you want, you can just turn it off entirely. However, it is a moderately effective barrier and better than nothing. We’ll follow up on the firewall after the RRAS configuration explanation.
Step 2: Installing and Configuring RRAS
Once Windows is installed and you have all the necessary network adapters connected to their respective virtual switches, the next thing to do is install RRAS. This is done through the Add Roles and Features wizard, just like any other Windows Server role. Choose Remote Access on the Roles screen:
RRAS Server Role
On the Role Services screen, check the box for Routing. You’ll be prompted to add several other items in order to fully enable routing, which will include DirectAcccess and VPN (RAS) on the same screen:
RRAS Role Services
After this, just proceed through, accepting the suggested settings.
Once the installation is complete, a reboot is unnecessary. You just need to configure the router. Open up the Start menu, and find the Routing and Remote Access snap-in. Open that up, and you’ll find something similar to the following screenshot. Right-click on your server’s name and click Configure and Enable Routing and Remote Access.
On the next screen, you have one of two choices for our purposes. The first is Network Address Translation (NAT). This mode works like a home Internet router does. All the virtual machines behind the router will appear to have the same IP address. This is a great solution for isolation purposes. It’s also useful when you haven’t got a hardware router available and want to connect your virtual machines into another network, such as the Internet. Your second choice is to select Custom configuration. This mode allows you to build a standard router, in which the virtual machines on the private network can be reached from other virtual machines by using their IP addresses. I won’t be illustrating this method as it doesn’t do a great deal for isolation.
On the next screen, you’ll tell the routing system which of the adapters is connected to the external network:
RRAS Adapter Selection
On the next screen, you’ll choose how addressing is handled on the private networks. The first option will set up your RRAS system to perform DHCP services for them and forward their DNS requests out to the DNS server(s) you specified for the external network. If you choose the second option, you can build your own addressing services as desired. I’m just going to work with the first option for the sake of ease (and a quicker article):
RRAS Name Services
Once this wizard completes, you’ll be returned to the main RRAS screen where the server should now show online (with a small upward pointing green arrow). That’s really all it takes to configure RRAS inside a Hyper-V guest.
Step 3: Configure the Virtual Machines
This is probably the best part: there is no necessary configuration at all. Just attach the virtual machines to the private switch and leave them in the default networking configuration. Here’s a screenshot from a Windows 7 virtual machine I connected to my test isolated switch:
RRAS in a VM: Demo
You’ll see that it’s using the IP information of SVRRAS’s adapter on the isolated switch for DHCP, gateway, and DNS information. As long as the firewall is down or properly set on the RRAS system, it will be able to communicate as expected.
Windows Firewall on the RRAS System
The easiest thing to do with the Windows Firewall is to just turn it off. If you’re using it at the perimeter, that’s a pretty bad idea. While it may not be the best firewall available, it does pretty well for a software firewall and seems to have the fewest negative side effects in that genre.
For many, a build of this kind exposes them to a completely new configuration style, one that many hardware firewall administrators have dealt with for a very long time. In a traditional software firewall build running on a single system, you usually only have to worry about inbound and outbound traffic on a single adapter. Now, you have to worry about it on at least two adapters. This is a visualization:
RRAS and Windows Firewall
When configuring Windows Firewall, you have to be mindful of how this will work. In order to allow guests on the private network to access web sites on the external network, you’ll need to open port 80 INBOUND on the adapter connected to the private network. This kind of management can get tedious, but it’s very effective if you want to further isolate those guests.
What you might want to do instead is leave the firewall on that connects to the external network but disable it for the private network. Then, your private guests will have receive the default levels of protection from the router VM’s firewall. You could then turn off the firewalls in the guests, if you like. In order to do this, open up Windows Firewall with Advanced Security on the router VM (or target the MMC from a remote computer). In the center pane, click Windows Firewall Properties or, in the left pane, right-click Windows Firewall with Advanced Security and click Properties. Next to Protected network connections, click Customize. In the dialog that appears, uncheck the box that represents the network adapter on the private switch. Click OK all the way out. I’ve renamed my adapters to “Isolated” and “External for the following screenshot:
RRAS Firewall Adapter Selection
DMZs and Port Forwarding in RRAS
You might want to selectively expose some of your guests on the private network to inbound traffic from the external network. Personally, I don’t see much value in using the DMZ mode. It’s functionally equivalent and less computationally expensive to just connect a “DMZ” virtual machine directly to the external switch and not have it go through a routing VM at all. Port forwarding (sometimes called “pincushioning”), on the other hand, does have its uses.
Open the Routing and Remote Access console. Expand your server, then expand the IP version (IPv4 or IPv6) that you want to configure forwarding for. Click NAT. In the center pane, locate the interface that is connected to the external switch. Right-click it and click Properties.
If you wish to configure one or more “DMZ” virtual machines, the Address Pool tab is where this is done. First, you create a pool that contains the IP address(es) that you want to map to the private VM(s). For instance, if you want to make 192.168.25.78 the IP address of a private VM, then you would enter that IP address with a subnet mask of 255.255.255.255. Once you have your range(s) configured, you use the Reservations button to map the external address(es) to the private VMs address(es).
For port forwarding, go to the Services and Ports tab. If you check one of the existing boxes, it will present you with a prefilled dialog where all you need to enter is the IP address of the private VM for which you wish to forward traffic:
RRAS Port Forwarding
In this screenshot, the On this address pool entry field is unavailable. That’s because I did not add any other IP addresses on the Address Pool tab. If you do that, then the external adapter will have multiple IPs assigned to it. If you don’t use those additional IPs for DMZ purposes, then you can use them here. The reason for doing so is so that multiple private virtual machines can have port forwarding for the same service. One use case for doing so is if you have two or more virtual machines on your private switch serving web pages and you want to make them all visible to computers on the external network.
VLANs and RRAS
This is a topic that comes up often. If you’ve read our networking series, then hopefully you already know how this works. VLANs are a layer 2 concept while routing is layer 3. Therefore, VLAN assignments for virtual machines on your private virtual switch will have no relation to any other VLANs anywhere else. Using VLANs on the private switch is probably not useful unless you need to isolate them from each other. If that’s the case, then you’ll need to make a distinct virtual adapter inside the routing virtual machine for each of those VLANs, or it won’t be able to communicate with the other guests on that VLAN.
That’s really all there is to getting routing working through a Hyper-V virtual machine. This is a great solution for isolating virtual machines and for test labs.
Do remember that it’s not as efficient or as robust as using a true hardware router.
Like any creative work, a blog post is never really done; it’s just abandoned. Unlike many other mediums, blogs do allow us to easily refresh those older articles, but we so rarely ever do it. To close out this year, a few of us on the editorial team got together and selected a few highlights from the past year.
Our 14 selections from 2014 (in no particular order):
This was our first licensing article directly related to guest licensing. We followed it up with a downloadable eBook that was expanded to include a number of examples, and Andy Syrewicze and Thomas Maurer gave a fantastic webinar on the topic. We’ve received quite a few questions and some great feedback. Keep an eye out for a follow-up post that takes on some of those questions and incorporates some of the suggestions. If you’ve got questions or suggestions of your own, feel free to send them in (by leaving a comment)!
This two part series started with a look at all the myths surrounding the virtualization of domain controllers and exposed the truths. In part 2, we explained how to successfully virtualize your domain controllers without headaches. I felt that both of these articles explained the situation and remedies very well, except that it seems that some people have been unable to make time synchronization work correctly when the Hyper-V Time Synchronization service is partially disabled for virtualized domain controllers. Microsoft has changed their published policy to include a recommendation that time sync be fully disabled for DC guests, although personally, I think the jury is still out. Completely disabling it is certainly an expedient solution to a frustrating situation, but that doesn’t automatically make it the best option. I’ve been able to make partial disablement work every time I’ve tried, and it’s the only way to guarantee that a DC that was saved (even accidentally) will be able to properly recover.
I wrote this post as I saw a lot of people really trying hard to justify using their Hyper-V hosts as domain controllers. It’s a really bad idea, so I collected all the reasoning I could think of not to do it. In retrospect, I probably should have ordered them a little better to address the topmost concerns first. Those are:
- Licensing costs. People want to save on licensing by just installing the domain controller in the Hyper-V host so that they don’t have to pay for a Windows license just to run domain services. As explained in #4, this doesn’t work.
- The chicken-and-egg myth. People believe that if they join the Hyper-V host to the domain and the only DC is a guest, then the Hyper-V host won’t boot or will have other problems. That wasn’t even mentioned in this article, at least not outright, although it was a major point of the articles that it links to.
- The myth that Hyper-V hosts should be left in workgroup mode to increase security. This was included as point #1, which isn’t a bad placement for it. People get too close to a situation and sometimes make irrational decisions. They don’t want to put their systems at unnecessary risk and a compromised Hyper-V host can potentially put a lot of guests at risk all at once, so they do take steps to distance the host from the guests. While there are solutions, workgroup mode isn’t one of them. I mean, just try to say this out loud without laughing: “Workgroup mode is more secure than domain mode”. Or, write this on your resume: “I once put a computer in workgroup mode to increase security over domain mode”. I’ll bet that won’t get you many callbacks.
“There are three kinds of men. The one that learns by reading. The few who learn by observation. The rest of them have to pee on the electric fence for themselves.” – Will Rogers
One really good way to learn is from your own mistakes. Most people that do something really wrong are much less likely to do that same thing twice. But, as far as I’m concerned, it’s a whole lot better if you can learn from other people’s mistakes. This article compiles a list of the ones I’ve seen most frequently in the hopes that at least a few people can avoid them.
We had an unusually long (7 parts!) series on storage and Hyper-V. Even though this particular piece looked at parts of storage that some people might find very basic, we often forget that there are always newcomers, and sometimes even experienced administrators missed some of the basics during their career. A highlight of this article is that it puts to bed an old recommendation about spending a lot of time on sector alignment. It also takes a generalized look at architecting disk layout for Hyper-V.
Everyone likes a good how-to. Even if you can figure something out on your own, it makes little sense to do so if someone else has already done the work. For many administrators, moving to a virtualization platform is the first time they’ll connect to external storage, which is why I took the time to lay out exactly how it’s done. I noticed that we had intended to publish a Powershell-equivalent article to this one, but that never came to fruition. We’ll rectify that in 2015.
Another outstanding submission from Jeff Hicks, this article shows a clever way to create a snapshot (now checkpoint) of a virtual machine and turn it into its own virtual machine. This gives you some cloning powers without needing to incur the expense of something like Virtual Machine Manager. Be sure to keep reading into part 2, where he shows you how to do it all much more efficiently with PowerShell.
Jeff Hicks compiled a list of his 10 favorite Hyper-V cmdlets and took us all for a quick tour. If you’re thinking about integrating PowerShell with Hyper-V into your toolkit as a New Year’s resolution, this is a great place to start with one of the topmost experts.
This fantastic article was written by Jeff Hicks, and is one of my personal favorites. This is a wonderful little script that quickly runs against the target hosts that you specify and returns a snappy-looking HTML health report. Even better, Jeff shows you how to set it up to run automatically against all your hosts, so you can easily have a daily report.
In this article, Nirmal discusses the best approaches to ensure your Hyper-V guests are operating at their peak performance. Since Nirmal focused on peak performance and we had more than a few comments about that, we’re planning a follow-up article that contains more generalized best practices.
Another great how-to guide from Nirmal shows you how to use the new feature of 2012 R2 that allows you to resize a Hyper-V VM’s virtual SCSI disk without shutting the guest down.
Out of all my articles, this is one of my personal favorites. I feel really bad for people who spend a lot of time wringing their hands over how many cores should be in their Hyper-V hosts, especially when they wind up spending too much money on CPUs that are just going to sit idle. One thing I probably should have mentioned in that article is that one of the first things many of us do to address certain Hyper-V performance issues is disable many of the power-saving features of our processors (C-states, especially). If you’ve got a lot of cores sitting idle, that’s a lot of wasted energy. And, if Aidan Finn’s prediction about per-core licensing comes true in 2015 (I really hope it doesn’t), then it’s going to translate into lots of wasted licensing dollars as well.
PowerShell still hasn’t become of serious importance to far too many administrators, and that’s really a shame. Since I clearly remember the days before I learned to embrace it and all the reasons I invoked to avoid it, I can certainly understand why. It’s just one of those things that can be tough to see the value in until you have your own personal “A ha!” moment. This series is meant to serve a dual role: one is to provide useful scripts to the Hyper-V community. The other is to simply to teach by example. If you’ve just come for the scripts, great! If you happen to learn something while you’re here, that’s even better!
I’ve written a number of PowerShell scripts for you now, but far and away this one is both the most complicated and my favorite. For a very long time prior, I had been thinking, “Someone should do that.” I was eventually forced to accept that I qualify as “someone”. A few days into this, I realized just why no one had done it. I learned a lot, though, and there are certainly a great many uncommon techniques included in this script. So, it has its surface value of being the only tool I know of that can track down orphaned Hyper-V files, but it would also be something that intermediate PowerShell scripters can tear apart for tidbits to add to their own scripting toolkit.
Time keeps on slippin’, slippin’, slippin’, into the future. – The Steve Miller Band, “Fly Like an Eagle”
As we close out 2014 by examining our successes, we acknowledge that all of it has been made possible by you, our readers. If you want to have a say in how in how we approach 2015, now is your chance! Let us know in the comments what you’d like to see more of, what you’d like to see less of, and what big things we’ve missed entirely.
On behalf of the Altaro blogging crew, we’ve had a wonderful time writing for you and interacting with you in the comments section. We wish you and yours a wonderful holiday season and look forward to another wonderful year!
We’ve had a long run of articles in this series that mostly looked at general networking technologies. Now we’re going to look at a technology that gets us closer to Hyper-V. Load-balancing algorithms are a feature of the network team, which can be used with any Windows Server installation, but is especially useful for balancing the traffic of several operating systems sharing a single network team.
We’ve already had a couple of articles on the subject of teaming in the Server 2012+ products. The first, not part of this series, talked about MPIO, but outlined the general mechanics of teaming. The second was part of this series and took a deeper look at teaming and the aggregation options available.
The selected load-balancing method is how the team decides to utilize the team members for sending traffic. Before we go through these, it’s important to reinforce that this is load-balancing. There isn’t a way to just aggregate all the team members into a single unified pipe.
I will periodically remind you of this point, but keep in mind that the load-balancing algorithms apply only to outbound traffic. The connected physical switch decides how to send traffic to the Windows Server team. Some of the algorithms have a way to exert some influence over the options available to the physical switch, but the Windows Server team is only responsible for balancing what it sends out to the switch.
Hyper-V Port Load-Balancing Algorithm
This method is commonly chosen and recommended for all Hyper-V installation based solely on its name. This is a poor reason. The name wasn’t picked because it’s the automatic best choice for Hyper-V, but because of how it operates.
The operation is based on the virtual network adapters. In versions 2012 and prior, it was by MAC address. In 2012 R2, and presumably onward, it will be based on the actual virtual switch port. Distribution depends on the teaming mode of the virtual switch.
Switch-independent: Each virtual adapter is assigned to a specific physical member of the team. It sends and receives only on that member. Distribution of the adapters is just round-robin. The impact on VMQ is that each adapter gets a single queue on the physical adapter it is assigned to, assuming there are enough left.
Everything else: Virtual adapters are still assigned to a specific physical adapter, but this will only apply to outbound traffic. The MAC addresses of all these adapters appear on the combined link on the physical switch side, so it will decide how to send traffic to the virtual switch. Since there’s no way for the Hyper-V switch to know where inbound traffic for any given virtual adapter will be, it must register a VMQ for each virtual adapter on each physical adapter. This can quickly lead to queue depletion.
Recommendations for Hyper-V Port Distribution Mode
If you somehow landed here because you’re interested in teaming but you’re not interested in Hyper-V, then this is the worst possible distribution mode you can pick. It only distributes virtual adapters. The team adapter will be permanently stuck on the primary physical adapter for sending operations. The physical switch can still distribute traffic if the team is in a switch-dependent mode.
By the same token, you don’t want to use this mode if you’re teaming from within a virtual machine. It will be pointless.
Something else to keep in mind is that outbound traffic from a VM is always limited to a single physical adapter. For 10 Gb connections, that’s probably not an issue. For 1 Gb, think about your workloads.
For 2012 (not R2), this is a really good distribution method for inbound traffic if you are using the switch-independent mode. This is the only one of the load-balancing modes that doesn’t force all inbound traffic to the primary adapter when the team is switch-independent. If you’re using any of the switch-dependent modes, then the best determinant is usually the ratio of virtual adapters to physical adapters. The higher that number is, the better result you’re likely to get from the Hyper-V port mode. However, before just taking that and running off, I suggest that you continue reading about the hash modes and think about how it relates to the loads you use in your organization.
For 2012 R2 and later, the official word is that the new Dynamic mode universally supersedes all applications of Hyper-V port. I have a tendency to agree, and you’d be hard-pressed to find a situation where it would be inappropriate. That said, I recommend that you continue reading so you get all the information needed to compare the reasons for the recommendations against your own system and expectations.
Hash Load-Balancing Algorithms
The umbrella term for the various hash balancing methods is “address hash”. This covers three different possible hashing modes in an order of preference. Of these, the best selection is the “Transport Ports”. The term “4-tuple” is often seen with this mode. All that means is that when deciding how to balance outbound traffic, four criteria are considered. These are: source IP address, source port, destination IP address, destination port.
Each time traffic is presented to the team for outbound transmission, it needs to decide which of the team members it will use. At a very high level, this is just a round-robin distribution. But, it’s inefficient to simply set the next outbound packet onto the next path in the rotation. Depending on contention, there could be a lot of issues with stream sequencing. So, as explained in the earlier linked posts, the way that the general system works is that a single TCP stream stays on a single physical path. In order to stay on top of this, the load-balancing system maintains a hash table. A hash table is nothing more than a list of entries with more than one value, with each entry being unique from all the others based on the values contained in that entry.
To explain this, we’ll work through a complete example. We’ll start with an empty team passing no traffic. A request comes in to the team to send from a VM with IP address 192.168.50.20 to the Altaro web address. The team sends that packet out the first adapter in the team and places a record for it in a hash table:
Right after that, the same VM request a web page from the Microsoft web site. The team compares it to the first entry:
The source ports and the destination IPs are different, so it sends the packet out the next available physical adapter in the rotation and saves a record of it in the hash table. This is the pattern that will be followed for subsequent packets; if any of the four fields for an entry make it unique when compared to all current entries in the table, it will be balanced to the next adapter.
As we know, TCP “conversations” are ongoing streams composed of multiple packets. The client’s web browser will continue sending requests to the above systems. The additional packets headed to the Altaro site will continue to match on the first hash entry, so they will continue to use the first physical adapter.
IP and MAC Address Hashing
Not all communications have the capability of participating in the 4-tuple hash. For instance, ICMP (ping) messages only use IP addresses, not ports. Non-TCP/IP traffic won’t even have that. In those cases, the hash algorithm will fall back from the 4-tuple method to the most suitable of the 2-tuple matches. These aren’t as granular, so the balancing won’t be as even, but it’s better than nothing.
Recommendations for Hashing Mode
If you like, you can use PowerShell to limit the hash mode to IP addresses, which will allow it to fall back to MAC address mode. You can also limit it to MAC address mode. I don’t know of a good use case for this, but it’s possible. Just check the options on New- and Set-NetLbfoTeam. In the GUI, you can only pick “Address Hash” unless you’ve already used PowerShell to set a more restrictive option.
For 2012 (not R2), this is the best solution in non-Hyper-V teaming, including teaming within a virtual machine. For Hyper-V, it’s good when you don’t have very many virtual adapters or when the majority of the traffic coming out of your virtual machines is highly varied in a way that way that would have a high number of balancing hits. Web servers are likely to fit this profile.
In contrast to Hyper-V Port balancing, this mode will mode always balance outbound traffic regardless of the teaming mode. But, in switch-independent mode, all inbound traffic comes across the primary adapter. This is not a good combination for high quantities of virtual machines whose traffic balance is heavier on the receive side. This part of the reason that the Hyper-V port mode almost always makes more sense in a switch independent mode, especially as the number of virtual adapters increases.
For 2012 R2, the official recommendation is the same as with the Hyper-V port mode. You’re encourage to use the new Dynamic mode. Again, this is generally a good recommendation that I’m overly inclined to agree with. However, I still recommend that you keep reading so you understand all your options.
This mode is new in 2012 R2, and it’s fairly impressive. For starters, it combines features from the Hyper-V port and Address Hash modes. The virtual adapters are registered separately across physical adapters in switch independent mode so received traffic can be balanced, but sending is balanced using the Address Hash method. In switch independent mode, this gives you an impressive balancing configuration. This is why the recommendations are so strong to stop using the other modes. However, if you’ve got an overriding use case, don’t be shy about using it. I suppose it’s possible that limiting virtual adapters to a single physical adapter for sending might have some merits in some cases.
There’s another feature added by the Dynamic mode that its name is derived from. It makes use of flowlets. I’ve read a whitepaper that explains this technology. To say the least, it’s a dense work that’s not easy for mortals to follow. The simple explanation is that it is a technique that can break an existing TCP stream and move it to another physical adapter. Pay close attention to what that means: the Dynamic mode cannot, and does not, send a single TCP stream across multiple adapters simultaneously. The odds of out-of-sequence packets and encountering interim or destination connections that can’t handle the parallel data is just too high for this to be feasible at this stage of network evolution. What it can do is move a stream from one physical adapter.
Let’s say you have two 10 GbE cards in a team using Dynamic load-balancing. A VM starts a massive outbound file transfer and it gets balanced to the first adapter. Another VM starts a small outbound transfer that’s balanced to the second adapter. A third VM begins its own large transfer and is balanced back to the first adapter. The lone transfer on the second adapter finishes quickly, leaving two large transfers to share the same 10 Gb adapter. Using the Hyper-V port or any address hash load-balancing method, there would be nothing that could be done about this short of canceling a transfer and restarting it, hoping that it would be balanced to the second adapter. With the new method, one of the streams can be dynamically moved to the other adapter, hence the name “Dynamic”. Flowlets require the split to be made at particular junctions in the stream. It is possible for Dynamic to work even when a neat flowlet opportunity doesn’t present itself.
Recommendations for Dynamic Mode
For the most part, Dynamic is the way to go. The reasons have been pretty well outlined above. For switch independent modes, it solves the dilemma of choosing Hyper-V port for inbound balancing against Address Hash for outbound balancing. For both switch independent and dependent modes, the dynamic rebalancing capability allows it to achieve a higher rate of well-balanced outbound traffic.
It can’t be stressed enough that you should never expect a perfect balancing of network traffic. Normal flows are anything but even or predictable, especially when you have multiple virtual machines working through the same connections. The Dynamic method is generally superior to all other load-balancing method but you’re not going to see perfectly level network utilization by using it.
Remember that if your networking goal is to enhance throughput, you’ll get the best results by using faster network hardware. No software solution will perform on par with dedicated hardware.
In part 3, I showed you a diagram of a couple of switches that were connected together using a single port. I mentioned then that I would likely use link aggregation to connect those switches in a production environment. Windows Server introduced the ability to team adapters natively starting with the 2012 version. Hyper-V can benefit from this ability.
To save you from needing to click back to part 2, here is the visualization again:
Port 19 is empty on each of these switches. That’s not a good use of our resources. But, we can’t just go blindly plugging in a wire between them, either. Even if we configure ports 19 just like we have ports 20 configured, it still won’t work. In fact, either of these approaches will fail with fairly catastrophic effects. That’s because we’ll have created a loop.
Imagine that we have configured ports 19 and 20 on each switch identically and wired them together. Then, switch port 1 on switch 1 sends out a broadcast frame. Switch 1 will know that it needs to deliver that frame to every port that’s a member of VLAN 10. So, it will go to ports 2-6 and, because they are trunk ports with a native VLAN of 10, it will also deliver it to 19 and 20. Ports 19 and 20 will carry the packet over to switch 2. When it comes out on port 19, it will try to deliver it to ports 1-6 and 20. When it comes out on port 20, it will try to deliver it to ports 1-6 and port 19. So, the frame will go back to ports 19 and 20 on switch 1, where it will repeat the process. Because Ethernet doesn’t have a time to live like TCP/IP does (at least, as far as I know, it doesn’t), this process will repeat infinitely. That’s a loop.
Most switches can identify a loop long before any frames get caught up. The way Cisco switches will handle it is by cutting off the offending loop ports. So, if it’s the only connection that switch has with the outside world, all its endpoints will effectively go out. I’ve never put any other manufacturer into a loop, so I’m not sure how the various other vendors will deal with it. No matter what, you can’t just connect switches to each other using multiple cables without some configuration work.
Port Channels and Link Aggregation
The answer to the above problem is found in Port Channels or Link Aggregation. A port channel is Cisco’s version. Everyone else calls it link aggregation. Cisco does have some proprietary technology wrapped up in theirs, but it’s not necessary to understand that for this discussion. So, to make the above problem go away, we would assign ports 19 and 20 on the Cisco switch into a port channel. On any other hardware vendor, we would assign them to a link aggregation group (LAG). Once that’s done, the port channel or LAG is then configured just like a single port would be, as in trunk/(un)tagged or access/PVID. What’s really important to understand here is that the MAC addresses that the switch assigned to the individual ports are gone. The MAC address now belongs to the port channel/LAG. MAC addresses that it knows about on the connecting switch are delivered to the port channel, not to a switch port.
It’s been quite a while since I worked on a Cisco environment, but as I recall, a port channel is just a port channel. You don’t need to do a lot of configuration once it’s set up. For other vendors, you have to set up the mode. We’re going to see these modes again with the Windows NIC team, so we’ll get acquainted with that first.
Now we look at how this translates into the Windows and Hyper-V environment. For a number of years, we’ve been using NIC teaming in our data centers to provide a measure of redundancy for servers. This uses multiple connections as well, but the most common types don’t include the same sort of cooperation between server and switch that you saw above between switches. Part of it is that a normal server doesn’t usually host multiple endpoints the way a switch does, so it doesn’t really need a trunk mode. A server is typically not concerned with VLANs. So, usually a teamed interface on a server isn’t maintaining two active connections. Instead, it has its MAC address registered on one of the two connected switch ports and the other is just waiting in reserve. Remember that it can’t actually be any other way, because a MAC address can only appear on a single port. So, even though a lot of people thought that they were getting aggregated bandwidth, they really weren’t. But, the nice thing about this configuration is that it doesn’t need any special configuration on the switch, except perhaps if there is a security restriction that prevents migration of MAC addresses.
New, starting in Windows/Hyper-V Server 2012, is NIC teaming built right into the operating system. Before this, all teaming schemes were handled by manufacturers’ drivers. There are three teaming modes available.
This is the same mode as the traditional teaming mode. The switch doesn’t need to participate. Ordinarily, the Hyper-V switch will register all of its virtual adapters’ MAC addresses on a single port, so all inbound traffic comes through a single physical link. We’ll discuss the exceptions in another post. Outbound traffic can be sent using any of the physical links.
The great benefit of this method is that it can work with just about any switch, so small businesses don’t need to make special investments in particular hardware. You can even use it to connect to multiple switches simultaneously for redundancy. The downside, of course, is that all incoming traffic is bound to a single adapter.
The Hyper-V virtual switch and many physical switches can operate in this mode. The common standard is 802.3ad, but not all implementations are equal. In this method, each member is grouped into a single unit as explained in the Port Channels and Link Aggregation section above. Both switches (whether physical or virtual) much have their matching members configured into a static mode.
MAC addresses on all sides are registered on the overall aggregated group, not on any individual port. This allows incoming and outgoing traffic to use any of the available physical links. The drawbacks are that the switches all have to support this and you lose the ability to split connections across physical switches (with some exceptions, as we’ll talk about later).
If a connection experiences troubles but isn’t down, then the static switch will experience problems that might be difficult to troubleshoot. For instance, if you create a static team on 4 physical adapters in your Hyper-V host but only three of the switch’s ports are configured in a static trunk, then the Hyper-V system will still attempt to use all four.
LACP stands for “Link Aggregation Control Protocol”. This is defined in the 802.1ax standard, which supersedes 802.3ad. Unfortunately, there is a common myth that gives the impression that LACP provides special bandwidth consolidation capabilities over static aggregation. This is not true. An LACP group is functionally like a static group. The difference is that connected switches communicate using LACPDU packets to detect problems in the line. So, if the example setup at the end of the Static teaming section used LACP instead of Static, the switches would detect that one side was configured using only 3 of the 4 connected ports and would not attempt to use the 4th link. Other than that, LACP works just like static. The physical switch needs to be setup for it, as does the team in Windows/Hyper-V.
Bandwidth Usage in Aggregated Links
Bandwidth usage in aggregated links is a major confusion point. Unfortunately, it’s not a simple matter of all physical links being simply combined into one bigger one. It’s more likely that load-balancing will occur than bandwidth aggregation.
In most cases, the sending switch/team controls traffic flow. Specific load-balancing algorithms will be covered in another post. However it chooses to perform it, the sending system will transmit on a specific link. But, any given communication will almost exclusively use only one physical link. This is mostly because it helps ensure that the frames that make up a particular conversation arrive in order. If they were broken up and sent down separate pipes, contention and buffering would dramatically increase the probability that they would be scrambled before reaching their destination. TCP and a few other protocols have built-in ways to correct this, but this is a computationally expensive operation that usually doesn’t outweigh the restrictions of simply using a single physical link.
Another reason for the single-link restriction is simple practicality. Moving a transmission through multiple ports from Switch A to Switch B is fairly trivial. From Switch B to Switch C, it becomes less likely that enough links will be available. The longer the communications chain, the more likely a transmission won’t have the same bandwidth available as the initial hop. Also, the final endpoint is most likely on a single adapter. The available methods to deal with this are expensive and create a drag on network resources.
The implications of all this aren’t exactly clear. A quick explanation is that no matter what teaming mode you pick, when you run a network performance test across your team, the result is going to show the maximum speed of a single team member. But, if you run two such tests simultaneously, it might use two of the links. What I normally see is people trying to use a file copy to test bandwidth aggregation. Aside from the fact that file copy is a horrible way to test anything other than permissions, it’s not going to show anything more than the speed of a single physical link.
The exception to the sender-controlling rule is the switch-independent teaming mode. Inbound traffic is locked to a single physical adapter as all MAC addresses are registered in a single location. It can still load-balance outbound traffic across all ports. If used with the Hyper-V port load-balancing algorithm, then the MAC addresses for virtual adapters will be evenly distributed across available physical adapters. Each virtual adapter can still only receive at the maximum speed of a single port, though.
Some switches have the power to “stack”. What this means is that individual physical switches can be combined into a single logical unit. Then, they share a configuration and operate like a single unit. The purpose is for redundancy. If one of the switch(es) fails, the other(s) will continue to operate. What this means is that you can split a static or LACP inter-switch connection, including to a Hyper-V switch, across multiple physical switch units. It’s like having all the power of the switch independent mode with none of the drawbacks.
One concern with stacked switches is the interconnect between them. Some use a special interlink cable that provides very high data transfer speeds. With those, the only bad thing about the stack is the monetary cost. Cheaper stacking switches often just use regular Ethernet or 1gb or 2gb fiber. This could lead to bandwidth contention between the stack members. Since most networks use only a fraction of their available bandwidth at any given time, this may not be an issue. For heavily loaded core switches, a superior stacking method is definitely recommended.
Without some understanding of load-balancing algorithms, it’s hard to get the complete picture here. These are the biggest things to understand:
- The switch independent mode is the closest to the original mode of network adapter teaming that has been in common use for years. It requires that all inbound traffic flow to a single adapter. You cannot choose this adapter. If combined with the Hyper-V switch port load-balancing algorithm, virtual switch ports are distributed evenly across the available adapters and each will use only its assigned port for inbound traffic.
- Static and LACP modes are common to the Windows/Hyper-V Server NIC team and most smart switches.
- Not all static and LACP implementations are created equally. You may encounter problems connecting to some switches.
- LACP doesn’t have any capabilities for bandwidth aggregation that the static method does not have.
- Bandwidth aggregation occurs by balancing different communications streams across available links, not by using all possible paths for each stream.
While it might seem logical that the next post would be about the load-balancing algorithms, that’s actually a little more advanced than where I’m ready for this series to proceed. Bandwidth aggregation using static and LACP modes is a fairly basic concept in terms of switching. I’d like to continue with the basics of traffic flow by talking about DNS and protocol bindings.
After storage, Hyper-V’s next most confusing subject is networking. There are a dizzying array of choices and possibilities. To make matters worse, many administrators don’t actually understand that much about the fundamentals because, up until now, they’ve never really had to.
Why It Matters
In the Windows NT 4.0 days, the Microsoft Certified Systems Engineer exam track required passage of “Networking Essentials” and the electives included a TCP/IP exam. Neither of these exams had a corollary in the Windows 2000 track and, although I haven’t kept up much with the world of certification since the Windows 2003 series, I’m fairly certain that networking has largely disappeared from Microsoft certifications. That’s both a blessing and a curse. Basic networking isn’t overly difficult and a working knowledge can be absorbed through simple hands-on experience. More advanced, and sometimes even intermediate skills, can be involved and require a fair level of dedication. If all you really need to do is plug a Windows Server into an existing network and get it going, then a lot of that is probably excess detail that you can leave to someone else. There are certification, expertise, and career tracks available just for networking, and the network engineers and administrators that earn them deserve to have their own world separate from system engineering and administration. Learning all of that is burdensome for systems administrators and is unlikely to pay dividends, especially with the risk of skill rot. The downside is that it’s no longer good enough to know how to set up a vendor team and slam in some basic IP information. Too many systems people have ignored the networking stacks in favor of their servers and applications and are now playing catch-up as integrated teaming, datacenter bridging, software-defined networking, and other technologies escape the confines of netops and intrude into the formerly tidy world of sysops.
The first post of this series will (re)introduce you to the fundamentals of networking that you will build the rest of your Hyper-V networking understanding upon.
The OSI Model
If you’ve never heard the phrases “All People Seem To Need Data Processing” or “Please Do Not Throw Sausage Pizza Away”, then someone along your technical education path has done you a great disservice (or you learned the OSI model in a non-English language). These are mnemonics used by students to drill for exams that test on the seven layers of the OSI model, which obviously worked because I can still recall them fifteen years later:
- Data Link
Oddly enough, I’ve never been asked on any test what “OSI” stands for and I had to look that up: Open Systems Interconnection. Now you know what to put on that blank Apples to Apples card if you want to never be invited to a party again.
The reason that we have two mnemonics is because traffic travels both ways in the model. If your application is Skype, then the model covers your voice being broken into a rush of electrons (from seventh down to first layer) and back into something that might sound almost like you on the other side of an ocean (from first up to seventh layer).
The OSI model is a true model in that it does nothing but describe how a complete networking stack might look. In practice, there is nothing that perfectly matches to this model. The idea is that each of the seven layers performs a particular function in network communications, but only knows enough to interoperate with the layer immediately above and immediately below. So, no jumping from the physical layer to the presentation layer, for instance.
I’m not going to spend a bunch of time on the seven layers. There are a lot of great references and guides available on the Internet, so if you really care, do some searching and find the resource that suits your learning model. If you’re in systems or network administration/engineering, layers six and seven will likely never be of any real concern to you. You might occasionally care about layer five. We’re really focused on layers one through four, and that’s what we’ll talk about.
Use the following diagram as a visual reference for the upcoming sections:
In theory, layer one is extremely easy to understand; it’s all in the name: “physical”. This is the electrons and the wires and fiber and switch ports and network adapters and such. It’s the world of twisted pairs and CAT-5e and crossover. In practice, it’s always messier. This is also the world of crosstalk and interference and phrases like, “Hey, we can use the ballasts in these fluorescent lights as anchors for the network cable, right?” and, “I only use cables with red jackets because they have better throughput than the blue ones,” and all sorts of other things that are pretty maddening if you spend too much time thinking about them. We’ll move along quickly. Just be aware that cables and switches are important, and they break, and they need to be cared for.
Layer two is where things start to get fuzzy. From this point upward, everything exists because we say it does. It’s the first level at which those pulses of light and electron bursts take on some sort of meaning. For us in the Hyper-V world, it’s mostly going to be Ethernet. The unit of communication in Ethernet is the frame. In keeping with our layered model concept, the frame was a sequence of light or electric pulses that some physical device, like a network adapter, has re-interpreted into digital bits. The Ethernet specification says that a series of bits has a particular format and meaning. An incoming series of these bits starts with a header and is then followed by what is expected to be a data section (called the payload), and ends with a set of validation bits. This is the first demonstration point of the OSI model: layer one handles all the nasty parts of converting pulses to data bits and back. Layer two is only aware of and concerned with the ordering of these bits.
The Ethernet Frame
By tearing apart the Ethernet frame header, we can see most of the basic features that live in this layer.
Tagged Ethernet Frame
The first thing of note is the destination and source MAC addresses (“media access control address”). On any Windows machine, run IPCONFIG /ALL and you’ll find the MAC address in the Physical Address field. Run Get-NetAdapter in PowerShell and you can retrieve the value of the MacAddress field or the LinkLayerAddress field. The MAC address comes in six binary octets, usually represented in two-digit hexadecimal number groupings, like this: E0-06-E6-2A-CD-FB. In case it’s not obvious, the hyphens are only present to make it human readable. You’ll sometimes see colons used, or no delimiters at all. Every network device manufacturer and (some other entities) have their own prefix(-es), indicated in the first three octets. If you search the Internet for “MAC address prefix lookup”, you’ll find a number of sites that allow you to identify the actual manufacturer of the network chip on your branded adapter.
The presence of the MAC address in the Ethernet frame tells us that layer 2 is what deals with these addresses. Therefore, it could also be said that this is the level at which we will find ARP (address resolution protocol), although, as a tangent, it could also be considered as layer 3. Either way, all data that travels across an Ethernet network knows only about MAC addresses. There is no other addressing scheme available here. TCP/IP and its attendant IP addresses have no presence in Ethernet, and unless you get really deep into the technicalities, TCP/IP isn’t considered to be in layer two at all. It’s vital that you understand this, as it is a common stumbling point that presents a surprisingly high barrier to comprehension. As a bit of a trivia piece, the ability to manage MAC addresses and tables is what differentiates a switch from a hub.
Next, we might encounter the 802.1q tag. This is the technology that enables VLANs to work. This is a potentially confusing topic and will get its own section later. For now, just be aware that, if present, VLAN information lives in the Ethernet frame which means it is part of layer 2. Layer 3 and upward have no idea that VLANs even exist.
What puts layer two right in the face of the Windows administrator is the fact that the Hyper-V virtual switch and Windows network adapter teaming live at this level. Without an ability to parse the Ethernet frame, teaming cannot work at all. It must be able to work with MAC addresses. The Hyper-V virtual switch is a switch, and as such it must also be aware of MAC addresses. It also happens to be a smart switch, so it must also know how to work with 802.1q VLAN tags.
A fairly recent addition to the Ethernet specification is Datacenter Bridging (DCB). This is an advanced subject that I might write a dedicated article about, as it is a large and complex topic in its own right. The basic goal of DCB is to overcome the lossy nature of TCP/IP in the datacenter where data loss is both unnecessary and undesirable. There are a number of implementations, but the Ethernet versions include some way of tagging the frame. The significance is that Windows can apply a DCB tag to traffic and DCB-aware physical switches are able to process and prioritize traffic according to these tags. You need a fairly large TCP/IP network for this to be an issue of major concern as most LANs see so little contention that any data loss usually indicates a broken component.
The final thing we’re going to talk about here is the payload. In the modern Windows world, the content of this payload is a TCP/IP packet. It doesn’t have to be that, though. In days of yore, it might have been an IPX/SPX packet. Or a NetBEUI packet. Or anything. All that Ethernet cares about is the destination MAC address. Once the frame is delivered, layer two will unpackage the packet and deliver it up to layer three to deal with.
Here is where we first begin to encounter TCP/IP. A couple of things to note here. First, TCP/IP is not really a protocol, but a protocol group. TCP is one of them, IP is another, so on and so forth. Second, it’s also where you really start to see that the layers of the OSI model are only conceptual, because a number of things could be considered to exist in multiple layers simultaneously.
Layer three is where we start talking about the packet as opposed to the frame. Ethernet (or Token Ring, or any other layer 2 protocol… it doesn’t really matter) has delivered the frame and the payload has been extracted for processing. Everything layer 2-related is now gone: no MAC address. No 802.1q tag. In general, the network adapter driver is the first and last thing in your Windows system to know anything about the Ethernet frame. After that, Windows takes over with the TCP/IP stack.
What we have at this level is IP. The stand-out feature of IP is, of course, the IP address. This is a four-octet binary number that is usually represented in dotted-decimal notation, like this: 192.168.25.37. IP is the addressing mechanism of layer three.
TCP/IP traffic is packaged in the packet. In many ways, it looks similar to the Ethernet frame. It has a defined sequence that includes a header and a data section. Inside the header, we find source and destination IP addresses. This is also the point at which we can start thinking about routing.
A very important fact to know when you’re testing a network is that ICMP (which means PING for most of us) lives in layer 3, not layer 4. You need to be aware of this because you will see behaviors in ICMP that don’t make a lot of sense when you try to think of them in terms of layer 4 behavior, especially in comparison to TCP and UDP. We’ll talk about this again once we are introduced to layer 4.
What’s not here is the Hyper-V virtual switch. It has no IP address of its own and is generally oblivious to the fact that IP addresses exist. When you “share” the physical adapter that a Hyper-V switch is assigned to, what actually happens is that a virtual network adapter is created for the management operating system. That virtual adapter “connects” to the Hyper-V virtual switch at layer one (which is, of course virtual). It does the work of bringing the layer two information off the Hyper-V switch into the layer three world of the management operating system. So, the virtual switch and virtual adapter are in layers one and two, but only the adapter can be said to meaningfully participate in layer three at all.
The Hyper-V Server/Windows Server team is also not really in level three. You do create a team interface, but it also works much like Hyper-V’s virtual adapter.
Layer four is where we find much more of the TCP/IP stack, particularly TCP and UDP. The OSI model is really fuzzy at this point, because these protocols are advertised right there in the TCP/IP packet header, which is definitely a layer three object. However, it is the TCP/IP control software operating in this layer that is responsible for the packaging and handling of these various packets, and the actual differences are seen inside the payload portion of the packet. For the most part, Hyper-V administrators don’t really need to think much about layer four operations, but having no understanding of them will hurt.
The features that we see in layer four are really what made TCP/IP the most popular protocol. This is especially true for TCP, which allows for packets to be lost while preventing data loss. TCP packets are tracked from source to destination, and if one never arrives, the recipient can signal for a retransmission. So, if a few packets in a stream happen to travel a different route and arrive out of order, this protocol can put them back into their original intended pattern. UDP does not do this, but it shares TCP’s ability to detect problems.
This capability is really what separates layer three from layer four, and why ICMP doesn’t behave like a layer four protocol. For instance, if you’re running a Live Migration and a ping is dropped, that doesn’t mean that TCP will be affected at all. I’ve heard it said that ICMP is designed to find network problems and that’s why it fails when other protocols don’t. That’s true to some degree, but it’s also because the functionality that allows TCP and UDP to deal with aberrations in the network are not layer 3 functions.
The best summary of the process described by the OSI model is that networking is a series of encapsulation. The following illustration shows this at the levels we’ve discussed:
Each successive layer takes the output of the previous layer, and depending on the direction that the data is flowing, either encapsulates it with that layer’s information or unpacks the data for further processing.
In the next installment of this series, we’ll start to see application of these concepts by tearing into VLANs.
Ever wonder why your virtual machines report that their network speed is 10 Gbps, even if you haven’t got a 10 Gbps adapter in the physical box? If so, you’re certainly not alone. Knowing why depends on an understanding of the Hyper-V virtual switch. (more…)
Anyone who has read much of my work on Hyper-V knows that I’m of the opinion that networking is one of the most complicated aspects of setting up Hyper-V, especially in a clustered environment. Part of it is that a lot of the concepts in Hyper-V networking lack a corollary in the physical realm so previous experience doesn’t carry forward very well. Another part of it is that Hyper-V is the first time that many administrators will see multiple network cards in the same physical unit that aren’t teamed together. One aspect that confuses a lot of people is the role of binding order for all those adapters in the parent partition. The contents of this post will apply to both Hyper-V R2 and 2012.