Hyper-V Backup Best Practices: Terminology and Basics

Hyper-V Backup Best Practices: Terminology and Basics

 

One of my very first jobs performing server support on a regular basis was heavily focused on backup. I witnessed several heart-wrenching tragedies of permanent data loss but, fortunately, played the role of data savior much more frequently. I know that most, if not all, of the calamities could have at least been lessened had the data owners been more educated on the subject of backup. I believe very firmly in the value of a solid backup strategy, which I also believe can only be built on the basis of a solid education in the art. This article’s overarching goal is to give you that education by serving a number of purposes:

  • Explain industry-standard terminology and how to apply it to your situation
  • Address and wipe away 1990s-style approaches to backup
  • Clearly illustrate backup from a Hyper-V perspective

Backup Terminology

Whenever possible, I avoid speaking in jargon and TLAs/FLAs (three-/four-letter acronyms) unless I’m talking to a peer that I’m certain has the experience to understand what I mean. When you start exploring backup solutions, you will have these tossed at you rapid-fire with, at most, brief explanations. If you don’t understand each and every one of the following, stop and read those sections before proceeding. If you’re lucky enough to be working with an honest salesperson, it’s easy for them to forget that their target audience may not be completely following along. If you’re less fortunate, it’s simple for a dishonest salesperson to ridiculously oversell backup products through scare tactics that rely heavily on your incomplete understanding.

  • Backup
  • Full/incremental/differential backup
  • Delta
  • Deduplication
  • Inconsistent/crash-consistent/application-consistent
  • Bare-metal backup/restore (BMB/BMR)
  • Disaster Recovery/Business Continuity
  • Recovery point objective (RPO)
  • Recovery time objective (RTO)
  • Retention
  • Rotation — includes terms such as Grandfather-Father-Son (GFS)

There are a number of other terms that you might encounter, although these are the most important for our discussion. If you encounter a vendor making up their own TLAs/FLAs, take a moment to investigate their meaning in comparison to the above. Most are just marketing tactics — inherently harmless attempts by a business entity trying to turn a coin by promoting its products. Some are more nefarious — attempts to invent a nail for which the company just conveniently happens to provide the only perfectly matching hammer (with an extra “value-added” price, of course).

Backup

This heading might seem pointless — doesn’t everyone know what a backup is? In my experience, no. In order to qualify as a backup, you must have a distinct, independent copy of data. A backup cannot have any reliance on the health or well-being of its source data or the media that contains that data. Otherwise, it is not a true backup.

Full/Incremental/Differential Backups

Recent technology changes and their attendant strategies have made this terminology somewhat less popular than in past decades, but it is still important to understand because it is still in widespread use. They are presented in a package because they make the most sense when compared to each other. So, I’ll give you a brief explanation of each and then launch into a discussion.

  • Full Backups: Full backups are the easiest to understand. They are a point-in-time copy of all target data.
  • Differential Backups: A differential backup is a point-in-time copy of all data that is different from the last full backup that is its parent.
  • Incremental Backups: An incremental backup is a point-in-time copy of all data that is different from the backup that is its parent.

The full backup is the safest type because it is the only one of the three that can stand alone in any circumstances. It is a complete copy of whatever data has been selected.

Full Backup

Full Backup

A differential backup is the next safest type. Remember the following:

  • To fully restore the latest data, a differential backup always requires two backups: the latest full backup and the latest differential backup. Intermediary differential backups, if any exist, are not required.
  • It is not necessary to restore from the most recent differential backup if an earlier version of the data is required.
  • Depending on what data is required and the intelligence of the backup application, it may not be necessary to have both backups available to retrieve specific items.

The following is an illustration of what a differential backup looks like:

Differential Backup

Differential Backup

Each differential backup goes all the way back to the latest full backup as its parent. Also, notice that each differential backup is slightly larger than the preceding differential backup. This phenomenon is conventional wisdom on the matter. In theory, each differential backup contains the previous backup’s changes as well as any new changes. In reality, it truly depends on the change pattern. A file backed up on Monday might have been deleted on Tuesday, so that part of the backup certainly won’t be larger. A file that changed on Tuesday might have had half its contents removed on Wednesday, which would make that part of the backup smaller. A differential backup can range anywhere from essentially empty (if nothing changed) to as large as the source data (if everything changed). Realistically, you should expect each differential backup to be slightly larger than the previous.

The following is an illustration of an incremental backup:

Incremental Backup

Incremental Backup

Incremental backups are best thought of as a chain. The above shows a typical daily backup in an environment that uses a weekly full with daily incrementals. If all data is lost and restoring to Wednesday’s backup is necessary, then every single night’s backup from Sunday onward will be necessary. If any one is missing or damaged, then it will likely not be possible to retrieve anything from that backup or any backup afterward. Therefore, incremental backups are the riskiest; they are also the fastest and consume the least amount of space.

Historically, full/incremental/differential backups have been facilitated by an archive bit in Windows. Anytime a file is changed, Windows sets its archive bit. The backup types operate with this behavior:

  • A full backup captures all target files and clears any archive bits that it finds.
  • A differential backup captures only target files that have their archive bit set and it leaves the bit in the state that it found it.
  • An incremental backup captures only files with the archive bit set and clears it afterward.
Archive Bit Example

Archive Bit Example

Delta

“Delta” is probably the most overthought word in all of technology. It means “difference”. Do not analyze it beyond that. It just means “difference”. If you have $10 in your pocket and you buy an item for $8, the $8 dollars that you spent is the “delta” between the amount of money that you had before you made the purchase and the amount of money that you have now.

The way that vendors use the term “delta” sometimes changes, but usually not by a great deal. In the earliest incarnation that I am aware of “delta” as applied to backups, it meant intra-file changes. All previous backup types operated with individual files being the smallest level of granularity (not counting specialty backups such as Exchange item-level). Delta backups would analyze the blocks of individual files, making the granularity one step finer.

The following image illustrates the delta concept:

Delta Backup

Delta Backup

A delta backup is essentially an incremental backup, but at the block level instead of the file level. Somebody got the clever idea to use the word “delta”, probably so that it wouldn’t be confused with “differential”, and the world thought it must mean something extra special because it’s Greek.

The major benefit of delta backups is that they use much less space than even incremental backups. The trade-off is in the computing power to calculate deltas. The archive bit can tell it if a file needs to be scanned, but it cannot tell it which blocks to cover. Backup systems that perform delta operations require some other method for change tracking.

Deduplication

Deduplication represents the latest iteration of backup innovation. The term explains itself quite nicely. The backup application searches for identical blocks of data and reduces them to a single copy.

Deduplication involves three major feats:

  • The algorithm that discovers duplicate blocks must operate in a timely fashion
  • The system that tracks the proper location of duplicated blocks must be foolproof
  • The system that tracks the proper location of duplicated blocks must use significantly less storage than simply keeping the original blocks

So, while deduplication is conceptually simple, implementations can depend upon advanced computer science.

Deduplication’s primary benefit is that it can produce backups that are even smaller than delta systems. Part of that will depend on the overall scope of the deduplication engine. If you were to run fifteen new Windows Server 2016 virtual machines through even a rudimentary deduplicator, it would reduce all of them to the size of approximately a single Windows Server 2016 virtual machine — a 93% savings.

There is risk in overeager implementations, however. With all data blocks represented by a single copy, each block becomes a single point of failure. The loss of a single vital block could spell disaster for a backup set. This risk can be mitigated by employing a single pre-existing best practice: always maintain multiple backups.

Inconsistent/Crash-Consistent/Application-Consistent

We already have an article set that explores these terms in some detail. Quickly:

  • Inconsistent backups would be effectively the same thing as performing a manual file copy of a directory tree.
  • Crash-consistent backup captures data as it sits on the storage volume at a given point in time, but cannot touch anything passing through the CPU or waiting in memory. You could lose any in-flight I/O operations.
  • Application-consistent backup coordinates with the operating system and, where possible, individual applications to ensure that in-flight I/Os are flushed to disk so that there are no active file changes at the moment that the backup is taken

I occasionally see people twisting these terms around, although I believe that’s most accidental. The definitions that I used above have been the most common, stretching back into the 90s. Be aware that there are some disagreements, so ensure that you clarify terminology with any salespeople.

Bare-Metal Backup/Restore

A so-called “bare-metal backup” and/or “bare metal restore” involves capturing the entirety of a storage unit including metadata portions such as the boot sector. These backup/restore types essentially mean that you could restore data to a completely empty physical system without needing to install an operating system and/or backup agent on it first.

Disaster Recovery/Business Continuity

The terms “Disaster Recovery” (DR) and “Business Continuity” are often used somewhat interchangeably in marketing literature. “Disaster Recovery” is the older term and more accurately reflects the nature of the involved solutions. “Business Continuity” is a newer, more exciting version that sounds more positive but mostly means the same thing. These two terms encompass not just restoring data, but restoring the organization to its pre-disaster state. “Business Continuity” is used to emphasize the notion that, with proper planning, disasters can have little to no effect on your ability to conduct normal business. Of course, the more “continuous” your solution is, the higher your costs are. That’s not necessarily a bad thing, but it must be understood and expected.

One thing that I really want to make very clear about disaster recovery and/or business continuity is that these terms extend far beyond just backing up and restoring your data. DR plans need to include downtime procedures, phone trees, alternative working sites, and a great deal more. You need to think all the way through a disaster from the moment that one occurs to the moment that everything is back to some semblance of normal.

Recovery Point Objective

The maximum acceptable span of time between the latest backup and a data loss event is called a recovery point objective (RPO). If the words don’t sound very much like their definition, that’s because someone worked really hard to couch a bad situation within a somewhat neutral term. If it helps, the “point” in RPO means “point in time.” Of all data adds and changes, anything that happens between backup events has the highest potential of being lost. Many technologies have some sort of fault tolerance built in; for instance, if your domain controller crashes and it isn’t due to a completely failed storage subsystem, you’re probably not going to need to go to backup. Most other databases can tell a similar story. RPOs mostly address human error and disaster. More common failures should be addressed by technology branches other than backup, such as RAID.

A long RPO means that you are willing to lose a greater span of time. A daily backup gives you a 24-hour RPO. Taking backups every two hours results in a 2-hour RPO. Remember that an RPO represents a maximum. It is highly unlikely that a failure will occur immediately prior to the next backup operation.

Recovery Time Objective

Recovery time objective (RTO) represents the maximum amount of time that you are willing to wait for systems to be restored to a functional state. This term sounds much more like its actual meaning than RPO. You need to take extra care when talking with backup vendors about RTO. They will tend to only talk about RTO in terms of restoring data to a replacement system. If your primary site is your only site and you don’t have a contingency plan for complete building loss, your RTO is however long it takes to replace that building, fill it with replacement systems, and restore data to those systems. Somehow, I suspect that a six-month or longer RTO is unacceptable for most institutions. That is one reason that DR planning must extend beyond taking backups.

In more conventional usage, RTOs will be explained as though there is always a target system ready to receive the restored data. So, if your backup drives are taken offsite to a safety deposit box by the bookkeeper when she comes in at 8 AM, your actual recovery time is essentially however long it takes someone to retrieve the backup drive plus the time needed to perform a restore in your backup application.

Retention

Retention is the desired amount of time that a backup should be kept. This deceptively simple description hides some complexity. Consider the following:

  • Legislation mandates a ten-year retention policy on customer data for your industry. A customer was added in 2007. Their address changed in 2009. Must the customer’s data be kept until 2017 or 2019?
  • Corporate policy mandates that all customer information be retained for a minimum of five years. The line-of-business application that you use to record customer information never deletes any information that was placed into it and you have a copy of last night’s data. Do you need to keep the backup copy from five years ago or is having a recent copy of the database that contains five-year-old data sufficient?

Questions such as these can plague you. Historically, monthly and annual backup tapes were simply kept for a specific minimum number of years and then discarded, which more or less answered the question for you. Tape is an expensive solution, however, and many modern small businesses do not use it. Furthermore, laws and policies only dictate that the data be kept; nothing forced anyone to ensure that the backup tapes were readable after any specific amount of time. One lesson that many people learn the hard way is that tapes stored flat can lose data after a few years. We used to joke with customers that their bits were sliding down the side of the tape. I don’t actually understand the governing electromagnetic phenomenon, but I can verify that it does exist.

With disk-based backups, the possibilities are changed somewhat. People typically do not keep stacks of backup disks lying around, and their ability to hold data for long periods of time is not the same as backup tape. The rules are different — some disks will outlive tape, others will not.

Rotation

Backup rotations deal with the media used to hold backup information. This has historically meant tape, and tape rotations often came in some very grandiose schemes. One of the most widely used rotations is called “Grandfather-Father-Son” (GFS):

  • One full backup is taken monthly. The media it is taken on is kept for an extended period of time, usually one year. One of these is often considered an annual and kept longer. This backup is called the “Grandfather”.
  • Each week thereafter, on the same day, another full backup is taken. This media is usually rotated so that it is re-used once per month. This backup is known as the “Father”.
  • On every day between full backups, an incremental backup is taken. Each day’s media is rotated so that it is re-used on the same day each week. This backup is known as the “Son”.

The purpose of rotation is to have enough backups to provide sufficient possible restore points to guard against a myriad of possible data loss instances without using so much media that you bankrupt yourself and run out of physical storage room. Grandfathers are taken offsite and placed in long-term storage. Fathers are taken offsite, but perhaps not placed in long-term storage so that they are more readily accessible. Sons are often left onsite, at least for a day or two, to facilitate rapid restore operations.

Replacing Old Concepts with New Best Practices

Some backup concepts are simply outdated, especially for the small business. Tape used to be the only feasible mass storage device that could be written and rewritten on a daily basis and were sufficiently portable. I recall being chastised by a vendor representative in 2004 because I was “still” using tape when I “should” be backing up to his expensive SAN. I asked him, “Oh, do employees tend to react well when someone says, ‘The building is on fire! Grab the SAN and get out!’?” He suddenly didn’t want talk to me anymore.

The other somewhat outdated issue is that backups used to take a very, very long time. Tape was not very fast, disks were not very fast, networks were not very fast. Differential and incremental backups were partly the answer to that problem, and partly to the problem that tape capacity was an issue. Today, we have gigantic and relatively speedy portable hard drives, networks that can move at least many hundreds of megabits per second, and external buses like USB 3 that outrun both of those things. We no longer need all weekend and an entire media library to perform a full backup.

One thing that has not changed is the need for backups to exist offsite. You cannot protect against a loss of a building if all of your data stays in that building. Solutions have evolved, though. You can now afford to purchase large amounts of bandwidth and transmit your data offsite to your alternative business location(s) each night. If you haven’t got an alternative business location, there are an uncountable number of vendors that would be happy to store your data each night in exchange for a modest (or not so modest) sum of money. I still counsel periodically taking an offline offsite backup copy, as that is a solid way to protect your organization against malicious attacks (some of which can be by disgruntled staff).

These are the approaches that I would take today that would not have been available to me a few short years ago:

  • Favor full backups whenever possible — incremental, differential, delta, and deduplicated backups are wonderful, but they are incomplete by nature. It must never be forgotten that the strength of backup lies in the fact that it creates duplicates of data. Any backup technique that reduces duplication dilutes the purpose of backup. I won’t argue against anyone saying that there are many perfectly valid reasons for doing so, but such usage must be balanced. Backup systems are larger and faster than ever before; if you can afford the space and time for full copies, get full copies.
  • Steer away from complicated rotation schemes like GFS whenever possible. Untrained staff will not understand them and you cannot rely on the availability of trained staff in a crisis.
  • Encrypt every backup every time.
  • Spend the time to develop truly meaningful retention policies. You can easily throw a tape in a drawer for ten years. You’ll find that more difficult with a portable disk drive. Then again, have you ever tried restoring from a ten-year-old tape?
  • Be open to the idea of using multiple backup solutions simultaneously. If using a combination of applications and media types solves your problem and it’s not too much overhead, go for it.

There are a few best practices that are just as applicable now as ever:

  • Periodically test your backups to ensure that data is recoverable
  • Periodically review what you are backing up and what your rotation and retention policies are to ensure that you are neither shorting yourself on vital data nor wasting backup media space on dead information
  • Backup media must be treated as vitally sensitive mission-critical information and guarded against theft, espionage, and damage
    • Magnetic media must be kept away from electromagnetic fields
    • Tapes must be stored upright on their edges
    • Optical media must be kept in dark storage
    • All media must be kept in a cool environment with a constant temperature and low humidity
  • Never rely on a single backup copy. Media can fail, get lost, or be stolen. Backup jobs don’t always complete.

Hyper-V-Specific Backup Best Practices

I want to dive into the nuances of backup and Hyper-V more thoroughly in later articles, but I won’t leave you here without at least bringing them up.

  • Virtual-machine-level backups are a good thing. That might seem a bit self-serving since I’m writing for Altaro and they have a virtual-machine-level backup application, but I fit well here because of shared philosophy. A virtual-machine-level backup gives you the following:
    • No agent installed inside the guest operating system
    • Backups are automatically coordinated for all guests, meaning that you don’t need to set up some complicated staggered schedule to prevent overlaps
    • No need to reinstall guest operating systems separately from restoring their data
  • Hyper-V versions prior to 2016 do not have a native changed block tracking mechanism, so virtual-machine-level backup applications that perform delta and/or deduplication operations must perform a substantial amount of processing. Keep that in mind as you are developing your rotations and scheduling.
  • Hyper-V will coordinate between backup applications that run at the virtual-machine-level (like Altaro VM) and the VSS writer(s) within guest Windows operating systems and the integration components within Linux guest operating systems. This enables application-consistent backups without doing anything special other than ensuring that the integration components/services are up-to-date and activated.
  • For physical installations, no application can perform a bare metal restore operation any more quickly than you can perform a fresh Windows Server/Hyper-V Server installation from media (or better yet, a WDS system). Such a physical server should only have very basic configuration and only backup/management software installed. Therefore, backing up the management operating system is typically a completely pointless endeavor. If you feel otherwise, I want to know what you installed in the management operating system that would make a bare-metal restore worth your time, as I’m betting that such an application or configuration should not be in the management operating system at all.
  • Use your backup application’s ability to restore a virtual machine next to its original so that you can test data integrity

Follow-Up Articles

With the foundational material supplied in this article, I intend to work on further posts that expand on these thoughts in greater detail. If you have any questions or concerns about backing up Hyper-V, let me know. Anything that I can’t answer quickly in comments might find its way into an article.

Why Hyper-V Replica Doesn’t Replace Backups

Why Hyper-V Replica Doesn’t Replace Backups

 

Once upon a time, insurance was the only product that you purchased in the hopes that you’d never need to use it. Then, Charles Babbage made the horrible mistake of inventing computers, which gave us all so much more to worry about. The good news is that, whereas insurance can’t do much more than pay money when you lose something, computers have the ability to recover data that you lose. The bad news is that, just like insurance, you must spend a lot of money for that power. Not only has the tech world come up with a cornucopia of schemes to protect your data, it’s produced catchy names like “Disaster Recovery” and “Business Continuity” to get you in the money-spending mood. This article compares two of those product categories within the scope of the Hyper-V world: Hyper-V Replica and virtual machine backups.

Meet the Players

If you’re here looking for a quick answer to the question of whether you should use Hyper-V Replica or a virtual machine backup product, the answer is that they are both on the same team but they play in different positions. If you can’t afford to have both, virtual machine backup is your MVP. Hyper-V Replica is certainly valuable, but ultimately nonessential.

What is Hyper-V Replica?

The server editions of Hyper-V include a built-in feature named Hyper-V Replica. It requires a minimum of two separate hosts running Hyper-V. A Hyper-V host periodically sends the changed blocks of a virtual machine to another system. That system maintains a replica of the virtual machine and incorporates the incoming changes. At any time, an administrator can initiate a “failover” event. This causes the replica virtual machine to activate from the point of the last change that it received.

What is the Purpose of Hyper-V Replica?

The general goal of Hyper-V Replica is to provide rapid Disaster Recovery protection to a virtual machine. Disaster Recovery has become more of a broad marketing buzzword than a useful technical term, but it is important here: Hyper-V Replica does not have any other purpose. To use another marketing term, they can also be said to enable Business Continuity. Little time is necessary to start up a replica, making it ideal when extended outages are unacceptable. However, this does not change the core purpose of Hyper-V Replica, nor does it qualify as an automatic edge over virtual machine backup.

What is Virtual Machine Backup?

I am generically using the phrase “virtual machine backup” in this article, not specifically referring to Altaro’s product. A virtual machine backup is a software-created duplicate copy of a virtual machine that is kept in what we usually call a cold condition. It must undergo a restore operation in order to be used. Virtual machine backups require one Hyper-V host and some sort of storage subsystem — it could be magnetic disk, optical disc, magnetic tape, or solid-state storage.

I am deliberately scoping backup to the virtual machine level in this article in order to make the fairest comparison to Hyper-V Replica. There are backup applications available with wider or different targets.

What is the Purpose of Virtual Machine Backup?

At first glance, the purpose of virtual machine backup might seem identical to Hyper-V Replica. Virtual machine backups can certainly provide disaster recovery protection. However, they also allow for multi-layered protection. Not all failures qualify as disasters (although impacted parties may disagree). Equally as important, not all uses for virtual machine backup involve a failure. The purpose of virtual machine backup is to provide an historical series of a virtual machine’s contents.

A Brief Overview of Hyper-V Replica

To understand why Hyper-V Replica can’t replace backup, you’ll first need to see what it truly does. Fortunately, all of the difficult parts are in the configuration and failover processes. A properly running replica system is easy to understand.

The Hyper-V host that owns the “real”, or maybe the “source” virtual machine tracks changes. At short, regular intervals, it transmits those changes to the Hyper-V host that owns the replica.

Hyper-V Replica in Action

Hyper-V Replica in Action

For this discussion, the most important part is the short, regular interval. You can choose a different value for short, but there can be only one.

Where Hyper-V Replica Falls Behind Backup

As I start this section, I want it made clear that I am not attempting to disparage Hyper-V Replica. It is a fantastic technology. The stated goal of this article is to explain why Hyper-V Replica does not replace virtual machine backup. Therefore, this section will lay out what virtual machine backup gives you that Hyper-V Replica cannot.

Retention

You can instruct Hyper-V Replica to maintain multiple Recovery Points. These are similar to backups in that each one represents a complete virtual machine. You can recover from one of them independently of any other recovery points. However, these recovery points are captured once every hour and you cannot change that interval. Therefore, opting to keep more than a few recovery points will result in a great deal of space utilization in a very short amount of time. All of that space will only represent a very short period in the life of the virtual machine. You won’t be able to maintain a very long history using only Hyper-V Replica.

In contrast, you can easily configure virtual machine backups for much longer retention periods. It’s not strange to encounter retention policies measured in years.

No Separate Copies

When Hyper-V Replica receives new change information, it merges the data directly into the Replica virtual machine. If you are maintaining recovery points, those are essentially change block records that are temporarily spared from the normal delete process. The files generated by the replica system are useless if you separate them from the replica virtual machine’s configuration files and virtual hard disks. More simply: Hyper-V Replica maintains exactly one standalone copy of a virtual machine per target replica server. As long as that copy survives, you can use it to recover from a disaster. Any damage to that copy essentially makes its entire replica architecture pointless.

Virtual machine backup, on the other hand, grants the ability to create multiple distinct copies of a virtual machine. They exist independently. If the backup copy that you want to use is damaged, try another.

No Separate Storage Locations

Since a virtual machine replica only has a single set of data, each of its components exists in only one location. For Hyper-V Replica’s intended purpose, that’s not a problem. But, what if something damages a storage location? What if a drive system fails and corrupts all of its data? What if the replica site suffers a catastrophe?

Most virtual machine backup applications allow you to place your backups on separate storage media. For instance, Altaro Virtual Machine Backup is friendly to the idea of rotating through disks. Others let you cycle through different tapes. With copies on separate physical media, you can put distance between unique copies of your virtual machines’ backups. That allows some to survive when others might not. It prevents one bad event from destroying everything.

All or Nothing

You can either bring up the replica of your virtual machine or not. There really isn’t any in-between. You don’t want to tinker with the VHDX file(s) of a replica in any way because that would break the replica chain. It wouldn’t write new change blocks and you’d be forced to start the replica process anew from the beginning. There are alternatives, of course. You could perform an export on the replica. If the replica has recovery points, you could export one of them. You’d then be able to do whatever you need with the exported copy. It’s a rather messy way to extract data from a replica, but it could work.

Almost all commercial virtual machine backup vendors design their applications to handle situations in which you don’t want complete data restoration. The features list will typically mention “granular” or “file-level” restoration capabilities. You shouldn’t need to endure any intermediary complications.

Limited Support and Interoperability

Microsoft does not support Exchange Server with Hyper-V Replica. Active Directory Domain Services sometimes stutters with Hyper-V Replica. Microsoft will support SQL Server with Hyper-V Replica, under some conditions. That series only included Microsoft products. Your line-of-business applications might have similar problems. Furthermore, each of the three items that I mentioned have their own built-in replication technologies. In all three cases, the native capabilities vastly outpace Hyper-V Replica’s abilities. For starters, they all allow for active/active configurations and transfer much less data than Hyper-V Replica.

I’ve seen a lot of strange things in my time, but I don’t believe that I’ve encountered any software vendor that wouldn’t support a customer taking backups of their product. You sometimes need to go through some documentation to properly restore an application. Some applications, like SQL Server, do include their own backup tools, but you can enhance them with external offerings.

Little Control Over Pacing

Once Hyper-V Replica is running, it just goes. It’s going to ship differences at each configured interval, and that’s all there is to it. You can change the interval and you can pause replication, but you don’t have any other options. Pausing replication for an extended period of time is a poor choice, as catching up might take longer than starting fresh.

One of the many nice things about backup applications is that you have the power to set the schedule. You define when backups run and when they don’t. You can restrict your large backup jobs to quiet hours. If you have a high churn virtual machine and need to take backups that are frequent, but not as frequent as Hyper-V Replica, you have that option. If you have a very small domain that doesn’t change often, you might want to only capture weekly backups of your domain controller. You might decide to interject a one-off backup around major changes. Backup applications allow you to set the pacing of each virtual machine’s backup according to what makes the most sense.

Heavy Configuration Burden

Hyper-V Replica requires a fair bit of effort to properly set up and configure. With basic settings, you can set up each involved host in just a few minutes. If you need to use HTTPS for any reason, that will need some more effort. But, initial configuration is only the beginning. By default, no virtual machines are replicated. You’ll need to touch every virtual machine that you want to be covered by Hyper-V Replica. Yes, you can use PowerShell to ease the burden, but I know how so many of you feel about that.

Even the worst virtual machine backup that I’ve ever used at least tried to make configuring backups easy. They all try to employ some sort of mechanism to set up guests in bulk. The biggest reason that this matters is configuration fatigue. If it takes a great deal of effort to reach an optimal configuration, you might take some shortcuts. Even if you suffer all the way through a difficult initial setup, anyone would be resistant to revisiting the process if something in the environment changes. Whereas you’ll likely find it simple to configure a virtual machine backup program exactly as you like it, most people will have almost no variance in their Hyper-V Replica build — even if it isn’t the best choice.

Higher Software Licensing Costs

If your source virtual machine’s operating system license is covered by Software Assurance, then you can use Hyper-V Replica without any further OS licensing cost. Otherwise, a separate operating system license is required to cover the replica. I don’t keep up with application licensing requirements, but those terms might be even less favorable.

Virtual machine backup applications generate copies that Microsoft labels “cold” backups. Microsoft does not require you to purchase additional licenses for any of their operating systems or applications when they are protected like that. I don’t know of any other vendor that requires it, either.

Higher Hardware Costs

The replica host needs to be powerful enough to run the virtual machines that it hosts. You might choose a system that isn’t quite as powerful as the original, but you can’t take that too far. Most organizations employing Hyper-V Replica tend to build a secondary system that rivals the primary system.

If we accept that the primary use case for a replica system involves the loss of the primary system, we see where backup can save money. Only the most foolish business owners do not carry insurance on at least their server equipment. It’s fair to presume that whatever destroyed them would qualify as a covered event. Smaller organizations commonly rely on that fact as part of their disaster recovery strategy. After a disaster, they order replacement equipment and insurance pays for it. They drag their backup disks out of the bank vault and restore to the new equipment.

Of course, they lose out on time. A Hyper-V Replica system can return you to operational status in a few minutes. If you’ve only got backups, then leaning on local resources might have you running in a few hours at best. However, small budgets lead to compromises. Backup alone is cheaper than Hyper-V Replica alone

The Choice is Clear

I don’t like setting down ultimatums or dictating “best practices”, but this is one case where there is little room for debate. If you can afford and justify both virtual machine backup and Hyper-V Replica, employ both. If you must choose, the only rational option is virtual machine backup.

Hyper-V Backup Strategies: Full vs. Reverse Delta

Hyper-V Backup Strategies: Full vs. Reverse Delta

Most of you know me as a blog writer for the Altaro Hyper-V blog, but I began my relationship with Altaro as a customer. One of the features that impressed me right from the start was Reverse Delta. “Delta” in backup jargon just means “difference”. The reason that we don’t just say “difference” is because that would cause confusion with the “differential” backup method. Reverse delta technology allows you to reduce the size of your backups and perform restores more quickly.

What is Reverse Delta?

To understand Reverse Delta, you first need to understand delta. When a file is backed up, it requires an equivalent amount of space on backup media as it does on live media. Keeping a unique backup of a file each time it changes can consume a great deal of media space, especially for frequently-changing files. Compression algorithms help reduce the space utilization, but they have never lived up to their hype. Several alternative techniques have been introduced over the years, but one of the most effective is “delta”. When a file changes, rather than backup the entire file again, only the changed bits are kept. If a restore is ever necessary, the original file is recovered and then all of the changes are applied to it in order, usually with some calculations to skip right to the final version of each bit, until the file is restored to the condition it was in at the time the desired backup was taken. Due to the overhead of working with individual bits, deltas are typically handled in block chunks.

This is a visual of one possible delta implementation:

Delta Backup

Delta Backup

The big thing to notice is how much less space is consumed by the delta backups than by the full backup. Delta is a tried-and-true solution that does a good job of preserving space on backup media. The full backup file is still necessary, of course, and the intermediate backups might be required as well, depending upon the delta technique in use.

The folks at Altaro looked at delta, and they looked at how things go in typical restores, and realized something: most restores are trying to get to the most recent version of a file. The older a backup gets, the less likely it will be used in a restore. With the traditional delta method, that means it’s likely that you’ll need to use multiple backup media to restore any given file. So, they came up with Reverse Delta as an answer. Reverse Delta works by keeping the full copy close to the recent backup, rather than at the opposite end of it.

Before we proceed, I need to make it clear that I do not work directly for Altaro software. I am on a completely different continent than the brilliant people that develop this software. I do not drive into the office and talk shop with them. Just like you, I only know as much about Reverse Delta as Altaro has made public. Fortunately for us, they aren’t hiding anything. If you want to read what they have to say on the matter, this is the official page, which includes a link to a PDF that diagrams the entire process. Since they’ve already done such a good job documenting it there, I’m only going to do the short form here for the sake of continuity. You already understand deltas, so you don’t really need further in-depth explanation.

The Altaro Reverse Delta Process

This is how Reverse Delta operates:

  1. The first backup is taken. This is a full backup, so the entire VHDX is captured and saved to backup media.

    Reverse Delta Backup 1

    Reverse Delta Backup 1

  2. The second backup knows about the first backup. So, it captures only the parts that have changed. In older versions of Altaro VM Backup, that meant a manual scan of the VHDX. In newer versions, it performs changed block tracking (CBT), so the scan time is significantly reduced. The effect is the same, though. Only changed blocks are captured and sent to the Altaro VM Backup server.

    Reverse Delta Backup Day 2

    Reverse Delta Backup Day 2

  3. As the second backup is written to disk, the bits are combined into the data saved from the first backup and that is saved as backup number two. The unaltered bits are the only things that are kept in backup 1.

    Reverse Delta Storage on Day 2

    Reverse Delta Storage on Day 2

This process then continues each day so that the latest backup is always a full backup being built by combining the previous full backup with that cycle’s changed blocks.

Benefits of Reverse Delta

Like standard delta, Reverse Delta is intended to save space on backup media. As you can see from the above, it uses storage in the same fashion, with only the difference being where the full backup is stored. Just like manufacturers that implement standard delta practices, Altaro recommends periodically taking full backups to reduce restore processing times.

What Reverse Delta does is shift when data combination processing occurs. Let’s say that you’re using a traditional delta backup application. You come to work on a Friday morning and are immediately greeted with an emergency e-mail from the accountant: “I accidentally deleted our payroll spreadsheet this morning and I need to upload my figures to the payroll processor today!” That’s your paycheck on the line! So, you fire up your application and start to restore the spreadsheet from last night’s backup. The first thing that your application needs to do is dig back to the first full copy of the file. Let’s say you take full backups every Sunday. So, it will need to go back five days (depending on your cycle) to retrieve that backup. Then, it will need to scan Monday’s backup, Tuesday’s backup, Wednesday’s backup, and Thursday’s backup for changes to that file. It will integrate all of these changes into the latest version, and then restore your file. Crisis averted!

The question is, how long does that take? That will depend on several factors, of course. What’s certain, though, is that you needed more than one day’s backup to retrieve data that was recorded less than a day ago, and the system is doing all of its calculating and file crunching while you’re sweating over whether or not you’re going to be able to pay your mortgage on time this month. With Reverse Delta, Altaro VM Backup only needs to perform a direct read from last night’s backup. All of the difficult file processing was done while you slept.

How to Set Reverse Delta

By default, Reverse Delta runs for a maximum of thirty days. After that, the next backup is not replaced by deltas. Instead, it is kept as a full backup. It will then be used as the reference point if you wish to restore any data from backups prior to it. Reverse Delta is configured per virtual machine. To access its settings:

  1. Open Altaro VM Backup and connect to the backup system.
  2. Under the Setup tree, click Advanced Settings.
  3. In the main pane, locate the VM that you wish to modify.
  4. Click the number that appears in the Reverse Delta column. It will become editable; change it to your desired number.
    Set Reverse Delta

    Set Reverse Delta

    Set Reverse Delta

  5. Change other virtual machines as desired. Alternatively, you can use the Modify All link to change the global policy or Modify for host to change all virtual machines on a specific host.
  6. Click Save Changes in the lower right when you are satisfied.

This is a “going forward” modification. Old backups are not changed.

The number that you set specifies the maximum number of Reverse Delta backups that can be taken before a full backup will be saved in its entirety and not replaced on the next cycle by deltas. The smaller the number that you use, the more frequently full backups will be kept.

How to Set Retention Policy

This article is not specifically about retention policy, but it is highly related to the Reverse Delta setting. To modify retention policy, just go to the Retention Policy page under the Setup heading.

Retention Policy

Retention Policy

Drag virtual machines into the slots that you wish to apply to them. If you don’t see a policy that you like, use the Add New buttons to build policies that suit you.

Reverse Delta Strategies

Now that you know what Reverse Delta is, you can start thinking about how to apply it optimally within your environment. A quick overview:

  • There isn’t a perfect one-size-fits-all approach, but no one says that you must be perfect.
  • Think on the VHD/X scale, not the individual file scale.
  • The two factors that most influence how you design your Reverse Delta scheme are data churn rates and available backup media space.
  • Consider your retention plan when adjusting Reverse Delta settings.
  • Watch your usage meter.

Altaro VM Backup Looks at VHD/X Files

Remember not to treat Altaro VM Backup the same way that you would a traditional in-operating system backup program. When we talk about changes and data churn, we’re talking about the blocks of a VHD/X file. If there are a few dozen files that are changing all of the time and they cumulatively consume 100 megabytes on a 100 GB VHDX, that’s not something that you need to worry about. If there are 60GB worth of files changing on that same 100GB VHDX, that’s worth taking the time to architect a Reverse Delta strategy around.

The Effect of Data Churn on Reverse Delta

Deltas/Reverse Deltas are only captured at the moment that the backup is taken. If a large portion of a file is being changed often, then its deltas can easily wind up being at or near the same size as the original. That means that delta/Reverse Delta won’t save you very much space. Worse, restoring to a particular point in time requires all of those deltas to be processed in order from the nearest full backup. For Altaro VM Backup, that won’t be too bad as long as the restore target is fairly recent. For standard delta backup applications, it won’t be too bad if the target restore date is fairly close to a full backup. This leads to our first recommendation:

For data that changes frequently, use full backups more often.

Frequent full backups are especially recommended for database servers that see meaningful amounts of writes. Due to the way that SQL works, there will be data changing in the database’s file and in logs for every modification action (CREATE, INSERT, UPDATE, and DELETE statements). Since virtual machine backup software is examining the .VHD/X file, even transient files like .TRNs will cause deltas to be generated even though they may not be of much use. I would caution you not to automatically lump all SQL databases into any “high churn” category, though. Many SQL databases do very little work. Some are very read-heavy. SQL statistics is a large topic and one that I am not especially well-versed in, but this might get you started (Microsoft SQL Server):

The Effect of Available Media Space on Reverse Delta

Before you can worry much about your media’s space, make sure you spend some time on the previous section regarding data churn. That will determine how Reverse Delta is going to make use of your space. For VHD/X changes that generate only a few deltas, your space utilization will be dominated by how frequently full backups are taken. If your system has a great deal of data churn, then your deltas may not be significantly smaller than your full backups. There are two recommendations:

  • Because large deltas are still smaller than full backups; consider using more delta for high-churn VHD/Xs when backup media space is a premium. This will result in lower overall space utilization but with higher-than-average times required for restore operations.
  • Reducing your retention policy length will be the best way for you to conserve space used by high-churn VHD/X files.
  • Use compression. It may or may not save a lot of space, but it will certainly work better if it’s on.

With both Reverse Delta and compression, you are trading computational power for consumed space. If compression is off and you don’t use Reverse Delta, there is no space savings but no calculation time. If you use compression and very long Reverse Delta settings, you will use the least amount of space, but you will spend more CPU cycles calculating compression and deltas.

The Effect of Retention Policy on Reverse Delta

Retention policies determine how long data is kept. In conjunction with your Reverse Delta policy, they determine the total number of full backups and delta captures. If I set a Retention Policy of six months and take full backups every 30 days, then, depending on how those line up in any given time frame, I’ll have as many as six full backups and somewhere around 175 separate Reverse Delta backups. If I were to reduce the Reverse Delta policy to every fifteen days, I’d have twelve full backups and around 170 Reverse Delta backups. A longer Reverse Delta policy results in reduced space consumption at the expense of longer restore times. The related recommendations are:

To conserve maximum space, use a longer Reverse Delta policy with a shorter retention policy.

To balance space usage and restore speed, use a longer Reverse Delta policy with a longer retention policy.

To ensure that restores occur quickly, use a shorter Reverse Delta policy.

Balancing a smart retention policy against a smart Reverse Delta policy is the best way to control your data usage.

Watch Your Usage Meters

Nobody is perfect. You may not build the best policy your first time out. Don’t worry about that. Hindsight is always easier to work with, and it helps when you have nice charts to look at.

You’ve probably already seen the charts on the Altaro VM Backup dashboard. If you haven’t already, spend some time going through them to see what they can offer.

The best way to determine how a virtual machine should be dealt with is by seeing how it is currently using your data. In the dashboard, on the top right graph, click the Data Backed Up / Day button. It’s the middle button at the left of that particular chart. On the right, choose a cluster or host, then choose a virtual machine to look at.

Data Backed Up Per Day

Data Backed Up Per Day

The tooltips are extremely helpful as they show the compressed and uncompressed statistics for any given day.

I can then switch over to the Total Backup Size / Day graph for the same virtual machine. What this graph is showing me is the total amount of space consumed by this virtual machine’s backup on any given day.

Total VM Backup Space

Total VM Backup Space

What I see on this virtual machine is that my deltas are working very well. My high-water mark for compressed size is 520MB. This virtual machine only has a single full backup. That plus all of the deltas is only about 7 GB. If I were to set it with another full, it would jump to around 12 GB of consumed space. If backup media space is a concern for me, I would not want to shorten the Reverse Delta length. However, I also need to understand that if I wish to restore to the July 07 date, I will need to step through every single data point on the graph, which would certainly take quite a bit of time.

 

Evaluating Hyper-V Backup Storage Solutions

Evaluating Hyper-V Backup Storage Solutions

It’s not difficult to find recommendations about what storage to use with your backup solution — from the people that make backup storage solutions. I certainly can’t begrudge a company trying to turn a coin by promoting their products, but it’s also nice to get some neutral assistance. What I won’t do is throw a list of manufacturers at you and send you on your way; that doesn’t help anyone except the manufacturers on the list. What I am going to do is give you guidance on how to analyze your situation to determine what solutions are most applicable to you which gives you the ability to select the manufacturer(s) on your own terms. I’m also going to show you some possible backup designs that might inspire you in your own endeavors.

Needs Assessment

The very first thing to do is determine what your backup storage needs are. Most people are not going to be able to work from a simple formula such as: we have 1 TB of data so we need 1 TB of backup storage. Figure out the following two items first:

  • How long does any given bit of data need to be stored?
  • Do we only need the most recent copy of that data or do we need a historical record of the original and changed versions? For instance, if your CRM application is tracking all customer interactions and you do not purge data from it, how many backups of that data are necessary to meet your data retention goals?

As you are considering these, be mindful of any applicable legal regulations. This is especially true in finance and related fields, such as insurance. Do not try to get everything absolutely perfect in this first wave. This is the part where you prioritize your data and determine what its lifespan should be. You’ll need to have a decent understanding of the concepts in the next section before you can begin architecting your solution.

If you need to brush up on any of the basics to help you complete your needs assessment, we have an article that covers them.

Backup Storage Options

Alongside a needs assessment, you need to know what storage options are available to you. This will guide you to your final design. At a high level, the options are:

  • Non-hard disk media, such as tape and optical
  • Portable disk drives
  • Solid state media
  • Permanently-placed disk drives
  • Over-the-network third-party provider storage

Non-Hard Disk Media

Disk drives have precision internal mechanical components and electronics that fail. The conventional decades-old wisdom has been to copy data to some other type of media in order to protect it from these shortcomings. The two primary media types that fall into this category are tapes and optical systems.

backstore-nondiskPros of Non-Hard Disk Media

  • Tried-and-true
  • Vendors have specialized to the particular needs of backup and restoration
  • Portable
  • Durable long-term storage (tape)
  • Relatively inexpensive long-term storage

Cons of Non-Hard Disk Media

  • Expense (tape)
  • Special drives and software are needed, which may fail and/or become obsolete while the media is still viable
  • Easily damaged
  • Very slow recovery process

Tape is the traditional king of backup media and is going strong today. It’s not very fast, but it’s highly portable, well-understood, and usually provides a solid ratio of expense, risk, and protection. It can be very expensive, however, but it’s typically the tape drive that drives the cost up the most. Media costs vary; smaller is cheaper, obviously, but there is also a difference in formats. DAT drives are cheaper than LTO drives, but even the highest capacity DAT tapes are nowhere near as large as most same-generation LTO tapes.

Tape must be cared for properly — it absolutely must be kept away from electromagnetic fields and heat. Tapes should be stored upright, preferably in a shielded container designed specifically for holding backup tapes. If these precautions are followed, tapes can easily last a decade. That said, the drives that can read a particular tape have a much shorter lifespan and you might have trouble finding a working drive that can read those old tapes. I’ve also run into issues where I had a good tape and a tape drive that was probably good enough to read it, but we couldn’t locate the software that recorded it. If you’re looking to hold onto backups for a very long time, tape has the highest shelf lifespan-to-cost ratio of all backup media.

Optical backup media popped up as an inexpensive alternative to tape. Optical drives are much cheaper than tape drives and optical media provides the same capacity at a fraction of the price of tape. However, optical media’s star never burned very brightly and dimmed very quickly. Optical media backups are very slow, the capacity-per-unit is not ideal, and durability is questionable. Optical media does have the ability to survive in electromagnetic conditions that would render tape useless, but is otherwise inferior. Unless you only have an extremely small amount of data to protect and your retention needs are no more than a few years, I would recommend skipping optical media.

A very large problem with tape and other non-disk media is that restoring data is a time-consuming process. If you want to restore just a few items, it will almost undoubtedly take far longer to locate that data on media than it will to restore it.

Portable Disk Drives

In my mind, portable disk drives are a relative newcomer in the backup market, although it has occurred to me that many of you have probably used them your entire career.

PortableDiskPros of Portable Disk Drives

  • Inexpensive
  • Reasonably durable
  • Portable
  • Common interfaces that are likely to still be usable in the years to come
  • Relatively quick recovery process

Cons of Portable Drives

  • Mechanical and electronic failure can render data inaccessible except by specialized, expensive processes
  • Long-term offline storage capability is not well-known
  • Drive manufacturers do not tailor their products to the backup market (although some enclosure manufacturers do)
  • Fairly expensive long-term storage

The expense and physical size of portable drives have shrunk while their bandwidth and storage capacity have grown substantially, making them a strong contender against tape. Their great weakness is a reliance on internal mechanisms that are more delicate and complicated than tape, not to mention their electronic circuitry. Most should be well-shielded enough that minor electromagnetic and static electricity fields should not be of major concern.

What you must consider is that tape drives have been designed around the notion of holding their magnetic state for extended periods of time; if kept upright in a container with even modest shielding, they can easily last a decade. Hard drives are not designed or built to such standards. You’ll hear many stories of really old disks pulled out of storage and working perfectly with no data loss — I have several myself. The issue is that those old platters did not have anything resembling the ultra-high bit densities that we enjoy today. What that means is that the magnetic state for any given bit might have degraded a small amount without affecting the ability of the read/write head to properly ascertain its original magnetic state. The effects of magnetic field degradation will be more pronounced on higher density platters. I do not have access to any statistics, primarily because these ultra-high capacity platters haven’t been in existence long enough to gather such information, but at this time, I personally would not bank on a very large stored hard drive keeping a perfect record for more than a few years.

Hard disks that are rotated often will suffer mechanical or electronic failure long before magnetic state becomes a concern. A viable option is to simply swap new physical drives in periodically. If you want to use hard drives for very long-term offline storage, add it to your maintenance cycle to spin up old drives and transfer their contents to new drives that replace them.

Solid State Media

The latest entry in the backup market is solid state media. The full impact of solid state on the backup market has not yet been felt. I expect that it will cause great changes in the market as costs decline.

SSDPros of Solid State Media

  • Extremely durable
  • Fast (newer types)
  • Very portable

Cons of Solid State Media

Its high cost-to-capacity ratio is the primary reason that it has not overtaken all other media types. It is far more durable and some types are faster than both disk and tape. If you can justify the expense, solid state is the preferred option.

Permanently-Placed Disk Drives

Another option that has only become viable within the last few years is storage that never physically moves, such as NAS devices.

SANPros of Permanently-Placed Drives

  • Very high reliability and redundancy — dependent upon manufacturer/model
  • High performance
  • Can be physically secured and monitored

Cons of Permanently-Placed Drives

  • High equipment expense
  • Best used with multi-site facilities
  • Dependent upon speed and reliability of inter-site connections

Loss of the primary business location and theft of backup media are ever-present concerns; the traditional solution has been to transport backup tapes offsite to a secured location, such as a bank safety deposit box (or somebody’s foyer credenza, that sort of thing happens a lot more often than many will admit). With the cost of Internet bandwidth declining, we now have the capability to transmit backup data over the wire to remote locations in a timely and secured fashion.

While I do not recommend it, it would theoretically be acceptable to use on-premises permanent disk drives for very short-term backup storage. This would allow for a very short RTO to address minor accidents. As long as it is made abundantly clear to all interested parties that such a solution is equally vulnerable to anything that threatens the safety and security of the site, there are viable applications for such a solution.

Over-the-Network Third Party Provider Storage

The primary distinguishing factor between this category and the prior entry is ownership. You can pay someone else to maintain the facility and the equipment that houses your offsite copies.

OffsiteDiskPros of Third-Party Offsite Providers

  • In theory, it is a predictable recurring expense
  • Potential for additional included services at a lower expense than a do-it-yourself solution
  • Full-time subject-matter experts maintain your data for you

Cons of Third-Party Offsite Providers

  • In theory, providers could make dramatic changes in pricing and service offerings and effectively hold your data and reliability of storage hostage
  • Trust and integrity concerns
  • You may not be able to control the software and some other components of your backup strategy

There are several enticing reasons to work with offsite backup providers. Many offer additional services, such as hosting your data in a Remote Desktop Services environment as a contingency plan. Truthfully, I believe that the primary barrier in the cloud-based storage market is trust. Several of the organizations offering these services are “fly-by-night” operations trying to turn a fast dollar by banking on the fact that almost none of their customers will ever need to rely on their restoration or hosting services. I also don’t think that the world is soon going to forget how Microsoft did everything but make it a requirement that we sync our Windows profiles into Onedrive and then radically increased the costs of using the service. Large service providers can do that sort of things to their customers and survive the fallout.

You can approach third-party offsite storage in two ways:

  • A full-service provider that supplies you with software that operates on your equipment and transmits to theirs
  • A no-/low-frills provider that supplies you with a big, empty storage space for you to use as you see fit

What you receive will likely have great correlation with what you spend.

Designing a Backup Storage Solution

At this point, you know what you need and you know what’s available. All that’s left is to put your knowledge to work by designing a solution that suits your situation.

Backup Strategies

Let’s start with a few overall strategies that you can take.

Disk-to-Tape (or other Non-Disk Media)

This is the oldest and most common backup method. You have a central system that dumps all of your data on tape (or some other media, such as optical) using a rotation method that you choose.

Disk-to-Disk

A more recent option is disk-to-disk. Your backup application transfers data to portable disks which are then transferred offsite or to a permanent installation, hopefully in another physical location.

Disk-to-Disk-to-Tape

A somewhat less common method is to first place regular backups on disk. At a later time these backups, or a subset of them, can be transferred to tape. This gives you the benefit of rapidly recovering recent items while keeping fairly inexpensive long-term copies. You wouldn’t need to rotate as many tapes through, and the constant rewriting of the disks mean that they won’t be expected to retain their data for long.

Disk-to-Local-to-Offsite

Another recent option that can serve as a viable alternative to tape is first backing up data locally, then transferring it to offsite long-term storage, whether its a site that you control or one owned by a third-party provider. This type of solution eliminates the need to manually move data by entrusting someone to physically carry media. In order for this type of solution to be viable, you must have sufficient outbound bandwidth to finish backups in a sufficiently small time frame.

Disk-to-Offsite

You could also opt to transfer your data directly offsite without keeping a local copy. This approach is essentially the same as the previous, but there’s nothing left at the primary location.

Backup Storage Examples

Let’s consider a few real-world-style examples.

Scenario 1

  • 4 virtual machines
    • 1 domain controller
    • 1 file/print VM
    • 1 application VM
    • 1 SQL VM
  • 300 GB total data
  • Cloud or ISP-based e-mail provider
  • No particular retention requirements
  • Uses line-of-business software with a database

This example is a fair match for a large quantity of small businesses. Some might have mixed their roles into fewer VMs and most will have somewhat different total backup data requirements, but this should scenario should have a large applicability base.

I would recommend using a set of portable hard drives in a rotation. I’d want a solid monthly full backup and a weekly full with at least two drives rotated daily. If using a delta-style application like Altaro VM Backup, the daily deltas are going to be very small so you won’t need large drives. Keeping historical data is probably not going to be helpful as long as at least one known good copy of the database exists.

If budget allows, I would strongly encourage using an offsite or third party storage-only provider to hold the monthly backups.

Probably the biggest thing to note here is that retention isn’t really an issue.

Scenario 2

  • 4 virtual machines
    • 1 domain controller
    • 1 file/print VM
    • 1 application VM
    • 1 SQL VM
  • 300 GB total data
  • Cloud or ISP-based e-mail provider
  • No particular retention requirements
  • Uses line-of-business software with a database

The layout here is the same as Scenario 1. As small as this is, it would be a good candidate for direct offsite transmission. Most backup applications that allow for such a thing allow a “seed” backup. You copy everything to a portable disk, have the disk physically transported to the destination backup site, then place that backup onto permanently-placed storage. From then on, nightly backups are just deltas from that original. Small businesses typically do not have a great deal of daily data churn, so this is a viable solution.

Scenario 3

  • 4 virtual machines
    • 1 domain controller
    • 1 file/print VM
    • 1 application VM
    • 1 SQL VM
  • 300 GB total data
  • Cloud or ISP-based e-mail provider
  • 5-year retention requirements for financial data
  • Uses line-of-business software with a database

This is the same as the first scenario, only now we have a retention requirement. To figure out how to deal with that, we need to know what qualifies as “financial data”. If your accountant keeps track of everything in Excel, then those Excel files probably qualify. If it’s all in the line-of-business app and it holds financial records in the database for at least five years before purging, then you probably don’t need to worry about retention in backup.

I want to take a moment here to talk about retention, because I’ve had some issues getting customers to understand it in the past. If you’ve got a 5-year retention requirement, that typically means that you must be able to produce any data that was generated within the last five years. It does not necessarily mean that you need to have every backup ever taken for the last five years. If I created a file in December of 2012 and that file is still sitting on my file server, then it was included in the full backup that I took on Sunday, July 4th, 2016. I don’t need to produce an old backup. Retention mainly applies to deleted and changed data. So, in more real-world terms, if all of the data that is in scope for your retention plan is handled by your line of business application and it is tracking changes in the database for at least as far back as the retention policy, then the only thing that you need old backups for is if you suspect that people are purging data before it reaches its five-year lifespan. That’s a valid reason and I won’t discount it, but I also think it’s important for customers to understand how retention works.

Let’s say that the data applicable to the long-term retention plan is file-based and is not protected in the database. In that case, I would recommend investigating options for capturing annual backups. Retain twelve monthly backups and keep one per year. Annual backups can be discarded after five years. My preference for storage of annual backups:

  1. Third-party offsite storage provider
  2. Self-hosted offsite permanent disk storage
  3. Portable hard disk
  4. Tape

Remember that we’re talking about up to 5 TB of long-term storage, although I wouldn’t recommend trying to keep 100% of the 300 GB in each annual backup. 5 TB of offline storage is not expensive (unless you’re buying a tape drive just for that purpose), so this should be a relatively easily attainable goal.

Scenario 4

  • 7 virtual machines
    • 2 domain controllers
    • 1 file/print VM
    • 2 application VMs
    • 1 SQL VM
    • 1 Exchange VM
  • 1.2 TB total data
  • 5-year retention requirements for financial data

This is a larger company than the preceding and it’s got some different requirements. The first thing to sort out will be what the 5-year retention requirement applies to and if it can be met just by ensuring that there is a solid copy of the database. Read the expanded notes for scenario 2, as they would apply here.

Truthfully, I would follow generally the same plan as in scenario 2. The drives would need to be larger, of course, but 1.2 TB in a single backup is very small these days. With applications such as Altaro VM Backup able to target multiple storage drives simultaneously, this system could grow substantially before portable disks become too much of a burden for a nightly rotation. This is in contrast to my attitude from only a few years ago, when I would have almost undoubtedly installed a tape drive and instituted something akin to a GFS rotation.

Scenario 5

Let’s look at a larger institution.

  • 25 virtual machines
    • Multiple domain controllers
    • Large central file server
    • Multiple application servers
    • Highly available Exchange
    • Highly available SQL
  • 10 TB total data
  • 5-year retention plan; financial only by law, but CTO has expanded the scope to all data

Honestly, even though it seems like there is a lot going on here, 10 TB is much more than most installations that fit this description will realistically be using. But, I wanted to aim large. This scenario is probably not going to be well-handled by portable drives unless you have someone on staff that enjoys carting them around and plugging them in. Even tape is going to struggle with this unless you’ve got the money for a multi-drive library.

Here’s what I would recommend:

  • A data diet. 10 TB? Really?
  • A reassessment of the universal 5-year retention goal
  • 2 inexpensive 8-bay NAS devices, filled with 3 TB SATA disks in RAID-6, with a plan in the budget to bring in a third and fourth NAS

Part of this exercise is to encourage you to really work on assessing your environment, not just nodding and smiling and playing the ball as it lies. Ask the questions, do the investigations, find out what is really necessary. The last thing that you want to do is back up someone’s pirated Blu-Ray collection and then store it somewhere that you’re responsible for. “Employment gap to fulfill a prison sentence due to activities at a previous employer” is an unimpressive entry on a resume. Also, be prepared to gently challenge retention expectations. Blanket statements are often issued in very small and very large institutions because it sometimes costs them more to carefully architect around an issue than it does to just go all-in. Organizations in the middle can often benefit from mixed retention policies. So, before you just start drawing up plans to back up that 10 TB and keep it for 5 years, find out if that’s truly necessary.

My third bullet point assumes that you discover that you have 10 TB of data that needs to be kept for 5 years. That does happen. I’m also working from the assumption that any organization that needs to hold on to 10 TB of data has the finances to make that happen. I would configure the first NAS as a target for a solid rotation scheme similar to GFS with annuals. Use software deltas and compression to keep from overrunning your storage right away. All data should be replicated to the second NAS which should live in some far away location connected by a strong Internet connection. As space is depleted on the devices, bring in the second pair of NAS devices — by that time, 4 TB drives (or larger) might be a more economical choice.

I would also recommend bringing in a second tier of backup for long-term storage. That might take the form of an offsite provider or tape.

Hopefully, though, you discover that you really don’t need to backup 10 TB of storage and can just follow a plan similar to scenario 3.

Hyper-V Backup Strategies: Don’t Worry about the Management OS

Hyper-V Backup Strategies: Don’t Worry about the Management OS

It’s no secret that I’m a vocal advocate for a solid, regular, well-tested backup strategy. It’s why I work so well with my friends at Altaro. So, it might surprise you that I do not back up any of my Hyper-V management operating systems. I don’t recommend that you back up yours, either. If you are backing it up, and the proposition to cease that activity worries you, then let’s take some time to examine why that is and see if there’s a better approach. I’ll admit that my position is not without controversy. It’s worth investigating the situation.

Before we even open this discussion, I want to make it very clear that I am only referring to the management operating system environment itself, which would include Hyper-V settings. In no way am I advocating that you avoid backing up your virtual machines.

Challenge Your Own Assumptions

Any time we’re dealing with a controversy, I think the best thing to do is decide just how certain we are that we’re on the correct side. So, I’m going to start this discussion by examining the reasons that people back up their management operating systems to test their validity.

Must We Backup Everything?

Through my years of service in this field, I have developed a “back up everything” attitude. Things that seemed pointless when they were available can become invaluable when they’re missing (as the sage Joni Mitchell tells us, “you don’t know what you’ve got ’til it’s gone”). When I first started working with hypervisors, I intended to bring that attitude and practice with me. However, the hypervisor and the backup products that I was using at that time (not Hyper-V and not Altaro) could not capture backups of the hypervisor. So, I worked with what I had and came up with a mitigation strategy that did not require a hypervisor backup for recovery. It’s coming up on a decade since I switched to that technique, and I have not yet regretted it.

Should Type 1 and Type 2 Hypervisors be Treated the Same Way?

I had been working with type 1 hypervisors in the datacenter for years before I even saw a type 2 hypervisor in action. The first time that I ever heard of one, it was from other Microsoft Certified Trainers raving about all the time that VMware Workstation was saving them. Back then, VMware offered a free copy of the product to MCTs. So, I filled out all the required paperwork and sent it in to see what all of the fuss was about… and never heard back from VMware. I shrugged and went on with my life. Over the years, I learned of other products, such as Virtual PC, but the proposition held very little interest for me. So, just by that quirk of fate, I never really used any virtualization products on the desktop until I had developed hypervisor habits from the server level.

It seems to me that many people have probably gained their experience with hypervisors in the opposite direction. Desktop type 2s first, then server type 1s. It also occurs to me that someone who started with type 2 hypervisors probably has a completely different mindset than I do. In order to examine that, I took a look at my current home system, which does use a desktop hypervisor. It looks like this:

Desktop Virtualization Environment

Desktop Virtualization Environment

Whether my desktop hypervisor is type 1 or type 2 is irrelevant, really. I’d also like to avoid talking about the fact that the current security best practice is to not have all the management stuff be in a privileged virtual machine. The big point here is that my management operating system contains some really important stuff that I would be very sad to lose. At least one of my virtual machines is important as well. I’ve got a lot of script and code in one of them that I’d like to hold on to. So, I back up the whole box.

Now, let’s look at one of my Hyper-V hosts:

Server Environment

Host Server Environment

Do you see the difference? My server’s management operating system doesn’t have any data in it that doesn’t exist elsewhere.

So, let’s draw an important distinction: a Hyper-V server machine should not have any important data in the management operating system. I used a little “s” on server for a reason: follow the same practice whether it’s Hyper-V Server or Windows Server + Hyper-V. This statement is a truism and holds its own even when we’re not talking about backup and recovery. It is simply a best practice that happens to impact our options for backup and recovery.

Is a Bare Metal Restore Faster than Rebuilding the Management Operating System from Scratch?

The common wisdom for your average Windows system is that a bare metal restore returns you to full operating mode much more quickly than rebuilding that system from installation media and restoring all of its applications and settings from backup. Is that wisdom applicable to Hyper-V, though? Not really. You have the Hyper-V role and your backup application, maybe an antivirus agent, and that’s it (or should be). The bulk of the data is in the virtual machines, which don’t care if you retrieve them from bare metal restore or using a VM-aware method.

The truth of that will depend on how you’re performing your backup. If you have 100% local virtual machines and you are using a singular method to back up the entire host — management operating system, guests, and all — then the full back up/bare metal restore method is faster. If there is any division in your process — one back up and restore method for the management operating system and another for the guests, or when the virtual machines are not stored locally, then bare metal restore is not faster.

Challenge Common Practice

Don’t do things just because it’s what other people do. We all start off doing that because anyone that’s done it even once has more experience than someone that’s just started. I switched from my habit of backing up everything to leaving the hypervisor out of backup because I was forced to and it worked out well for me. You can use my experience (and the experiences of others) to make an informed decision.

Challenge Your Own Practice

If you’ve been backing up your management operating system for a while, I wouldn’t be surprised if you’re reluctant to stop. No one likes to alter their routines, especially the ones that work. So, what I want you to do is not think so much about changing your routine, but think more about what brought you to that routine and if that is resulting in optimal administrative quality of life. There might be something that you should be changing anyway.

Are you doing any of these things?

  • Running another server role in the management operating system, such as Active Directory Domain Services or Routing and Remote Access
  • Using the management operating system as a file server
  • Storing guest backups in the management operating system’s file structure
  • Relying on backup to protect your management server’s configuration

You should not be doing any of these things. Every single item on that list is bad practice.

Running any role besides Hyper-V in the management operating system causes you to forfeit one of your guest virtualization rights. That role will also suffer in performance because the management operating system gets last rights to resources. Some roles, such as Active Directory Domain Services, are known to cause problems when installed alongside Hyper-V. No role is officially supported in the management operating system. If Hyper-V is installed, then you have immediate access to a hypervisor environment where you can place any such other roles inside a guest. So, there is no valid reason to run a non-Hyper-V role alongside Hyper-V.

If you’re using the management operating system as a file server, then you’ve really done everything that I said in the previous paragraph. I only separated it out because some people are convinced it’s somehow different. It’s not. “File server” is a server role, as indicated by the word “server”. Even if the shared files are only used by the guests of that host, it’s still a file server, it is still competing with guests, it is still best suited inside a virtual machine, and it is still a violation of your license agreement — and you’re doing it wrong. Create a file server guest, move all the files there, and back up the guest.

I truly hope that no one is actually storing their guest backups within the management operating system’s file structure. I hesitate to even legitimize that in any way by calling it a “backup”. It’s like copying the front side of a sheet of paper to the back side of that same sheet of paper and feeling like you’ve somehow protected the information on it.

The fourth item, using backup to protect your host configuration, has some validity. Let’s cover what that entails:

  • IP addresses
  • iSCSI/SMB storage connections, if any
  • Network card configuration
  • Teaming configuration
  • Windows Update configuration
  • Hyper-V host settings, such as VM file and storage locations and Live Migration settings

If a basic Hyper-V Server/Windows Server installation is around 20GB, how of it is consumed by the above? Hardly any. A completely negligible amount, in fact. There’s really no justification for backing up the entire management operating system just to capture what you see above.

A Better Strategy for Protecting Your Hyper-V Management Operating System

If we’re not going to back up the management operating system, then what’s our back up and recovery strategy? In the event of total system loss, what is our path back to a fully functional operating environment?

In a typical disaster recovery document, each system, or system type, is broken into two parts: the back up portion and the recovery portion. We’ll investigate our situation from that perspective and create a Hyper-V “Backup” strategy.

“Backing Up” the Management Operating System

There are three parts to the “back up” phase.

  • A clean environment
  • A copy of all settings and drivers
  • Regularly updated installation media

Make Sure the Environment is Clean

Before you can even think about not running a standard backup tool, you need to make sure that your management operating system is “clean”. That means, there can’t be anything there that you would miss in the event of a complete system failure besides the virtual machines. Look in your documents and downloads folders, check the root of the C: drive for non-standard folders, use my orphaned VM file locator to see if you’ve got a detached VHDX with important things in it. Find a more suitable home for all of those items and institute a practice to never leave anything of value on a hypervisor’s management operating system.

Protect Settings and Drivers

Next, we need to deal with the host settings. I don’t want you to need to memorize these or anything crazy like that. What I recommend that you do is save these settings in a script file. I have designed one for you to customize. You can modify that and save it anywhere. You can keep it on one or more USB keys and leave one wherever your offsite backups are kept. You can copy it up to your cloud provider. It’s plenty small to fit any any free online storage account, even. I also counsel you to keep those settings in a plain text file as well. Doing so is a must if you’re not going to script your settings. Even if you are going to script them, you might have other data bits that you can’t add to that script, such as the IP address for an out-of-band management device. For small environments, I like to make paper copies of these and store them with other disaster recovery paper documents. Just for the record, this is not Hyper-V-specific practice. Good backups or not, server configurations should always be documented.

With your settings properly stored away, think next about what you need to run the environment. I’m specifically thinking about drivers and computer system manufacturer’s tools. If your operating environment is large enough, just keeping them on a file server might do. If your file server is dependent upon your Hyper-V host, that might not work so well. You could place them on the USB key with your settings, you could copy them to an external source that holds your virtual machine backups, or you could burn a DVD with them. You could also take the view that if you had to rebuild a host you’d probably skip using anything that’s been sitting in storage since your last disaster recovery update and just get the latest stuff from the manufacturer. If that’s the case, then record the must-have items in your documentation so that you don’t spend a lot of time looking through download lists. As an example:

  • Chipset drivers
  • Network adapter drivers
  • RAID drivers

Protect the Installation

I anticipated that the number one complaint I’ll get about the title/content of this article is that recovering a Windows Server or Hyper-V Server from installation media is a long and time-consuming process. That’s not because installation needs much time, but because Microsoft has ceased making any attempts to keep operating system ISOs up-to-date. As far as I know, the most recent 2012 R2 ISO that is generally available only includes Update 1, which is over two years old. Patching from that point to current could easily take over an hour, and that’s if you’ve got an in-house WSUS system and can skip the Internet pull. Therefore, that complaint is almost valid. However, it’s so easily mitigated that once you know how it can be done, it would be embarrassing not to.

The basic approach is simple. Stop using DVDs to deploy physical hosts. It’s slow and unnecessary. You have three superior options:

  • Deploy from USB media
  • Deploy from Windows Deployment Services (WDS)
  • Bare metal deployment from System Center Virtual Machine Manager

Out of all of these methods, the second is my favorite. WDS is relatively lightweight, with the heaviest component being the large WIM files that you must save. If you’re a smaller business, it would work well as a role on your file server. It’s still not necessarily feasible for the very small shops that only do a handful of deployments in the course of a year (if that). For those, the first option is probably the best. For the cost of two or three USB keys, you can fairly easily maintain ready-to-go reinstallation images. Bare metal deployment from VMM is my least favorite. I’ve used it myself for a while because I have a fair number of hosts to deal with and it has some Hyper-V-centric features that WDS does not. However, I’m starting to lean back toward the WDS method. VMM’s bare metal deployment has always been a rickety contraption, and each new update roll-up adds new bugs untested features that are relentlessly driving it toward being yet another vestigial organ in the morass of uselessness that is VMM.

The first two deployment methods utilize the Windows Imaging File Format (WIM). VMM uses VHDX. Native tools in Windows Server can update a Windows installation in either file type without bringing it online. That’s an extremely important time saver. You can schedule a patch cycle against these files while you sleep and have an always-ready, always-current deployment source if you should need to replace a failed host or stand up a new one. The only things that you need are an installation of Windows Server Update Services (WSUS) and a copy of a script that I wrote specifically for the purpose of updating VHDX and WIM files – get it here. If you haven’t been using WSUS, you should start as quickly as possible. It’s another lightweight deployment that mostly relies on disk space. If you only select Windows Server 2012 R2, it can use as little as 8 GB of drive space (the end-user products can require dozens of GB of space so be careful). That article also provides links to setting up a WDS system, if you’re ready to break free of media-based deployments once and for all.

I typically don’t add drivers right into the image’s Windows installation because so many require full .EXEs, but you can add .inf-based drivers using the Add-WindowsDriver cmdlet. You can hack in to my script a bit and see how I get to that point. What I am more inclined to do is mount the image file and copy the driver installers into a non-system folder so that I can run them after deployment.

“Restoring” the Management Operating System

With the above completed, you’re ready in case a Hyper-V host should need to be rebuilt. All you need to do is:

  • Install from your prepared media
  • Reload the drivers that you have ready (or just download new ones)
  • Re-establish your settings
  • Restore your virtual machines

I haven’t timed the process on current hardware, but I’d say that you should reasonably expect to be able to get through the first three in somewhere between 20 minutes and an hour. I think the hour is a bit long and depends on how quickly your host can boot. I know that the more recent Dell rackmount hosts take longer to warm boot than my old 286 needed to cold boot, but that time investment would be a constant whether you were performing a traditional bare metal restore or using one of these more streamlined methods. Once that’s done, the time to restore virtual machines is all a factor of the virtual machines’ sizes, your software, and your hardware.

How To Copy or Backup a VHD File While the VM is Running

How To Copy or Backup a VHD File While the VM is Running

I think that we can all agree that backup software exists for a reason. Well, lots of reasons. Very good reasons. So, if you ask me in the abstract how to make a copy or backup of a virtual machine’s virtual hard disk file while the virtual machine is running, I’m probably going to refer you to your backup vendor.

If you don’t have one, or don’t have one that you can trust, then I am naturally going to recommend that you download Altaro VM Backup. Backup is their wheelhouse and they’re going to have a lot more experience in it than any of us. The outcomes will be better than anything that we administrators can do on our own.

But, I also understand that sometimes you have one-off needs and you need to get something done right now.

Or, you need to script something.

Or your backup software isn’t quite granular enough.

Or you have some other need that’s unique to your situation.

If you need to get a copy or backup of one or more of a virtual machine’s hard disks without shutting down the virtual machine, you have three options, shown in their preferred order:

  1. Use your backup application, as we discussed.
  2. Export the virtual machine with Hyper-V Manager or Export-VM. This only works for Hyper-V versions 2012 R2 and later.
  3. Copy the file manually.

I’m not going to change my mind that a backup application is the best way to get that copy. But, I’m done beating that horse in this article.

Export is the second-best solution. The biggest problem with that is that it exports the entire virtual machine, which might be undesirable for some reason or another. It also locks the virtual machine. That won’t necessarily be a bad thing, especially if all you’re doing is getting a copy, but maybe it’s a showstopper for whatever you’re trying to accomplish.

That leaves us with option 3, which I will illustrate in this article. But first, I’m going to try to talk you out of it.

You Really Shouldn’t Manually Copy the Disks of a Live Virtual Machine

Manually copying a live VHD/X file isn’t the greatest idea. The best that you can hope for is that your copy will be “crash consistent”. The copy will only contain whatever data was within the VHD/X file at the moment of the copy. Any in-flight I/Os will be completely lost if you’re lucky or partially completed if you’re not. Databases will probably be in very bad shape. I’m sure that whatever reason that you have for wanting to do this is very good, but the old adage, “Because you can do a thing, it does not follow that you must do a thing,” is applicable. Please, think of the data.

OK, the guilt trip is over.

Just remember that if you attach the copied disk to a virtual machine and start it up, the volume will be marked as dirty and your operating system is going to want to run a repair pass on it.

Manually Copying a Hyper-V Disk the Dangerous Way

That header is a bit scarier than the reality. Most importantly, you’re not going to hurt your virtual machine doing this. I tested this several times and did not have any data loss or corruption issues. I was very careful not to try this process with a disk that housed a database because I was fairly certain that would break my perfect streak that way.

Use robocopy in restartable mode:

The format is:

It is important that you do not use a trailing slash on the folder names! If you want to copy multiple files, just enter them with a space between each.

Pros of the robocopy method:

  • It’s easy to remember
  • It works on anything that your account can reach — local storage, CSVs, SMB shares, whatever
  • Is “good enough”

Cons of the robocopy method:

  • Restartable mode (specified by the /z switch) is sssssllllllllooooooooow especially over networks
  • There is no guarantee of data integrity. But, there’s no real guarantee of data integrity when manually copying a live VHD/X anyway, so that probably doesn’t matter.
  • I doubt that anyone will ever give you support if you use this method

For basic file storage VHD/X files, this is probably the best of these bad methods to use. I would avoid it for frequently written VHD/X files.

Manually Copying a Hyper-V Disk the Safer Way

A somewhat safer method is to invoke VSS. It’s more involved, though. The following is a sampledo not copy/paste!

This is going to need somewhat more explanation than the robocopy method. We’ll take it line-by-line.

The first line tells VSSADMIN to create a shadow copy for the C: volume. The VHD/X file that I’m targeting lives on C:. Substitute your own drive here. The shadow copy becomes a standard Windows volume.

The second line creates a symbolic link to the newly created volume so that we can access its contents with the usual tools. You can discover what that lines contents should be via the output from the previous command.

VSS Admin Create Shadow

VSS Admin Create Shadow

We use “mklink.exe” to create the symbolic link. The /D switch lets it know that we’re going to make a directory link, not a file link. After that, we only need to tell it what to call the link (I used C:vssvolume) and then the target of our link. It is vitally important that you place a trailing slash on the target or your symbolic link will not work.

Next, we copy the file out of the shadow copy to our destination. I used XCOPY because I like XCOPY (and because it allows for tab completion of the file name, which robocopy does not). You can use any tool that you like.

That’s all for the file. You can copy anything else out of the shadow copy that you want.

We need to clean up after ourselves. Do not leave shadow copies lying around. It’s bad for your system’s health. The first step is to destroy the symbolic link:

The last thing is to delete the shadow. There are multiple ways to do this, but my preferred way is to delete the exact shadow copy that we made. If you look at the output of your vssadmin create shadow command, it has all of the information that you need. Just look at the Shadow Copy ID line (it’s directly above the red box in my screen shot). Since VSSADMIN was nice enough to place all of that on one line, you can copy/paste it into the deletion command.

You’ll be prompted to confirm that you want to delete the shadow. Press [Y] and you’re all finished! If you want to see other ways to remove VSS shadows, type vssadmin delete shadows without any other parameters. It will show you all of the ways to use that command.

Yes, this works for Cluster Shared Volumes. Get a shadow of C: as shown above and copy from the shadow’s ClusterStorage folder. I would take caution to only perform this from the node that owns the CSV.

Pros of the VSSADMIN method:

  • It’s completely safe to use, if you do it right.
  • It’s not entirely perfect, but some quiescing of data is done. The copied volume is still dirty, though.
  • Faster when copying to a network destination than robocopy in restartable mode.
  • Works for local disks and CSVs. Won’t work for SMB 3 from the Hyper-V host side.

Cons of the VSSADMIN method:

  • Tough to remember (but this blog article should live for a long time, and that’s what Favorites and Bookmarks are for)
  • If you don’t clean up, you could cause your system to have problems. For instance, you might prevent your actual backup software from running later on in the same evening.
  • May not work at all if another VSS snapshot exists
  • May have problems when third-party and hardware VSS providers are in use

If the integrity of the copied data is important and/or changing frequently, you’ll likely get better results from the VSSADMIN method than the robocopy method. It’s still not as good as the other techniques that I promised not to harp on you about.

How to Import a Hyper-V Virtual Machine

How to Import a Hyper-V Virtual Machine

 

One of the many benefits of virtualization is the ease of moving an entire operating environment from one place to another. Hyper-V includes several mechanisms for such portability. One of these is the export/import system, which involves creating an offline copy of a virtual machine. In 2008 R2 and prior versions of Hyper-V, the import system was quite fragile. It was drastically improved with 2012. Some functionality was lost in 2012 R2, but the process is nearly bulletproof.

Importing a Hyper-V Virtual Machine

In order to begin, you need a source Hyper-V virtual machine. As of 2012, the source no longer needs to have been exported first. Follow these steps:

  1. In Hyper-V Manager, select (highlight) the host that will own the imported virtual machine in the left pane.
    Select Host

    Select Host

     

  2. Right-click on the host and choose Import Virtual Machine. Alternatively, find the same entry under the Actions pane at the far right:
    Import Menu Option

    Import Menu Option

     

  3. The Import Virtual Machine wizard will appear. The first screen is merely informational. Click Next.
  4. You will be presented with the Locate Folder page where you’ll need to type or browse to the location of the files for the virtual machine. This screen can cause a great deal of confusion. To simplify it as much as possible, be aware that wizard is looking for an .XML file that describes a virtual machine. It will look in the specified folder. If the specified folder has a Virtual Machines sub-folder, it will look there too. Once you have a source selected, click Next.
    Import Source Folder

    Import Source Folder

     

  5. What happens on the next screen depends on whether or not it detects an importable virtual machine in the target location.
    1. If a suitable virtual machine was found, its name will appear and be selectable. If you see the virtual machine that you want to import, highlight it and click Next.
      Virtual Machine to Import

      Virtual Machine to Import

       

    2. If a suitable virtual machine was not found, you will receive a somewhat misleading error that says, The folder <foldername>Virtual Machines could not be found. You might not have permission to access it. In this case, the folder that it mentions truly does not exist, but it’s also not the folder that I specified. I only told it to look at “C:Exports”. When it didn’t find an XML file there, it then tried to find a Virtual Machines sub-folder. When it didn’t find that either, it generated the error. What makes it more confusing is that even if the folder does exist, you’ll get the same message if the only problem is that it didn’t find a suitable XML.
      No VM Source Folder Found

      No VM Source Folder Found

       

  6. If you found a suitable virtual machine in step 5, you’ll now be presented with the Choose Import Type screen. The “Import Types” discussion section after these steps details what the options mean. Make your selection and click Next.
    Import Type

    Import Type

     

  7. If you chose either Restore or Copy in step 6, the next screen will ask where you want to place the individual components of the virtual machine. If you do nothing on this screen, the files are placed in the host’s default location. Make your choices and click Next.
    Import Target Locations

    Import Target Locations

     

  8.  Also for the Restore or Copy options in step 6, you’ll be asked where to place the virtual hard disks. They’ll all be placed in the folder that you choose here, but you can use Storage [Live] Migration to make adjustments later.
    Import Target Virtual Disk Locations

    Import Target Virtual Disk Locations

     

  9. The final screen is a summary, which allows you to review your selections. Click Back to make adjustments or Finish to start the process. If you chose the Restore option, the import will happen almost instantly. Otherwise, you’ll need to wait while the files are copied to their destination. Once that completes, the virtual machine will appear in Hyper-V Manager on the host that you initially selected.

Notes on Hyper-V Import

The import procedure is quite simple when all is well, but there are quite a few issues involved.

  • The native import process in 2012 R2 only works with virtual machines that were created or run on 2012 or 2012 R2. The 2012 import process is the most versatile, as it will accept virtual machines as old as 2008 and as new as 2012 R2. It has not been explained why Microsoft chose to cripple the import procedure in newer versions. If you wish to import a pre-2012 virtual machine, you will need to use a 2012 host.
  • Export and import can be used as a backup, but lacks a great deal of functionality in comparison to true backup applications. If used for backup, it is best applied to virtual machines that require only a single backup instance. An example would be a software firewall implementation that is more or less steady-state.
  • In 2008 R2 and earlier, the import process required an EXP file that was only created by the export process. The EXP file is no longer used and the “export” procedure is mostly just a file copy.
  • The contents of the virtual machine files are not altered except to reflect any location or VMID changes necessary. If not properly handled, this can lead to the same types of issues that occur with cloned operating systems.
  • As indicated in the steps, all of the imported virtual machine’s virtual hard disks are placed in the same location. If you have issues with your storage space that might lead to problems, you have two options:
    • Place all of the virtual hard disks in the largest available space. Once the import completes, use Storage [Live] Migration to relocate the files as necessary.
    • If the export and import are on the same system, you can detach one or more of the disks before exporting the original. The import process will not be aware of them. Once the virtual machine is imported, reattach the disks.
  • In 2012 R2 and prior, your credentials only control whether or not you can start the import. It is the Virtual Machine Management Service (VMMS.EXE) on the importing host that performs the operation. Any Access denied errors that you receive will be because the computer account on the host performing the import does not have the proper permissions.

Import Types

On step 6 of the steps, you are asked which type of import to perform. How you answer depends on the condition of the sources files and what your goals are.

Register the Virtual Machine In-Place (Use the Existing Unique ID)

With this option, none of the source files are moved or changed in any way. A symbolic link is created in C:ProgramDataMicrosoftWindowsHyper-VVirtual Machines that points to the virtual machine’s true XML file (this behavior changes in the Windows 10/2016 code base; I do not know how these virtual machines are registered).

In-place registration is the preferred choice when a virtual machine has become unregistered from the Hyper-V host in some way, such as the symbolic link being damaged or destroyed by an anti-virus application or a failed Live Migration procedure. This is also the preferred choice if you manually copy files from a backup to the final destination. If you do not want to run the new virtual machine from where the source files are, make another choice. Choose in-place restoration if:

  • The source files do not need to be preserved as-is.
  • The imported virtual machine will be the only live copy of this particular virtual machine.
  • The source location is appropriate for running a virtual machine on the specified host.

Hyper-V and many applications that are aware of Hyper-V virtual machines (such as the Altaro Backup for Hyper-V product) track virtual machines by their ID. As stated in the option text, in-place registration does not modify this ID.

Restore the Virtual Machine (Use the Existing Unique ID)

The primary difference between this option and the Register the Virtual Machine In-Place option is that all of the files are copied from the specified source location into the specified destination location. Hyper-V will run the imported virtual machine from that target folder. In the abstract sense, VMMS needs permissions to copy to that location in order for the import to work, so that ability is a positive sign that it will probably be able to load a virtual machine from that location as well. In concrete terms, that’s not a guarantee. Be certain to verify permissions.

Restoration is the optimal choice if you are importing from a manual backup location that stores copies of virtual machines in their operating format (most backup software does not store virtual machine backups in this way). Choose restoration if:

  • You want to preserve the source files unmodified.
  • The imported virtual machine will be the only live copy of this particular virtual machine.
  • The source location is inappropriate for running a virtual machine (ex: a NAS that only supports the SMB 2 protocol).

The note at the end of the in-place registration section regarding the virtual machine ID is the same for restoration.

Copy the Virtual Machine (Create a New Unique ID)

The third option makes the most changes. It copies the files from the source to the specified destination, but it also modifies the XML file and the virtual machine’s files and folders to use a new VMID. As for the new destination, much of what was said in the Restore section applies.

Copying the virtual machine is a good decision when you truly need to duplicate the virtual machine. The ID will change, so ID-aware applications like Altaro Backup for Hyper-V will see it as an all-new virtual machine. Use the Copy option if either of the following are true:

  • You want to preserve the source files unmodified.
  • You want the virtual machine to exist alongside the virtual machine it was copied from.
  • You want any software outside or inside the virtual machine to treat the copy as a new operating system instance.

Applications inside the copy, include Windows Server, will be able to detect that a change occurred. It will be treated as a hardware change, much like restoring to a new computer. What does not happen is any change to any guest operating system identifiers. The Windows SID will not change and no computer names will be modified. IP addresses will be kept. SSL certificates will not be changed. While copying a virtual machine is certainly a powerful tool, consider using generalizing tools such as sysprep before exporting.

 

Hyper-V and the Small Business: Sample Host Builds

Hyper-V and the Small Business: Sample Host Builds

 

I get quite a few requests to assist with sizing of hosts, and I turn all of them away. Some of it is that I can’t offer free consulting — it’s not fair to me and it’s not fair to the consultants that I would be so dramatically undercutting. The other part of it is that no one that has ever asked has given me nearly enough information to do a good job of it. While I do not intend to change my policy on not designing systems, I will share some theoretical builds based on hypothetical situations. You can measure your projections against them, adjusting for any differences, to see where you might be overspending or under-allocating.

General Guidelines

First things first. If you haven’t read my article with tips for host provisioning, start there, then come back. I won’t restate the contents of that post, but I will rely on its information.

Now, set your priorities. Most small businesses do not have sufficient funds to meet every desire. Therefore, you must decide what is most important. When it comes time, you’ll have a better idea of where to add or cut. Almost every small design will follow this priority order:

  • Storage
  • Memory
  • CPU
  • Network

Storage

All else being equal, I prefer having a larger number of small platters to having a few very large platters. The larger disks tend to be slower and present more risks. Adding spindles is the best way to improve storage performance. You might find yourself constrained by drive bay count, however. Your greatest challenge then, is to not buy more storage than you need. How much do you need? Well, how much are you using today? Do you know how quickly it grew to that point? If you do, just project from there out to the maximum amount of time that you expect to use the new host, and add in a bit for overflow. If you’re not certain, a good rule of thumb is that 100 GB today will need 130 GB in 5 years. Capacity is the most important thing to know when sizing storage.

Once you have an idea for capacity, the next thing you need to figure out is what sort of fault tolerance you’re going to provide. If you’re using very large platters (I usually draw the line at anything over 1 TB per disk drive), then RAID-10 is really your best bet. For smaller platters, or if you’re using solid-state disk, then it really comes down to how many spindles you have and if you want your money to go more for capacity or more for redundancy. With a very small number of disks (4), RAID-5 is usually the best balance. RAID-10 will give a small performance boost but not much protective benefit over RAID-5. RAID-10 actually has a worse redundancy profile than RAID-6 in a 4-disk array, although its performance will be noticeably better. If I were to build a 4-disk array of spinning disks but couldn’t satisfy the size requirements with disks whose size is suitable for RAID-5, then I would definitely add spindles.

From there, I would counsel you to read my article on the RAID types. The biggest thing is to just avoid the FUD. I know of more than a few people out there trying their best to terrorize everyone into using RAID-10 by waving a lot of scary-looking charts and quoting statistically-unsound junk science. RAID-10 is perfectly fine if you’re willing to sacrifice half of your capacity for somewhat better performance, but it is not the automatic best fit for every situation. If you want better performance, your first choice should be to add spindles or switch to solid-state disks, not wring your hands over the RAID level.

Also, don’t do this:

Bad Hyper-V Array Design

Bad Hyper-V Array Design

 

Splitting arrays to segregate I/O comes from early SQL Server design, where both arrays do different work constantly. If you build your arrays like that for Hyper-V, you’re going to have two disks that do nothing all of the time while the rest handle all of the work. It’s a waste of money and places an unnecessarily high burden on the virtual machine-bearing spindles. This is what you want to do:

Good Array Design for Hyper-V

Good Array Design for Hyper-V

 

With all the drives in a single array, the work is spread out more evenly and across a greater number of spindles. You get better performance and reduced odds of a disk failing.

This also helps with the question of, “How should I partition my drives for Hyper-V?” If they’re all internal, that’s completely up to you. This is what I like to do:

Hyper-V Multiple Partitions

Hyper-V Multiple Partitions

 

This gives me good logical separation and safe places to use Storage Live Migration to fix things. The 100 GB for Hyper-V isn’t a hard number, but it’s worked very well for me.

Since you’re using internal storage, this is also a perfectly viable option:

Hyper-V Single Partition

Hyper-V Single Partition

 

I know that the very idea that anyone might do something like this ruffles a lot of people’s feathers, but there aren’t any quantifiable reasons for not doing so. Split partitions come from the days of split arrays, which as we said before, are a SQL thing. For bigger guys, they’re installing Hyper-V on internal storage but running virtual machines from the SAN, so they cannot physically do this. You’re building a Hyper-V host on all-local storage. Having one big partition doesn’t prevent upgradeability or portability or anything else. It’s just — unusual. If it works for you, that’s completely fine.

Finally, don’t do this:

Bad SSD Design for Hyper-V

Bad SSD Design for Hyper-V

 

The above computer has its fastest, most reliable disks doing nothing and its slowest, most failure-prone disks bearing all the weight. There is no sense in this design whatsoever. If you can’t afford SSDs for your virtual machines, then don’t buy any at all. Put comparable spinning disks in those first two slots and make your array encompass them all as described above.

I will reiterate from my previous post that only hardware RAID controllers should be used, especially with internal storage.

Memory

There isn’t much to say about memory. If you know what you need, buy that, plus some for expansion, plus 2 GB for the management operating system. If you don’t know what you need, well, you’re probably going to buy too much.

CPU

Modern CPUs are difficult to overdrive, especially in the small business environment. If you’re looking forward to Windows Server 2016 licensing, buy a dual 8-core system to get the most bang for your licensing dollar. I did a little poking around as I was writing this post, and it does appear that everything above 8 cores per CPU gives diminishing returns. While it’s difficult to give great guidelines here, I would be surprised if any typical 5o-user business couldn’t cover all of its server needs using a single 6-core CPU.

Do not scale out to the second CPU unless you have good reason to believe that you’ll need it. Unlike a lot of other resources, CPUs are bought in pairs and quads to give more computing power to a single box, not for redundancy. In my entire career, I’ve seen exactly two CPUs fail (not counting dead-on-arrival and improper user installations). Out of those two, one was because the owner didn’t believe us when we told him that CPUs were hot and that he shouldn’t touch it. His disbelief was rewarded with burned-out CPU and a permanent reverse imprint of the Intel logo on his thumb. I’ll tell you like we told him: “No, that’s not covered under warranty.” Now that we use heatsinks on all CPUs, I don’t think that will be a problem again. So, with only one non-user error CPU failure across a 15-year career, I think it’s very safe to say that buying a second CPU just for redundancy is a waste of money.

When you have two or more CPUs in the same box, memory balancing becomes an issue. While I struggle to envision the small business whose performance needs would be negatively impacted by a poorly designed NUMA layout, I still encourage you to work with your hardware vendor to ensure an optimal configuration.

Network

Networking is probably the easiest thing to size. For a standalone host with all internal storage in a typical small business, two gigabit adapters is perfect. If you’re going to use iSCSI/SMB, four gigabit adapters is perfect. To justify anything more, you’re either not fitting the small business mold very well or you’re building a cluster. For a small business cluster, four gigabit adapters for non-iSCSI/SMB hosts and six gigabit adapters for hosts that will use iSCSI/SMB is a solid number. If anyone tries to sell you anything more or anything bigger, demand concrete supporting evidence.

Small Business Management Operating System Choice

Ordinarily, I push for everyone to install Hyper-V Server (that’s the free one) and manage it remotely. I would be inclined to counsel a lot of small businesses to consider installing the full GUI of Windows Server as the management operating system instead. This would be one of those cases where it’s necessary to understand the best practice and why it may not apply to you.

  • Security: One common reason to use Hyper-V Server or Windows Server in Core mode is security. There are fewer components in use, which means fewer things for attackers to target. There’s also no Explorer, which is one of the favorite targets of attackers. However, these concerns are easy to blow out of proportion. In a major datacenter with thousands of servers, worrying about patching Explorer flaws on all of them can drive a security office insane. In a small office with one physical server and no security office, the concern is much less. Don’t browse the Internet from the server and most worries are instantly alleviated.
  • Reboots: Supposedly, a system only running Hyper-V Server needs fewer reboots for patches than a full GUI because it needs fewer patches. Well, whether this patch cycle has eight patches that need a reboot or one patch that needs a reboot, it all winds up with the same outcome. Truthfully, I don’t even know how it pans out because all of my patch reboots are automated to occur when nobody is around to care and I almost never follow up. I’m going to make a guess that if I set any given small business’s Hyper-V host to patch as necessary at 4 AM every Wednesday and allow a reboot if one is required, I’m probably not going to need to run from anyone chasing me with a pitchfork. If you have a cluster, Cluster Aware Updating should be taking care of everything for you.
  • Management: I often push for Hyper-V Server because I think that a Hyper-V host should be deployed and then never directly logged on to again. In organizations with multiple physical hosts, that’s a relatively simple reality to accomplish. In sites with one or two hosts, it’s less likely. If it would be a hardship to have a management system, don’t use one.
  • Features: You aren’t supposed to run anything alongside Hyper-V unless it is for managing or protecting Hyper-V and/or the virtual machines. Anything else causes you to forfeit one of your Windows Server guest virtualization rights, places a drag on the guests, and is probably better suited to a guest anyway. However, you might want to use Storage Spaces in the management operating system. If that’s you, then Windows Server is the only way to go because none of the advanced storage capabilities are supported by Hyper-V Server. However, take these notes to heart before you do that:
    • Like Hyper-V, Storage Spaces works best when it is all by itself. I like Storage Spaces as a technology and I think Microsoft has done some fantastic work with storage in the last few years, but I still would be reluctant to combine these two roles in most cases. A hardware RAID controller will be more cost-efficient and supportable. If you choose to use Storage Spaces anyway, make sure to increase CPU, RAM, and, if possible, the count and speed of the disks that you use. For context, I am using a stand-alone Storage Spaces host in my lab and it routinely alerts me that it is using 3.75 GB out of the installed 4 GB, and almost all of it is for storage-related tasks.
    • If you only use Storage Spaces to hold virtual machine files (BIN, VSV, VHD/X, XML, and SLP) and anything that will be used on the local host, it does not cost you a guest virtualization privilege. If you create a share on it and host anything else at all, you lose a privilege.
    • The Hyper-V deduplication feature of Storage Spaces is only supported for VDI (virtual desktops). It does work for server loads (I have it in my lab), but you can find yourself in a position where no one can or will be able to help you in production. Also, the outcome for deduplication of server loads is not nearly as satisfying as it is for VDI. A supported method that does not forfeit a right is to build a file server virtual machine on dynamically expanding VHDX and use the normal deduplication technique within the virtual machine. That method’s space yield is commonly not very satisfying either, but there is no question around supportability.
  • PowerShell: I’m never going to stop beating the drum for PowerShell adoption. DevOps and control by keyboard is the present and the future as much as it is the past. However, there’s not as much going on in a one or two host environment that creates a strong driver for intimate PowerShell knowledge. To be sure, if you’re vying for that sysadmin job at a larger company, you’re going to be deservedly knocked out of consideration very quickly if you’ve decided that you’re too good for PowerShell. But, if you’re a small business principal who is only wearing the IT hat because you can’t afford to pay someone else, acquiring a strong PowerShell skillset is probably too much. You’ll still need to have enough familiarity to not be afraid to use PowerShell solutions to fix your problems, but it’s OK to stay rooted in the GUI.

Sample Build #1

So, let’s move on to our samples.

Scenario

These are the parameters for this scenario:

  • 20 user organization, single site
  • Uses a cloud service for e-mail
  • Has 1TB of file storage with typical growth
  • 2 printers on-premises

Suggested Build

This one looks really easy, but I’m going to say this because I think it’s something never to be forgotten: I would spend some time setting up Performance Monitor and build from that. From what I’ve seen of organizations in this scenario, the server that I would build would look like this:

  • Quad-core CPU
  • 8-16 GB RAM
  • 4x 600 GB SAS in RAID-5; prefer chassis with additional bays
  • 2x gigabit; paired in a switch-independent, dynamic team. It would handle management operating system traffic and all virtual machine traffic.
  • 1x Windows Server 2012 R2 Standard license
  • Virtual machines
    • 1x domain controller
      • Generation 2
      • 2 vCPU
      • 60GB dynamically expanding C:
      • Dynamic RAM; 512MB startup, 512MB minimum, 3GB maximum
    • 1x file/print server
      • Generation 1 or 2 — Generation 1 still seems to be a bit more stable
      • 2 vCPU
      • 60GB dynamically expanding C:
      • 1.5TB dynamically expanding D:
      • Dynamic RAM; 512MB startup, 512MB minimum, 4GB maximum
  • 2x 2TB external HDDs for daily backup rotation

Discussion

This build is difficult to undersize. Even though it will likely not need anything more than 8GB of memory, doubling it to 16GB will likely not be expensive and give the system room for growth. The main issue holding this system back is really the Windows Server licensing. Adding even one more virtual machine will require the purchase of another Windows Server license (of course, this is assuming that Windows Server is the operating system that would be used).

Anything larger than a quad-core CPU will be wasteful, and even a quad-core won’t be used to anything resembling its capacity.

Disk space is also well-sized against the need. At current growth rates, it will be unlikely for the system to need even as much as it has until the fifth year. By leaving open even one bay, the array can be expanded with another 600GB, which is quite a substantial increase when growing from 1.8TB.

The 2TB backup disks might be a bit small, but remember that most data is static in nature. Data churn is likely to be small. These disks should be sufficient for day one, but the organization should be prepared to acquire larger disks if necessary.

Sample Build #2

The above scenario is more likely to exist in a textbook than in the real world, although I’ve seen some that fit. The sample build is going to be more realistic and incorporate some of the harder realities faced by small businesses.

Scenario

These are the parameters for this scenario:

  • 20 user organization, single site
  • Uses a cloud service for e-mail
  • Has 1 TB of file storage with typical growth
  • Has a line-of-business application server whose developer publishes the following requirements:
    • 4 GB of RAM, fixed
    • 500 GB of HDD, fixed
    • 2 CPU reservation
  • Multiple client-based (no server) applications
  • 2 printers on-premises

This scenario takes the company from scenario #1 and adds in what most small businesses really have: an application system that is provided by a software house that caters to their industry along with a smattering of other applications. Ideally, that application would be placed on its own server. Realistically, that would mean another full Windows Server license, which might not be feasible.

Suggested Build

  • Single hex-core CPU
  • 16 GB RAM
  • 6x 600 GB SAS in RAID-5; prefer chassis with additional bays
  • 2x gigabit; paired in a switch-independent, dynamic team. It would handle management operating system traffic and all virtual machine traffic.
  • 1x Windows Server 2012 R2 Standard license
  • Virtual machines
    • 1x domain controller
      • Generation 2
      • 2 vCPU
      • 60GB dynamically expanding C:
      • Dynamic RAM; 512MB startup, 512MB minimum, 3GB maximum
    • 1x file/print/application server
      • Generation 1 or 2 — Generation 1 still seems to be a bit more stable — app vendor recommendations apply
      • 4 vCPU; 50% reserved
      • 60GB dynamically expanding C:
      • 1.5TB dynamically expanding D:
      • 500GB fixed E:
      • Fixed RAM: 8GB
  • 2x 2TB or 3TB external HDDs for daily backup rotation

This build doesn’t change a great deal over the original. The virtual machine that was serving as only file and print has been augmented to host the application server. We’ve doubled its memory allotment to 8GB fixed, given it another drive that exactly matches the vendor’s specification, and established CPU reservations to ensure that the application vendor is satisfied.

As with the previous build, I believe that the CPU on this system will be largely wasted. However, the fiscal distance between a quad-core and a hex-core processor is usually minimal. Since your application vendor demands two of them, having another two at your disposal gives you a bit more flexibility to grow.

The bit about other applications is just to throw it off a bit. They don’t run from a server, so you’ll probably have a “Software” or an “Apps” share with their installers on the file server. Other than that, any provisioning that you do for them is at the desktop/laptop computer level.

Sample Build #3

Let’s move up a little bit to a somewhat larger company.

  • 50 user organization, single working site with disaster recovery site
  • Hosts Exchange on-premises
  • Has 2 TB of file storage with typical growth
  • Has a line-of-business application server whose developer publishes the following requirements:
    • 4 GB of RAM, fixed
    • 50 GB of HDD, fixed
    • 2 CPU reservation
    • SQL
  • The line-of-business SQL server has these requirements:
    • 8 GB of RAM
    • 1TB of HDD
    • 4 CPU reservation
  • Multiple client-based (no server) applications
  • 5 printers on-premises

Even though the bullet points don’t look substantially different, we’ve made some major changes. Let’s begin by breaking down the roles we need to plan for:

  • Active Directory (including DNS & DHCP)
  • Exchange
  • SQL
  • Application
  • File/print

We’ve come up with an odd number of roles again (5 printers is not nearly enough to separate them from the file role). Since Windows Server license coverage for virtual guests comes in pairs, we need to decide whether to bump up to three licenses and have an unused set or try to keep it at two. An additional $800USD for a third Windows Server license probably would not be a major barrier for most 50-user organizations, but let’s assume the worst and try to architect for only two licenses.

Suggested Build

  • Single 8-core or dual hex-core CPU
  • 48 GB RAM
  • 6x 2TB SATA in RAID-10
  • 2x gigabit; paired in a switch-independent, dynamic team. It would handle management operating system traffic and all virtual machine traffic.
  • 2x Windows Server 2012 R2 Standard license
  • Virtual machines
    • 1x domain controller
      • Generation 2
      • 2 vCPU
      • 60GB dynamically expanding C:
      • Dynamic RAM; 512MB startup, 512MB minimum, 3GB maximum
    • 1x file/print/application server
      • Generation 1 or 2 — Generation 1 still seems to be a bit more stable — app vendor recommendations apply
      • 4 vCPU; 50% reserved
      • 6 GB fixed RAM
      • 60GB dynamically expanding C:
      • 3TB dynamically expanding D:
      • 50GB fixed E:
      • Fixed RAM: 6GB
    • 1x SQL Server
      • Generation 1
      • 4 vCPU; 100% reserved
      • 8 GB fixed RAM
      • 60GB dynamically expanding C:
      • 500 GB fixed D: (data)
      • 500 GB fixed E: (logs)
    • 1x Exchange Server
      • Generation 1
      • 4 vCPU
      • 12 GB fixed RAM
      • 60GB dynamically expanding C:
      • 400GB fixed D:
  • 4x 2TB external HDDs for daily backup rotation, two per day

One thing that you need to be aware of is that a lot of people will insist that this disk configuration is insufficient for either SQL or Exchange separately, much less together. That thinking is over a decade out-of-date. Exchange Server has been using progressively less IOPS for years, to the point that Microsoft now certifies it to run on so-called “near-line” SATA. Exchange now uses lots of RAM instead of lots of disk, and the 12 GB in this design will keep it happy while also meeting Microsoft’s requirements. Your only concern here is having sufficient space. At 400GB in the initial build, that’s close to 8 GB per mailbox, which is far more generous than I tend to be (anything after 2 and I start reminding users that Outlook is neither a customer-relations database nor a file server).

As for SQL, there is no magical number of IOPS to satisfy it. It all comes down to how it will be used. No one can determine this in the abstract. In my experience, most databases backing a vendor application for 50 users will not be a very heavy load.

My concern with this build is storage utilization. In order to maintain expansion capabilities, I have switched from fast SAS to somewhat slower SATA, albeit in the faster RAID-10 configuration. Had we continued with the 600GB SAS drives, we would have needed to use 8 bays, which is the most commonly available maximum. While it is possible to acquire hosts with more than 8 drive bays, it’s not the most practical approach for a single-host site. Even with an 8-bay system, the host would have started life by bumping up against the maximum capacity. One thing that I would try to do is investigate the possibility of reducing disk needs. 2 TB is a lot of file storage for only 50 users. If they’re engineers storing CAD drawings, it’s plausible. If they’re insurance agents, someone needs to transfer their MP3 collection to their own device.

You can’t always give much pushback on things like that. If they’re accustomed to having lots of space, taking it away will be difficult.

One other option would be to use the filled 8-bay SAS system. I would plan for such a system to expand out within its lifetime. You would start with that host, but set the expectation of an additional host at some point in the future. When its day comes, use Shared Nothing Live Migration to move roles to it. A secondary host would also open the door to a secondary domain controller.

Another option would be to transition the organization to hosted Exchange. The space reduction would allow you to back down the storage requirement significantly.

 

How to Setup an Altaro Offsite Server in Microsoft Azure

How to Setup an Altaro Offsite Server in Microsoft Azure

Welcome back everyone for Part 2 of our series on hosting an Altaro Offsite Server in Microsoft Azure! In Part 1 we covered the planning and pricing aspects of placing an Altaro Offsite Server in Microsoft Azure. While that post was light on the technical how to, this post is absolutely filled with it!

Below you’ll find a video that walks through the entire process from beginning to end. In this video we’ll be doing the following:

  1. Provision a new 2012 R2 virtual machine inside of Azure
  2. Configure Azure Security Group port settings
  3. Define External DNS settings
  4. RDP into the new server and install the Altaro Offsite Server software.
  5. Attach a 1TB virtual data disk to the VM
  6. Configure a new Altaro Offsite Server user account and attach the virtual disk from step 5.
  7. Log into the on-premises instance of Altaro VM Backup and define a new offsite backup location.

Once these steps are complete, you’ll have a good starting offsite server to vault your backups too. I would like to note however, that for the purposes of this demo, it is assumed that you have no more than 1TBs worth of data to vault offsite. Microsoft Azure imposes a hard 1TB size limitation on virtual hard disks, and while there are ways around this limitation, they are outside the scope of the basic installation and setup instructions included in this post. I will be covering those situations in the next part of this series. Outside of that, the installation instructions covered here, are the same regardless.

The process is fairly straight forward, and I’ve done it in a way that doesn’t require a full understanding of Azure for this to work. However, I highly encourage you to take the time to learn about how Azure functions. With that said, lets get to the video!

 

As you can see, the process really isn’t that difficult once it’s broken down. If you have any follow up questions of need clarification on anything, feel free to let me know in the comments section below, and stay tuned for more advanced scenarios coming up next in this series!

Thanks for reading!

 

 

Page 1 of 3123