Save to My DOJO
Although there are countless backup applications on the market, nearly all of these backup applications use one of several high-level methods for creating backups. Some backup applications create image backups, while others concentrate on file-level backups. There are also a few solutions that give the backup operator a choice of which backup method they want to use. This article explores the difference between image-based backups and file-based backups and compares the advantages and disadvantages of each approach.
Note: Block-level backups will be covered in a separate article.
What is system image backup?
An image backup is exactly what it sounds like. It is a full copy of a computer’s contents. In other words, an image backup is a mirror image of the computer’s hard drive.
What is the operational impact of a system image backup?
There are a few different things to consider with regard to the operational impact associated with using image backups. First and foremost, depending on the backup software that is being used, and image backup may limit your options for restoring data. Some backup vendors will allow you to perform granular recovery operations from an image-based backup (such as restoring an individual file). Others, however, will only allow you to perform a full restoration and do not allow for the restoration of individual files or folders.
It is also worth considering that a PCs hardware may limit its ability to restore an image backup. Suppose for a moment that a PC is equipped with a 2 TB hard drive, but only contains half a terabyte of data. Now imagine that the hard disk fails and that the only spare hard disk that is immediately available is a 1 TB disk. Using the smaller disk shouldn’t be a problem because the PC only contains half a terabyte of data. Even so, restoring an image backup to the smaller disk may be impossible because the image was based on a much larger disk.
Another thing to consider with regard to image backups is that because images are essentially full copies of a computer’s hard disk, image backups tend to be quite large in size. While it is true that most image backup solutions create images that are smaller than the hard disks that they are backing up, images do tend to be large because they do not use technologies such as deduplication to reduce the image size, and because they may potentially include temporary files or a copy of the Windows pagefile. Again, however, each vendor has its own way of doing things, so some image backup solutions will inevitably create smaller images than others.
The reason why image size is an important consideration is because the size of the image has a direct impact on backup storage cost. Similarly, an organization that wants to use a cloud-based backup target may find it impractical to do so if their backup software produces excessively large images.
What is a file-based backup?
Whereas an image-based backup attempts to create a full copy of an entire hard disk, a file-based backup focuses on backing up individual files and folders.
What is the operational impact of a file-based backup?
Early on, file-based backups performed direct copies of the files residing on a protected system. The problem with this approach, however, is that most of the backup applications of the time were unable to back up open files. This meant that the operating system and the applications could not be protected, nor could the backup application protect any documents or data files that a user was actively working on.
Modern file-based backup solutions tend to use changed block tracking as an alternative to direct file copies. The idea behind changed block tracking is that initially, the backup software makes a backup copy of every storage block on the protected system. Deduplication is often used to ensure that duplicate blocks are not backed up, thereby reducing the amount of time required to create the initial backup, and shrinking the backup footprint. Subsequent backups are run every few minutes, as opposed to the nightly backups that were once common practice and protect any newly created or modified storage blocks.
The main advantage of this type of backup is that it allows for the frequent creation of recovery points. Additionally, because each backup cycle is only protecting the storage blocks that have been created or modified since the previous backup, backups tend to be very small in size.
It is worth noting, however, that some (but not all) file-based backup solutions are incapable of performing a full restoration of an entire physical or virtual machine. Such solutions protect data but may be incapable of restoring the operating system or applications.
Image-Based Backup vs. File-Level Backup
When properly implemented, both image-based backups and file-based backups are viable solutions for protecting a computer’s contents. Even though some file-based backup applications are incapable of performing full system restorations, there are file-based backup solutions that are able to protect the operating system, applications, and everything else on a computer. Similarly, some image-based backup solutions do not allow for granular restoration of files, folders, and other objects, but there are those that do. As such, the ability to perform both bare metal and granular restorations needs to be considered when selecting a backup solution, but this consideration does not necessarily disqualify the use of image or file-based backup technology.
The main things that need to be considered when choosing between the two technologies are the frequency with which recovery points can be created, and the size of the backups. Although there are some image-based backup products that are able to create differential images, such products tend to be extremely limited in the number of recovery points that they are able to create over the course of the day. Conversely, file-based backup solutions that are based on changed block tracking are generally able to create recovery points every few minutes. Likewise, image-based backup solutions tend to create much larger backups than those produced by file-level backup solutions.
How does Altaro tackle backups?
Altaro takes a block-based approach to backups (which is the best of both worlds) but does so in a way that allows entire physical or virtual machines to be backed up and restored. In fact, Altaro fully supports Windows VSS and is application-aware as well.
Figure 1
Because Altaro does use a block-based approach to backups, it supports continuous data protection, with backups being created as frequently as every five minutes. Additionally, technologies such as augmented inline deduplication help to increase the speed of the backup process, while also significantly reducing backup storage requirements. This can be extremely beneficial to organizations who wish to back up or restore their data to the cloud.
Continuous data protection solutions, such as the one used by Altaro are based on changed block tracking. This means that the backup application monitors the protected system’s storage to keep track of how storage blocks are being used. If the operating system writes data to a storage block, that data is backed up.
One of the problems that has long been associated with this approach, is that because each scheduled backup only backs up storage blocks that have been created or modified, restorations almost always require both new and previously existing storage blocks to be recovered. This isn’t a problem if the backup is healthy, but if any corruption exists within the backup then that corruption may inhibit an organization’s ability to recover data from even the most recent recovery point.
Altaro avoids this problem with its Backup Health Monitor. The Backup Health Monitor regularly checks the backup repository for the existence of missing or corrupt storage blocks. If any problems are detected, then the affected blocks are automatically repaired (re-backed up) within the next backup cycle. This type of self-healing is one of the things that really sets Altaro apart from competing backup solutions.
Figure 2
Not a DOJO Member yet?
Join thousands of other IT pros and receive a weekly roundup email with the latest content & updates!