Save to My DOJO
Microsoft’s Active Directory technology enables system administrators to group large numbers of computers together inside security boundaries. The directory holds a potentially vast amount of information on users, computers, and all other types of objects. Any object with proper security rights can query the directory and retrieve information. This directory allows for single-sign-on, easily controlled access to resources, software deployment, group policy management, and many other benefits over the older workgroup model. The directory’s layout, called the “schema”, can even be extended to hold other data types so that third party software can utilize the repository as well. Management of Active Directory requires a special type of server called a domain controller. This server is responsible for handling queries and keeping the directory up-to-date. Microsoft had domain controllers in earlier versions of Windows Server, but with the introduction of Active Directory in Windows 2000, domain controllers ceased working under the single-master/multiple-slave model and began operating under a multi-master model. This means that in a domain with multiple domain controllers, objects can be changed on any domain controller. Changes are periodically synchronized between domain controllers using active directory replication technology.
Identifying the Challenge
Most of the intricacies and details of the multi-master Active Directory operation are well beyond the scope of this article, but some explanation is necessary to understand the special challenges domain controllers pose for both backup and restore operations. Every domain controller is considered “authoritative”; that means it can make changes to the directory without gaining consent from the others or having a centralized domain controller handle update operations. To do this, each domain controller stores a local copy of the directory and controls updates to it. To keep track of changes, Active Directory assigns an Update Sequence Number (USN) to every attribute for every object that it is tracking. These aren’t easily viewable, but some USN-related attributes can be seen in the ADSI Edit screenshot below:
When a domain controller updates the attributes for an object, it modifies them and a local copy of the attribute’s USN which is based on the domain controller’s own USN. When it replicates with another domain controller, they determine whether or not they need to update objects from remote servers based on USNs. If an object attribute on a remote domain controller has a higher USN that it did during the last replication cycle, then the local domain controller knows it needs to retrieve that data. In the event of a collision, other mechanisms are in play to determine which attribute update is kept (usually, most recent wins).
Challenges of Virtualization
With the multi-master model, all domain controllers are considered “authoritative”. With the exception of a handful of operations, no single domain controller is ever “in charge”. So, if problems occur in the directory, there isn’t a simplistic way for domain controllers to “work it out amongst themselves”. In most cases, there won’t be a problem, but virtualization adds some potential issues. When Active Directory was first introduced, no one was virtualizing anything. The biggest uncontrolled threat to Active Directory was a system crash. Microsoft built in a mechanism so that if a domain controller crashes, when it comes back up, it essentially reaches out to all the other domain controllers and says, “I was out for a bit, what did I miss?” Replication is not hindered. However, a virtualized domain controller can be paused or suspended for an indefinite length of time. When it comes back, the only thing it’s going to know is that its clock has changed. This has the potential to behave exactly like restoring an old copy of Active Directory into a functioning forest. This condition is discussed in the “Challenges of Domain Controller Restores” section.
Challenges of Virtualized Domain Controller Backup
Running a traditional-style backup that specifically triggers the VSS writer to operate on the System State of a domain controller ensures that Active Directory knows it’s been backed up and therefore the consistency of the Active Directory database is guaranteed. Backing up the virtual machine from the host level using a method that does not trigger the guest’s VSS writer does not result in a state that’s guaranteed to be accurate. More details are given in the “Challenges of Domain Controller Restores” section.
Challenges of Domain Controller Restores
There are multiple issues around restoring a domain controller, many of which can be problematic even in a non-virtualized environment. Most of them revolve around restoring an old copy of the directory database without properly preparing Active Directory. This happens when you use backup software that doesn’t properly trigger the VSS writer or when a paused/suspended virtualized domain controller is started. The serious part of the problem is that there’s not a simplistic way to know that a problem has occurred. It’s highly probable that if you attempt to restore or bring an old domain controller out of pause state that it will tell you that everything is fine. If you’re lucky, the other domain controllers will notice that something is amiss and the newly restored domain controller will mark itself as being in a USN Rollback state and will stop participating in replication. That will generally require a rebuild of the domain controller; bad, not as bad as the alternative. If the other domain controllers don’t realize anything happened, then objects may exist on the domain with different attributes that will never be properly reconciled because the various domain controllers may not realize that they don’t agree with each other. A more serious possibility is that objects that the revived domain controller might have active copies of objects that other domain controllers had deleted. These objects (called “lingering objects”) will be returned to active status and replicated to the other domain controllers.
How to Approach Virtualized Domain Controller Backup and Restore
The most important thing is to select a backup application that is Active Directory-aware. Not all programs are marketed this way; if your application is VSS-aware and can specifically back up the System State of a Windows machine, it should work. Because this article specifically has Hyper-V guest machines in mind, then the application must be able to work with Hyper-V’s VSS writer so that it can communicate with the guest’s VSS writer. Also, in the virtual machine’s property dialog, on the Integration Services tab, the “Backup” integration component must be enabled. If all of these requirements are not met, then Hyper-V will pause the domain controller while it is being backed up. If that domain controller is ever restored, you run the risk of being in a USN rollback condition.
As mentioned earlier, restores are the dangerous part and something you want to avoid if at all possible. If you are recovering from a catastrophe and any domain controller survived with a suitably recent copy of the directory, the preferred recovery method is to have it seize all FSMO roles, make it a global catalog server if it wasn’t already, and delete the missing domain controllers from Active Directory Sites and Services. After that is complete, just deploy new domain controllers. If no domain controllers survived at all, then the recommended approach would be to restore only one domain controller and proceed as though it was the sole survivor mentioned earlier. Even if its copy of the database has some problems, it won’t cause a USN rollback if it’s the only domain controller still standing and it won’t be as problematic as potentially introducing multiple inconsistent domain controllers.
Of course, sometimes, you just have to restore a domain controller into an existing forest amid other functioning domain controllers. In that case, you need to ensure that your software is Active Directory aware (hopefully you did this before needing to restore). The only thing required to bring back the directory itself is a restore of System State data. If you are going to be restoring a virtualized domain controller using some sort of “bare metal recovery” method, then you need to double-check with your software vendor that it is designed specifically to handle virtualized domain controllers in this scenario. If a domain controller is not notified that a restore took place, then it will operate as though it is current and this will always trigger a USN rollback condition.
How to Verify Your Software is Active Directory-Aware
After a backup, check the domain controller’s event logs. Look under “Applications and Services”->”Directory Services” for Event ID 1917. If you find it, this is an absolute guarantee that your software is properly triggering the VSS writer. However, some applications still take a consistent backup of Active Directory without generating this event, as long as they trigger the VSS writer. Ensure that your application is set to at least take a System State backup and use the VSS writer, then look in the “Application” log for several Event 2001 and Event 2003 entries generated by ESENT. If there are no associated errors, then your directory is being safely backed up.
After you restore a domain controller into a domain with other active domain controllers, immediately check its “Directory Services” event log for ID 1109. If it is not there, disconnect that domain controller from the network as soon as possible or it could lead to a USN rollback condition.
I Only Have One Domain Controller. How Does this Affect Me?
If you only have one domain controller, then there’s not as much to worry about. It could crash without a commit and lose some data, but that is a potential problem regardless of how many domain controllers you have. Since a lone domain controller doesn’t participate in replication, a USN rollback state will not corrupt the directory or force you to into needing to demote/promote any domain controllers. It just means that later changes are lost. USN rollbacks are very rare in practice, especially if administrators are sufficiently educated on the subject, so the benefit of never having a USN rollback is completely eclipsed by the benefits of having multiple domain controllers. Only the smallest domains should ever operate with only one domain controller.
Can’t I Just Never Back Up and Instead Rely on Multiple Domain Controllers?
If your domain controllers are split across multiple physically separate sites and replication is reliable and consistent, there is a temptation to just not back them up. You should be protected from most environmental and physical disasters. However, without a backup, you aren’t protected from human disasters or malice. Always take backups.
I Can Rebuild the Domain with One Restore, so Should I Only Backup One Domain Controller?
You should back up several domain controllers, the more the better. Even though you only need to restore one, you cannot predict the circumstances that will require a restore or if any given backup will be successful and remain uncorrupted until needed. The more backups you have, the more likely you are to have the backup that you want. It is always better to have too many backups than not enough.
The source for much of the research in this article is a piece by Sander Berkouwer at “The things that are better left unspoken”. You can read the original article as well and download a PDF with more detailed explanation of USNs and how they are impacted by backups and restores.
Not a DOJO Member yet?
Join thousands of other IT pros and receive a weekly roundup email with the latest content & updates!