Save to My DOJO
Through version 2012 R2, all Hyper-V virtual machines are defined by an XML file. It conforms to the XML specification and can be viewed in any text editor. Microsoft has never supported editing this file, but being able to read it has its uses. For instance, if you find an orphaned virtual machine file set, you just open its XML and navigate to ConfigurationPropertiesName to find the name of the virtual machine that the XML file describes. You could also create a custom XML reader to quickly poll virtual machine(s) for other information that you find relevant. Out of all the files that belong to a virtual machine, the XML file is the most important. The purpose of this article is to investigate the importance of these XML files and how Hyper-V utilizes them.
Changes in Client Hyper-V in Windows 10 and Hyper-V Server 2016
The focus of the content of this article is only useful up through version 2012 R2. Microsoft has elected to replace their open, comprehensible system with a new model that utilizes a larger, undocumented file format that causes problems for text-based file readers. I have begun investigating the differences and hope to return to this subject for versions 2016+ once I have a better understanding. The only thing that I’ve learned so far is that the hoopla around the format being”binary” is mostly meaningless. The new format contains the same data as the old, it just takes a bit more effort to look at it:
As you can see, the same information in the original file type on the left is more or less human-readable in a hex dump of the new file format on the right. The format doesn’t seem like it’s overly complicated across the board. For example, I saw fairly quickly that it appears that each device leads in with four bytes of sequential numbering. While I didn’t line up the highlighting perfectly, look at offset 70B000 (right at the beginning of the screen capture) and you’ll see the byte pattern 00 00 00 07. The next device appears to start at offset 722005 (very near the end of the highlighted area) and has the byte pattern 00 00 00 08. I don’t really see how this format is “more efficient” since it is larger, has nearly the same layout, and adds a lot of empty padding at the beginning, but it is the format we’ll have to get used to. I’m certain that someone will come out with a parser in fairly short order, if it hasn’t been done already.
The other change in 2016 is way that Hyper-V keeps tabs on the virtual machines via the VMCX files. I don’t understand that at all, yet. From this point forward, I doubt that much, if any, of the content of this article will apply to 2016 or later.
How Hyper-V Works with Virtual Machine XML Files
By default, all virtual machines are created in C:ProgramDataMicrosoftWindowsHyper-VVirtual Machines. If you do not change the default, then the XML file that represents each virtual machine that is created on the host is placed in this folder. This includes not only virtual machines created by Hyper-V Manager, Failover Cluster Manager, and New-VM, but also applies to guests that are introduced to the host by Quick Migration or any kind of Live Migration.
Here’s a quick, simple rule: if an XML file exists in this folder that contains a GUID in the file name and can be parsed as a virtual machine definition, Hyper-V will automatically treat it as a valid virtual machine.
Any virtual machine defined by a file that fits the above rule will appear in Hyper-V Manager and Get-VM. No other file is required to make this happen — not VHDs, not BINs, not VSVs, nothing. If the contents of the file are incorrect, then the virtual machine will be inoperable, but it will still appear. The XML file is the centerpiece for all the rest of the components. From the XML file, Hyper-V will be able to find its way to all of the other files that make up a virtual machine. The VHDs and some of the other files will be specifically indicated in the XML while others will be placed in locations that are relative to the XML file.
XML Files in Other Locations
Of course, most people do not leave the virtual machine placement default unchanged, as it’s generally accepted that you don’t want your virtual machines to be in the same place as the hypervisor files for a variety of reasons. However, the C:ProgramDataMicrosoftWindowsHyper-VVirtual Machines folder retains its significance. Here is a view of mine on my primary host:
Every single one of the virtual machines currently running on this host are represented here, even though some are on SMB 3 storage and some are in Cluster Shared Volumes.
This may be a bit confusing at this point in the explanation, but here’s another solid rule: only XML files that are present in C:ProgramDataMicrosoftWindowsHyper-VVirtual Machines can be detected by Hyper-V as virtual machines.
You might be wanting to tell me that all of your virtual machine’s XML files are elsewhere and they are working just fine. That’s also true. However, they are also present in C:ProgramDataMicrosoftWindowsHyper-VVirtual Machines. If you look at the Type column of my screenshot (and your own folder), you’ll see that these are all of type .symlink, not of type XML Document. Hyper-V looks in this folder, and only in this folder, for XML files. Because these .symlink objects exist in that location, it retrieves exactly what it’s looking for. Hyper-V does not really know that the XML is elsewhere. That might not be technically true under the covers, but conceptually, this is how it works.
To grasp how Hyper-V functions when the XML files are not physically present in C:ProgramDataMicrosoftWindowsHyper-VVirtual Machines, you must understand the concept of the symbolic link. Many people think of these as shortcut files, but they are not. A shortcut file (identifiable by the .lnk extension), is almost completely different.
This is how the traditional shortcut file functions:
- The shortcut file (.lnk) is activated.
- The operating system retrieves the location of the .lnk from the volume’s file allocation table.
- The operating system parses the contents of the .lnk file to find information on the target file.
- The operating system retrieves the location of the target file from the indicated volume’s file allocation table.
- The operating system opens the target file.
A visual representation of accessing a file on a D: volume via a shortcut file on the C: volume:
Symbolic links have a similar purpose, but quite different functionality. As you can see from the screenshot of my directory listing, they appear as the actual target file, not as a separate shortcut file. The operating system handles all of the functionality much more smoothly:
The first major difference between a symbolic link and a shortcut is that a symbolic link has the exact same file name as its target. This makes it far easier for an automated system, like Hyper-V, to use a single search to find what it is looking for. In the case of Hyper-V, the only thing it does is scan C:ProgramDataMicrosoftWindowsHyper-VVirtual Machines for files that match the pattern GUID.xml. It’s up to the operating system to deliver the actual files that it asks for, even if they are in other locations. The second difference is that a symbolic link exists entirely in the file allocation table. Where the process of translating a shortcut takes 5 steps, following a symbolic link only requires only one extra hop over directly accessing a file.
Broken or Lost XML Files
Prior to 2012, Hyper-V’s XML parser was very fragile. Simply having a well-formed XML file wasn’t enough. It was very easy to make very minor modifications to the XML that would not have broken a traditional parser but would completely throw Hyper-V for a loop. Beginning in 2012, the XML parser became much more resilient to the point that we should all be sad to see it go. Of course, it’s neither perfect nor omniscient. However, it does follow rules that are easily understood.
Rule: If a VM does not have a GUID.xml file in C:ProgramDataMicrosoftWindowsHyper-VVirtual Machines, then it does not exist.
Hyper-V does not look anywhere else for these files. They could be deleted; antivirus software has been known to do this in the past, although I believe most now understand to leave these files alone. They also might not be created when they should. I’ve had a few instances where a Live Migration had some sort of problem and the XML file wound up in a sort of limbo where it wasn’t on either system. I’m not entirely certain how it gets bound up, but usually, restarting the Hyper-V Virtual Machine Management (vmms.exe) service on one or both hosts sorts this particular issue out.
Rule: If Hyper-V cannot parse the XML file, the virtual machine might as well not exist.
If the XML file is present but damaged to the point that Hyper-V’s parser can’t untangle it, it might as well not exist. It won’t appear in Hyper-V Manager or any other Hyper-V tool. However, you’ll find error event 16030 recording in the Hyper-V-VMMS log: “Cannot load a virtual machine configuration because it is corrupt. (Virtual machine ID GUID) Delete the virtual machine configuration file (.XML file) and recreate the virtual machine”. If this happens to an XML file that is represented by a symbolic link, be aware that Hyper-V will delete the symbolic link. If you are able to repair the target XML file, you can use Hyper-V’s import feature to register the fixed XML in place and the virtual machine will be ready to use.
Rule: If Hyper-V’s local folder contains a symbolic link to a location that does not exist, the virtual machine will be set to a “Critical” state.
Most of us that use Hyper-V with remote storage have encountered this issue at least once. If the target location for a virtual machine’s symbolic link goes dead, Hyper-V will retain a memory of its name and state but will lose everything else.
Recovering from Critical States
In the previous screen shot, the recovery method was very simple: I took the CSV out of maintenance mode and waited for VMMS to catch on that something had changed. That worked out well for me because I already knew how Hyper-V would respond to a CSV going into maintenance mode and how it would act when maintenance mode ended. What you need to do to fix a virtual machine in a critical state depends on how it got there in the first place.
- Temporarily Unreachable Storage
If a VM is in a critical state because the back-end storage is offline for a short time, you have two options. The first is to simply wait. VMMS will periodically check on VMs in a critical state and if the storage becomes reachable, it will take appropriate steps. I prefer this option because it requires no effort on the part of the administrator. If the VM was previous running, it will be started automatically. This is what happens to me when I bring my entire test cluster up from an off state and don’t give my storage host sufficient time to start before turning on my Hyper-V hosts. The second option is to just start interacting with the VM(s). That could mean manually turning them on, refreshing the Hyper-V Manager screen, or even resetting the VMMS service. Changing their state in Failover Cluster Manager often sorts a lot of issues out with VMMS.
- Collision Due to Clustering
Every once in a while, clustering causes an issue (or shows symptoms due to some other underlying issue). A virtual machine might continue to appear on one node when it’s not really there, but in a critical state. First, make sure you’ve checked for any issue that you can address. A common issue that causes this is group policies that affect user rights assignments. For instance, controlling the “Create Symbolic Link” permission. If you’re reasonably certain everything is OK, attempt to Quick Migrate or Live Migrate the actual instance of the problematic machine to any other cluster node. This action should cause all the nodes to sync up and remove any invalid registrations.
- Permanently Unreachable Storage
If the storage has crashed and you’re replacing it and restoring the data at the file level (as opposed to performing a complete virtual machine restore), how you proceed depends. If the name of the target storage volume is the same, then the wait method, interaction method, or Quick/Live Migration method mentioned in the prior two dots should sort you out. If one technique doesn’t work, try another. If you changed something about the storage volume, then your best bet is to delete the item in a Critical state and import the virtual machine, choosing to register it in place. The nice thing is, if Failover Clustering had marked a virtual machine as one of its own, re-creating the virtual machine with the same GUID via import will allow Failover Clustering to automatically recognize the virtual machine.
Leave the XML Alone
The point of this article was to demystify the XML file. It’s not to give you a false sense of security. I’ve taken you through the usage of the XML file and how to keep it in Hyper-V’s good graces, but I didn’t talk about the contents. That’s because it’s really not a place you should ever need to be. Even though I don’t like that the “binary” format is replacing the friendly XML format, I do like that it will discourage tinkering. If you absolutely must go into the file, do so in a read-only fashion.
Not a DOJO Member yet?
Join thousands of other IT pros and receive a weekly roundup email with the latest content & updates!