How to terminate unresponsive virtual machines in vSphere

Save to My DOJO

How to terminate unresponsive virtual machines in vSphere

Table of contents

 

 

 

I’ve put this post together on the spur of the moment after coming across an issue I had not seen in a while, an unresponsive VM in vSphere. Yep, it does happen from time to time. By unresponsive, I mean a guest OS that will categorically ignore any remote call made to it and a VM that is completely oblivious to anything triggered from a vSphere client.

Thankfully, there are a number of methods used to kill the offending VM. The first port of call is PowerCLI. If this fails, you still have a number of options left at your disposal and they all involve running commands directly off an ESXi host.

The alternative to this, of course, is to reboot the ESXi host assuming you are able to migrate VMs elsewhere. Hopefully you will never have to resort to this act of desperation.

Let start off with the easy one.

 

The PowerCLI Way


Stop-VM is the cmdlet used to kill a VM. It’s actually used to power off a VM but adding the kill parameter will terminate the VM’s corresponding processes on ESXi effectively killing it. In the following example, I’ll be terminating a VM called Centos 7 Web Server.

Stop-VM -kill <VM display name> -Confirm:$false
Killing a VM with PowerCLI

Killing a VM with PowerCLI

 

The ESXi Way


There are three command line methods you can use on ESXi to terminate unresponsive VMs these being esxcli, vim-cmd and plain old Unix style commands, though technically ESXi doesn’t do Unix. You need to enable SSH and/or Shell on ESXi after which you could use something like putty or, if you prefer, work directly from console. The esxtop tool presents itself as a fourth option.

 

Using esxcli

Esxcli is a command line interface (CLI) framework that gives you access to a number of namespaces such as vm, vsan, network and software allowing you to execute tasks, change settings and so on.

So, if I had to kill an unresponsive VM using esxcli, I would use the vm namespace pretty much like this:

1) Get a list of the VMs hosted on ESXi.

esxcli vm process list
Retrieving the World ID of a VM

Retrieving the World ID of a VM

 

2) Copy the World ID value and run:

esxcli vm process kill --type=soft -w=796791

The command does not provide any feedback unless it fails to locate the VM or you specify an invalid parameter.

Killing a VM using its World ID

Killing a VM using its World ID

 

The type parameter takes on three values:

  • Soft – Allows the VMX process to shutdown gracefully similar to kill-SIGTERM.
  • Hard – Immediately kills the VMX process similar to a kill -9 or kill -SIGKILL command.
  • Force – Stops the VMX process when either the Soft or Hard options fail.

 

Using vim-cmd

Vim-cmd is another command line utility, very similar to esxcli, but bearing a somewhat different syntax. Similarly, it can be used to manage VMs as well other resources. Keeping in line with the subject matter, here’s how vim-cmd is used to terminate a VM.

1. List all the VMs on the currently logged on host and note down the vmid (1st column) of the unresponsive VM.

vim-cmd vmsvc/getallvms

2. Get the power state of the VM using its vmid. This step is optional since you already know that the VM is toast.

vim-cmd vmsvc/power.getstate 36

3. Try the power.shutdown option first.

vim-cmd vmsvc/power.shutdown 36

4. If whatever reason the VM fails to shut down, try using power.off instead.

vim-cmd vmsvc/power.off 36

The following screenshot provides an example of how the above commands are used to kill a VM.

Using vim-cmd to terminate a VM

Using vim-cmd to terminate a VM

 

Using ESXTOP

ESXTOP is a great utility which provides information on how ESXi is using system resources. It can be used for troubleshooting, performance statistics and more. Keep this chart handy as it gives you a quick overview on how the tool may be used.

We’re going to use it to kill a hypothetically unresponsive vm. Here’s how.

  • Press Shift+v to change the view to virtual machines.

Virtual machine view in ESXTOP

 

  • Press f to display the list of fields followed by c. This will add the Leader World ID (LWID) column to the view. Press any key to return to the main menu.
Adding the LWID column to the virtual machine view

Adding the LWID column to the virtual machine view

 

  • Locate the unresponsive VM under the Name column and note down its LWID.
  • Press k and type in the LWID value at the World to kill (WID) prompt. Press Enter.
Using the LWID value to terminate a VM

Using the LWID value to terminate a VM

Just to be doubly sure, wait for 30 seconds before validating that the VM is no longer listed. If it is, try again failing which you will probably need to reboot the host.

 

The brute force is with you

This is the crudest method used to terminate a VM I can think of. It’s actually documented on the VMware site albeit slightly differently. A VM is invariably a series of processes running on ESXi. Using the ps command below, I can list the processes associated, say, with a VM called Centos 7 Web Server.

ps | grep "Centos 7 Web Server"
Listing the processes associated with a VM

Listing the processes associated with a VM

 

If you look closely, you’ll see that the value in the second column is identical to all processes. This is the VMX Cartel ID value which, although different from the World ID value, can still be targeted to terminate a VM as follows:

kill 797300

If the processes persist in running, try kill -9 <process id>.

kill -9 797300

Disclaimer: Use this at your peril. It seems to work on any version of ESXi. I’ve actually used it on ESXi 6.5 while working on this post without any apparent side-effects. The main concern I have with this method, is that you risk bringing ESXi down to its knees if you inadvertently kill a wrong process.

 

Conclusion

There are times when you’ll end up with unresponsive VMs and a couple of things you could try is to use PowerCLI or any of the ESXi command line tools available to kill the buggers. When everything else fails, your best option would be to migrate well behaved VMs to some other ESXi host and reboot the one hosting the unresponsive VMs. There is a chance that you’ll need to power off the host if the frozen VMs are preventing you from putting the ESXi into maintenance mode. If you still have issues after this I suggest giving VMware support a call!

[the_ad id=”4738″][the_ad id=”4796″]

Altaro VM Backup
Share this post

Not a DOJO Member yet?

Join thousands of other IT pros and receive a weekly roundup email with the latest content & updates!

7 thoughts on "How to terminate unresponsive virtual machines in vSphere"

  • Darren says:

    Thanks for the post – unfortunately none of the above processes kill my running VM. When I get kill -9, I get “No such process”, yet I can confirm VM is on this host, and is running. ESXTOP also confirms. I cannot console to the machine, yet is is running. Please assist to power off this VM. Thanks

Leave a comment

Your email address will not be published. Required fields are marked *