Gracefully turn off an ESXi host and VMs
It’s been a long time… well, here we go.
We have a virtualized server on the office, using a free VMware ESXi 4.1 solution. We also have the machine connected to a UPS.
Unfortunately, the power lines on the office already have failed us, so we wanted to configure the server to shutdown gracefully all the VMs and then shutdown the host itself.
These were the steps we took to take care of this:
Install vmware tools on all the VMs. Using the vsphere console (I think that’s what its called…) is easy enough:
Create a new VM (once again, using the vsphere) with minimal disk space and RAM, to just install the UPS drivers. This VM job is just take care of listen to the UPS connected to the host. On the properties of the VM, make sure you add a serial port, mapping directly to the physical serial port of the host
Now comes the nice part: the idea is to make this VM to connect to the host OS and run the appropriate commands to begin the shutdown process. SSH to the rescue… on that VM, just create a pair of SSH keys (ssh-keygen -t RSA, for example) WITHOUT PASSWORD, and push the public one to the host.
Done, you should now have access to the ESXi host without having to enter a key. Which means, you can automate the process of shuting it down when the UPS starts running on battery, by running an arbitrary command through SSH. On this matter, here goes the script we used on the ESXi host to shutdown everything (this came from a blog whose link I can’t remember. I’m trully sorry. The credits for this should not go to me!)
## get all the VMs identifiers VMID=$(/usr/bin/vim-cmd vmsvc/getallvms | grep -v Vmid | awk '{print $1}')
## loop through all the VMs for i in $VMID do ## get their state (turned on, off, whatever) STATE=$(/usr/bin/vim-cmd vmsvc/power.getstate $i | tail -1 | awk '{print $2}') ## if they are running, turn them off (only works correctly if ## vmware tools are installed on the VMs) if [ $STATE == on ] then /usr/bin/vim-cmd vmsvc/power.shutdown $i fi done
## shutdown the host itself sleep 30 /sbin/shutdown.sh /sbin/poweroff |
There’s a small problem here: I’m quite new to this VMware things, but from my understanding, the host completely wipes out all the files you’ve just created on reboot. Everything you see from / on the filesystem gets wiped when the machine reboots. So we have to take a few steps to get the changes you’ve made on authorized_keys and the shutdown script inside /home permanent. These steps include packing up the directories and files you’ve created and editing a file (I’ve seen this on a blog I can’t remenber… sorry):
|
## pack your files tar -C / -czf "/bootbank/home.tgz" /.ssh /home
## edit the file /bootbank/boot.cfg and add the new compressed file. In our case, ## the file reads the following: kernel=b.z kernelopt= modules=k.z --- s.z --- c.z --- oem.tgz --- license.tgz --- m.z --- state.tgz --- home.tgz build=4.1.0-348481 updated=1 bootstate=0 |
ssh -i *my_private_key* root@*esxi_host_ip 'sh /home/shutdownVMs.sh' |
Conclusion
So, with this, you get a VM listening to your UPS state, and connecting to the virtualization host when the AC fails. The graceful shutdown of all the VMs is only possible thanks to the vmware tools installed on each one of the VMs.