KVM - Not all VMs start automatically on boot

I have two VMs, and when my machine boots, only one comes up. If I manually start the second one, it works fine, it just doesn't come up automatically on boot. It is set to start automatically in the Cloudmin interface.

Any ideas where I should look to get more useful info?

Status: 
Closed (fixed)

Comments

If you look in the file "/etc/init.d/cloudmin-kvm", you should see a list of all the VM's that are set to start at boot (they start with "cgcreate").

Do you see the VM that isn't starting listed in there?

Yes, the VM is listed there.

Try looking in the /kvm directory on the host system for a file whose name ends with .console and contains the VM name. That should contain any error messages about why it failed to start at boot..

This is what the relevant file contains:

cgcreate: libcgroup initialization failed: Cgroup is not mounted cgroup change of group failed Failed to connect : Connection refused at /usr/bin/nc.pl line 19.

Perhaps the service it needs hasn't fully started by the time the VM tries to come up?

The other VM always starts just fine, however, and I can manually start this VM just fine after the system's fully booted.

One thing you could try is running the command that is in /etc/init.d/cloudmin-kvm for the VM as root via SSH, and see if it starts properly. That would reveal if the cause is some dependency issue at boot.

Copying the command from the file works fine.

I simply added "sleep 10" before all the other commands in the init script and it seems to work fine now. Definitely a dependency issue during boot.

would be interesting to know what that dependency is, so that a proper fix could be devised.

My hunch is simply that cgconfig does not start in time, but I do not have any proof.

It's odd that one of the VMs starts correctly though. Does its .console file have the same error immediately after booting the host?

The VM that starts correctly shows no errors. It could be simply that it starts after the one that fails, but there are still a couple static delays using "sleep" that might give it enough of a delay.

It looks like the issue is that /cgroups directory may not be mounted at the time the Cloudmin init script runs. I will add a delay in future to handle that case, but only delay if /cgroups doesn't exist.

Automatically closed -- issue fixed for 2 weeks with no activity.