VM Startup Issues - QEMU-KVM Failed to start VNC server

Good morning Cloudmin support,

We have started seeing this issue when trying to boot up KVM guests on any of our 7 KVM hosts. It seems that there are addresses in use for VNC that aren't actually in use per se...at least none that we could find?

I have attached a screenshot of the boot message we see. This has become a serious issue for us, as we're trying to deploy new VMs via cloning. On one KVM host, we powered down all of its VMs, and restarted them all in parallel. 2 of 4 VMs came back up to Webmin status without an issue, and the other two would not start. We manually started them individually, and 1 of them started, and the other did not, and still won't.

Does Cloudmin track VNC port assignments somewhere within a file, and that is causing the issue?

In the past, a full-on reboot of servers would seemingly fix the issue...but that is not a viable option given the operational use of our production systems.

Our Cloudmin master server is running Ubuntu 12.04LTS and our KVM guests are running CentOS 7.

Please let us know how we can troubleshoot this, or if perhaps it is a bug that needs a patch?

Status: 
Closed (fixed)

Comments

Does it make any difference if you shut down the VMs using the Cloudmin GUI and start them back up again? The VNC port used at boot time differ (and be wrong) from the port used from the UI.

chadwick89's picture
Submitted by chadwick89 on Tue, 07/03/2018 - 15:49

JamieCameron

Prior to submitting this support post, that is what we tried. The VM we originally wanted to boot up would not start, so we shut down all the vms on its KVM host, and started them all in paralell. It was a total of 4 VMs that were started, and only 3 of them came back up. The 4th would hang and spit back the boot message captured in the screenshot attached to this post.

We have another VM we use as a cloning template that also will not boot (hosted on a different KVM host) and it's giving the same message. At this point we cannot properly clone to create new VMs as they don't start.

Ok, as a work-around you can force Cloudmin to select a new VNC port for a VM as follows :

  1. Find the VM's unique ID by SSHing in as root and running cloudmin list-systems --host your-vm-name --id-only
  2. Edit the file under /etc/webmin/servers whose name starts with that ID.
  3. Remove the kvm_vnc_display line, and save the file.
  4. Shutdown and re-start the VM.
chadwick89's picture
Submitted by chadwick89 on Fri, 07/06/2018 - 19:14

JamieCameron

Thank you so much for this, we have documented it for future reference. This workaround did solve our issues with VMs that could not boot due to VNC port conflicts.

Is this ultimately a Cloudmin bug? Will there be a patch or permanent fix for this?

It's hard to say if this is a Cloudmin bug, as that would depend on what other process is using the same port.

However, in the next release we will add an extra check to prevent these port conflicts at VM startup time.

Status: Fixed ยป Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.