VM working fine then for no appearent reason refuses to boot giving only a cryptic error message.


inet_listen_opts: bind(ipv4,127.0.0.1,40000): Address already in use
inet_listen_opts: FAILED
chardev: opening backend "socket" failed: Address already in use

The above message does not really make sense since 127.0.0.1 is on every machine how can it be a problem.
Anyhow this is a critical problem, and there appears no way to debug or troubleshoot it in the VM via cloumin/virtualmin.

Status: 
Closed (fixed)

Comments

Howdy -- that error is from the standpoint of the host. When KVM launches, it's setup to listen on port 40000 on your KVM host (and the next KVM instance 40001, then 40002, etc).

It sounds like, in your case, that KVM may already be running and using that port.

What does this command output:

ps auxw | grep kvm

Thanks for your prompt response, Yes That makes WAY more sense, here is the PS output:

root      1200  0.0  0.0      0     0 ?        S<    2012   0:00 [kvm-irqfd-clean]
root      1626  9.2  5.6 3503340 1160616 ?     Sl    2012 5320:41 /usr/bin/kvm -name basesys.localhost -m 1024 -drive file=/kvm/basesys.localhost.cd,media=cdrom,index=2 -drive index=0,media=disk,file=/dev/lg_storage/lv_postgresql,if=virtio -boot c -net tap,vlan=0,script=/kvm/basesys.localhost-eth0.sh -net nic,vlan=0,macaddr=54:52:00:1a:fb:24,model=virtio -vnc :1,password -usbdevice tablet -monitor tcp:127.0.0.1:40000,server -smp 1
root      1629  0.0  0.0      0     0 ?        S<    2012   0:00 [kvm-pit-wq]
1000      2818  0.0  0.0   9380   924 pts/1    S+   14:34   0:00 grep --color=auto kvm
root     20743  1.3  7.2 4539368 1500276 ?     Sl   11:09   2:40 /usr/bin/kvm -name sparcs-tng.localhost -m 2048 -drive index=0,media=disk,file=/dev/lg_storage/lv_sparcs_tng,if=virtio -boot c -net tap,vlan=0,script=/kvm/sparcs-tng.localhost-eth0.sh -net nic,vlan=0,macaddr=52:54:00:4e:4b:56,model=virtio -net tap,vlan=1,script=/kvm/sparcs-tng.localhost-eth1.sh -net nic,vlan=1,macaddr=02:54:00:51:3D:FE,model=virtio -vnc :6,password -usbdevice tablet -monitor tcp:127.0.0.1:40002,server -smp 1
root     20749  0.0  0.0      0     0 ?        S<   11:09   0:00 [kvm-pit-wq]
root     29990  8.4  3.6 8893632 759260 ?      Sl   13:26   5:46 /usr/bin/kvm -name sparcsvm.localhost -m 8192 -drive index=0,media=disk,file=/dev/lg_storage/lv_sparcs,if=ide -boot c -net tap,vlan=0,script=/kvm/sparcsvm.localhost-eth0.sh -net nic,vlan=0,macaddr=02:54:00:6F:D7:A5,model=virtio -net tap,vlan=1,script=/kvm/sparcsvm.localhost-eth1.sh -net nic,vlan=1,macaddr=02:54:00:6F:D8:5A,model=virtio -vnc :2,password -usbdevice tablet -monitor tcp:127.0.0.1:40001,server -smp 2
root     29996  0.0  0.0      0     0 ?        S<   13:26   0:00 [kvm-pit-wq]
sparcsadmin@smvmhost:~$

It looks like that basesys vm is orphaned, as it is not in the GUI but obviously in the background using up the resource, is there a safe way to shut it down without the GUI, ie send a shutdown to it from the terminal? Franco

I might try this... first, what is the output of this command:

cloudmin list-systems

Do you see basesys.localhost listed in that?

If so, try running this command:

cloudmin shutdown-system --host basesys.localhost

Followed by this one:

cloudmin startup-system --host basesys.localhost

Does that resolve your issue?

And if not, what output do you receive when doing all that?

My other post with results of the ps command seemed to get lost in moderation, I have rebooted the server and the mysterious vm (Seems to be a copy/version of the VM I am trying to start always starts up in the background. I tried to get control by adding the instance to the GUI but it always refuses the hostname. I did manage to kill the rogue VM and start the proper one, and noticed this was happening to multiple VMs so I killed all the rogue VMs and started the correct ones from the GUI. Once everything was working correctly I rebooted the server, and came back to the exact same trouble situation.

So, After poking around further, I think thee rogue VMs are coming from old files in the /kvm/ folder since thats the only place I can find these outdated versions of my VMs.

Sorry. I will try using your approaches above to see where I get... Franco

ok, after I groom the system to be working correctly,

when I run list-systems I get this:

216.108.146.12                 Physical        KVM resource server #2       
216.108.146.32                 Physical        KVM resource Server #3       
comms.localhost                KVM             comm. shop Info portal (D7)  
geoserver.localhost            KVM             GeoServer (SPARCS)           
mapserver.localhost            KVM             Mapserver (SPARCS)           
pg-vm.localhost                KVM             PostGreSQL (SPARCS RDBMS)    
smvmhost                       Master          This system                  
sparcs-fs.localhost            KVM             SPARCS File Server           
sparcs-tng.localhost           KVM             Dev.Front End (Drupal7) SPARCS
sparcsdown.localhost           KVM             Offline Front End (Apache) SPA
sparcsvm.localhost             KVM             Prod. Front End (Drupal6) SPAR

in the ps I get this:

root      1215  0.0  0.0      0     0 ?        S<   16:10   0:00 [kvm-irqfd-clean]
root      1232  8.1  3.9 8893632 808784 ?      Rl   16:10   3:03 /usr/bin/kvm -name sparcsvm.localhost -m 8192 -drive index=0,media=disk,file=/dev/lg_storage/lv_sparcs,if=ide -boot c -net tap,vlan=0,script=/kvm/sparcsvm.localhost-eth0.sh -net nic,vlan=0,macaddr=02:54:00:6F:D7:A5,model=virtio -net tap,vlan=1,script=/kvm/sparcsvm.localhost-eth1.sh -net nic,vlan=1,macaddr=02:54:00:6F:D8:5A,model=virtio -vnc :2,password -usbdevice tablet -monitor tcp:127.0.0.1:40001,server -smp 2
root      1324  0.0  0.0      0     0 ?        S<   16:10   0:00 [kvm-pit-wq]
root      1404  2.3  6.0 4536568 1242404 ?     Sl   16:10   0:51 /usr/bin/kvm -name sparcs-tng.localhost -m 2048 -drive index=0,media=disk,file=/dev/lg_storage/lv_sparcs_tng,if=virtio -boot c -net tap,vlan=0,script=/kvm/sparcs-tng.localhost-eth0.sh -net nic,vlan=0,macaddr=52:54:00:4e:4b:56,model=virtio -vnc :9,password -usbdevice tablet -monitor tcp:127.0.0.1:40002,server -smp 1
root      1408  0.0  0.0      0     0 ?        S<   16:10   0:00 [kvm-pit-wq]
root      5411 10.4  2.0 3465876 430940 ?      Sl   16:39   0:50 /usr/bin/kvm -name pg-vm.localhost -m 1024 -drive file=/kvm/basesys.localhost.cd,media=cdrom,index=2 -drive index=0,media=disk,file=/dev/lg_storage/lv_postgresql,if=virtio -boot c -net tap,vlan=0,script=/kvm/basesys.localhost-eth0.sh -net nic,vlan=0,macaddr=54:52:00:1a:fb:24,model=virtio -net tap,vlan=1,script=/kvm/pg-vm.localhost-eth1.sh -net nic,vlan=1,macaddr=02:54:00:07:B7:67,model=virtio -net tap,vlan=2,script=/kvm/pg-vm.localhost-eth2.sh -net nic,vlan=2,macaddr=02:54:00:4D:9A:C7,model=virtio -net tap,vlan=3,script=/kvm/pg-vm.localhost-eth3.sh -net nic,vlan=3,macaddr=02:54:00:4D:9A:DA,model=virtio -vnc :1,password -usbdevice tablet -monitor tcp:127.0.0.1:40000,server -smp 1
root      5414  0.0  0.0      0     0 ?        S<   16:39   0:00 [kvm-pit-wq]
1000      6592  0.0  0.0   9384   928 pts/0    S+   16:47   0:00 grep --color=auto kvm

Until the next reboot, which gives me this in list-systems

216.108.146.12                 Physical        KVM resource server #2       
216.108.146.32                 Physical        KVM resource Server #3       
comms.localhost                KVM             comm. shop Info portal (D7)  
geoserver.localhost            KVM             GeoServer (SPARCS)           
mapserver.localhost            KVM             Mapserver (SPARCS)           
pg-vm.localhost                KVM             PostGreSQL (SPARCS RDBMS)    
smvmhost                       Master          This system                  
sparcs-fs.localhost            KVM             SPARCS File Server           
sparcs-tng.localhost           KVM             Dev.Front End (Drupal7) SPARCS
sparcsdown.localhost           KVM             Offline Front End (Apache) SPA
sparcsvm.localhost             KVM             Prod. Front End (Drupal6) SPAR

and this in ps

root      1181  0.0  0.0      0     0 ?        S<   16:52   0:00 [kvm-irqfd-clean]
root      1203 21.3  2.9 8893632 614892 ?      Sl   16:52   0:23 /usr/bin/kvm -name sparcsvm.localhost -m 8192 -drive index=0,media=disk,file=/dev/lg_storage/lv_sparcs,if=ide -boot c -net tap,vlan=0,script=/kvm/sparcsvm.localhost-eth0.sh -net nic,vlan=0,macaddr=02:54:00:6F:D7:A5,model=virtio -net tap,vlan=1,script=/kvm/sparcsvm.localhost-eth1.sh -net nic,vlan=1,macaddr=02:54:00:6F:D8:5A,model=virtio -vnc :2,password -usbdevice tablet -monitor tcp:127.0.0.1:40001,server -smp 2
root      1307  0.0  0.0      0     0 ?        S<   16:52   0:00 [kvm-pit-wq]
root      1387 18.9  4.0 4619636 838360 ?      Sl   16:52   0:18 /usr/bin/kvm -name sparcs-tng.localhost -m 2048 -drive index=0,media=disk,file=/dev/lg_storage/lv_sparcs_tng,if=virtio -boot c -net tap,vlan=0,script=/kvm/sparcs-tng.localhost-eth0.sh -net nic,vlan=0,macaddr=52:54:00:4e:4b:56,model=virtio -vnc :9,password -usbdevice tablet -monitor tcp:127.0.0.1:40002,server -smp 1
root      1390  0.0  0.0      0     0 ?        S<   16:52   0:00 [kvm-pit-wq]
root      1451 16.3  1.6 3502268 330296 ?      Sl   16:52   0:14 /usr/bin/kvm -name sparcsdown -m 1024 -drive index=1,media=disk,file=/home/sparcsadmin/sparcs-down.raw.img,if=virtio -boot c -net tap,vlan=0,script=/kvm/sparcsdown.localhost-eth0.sh -net nic,vlan=0,macaddr=02:54:00:B3:7A:CD,model=virtio -net tap,vlan=1,script=/kvm/sparcsdown.localhost-eth1.sh -net nic,vlan=1,macaddr=02:54:00:B3:7C:0C,model=virtio -vnc :6,password -usbdevice tablet -monitor tcp:127.0.0.1:40003,server -smp 1
root      1564  0.0  0.0      0     0 ?        S<   16:52   0:00 [kvm-pit-wq]
root      1704 18.4  1.0 3468364 226080 ?      Sl   16:52   0:14 /usr/bin/kvm -name basesys.localhost -m 1024 -drive file=/kvm/basesys.localhost.cd,media=cdrom,index=2 -drive index=0,media=disk,file=/dev/lg_storage/lv_postgresql,if=virtio -boot c -net tap,vlan=0,script=/kvm/basesys.localhost-eth0.sh -net nic,vlan=0,macaddr=54:52:00:1a:fb:24,model=virtio -vnc :1,password -usbdevice tablet -monitor tcp:127.0.0.1:40000,server -smp 1
root      1708  0.0  0.0      0     0 ?        S<   16:52   0:00 [kvm-pit-wq]
1000      2288  0.0  0.0   9384   924 pts/1    S+   16:54   0:00 grep --color=auto kvm

These are the files in the /kvm/ folder

-rw-r--r--  1 root root 31457280 Nov  7 16:27 basesys.localhost.cd
-rw-r--r--  1 root root     1112 Jan  7 16:53 basesys.localhost.console
-rwxr-xr-x  1 root root      222 Jan  7 16:39 basesys.localhost-eth0.sh*
-rw-r--r--  1 root root       28 Jan  7 16:39 basesys.localhost.monitor
-rw-r--r--  1 root root      222 Jan  7 15:31 pg-vm.localhost-eth0.sh
-rw-r--r--  1 root root        5 Jan  7 16:52 pg-vm.localhost-eth0.tap
-rwxr-xr-x  1 root root      222 Jan  7 16:39 pg-vm.localhost-eth1.sh*
-rw-r--r--  1 root root        5 Jan  7 16:39 pg-vm.localhost-eth1.tap
-rwxr-xr-x  1 root root      222 Jan  7 16:39 pg-vm.localhost-eth2.sh*
-rw-r--r--  1 root root        5 Jan  7 16:39 pg-vm.localhost-eth2.tap
-rwxr-xr-x  1 root root      222 Jan  7 16:39 pg-vm.localhost-eth3.sh*
-rw-r--r--  1 root root        5 Jan  7 16:39 pg-vm.localhost-eth3.tap
-rw-r--r--  1 root root        5 Nov 15 14:58 postgres.localhost-eth0.tap
-rw-r--r--  1 root root        5 Nov 15 14:58 postgres.localhost-eth1.tap
-rw-r--r--  1 root root        0 Nov 13 14:32 postgresql.localhost.console
-rwxr-xr-x  1 root root      227 Nov 15 15:16 postgresql.localhost-eth0.sh*
-rw-r--r--  1 root root        5 Nov 13 14:32 postgresql.localhost-eth0.tap
-rw-r--r--  1 root root        5 Nov 13 14:32 postgresql.localhost-eth1.tap
-rw-r--r--  1 root root       28 Nov 15 14:22 postgresql.localhost.monitor
-rw-r--r--  1 root root      953 Jan  7 16:52 sparcsdown.localhost.console
-rwxr-xr-x  1 root root      227 Jan  7 11:54 sparcsdown.localhost-eth0.sh*
-rw-r--r--  1 root root        5 Jan  7 16:52 sparcsdown.localhost-eth0.tap
-rwxr-xr-x  1 root root      227 Jan  7 11:54 sparcsdown.localhost-eth1.sh*
-rw-r--r--  1 root root        5 Jan  7 16:52 sparcsdown.localhost-eth1.tap
-rw-r--r--  1 root root       28 Jan  7 16:31 sparcsdown.localhost.monitor
-rw-r--r--  1 root root     1105 Jan  7 16:52 sparcs-tng.localhost.console
-rwxr-xr-x  1 root root      227 Jan  7 11:09 sparcs-tng.localhost-eth0.sh*
-rw-r--r--  1 root root        5 Jan  7 16:52 sparcs-tng.localhost-eth0.tap
-rwxr-xr-x  1 root root      227 Jan  7 11:09 sparcs-tng.localhost-eth1.sh*
-rw-r--r--  1 root root        5 Jan  7 11:09 sparcs-tng.localhost-eth1.tap
-rw-r--r--  1 root root       32 Jan  7 16:31 sparcs-tng.localhost.monitor
-rw-r--r--  1 root root      953 Jan  7 16:52 sparcsvm.localhost.console
-rwxr-xr-x  1 root root      225 Jan  7 13:26 sparcsvm.localhost-eth0.sh*
-rw-r--r--  1 root root        5 Jan  7 16:52 sparcsvm.localhost-eth0.tap
-rwxr-xr-x  1 root root      225 Jan  7 13:26 sparcsvm.localhost-eth1.sh*
-rw-r--r--  1 root root        5 Jan  7 16:52 sparcsvm.localhost-eth1.tap
-rw-r--r--  1 root root       28 Jan  7 16:32 sparcsvm.localhost.monitor
-rw-r--r--  1 root root 31457280 Nov  9 08:52 zoneminder.localhost.cd
-rw-r--r--  1 root root        5 Nov  9 10:30 zoneminder.localhost-eth0.tap

I have to quit for the night so I will groom it, and leave them until tomorrow then I will come back and continue... Thanks for all the help guys! Franco

Hmm, so it looks like the difference in running KVM instances is that the "basesys.localhost" is running after you reboot?

Are you receiving the error you mentioned when that occurs?

And then, after stopping that KVM instance, does your error go away?

Good morning Eric, Yes, killing the basesys.localhost allows me to properly boot the pg-vm.localhost and the "inet_listen_opts: bind(ipv4,127.0.0.1,40000): Address already in use" message does not appear.

However, the problem seems more complicated than that, here is everything that is happening.

On server boot, the basesys.localhost starts (it should not even exist as far as I can tell) this is using the same harddrive as pg-vm.localhost (and KVM port 40000 too) so when I try and start pg-vm, it errors out.

Further, there are two versions of the main webserver VM: sparcsvm.localhost - the production webserver which should boot with the server (And is set as such in the GUI) and sparcsdown.localhost - which is a placeholder webserver set to NOT boot with the server (And is set as such in the GUI). this server is manually started when the the production server is taken offline.

However, the server tries to boot both of them at server boot time, and since they have the same IP/mac parameters, it causes problems.

So, to summarize the problem here, 1) An old unwanted VM keeps instantiating itself at server boot time 2) The VMs are not respecting the BOOT/NO BOOT at server boot time.

I realize these are two separate issues, so if you prefer we can split this off into another thread if you prefer. Franco

You should check the init script /etc/init.d/cloudmin-kvm , which contains commands to start VMs at system boot time. It seems likely that a VM was deleted or renamed somehow, but this script wasn't updated..

Both VMS that should not start were in this script, so I manually removed them from the script, but I suspect I should have the script rebuilt by the software, how do I trigger that? F

The lines in that script get added when you change the "Start at host boot time" option for a VM on the Edit System page.

Nice, these workarounds have resovled the two issues and I can confirm that the GUI is having the expected effect on the cloudmin-kvm script now. Thanks Eric & Jamie :) Franco