Virtualmin crashes when I try to create a new site [#31591]

Submitted by cjcollins on Sat, 12/07/2013 - 22:22 Comment #1

Log in or register to post comments

Submitted by andreychek on Sat, 12/07/2013 - 22:34 Comment #2

Howdy -- are the authentication details of your slave DNS server correct?

The error I see in your error logs shows this:

Login to RPC server as root rejected

That most commonly occurs when the root password, or the IP address, are incorrect.

Log in or register to post comments

Submitted by cjcollins on Sat, 12/07/2013 - 22:37 Comment #3

BEFORE my second NS2 virtualmin (192.168.7.6) had a different password then my main NS1 virtualmin (192.168.7.5). So now that I have both IPs going to the same virtualmin the log in password to NS2 should be the same as NS1. How/Where do I set that? I think that is the problem.

Log in or register to post comments

Submitted by cjcollins on Sat, 12/07/2013 - 23:15 Comment #4

The same error, Login to RPC server as root rejected

is generated when i go to webmin > cluster webmin servers and try to add 192.168.7.6 as a second server.

Log in or register to post comments

Submitted by cjcollins on Sun, 12/08/2013 - 20:36 Comment #5

Here's more information. So when I try to create a new virtual site all the other sites go down but I can SSH into the Ubuntu server. When I run "top" command I see this,

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 6781 www-data  20   0  399m  11m  632 R  100  0.1   2:38.80 apache2

I can see that apache2 is using 100% of the cpu. I can see what is being process by doing by this command

# ps ax | grep -v grep | grep apache
 6781 ?        R      4:44 /usr/sbin/apache2 -k start

So basically apache2 is trying to start and sits at 100% CPU crashing all the other sites.

Log in or register to post comments

Submitted by JamieCameron on Sat, 12/07/2013 - 23:41 Comment #6

So if NS1 and NS2 are the same machine, you don't actually need a cluster slave setup - otherwise Virtualmin will try to create the slave DNS zone on the same system as the master.

Am I correct that you actually only have a single physical system?

Log in or register to post comments

Submitted by cjcollins on Sun, 12/08/2013 - 00:04 Comment #7

Ya that's right I now have a single physical system. I went ahead and deleted all traces of the second server (192.168.7.6) from Webmin > Webmin Server Index, Cluster Webmin server, Cluster Usermin Servers. After doing that I tried creating the virtual site again and I have a different output. (see attached) The original error "Login to RPC server as root rejected" went away but I still have the problem with the process "apache2 -k start" maxing out the CPU and crashing all the other sites.

Before I really did have a second server (192.168.7.6) and so I forgot to erase them out of the cluster when I decided to just add a second NIC on my main server and assign it 192.168.7.6.

Log in or register to post comments

Submitted by JamieCameron on Sun, 12/08/2013 - 12:30 Comment #8

So if you try to restart Apache with /etc/init.d/apache2 start , does it actually start up, or does it just crash? If the latter, what gets logged to the Apache error log?

Log in or register to post comments

Submitted by cjcollins on Sun, 12/08/2013 - 20:40 Comment #9

I tried killing the apache2 process running at 100% cpu and starting it back up,
# /etc/init.d/apach2 start * Starting web server apache2 [Sun Dec 08 13:02:28 2013] [warn] NameVirtualHost 173.165.128.42:80 has no VirtualHosts

I have attached the error log.

Log in or register to post comments

Submitted by andreychek on Sun, 12/08/2013 - 14:30 Comment #10

When you killed Apache, and started it on the command line -- that appears to have worked properly, that's just a warning that you saw. Are your websites working for you at that point?

Regarding the CPU issue -- the logs show some unusual errors that Apache is generating.

Do you have any unusual modules enabled, perhaps modules from third party repositories?

It's possible that an Apache module is mis-behaving.

What output do you receive if you run this command:

ls /etc/apache2/mods-enabled

Also, what output does this show:

dpkg -l apache2

Log in or register to post comments

Submitted by cjcollins on Sun, 12/08/2013 - 14:35 Comment #11

The websites work when I reboot the system. I can recreate the problem by just trying to "create a virtual server".

I just barely rebooted and ran both of those commands. (see attached).

Log in or register to post comments

Submitted by cjcollins on Sun, 12/08/2013 - 20:41 Comment #12

Log in or register to post comments

Submitted by andreychek on Sun, 12/08/2013 - 15:19 Comment #13

Hmm, I don't see any unusual modules enabled there.

Do you see an influx of bandwidth, that corresponds with the Apache CPU load you're seeing? I'm curious if that's related to traffic, rather than a misbehaving module.

Log in or register to post comments

Submitted by cjcollins on Sun, 12/08/2013 - 15:24 Comment #14

I'm using monitis to monitor the server. See picture of today's graph attached.. The top graph is pings so the red dots are when the server appeared to be down. It's directly related to when the CPU load shot up to 100%

Log in or register to post comments

Submitted by cjcollins on Sun, 12/08/2013 - 16:11 Comment #15

I just realized I can bring the sites back online by killing the apache2 process and starting it back up again. I made a short video showing the problem,

https://www.dropbox.com/s/v5esb1vxaosan1p/virtualmin_crash.mp4

Log in or register to post comments

Submitted by andreychek on Sun, 12/08/2013 - 16:37 Comment #16

Thanks for the video! Yeah, I understand what's occurring, I just don't know why that might happen.

There's nothing Virtualmin does that should cause that sort of behavior... Virtualmin just adds VirtualHost content for the new domain, and then restarts Apache.

What if you run this command on that Apache process:

strace -p PID > apache_strace.txt 2>&1

And then, substitute the Apache process's process ID in place of the "PID" above.

And then after 5-10 seconds, kill that process if it doesn't end automatically.

Could you attach the resulting file (apache_strace.txt)?

Log in or register to post comments

Submitted by cjcollins on Sun, 12/08/2013 - 17:35 Comment #17

Sure. It's taking too long to upload that so here's a link from my dropbox,

https://www.dropbox.com/s/u59fgh3kbbwqurh/apache_strace.txt

Log in or register to post comments

Submitted by andreychek on Sun, 12/08/2013 - 18:09 Comment #18

Thanks! I've sent the relevant bits over to Jamie, let's see what he can make of it.

I'm going to post it below for future reference -- the messages below repeat throughout the entire file:

Process 22466 attached - interrupt to quit
gettimeofday({1386543157, 496531}, NULL) = 0
gettimeofday({1386543157, 496717}, NULL) = 0
gettimeofday({1386543157, 496886}, NULL) = 0
poll([{fd=67, events=POLLIN}], 1, 3000) = 1 ([{fd=67, revents=POLLHUP}])
read(67, "", 13160)                     = 0
gettimeofday({1386543157, 497414}, NULL) = 0
gettimeofday({1386543157, 497568}, NULL) = 0
gettimeofday({1386543157, 497725}, NULL) = 0
gettimeofday({1386543157, 497869}, NULL) = 0
poll([{fd=67, events=POLLIN}], 1, 3000) = 1 ([{fd=67, revents=POLLHUP}])
read(67, "", 13160)                     = 0

Log in or register to post comments

Submitted by JamieCameron on Sun, 12/08/2013 - 22:50 Comment #19

So I had a look, and it seems that just running apache2ctl graceful is enough to trigger this problem ... which suggests it is actually some kind of Apache bug.

As a work-around, I configured Virtualmin to not use that command - instead, it restarts Apache to apply config changes.

Log in or register to post comments

Submitted by cjcollins on Mon, 12/09/2013 - 00:15 Comment #20

I just confirmed the problem is fixed. I can add/delete sites without apache crashing the other sites. Thanks Jamie!

Log in or register to post comments

Submitted by JamieCameron on Mon, 12/09/2013 - 12:03 Comment #21

Great! Now as to why apache2ctl causes Apache to hang, I don't know ..

Log in or register to post comments

Submitted by cjcollins on Sat, 01/04/2014 - 00:31 Comment #22

Log in or register to post comments

Submitted by cjcollins on Wed, 01/15/2014 - 05:20 Comment #23

I still have the issue with apache2 crashing. Now it's just random and I don't know what triggers it. Here's my quick fix for the problem. I just wrote a script that I run every 20minutes checking if apache has crashed. Basically I know it crashed if there's only one apache2 process running. Here's my script,

#!/bin/bash
ps aux | grep -v grep| grep apache2
BROKE=$(ps aux | grep -v grep | grep apache2 | wc -l)
PID=$(ps aux | grep -v grep| grep apache2 | tail -n 1| cut -d" " -f2)
echo $BROKE
echo $PID

if [ $BROKE -eq 1 ]
then
  date >> /home/chris/scripts/apache_crash_log
  echo "PID=$PID" >> /home/chris/scripts/apache_crash_log
  echo "server crashed...sites down!"
  kill $PID
  /etc/init.d/apache2 stop
  sleep 2
  /etc/init.d/apache2 start
  echo "sites should be back soon..."
  echo sleep 5
  ps aux | grep -v grep| grep apache2
  sleep 1
  wall /home/chris/scripts/apache_crash_log
  echo "--------------------------------" >> /home/chris/scripts/apache_crash_log
else
  echo "everything looks good!"
fi

I tested this in a real situation and it fixed the problem. I went to Webmin>Cluster>Cluster Cron Jobs and added this script in a cronjob to run every 20minutes. I don't know what else to do but if it works I'm happy. What do you think?

Log in or register to post comments

Virtualmin crashes when I try to create a new site

Comments