Submitted by jims45 on Sun, 05/07/2017 - 09:56
From time to time my server crashes and needs a reboot. The server is hosted on a Digital Ocean droplet and has plenty of reserve memory. How can I diagnose this issue?
Status:
Active
Comments
Submitted by jims45 on Sun, 05/07/2017 - 09:57 Comment #1
Submitted by andreychek on Sun, 05/07/2017 - 11:14 Comment #2
Howdy -- thanks for contacting us!
It's rare to see Webmin crash for a reason other than a resource issue, though we'll certainly help look around to determine what's going on.
The next time Webmin crashes, before restarting Webmin or rebooting, can you run this command:
dmesg | tail -30
Also, do you see any errors in /var/webmin/miniserv.error?
Lastly, do you have a /proc/user_beancounters file? If so, can you paste in it's contents?
Thanks!
Submitted by jims45 on Mon, 05/08/2017 - 11:07 Comment #3
Subroutine arpa_to_ip redefined at /usr/share/webmin/bind8/records-lib.pl line 591. Subroutine ip_to_arpa redefined at /usr/share/webmin/bind8/records-lib.pl line 601. Subroutine ip6int_to_net redefined at /usr/share/webmin/bind8/records-lib.pl line 611. Subroutine net_to_ip6int redefined at /usr/share/webmin/bind8/records-lib.pl line 638. Subroutine valdnsname redefined at /usr/share/webmin/bind8/records-lib.pl line 657. Subroutine valemail redefined at /usr/share/webmin/bind8/records-lib.pl line 685. Subroutine absolute_path redefined at /usr/share/webmin/bind8/records-lib.pl line 696. Subroutine parse_spf redefined at /usr/share/webmin/bind8/records-lib.pl line 704. Subroutine join_spf redefined at /usr/share/webmin/bind8/records-lib.pl line 750. Subroutine parse_dmarc redefined at /usr/share/webmin/bind8/records-lib.pl line 794. Subroutine join_dmarc redefined at /usr/share/webmin/bind8/records-lib.pl line 818. Subroutine join_record_values redefined at /usr/share/webmin/bind8/records-lib.pl line 848. Subroutine compute_serial redefined at /usr/share/webmin/bind8/records-lib.pl line 869. Subroutine convert_to_absolute redefined at /usr/share/webmin/bind8/records-lib.pl line 904. Subroutine get_zone_file redefined at /usr/share/webmin/bind8/records-lib.pl line 923. Subroutine get_dnskey_record redefined at /usr/share/webmin/bind8/records-lib.pl line 947. Subroutine record_id redefined at /usr/share/webmin/bind8/records-lib.pl line 969. Subroutine find_record_by_id redefined at /usr/share/webmin/bind8/records-lib.pl line 979. Subroutine get_dnskey_rrset redefined at /usr/share/webmin/bind8/records-lib.pl line 998. Subroutine is_raw_format_records redefined at /usr/share/webmin/bind8/records-lib.pl line 1020.
No errors or data in /proc
Webmin has not crashed again yet.
Thanks
Submitted by andreychek on Mon, 05/08/2017 - 11:57 Comment #4
Hmm, don't see anything too unusual in your error log there. Those are just some notices that are safe to ignore.
What I'd suggest is to keep an eye out for the next time it happens, and if/when it does, take a look at the "dmesg" output. It may contain some helpful info to debugging what you're seeing.
Submitted by jims45 on Mon, 05/08/2017 - 11:59 Comment #5
Will do!
Thank you.
Submitted by jims45 on Tue, 05/16/2017 - 11:32 Comment #6
Hello Crashed again here is result of "dsmeg"
[617108.408845] [ 9984] 33 9984 116766 2092 148 4 0 0 apache2 [617108.408849] [ 9985] 33 9985 116766 2092 148 4 0 0 apache2 [617108.408852] [ 9986] 1023 9986 90066 10484 119 3 0 0 php5-cgi [617108.408856] [ 9988] 33 9988 116751 2109 147 4 0 0 apache2 [617108.408860] [ 9989] 33 9989 116751 2109 147 4 0 0 apache2 [617108.408864] [ 9990] 33 9990 116751 2109 147 4 0 0 apache2 [617108.408868] [ 9991] 33 9991 116766 2092 148 4 0 0 apache2 [617108.408872] [ 9993] 1023 9993 89457 6759 114 3 0 0 php5-cgi [617108.408876] [ 9994] 1023 9994 89297 6304 112 3 0 0 php5-cgi [617108.408879] [ 9998] 33 9998 116720 2053 145 4 0 0 apache2 [617108.408883] [ 9999] 33 9999 116720 2053 145 4 0 0 apache2 [617108.408887] [10000] 33 10000 116748 2082 147 4 0 0 apache2 [617108.408891] [10001] 33 10001 116748 2082 147 4 0 0 apache2 [617108.408895] [10002] 33 10002 116744 2060 146 4 0 0 apache2 [617108.408898] [10003] 33 10003 116720 2053 145 4 0 0 apache2 [617108.408902] [10004] 33 10004 116720 2053 145 4 0 0 apache2 [617108.408906] [10005] 33 10005 116748 2082 147 4 0 0 apache2 [617108.408910] [10007] 1023 10007 88480 4271 109 3 0 0 php5-cgi [617108.408914] [10008] 1023 10008 89169 5040 109 3 0 0 php5-cgi [617108.408918] Out of memory: Kill process 1458 (mysqld) score 58 or sacrifice child [617108.409049] Killed process 1458 (mysqld) total-vm:1084176kB, anon-rss:120260 kB, file-rss:0kB [617108.456379] init: mysql main process (1458) killed by KILL signal [617108.456413] init: mysql main process ended, respawning [617108.543094] audit: type=1400 audit(1494786641.753:17): apparmor="STATUS" ope ration="profile_replace" profile="unconfined" name="/usr/sbin/mysqld" pid=10020 comm="apparmor_parser" [617108.900986] init: mysql main process (10035) terminated with status 1 [617108.900999] init: mysql main process ended, respawning [617109.786239] init: mysql post-start process (10036) terminated with status 1 [617109.802487] audit: type=1400 audit(1494786643.013:18): apparmor="STATUS" ope ration="profile_replace" profile="unconfined" name="/usr/sbin/mysqld" pid=10060 comm="apparmor_parser" [617109.857225] init: mysql main process (10072) terminated with status 1 [617109.857239] init: mysql respawning too fast, stopped
Thanks
Submitted by andreychek on Tue, 05/16/2017 - 12:32 Comment #7
Okay, it does look like you're running into memory problems there.
The Linux kernel is killing off processes in order to keep the server up and running.
You may need to review all the running processes, and ensure that none are using up a large amount of memory. Sometimes MySQL can get particularly large, for example.
Also, you may want to configure Apache's maximum connections to reduce how many are allowed at once. It's possible that you're receiving bursts of Apache traffic that are causing problems.
Submitted by andreychek on Tue, 05/16/2017 - 12:32 Comment #8
Also, what is the output of the command "free -m"?
Submitted by jims45 on Tue, 05/16/2017 - 12:44 Comment #9
This is a bit beyond my skillset! I have 171 running processes Real memory: 1.95 GB total / 1.24 GB free / 1.06 GB cached Swap space: 0 bytes total / 0 bytes free 1464 mysql 866.28 MB /usr/sbin/mysqld Not sure what I should do next, ,maybe go back to using NGINX?
Submitted by jims45 on Tue, 05/16/2017 - 12:51 Comment #10
FREE -M
total used free shared buffers cached
Mem: 2000 1495 505 513 38 759
-/+ buffers/cache: 697 1302
Swap:
Submitted by jims45 on Tue, 05/16/2017 - 12:54 Comment #11
Submitted by andreychek on Tue, 05/16/2017 - 14:06 Comment #12
I wouldn't switch to Nginx, I'd recommend continuing to use Apache, and we just make a few tweaks to your setup there.
First, do you have the ability to add in swap space?
512MB of swap can go a long ways.
Second, what is the output of this command:
grep MaxClients /etc/apache2/apache2.conf
Submitted by jims45 on Tue, 05/16/2017 - 14:11 Comment #13
I can add swap space and report back np.
grep MaxClients /etc/apache2/apache2.conf = NO OUTPUT
Submitted by andreychek on Tue, 05/16/2017 - 14:32 Comment #14
Hmm, try adding a "-i" to that, so it'd look like this:
grep -i MaxClients /etc/apache2/apache2.conf
Submitted by jims45 on Tue, 05/16/2017 - 14:33 Comment #15
Swap created, will monitor Thanks
Submitted by jims45 on Tue, 05/16/2017 - 14:35 Comment #16
grep -i MaxClients /etc/apache2/apache2.conf still = NO OUTPUT
Submitted by jims45 on Tue, 05/16/2017 - 14:48 Comment #17
Submitted by andreychek on Tue, 05/16/2017 - 15:36 Comment #18
Hmm, that may be located in a different file on your system there.
How about this command here, it'll search all the Apache configs:
find /etc/apache2 -type f | xargs grep -i maxclients
Submitted by jims45 on Wed, 05/17/2017 - 09:55 Comment #19
Still no output from that command
Submitted by jims45 on Wed, 05/17/2017 - 10:02 Comment #20
prefork.conf
Submitted by jims45 on Wed, 05/17/2017 - 10:12 Comment #21
This is the nearest config I can find but no directive for max clients.
Submitted by andreychek on Wed, 05/17/2017 - 10:13 Comment #22
Darnit, I'm sorry, it looks like new installs of Ubuntu 14.04 change the setup a bit, and don't use MaxClients.
I have "MaxClients" on my test system here, but only because it was upgraded from an older system.
Here is what I'd do -- I'd edit the config file you shared above, as that's exactly what we need to tweak -- and in that file, I'd change MaxRequestWorkers from 150 to 75, and then restart Apache.
Having that set to 150 can be a bit too much at times, and can allow servers to get a bit overloaded at times if they don't have enough RAM to handle 150 requests at a time.
Submitted by jims45 on Wed, 05/17/2017 - 10:21 Comment #23
Done that, will monitor and see what happens. Thanks for your help!
Submitted by jims45 on Mon, 05/22/2017 - 11:31 Comment #24
Hello Crashed again, attached is result of dsmeg command. Thanks
Submitted by andreychek on Mon, 05/22/2017 - 12:06 Comment #25
I have a few ideas that should help, but one quick thing -- how many emails are you your mail queue right now?
You can determine that by running this command:
mailq | tail -1
Submitted by jims45 on Mon, 05/22/2017 - 12:48 Comment #26
This is the output:
-- 524 Kbytes in 7 Requests.
Submitted by andreychek on Mon, 05/22/2017 - 13:23 Comment #27
Okay, it doesn't look like there are many.
Is the ClamAV service being hosted on a different server in your setup there?
Submitted by jims45 on Mon, 05/22/2017 - 13:29 Comment #28
See screenshot, it looks like clam av is not activated?
Submitted by andreychek on Mon, 05/22/2017 - 13:33 Comment #29
Thanks, from the output there it looks like it's set to use the standalone scanner.
That's okay, though during a burst of email it could end up using a lot of RAM, which could be part of the issue you're experiencing.
My suggestion would be to change the "Virus Scanning Program" option from "Standalone" to "Server Scanner".
That should cut down on a lot of RAM usage.
Submitted by jims45 on Mon, 05/22/2017 - 14:01 Comment #30
Getting an eror when i try to activate it
"The server virus scanner cannot be selected unless the clamd virus scanning server is running"
Update : I enabled the ClamAV server and rebooted then the error dissapeared. I will try this setting see what happens Thanks
Submitted by andreychek on Mon, 05/22/2017 - 14:01 Comment #31
You may want to ensure the ClamAV service is running. You can do that in Webmin -> Servers -> Bootup and Shutdown.
Submitted by Vipul.K on Wed, 07/24/2019 - 02:26 Comment #32
@andreychek I'm facing the exact same problem. Can you elaborate on how to change the "Virus Scanning Program" option from "Standalone" to "Server Scanner"?