Hello,
We have seen that in some cases (we do not know exactly when) lookup-domain.pl is causing a lot of I/O. Probably it is when it is processing email messages.
When this happens we see couple (no more than 5-6) groups of process like this one:
\_ /usr/bin/procmail-wrapper -o -a s3.trafficplanethosting.com -d user.name
21778 ? S 0:00 | \_ /usr/bin/procmail-wrapper -o -a s3.trafficplanethosting.com -d
user.name
21779 ? RN 0:00 | \_ /usr/bin/perl /usr/libexec/webmin/virtual-server/lookup-domain.pl --exitcode 73 user.name
And at this time the I/O is over 50% when these processes done their work the I/O fall to normal and the load of the server also return to normal.
I have tried to restart the lookup-domain-daemon.pl but when i execute: /etc/init.d/lookup-domain restart it returns Failed to bind to localhost port 11000 at /usr/libexec/webmin/virtual-server/lookup-domain-daemon.pl line 49.
I tried to stop it with /etc/init.d/lookup-domain stop
and to see if it actually stop but it appears to be active:
/etc/init.d/lookup-domain stop [root@s3 ~]# ps fax |grep lookup-domain 9754 pts/1 S+ 0:00 | _ grep lookup-domain 26657 ? SNs 0:10 /usr/libexec/webmin/virtual-server/lookup-domain-daemon.pl
Probably that is the cause of the error but why it is not stopping.
Int the /var/webmin/lookup-domain-daemon.log
last thousand errors are:
Too many child processes are running already
And it is continuing to fill the log with the same message.
This condition is on 2 servers already.
Comments
Submitted by andreychek on Tue, 10/27/2015 - 10:28 Comment #1
Howdy -- we may need Jamie's input to understand why you're receiving the "Too many child processes are running already" error.
However, how many emails are you receiving at a time?
If it's not a huge amount of email, you could always try going into System Settings -> Virtualmin Config -> Spam Filtering, and there, set "Lookup domain for incoming email" to "One at a time".
Submitted by JamieCameron on Tue, 10/27/2015 - 13:34 Comment #2
Is it the
lookup-domain.pl
process that is using a lot of disk IO, orlookup-domain-daemon
? Because if the daemon process is running, most work should be offloaded to it..Submitted by george.asenov on Wed, 10/28/2015 - 09:17 Comment #3
HI Jamie,
The thing is that the demon was throwing the error i mention and probably it wasn't working so all the load was on the lookup-domain.pl.
After I tried to restart the daemon and it refuses to stop i kill the process and start it again and it appears it is working but sometimes we see lookup-domain.pl "jumping" over the top but now it is way better.
But how we reach this point where the daemon start to trowing error:
Too many child processes are running already
This happen to only two of our servers with the same setup all others seam to work as they should.
Submitted by aplima on Tue, 07/26/2016 - 11:22 Comment #4
Hello,
Not sure if I'm posting this in the correct place. Been struggling with a burst of lookup-domain firing all at same time, and then my server starts killing every process... I already changed the option to "check one at once" to see how it will behave. No, I don't see any abnormal amount of incoming/outgoing email when this happens...
I had to remove execution permission on the related files, to get back control of my server. Some details:
Linux 2.6.32-642.3.1.el6.x86_64 #1 SMP Tue Jul 12 18:30:56 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Operating system CentOS Linux 6.8 Webmin version 1.801 Virtualmin version 5.03 Kernel and CPU Linux 2.6.32-642.3.1.el6.x86_64 on x86_64 Processor information Intel(R) Xeon(R) CPU E5-2630L v2 @ 2.40GHz, 1 cores (Yes, it is a VPS )
I gave back execution permissions on the files: /etc/init.d/lookup-domain /usr/libexec/webmin/virtual-server/lookup-domain-daemon.pl /usr/libexec/webmin/virtual-server/lookup-domain.pl
I took a print-screen of the running processes, and I'm willing to provide whatever log details could help finding what could be causing this.
I have another VPS servers running. The difference is, that this is the first one with x86_64 install. All the other ones are still running Centos 5.8 i686. Could it be it? On x86_64, 1Gb ram could not be enough? It's strange, because this server is running for only a couple of months, not having much traffic (one web page only) and 4 domains using email...
Thanks for any help,
António Lima
Submitted by andreychek on Tue, 07/26/2016 - 11:46 Comment #5
If that occurs, chances are that either your VPS is low on resources at the time, or you're receiving a large influx of email. You may want to review the email logs at the time it's occurring to see if you can see any signs of that occurring.
You could also attempt to disable ClamAV and/or SpamAssassin temporarily to see if that helps with the resource usage, just as a troubleshooting measure.
Lastly, we'd suggest reviewing the output of the command "mailq", as if there are a lot of emails in the queue, that can indicate a problem.
If you continue to see issues there, since it appears that you're using Virtualmin GPL there, we'd encourage you to start a new Forum thread with details on the problem you're experiencing, and there we can troubleshoot it a bit more.
Submitted by aplima on Tue, 08/02/2016 - 18:16 Comment #6
Eric, thanks for your reply. Server is stable now, after I changed the settings to "check one at once"... On a 1Gb RAM, I will stick to x86 version of CentOS, because it's the only diference from this server to the others I have. Thanks again, and my apologies for using this thread.
Keep up the great work guys.
Submitted by andreychek on Tue, 08/02/2016 - 18:33 Comment #7
Great, glad to hear things are working better for you!