I recently added curl to our site with: apt-get install php5-curl
It seemed to work, however, now the site (all of the virtual sites) seem to get 500 server errors all the time. I'm not sure what is causing it, and it is very bad for business. I'm not sure if this has to do with the curl installation or not.
The /var/log/apache2/error.log contains lines like this for when the pages seems to go "slow" and eventually show the 500 internal error. These are pages that were lightning fast previously.
[warn] mod_fcgid: process 7607 graceful kill fail, sending SIGKILL
[notice] mod_fcgid: process /home/thesite/public_html/wiki/index.php(5989) exit(communication error), terminated by calling exit(), return code: 0
Can I see anywhere else where this may be caused? Perhaps the error is with mod_fcgid but something else that causes the scripts to timeout and fail.
How shall i proceed with investigating this?
I've googled around and found this:
exit(communication error), terminated by calling exit(), return code: 0
Means that it has taken more that IPCCommTimeout seconds for mod_fcgid to either write or read from the socket connecting to the child process. We found this mostly happens if the script runs too long waiting for external IO (i.e. database or ldap). IPCCommTimeout is 5 seconds by default.
It seems like the pages that gets hanged and produce the 500 server errors are pages that fetch data from the mysql databases. Could it be something messed up with our database? And if so, how do we check/fix this?
Howdy,
One quick thing you could try that would rule out a problem with FCGID is to change your domains to use CGI.
To do that, go into Server Configuration -> Website Options, and change the PHP Execution Mode there from FCGID to CGI.
As far as any MySQL problems go -- is this problem occuring on all domains, or just one or two? You could always verify that you're able to connect to MySQL from the command line, using "mysql -p".
-Eric
Hello!
This problem is occurring on all domains. Also, sometimes the page show fast, sometimes it takes ages, and timeout into a 500 server error. The load on the server is almost non existing so pages should really be showing very fast (and they did previously).
I tried changing to CGI instead of FCGID. The server error seems to be gone, however, the pages are still VERY slow. Something is very wrong :(
Connecting to mysql doesn't seem to be a problem.
I went into mysql and used "show processlist" when two pages seemed to "hang".
It shows something like this:
The formatting on this forum doesnt do this justice. But you can see the Time on all the processes, it just goes up and up... not sure what is happening there. Could mysql be broken somehow?
+------+--------------+-----------+--------------+---------+------+---------------+--------------------------------------------------------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | +------+--------------+-----------+--------------+---------+------+---------------+--------------------------------------------------------------------------------------+ | 1228 | ic | localhost | ic | Sleep | 365 | | NULL | | 1385 | ic | localhost | ic | Sleep | 116 | | NULL | | 1386 | root | localhost | NULL | Query | 0 | NULL | show processlist | | 1389 | test | localhost | test | Query | 24 | freeing items | UPDATE BLABLA SET views='471' WHERE ID='6451e674-d983-11e0-9bf1-e41f1331d16c' | | 1390 | wiki | localhost | wiki | Query | 43 | NULL | COMMIT | | 1391 | wiki | localhost | wiki | Sleep | 75 | | NULL | | 1392 | wiki | localhost | wiki | Query | 43 | NULL | COMMIT | | 1393 | wiki | localhost | wiki | Sleep | 74 | | NULL | | 1399 | wiki | localhost | wiki | Query | 24 | NULL | COMMIT | | 1400 | wiki | localhost | wiki | Sleep | 54 | | NULL | +------+--------------+-----------+--------------+---------+------+---------------+--------------------------------------------------------------------------------------+ 10 rows in set (0.00 sec)
HMMM...
I did a ps -u root showing this:
PID TTY TIME CMD 1 ? 00:00:00 init 1161 ? 00:00:00 cron 1332 ? 00:00:00 sshd 1361 ? 00:00:00 saslauthd 1391 ? 00:00:00 saslauthd 1392 ? 00:00:00 saslauthd 1393 ? 00:00:00 saslauthd 1394 ? 00:00:00 saslauthd 1484 ? 00:00:00 apache2 1864 ? 00:00:00 miniserv.pl 3157 ? 00:00:00 miniserv.pl 3259 ? 00:00:00 master 3369 ? 00:00:00 xinetd 3374 ? 00:00:00 dovecot 3375 ? 00:00:00 dovecot-auth 3382 ? 00:00:00 dovecot-auth 7446 ? 00:00:00 sshd 7457 pts/0 00:00:00 bash 11531 ? 00:00:00 cron 11533 ? 00:00:00 sh 11534 ? 00:00:00 monitor.pl 11546 pts/0 00:00:00 ps
It seems miniserv.pl is started twice? Could this be a problem? Not sure how that has happened. It is for webmin right?
I did a kill on one of the processes and I be damned, wonder if that maybe solved it. The pages seems to respond better now...
Will post more info if it still is something wrong.
Ugh...
by killing that process, i can no longer access webmin/virtualmin at myhost:10000
How do i get that working?
I seem to answer myself :P
/etc/init.d/webmin restart
Howdy,
There would indeed be two miniserv processes -- one is for Webmin, and one is for Usermin.
You can see that in more detail by running this command:
ps auxw | grep miniserv
That'll show the full path to the miniserv command, which will show either webmin or usermin.
-Eric
Hmm weird... killing one of the processes and restarting webmin made all my problems go away.