monitor.pl and collectinfo.pl don't get along with procps 3.2.7 on CentOS 5.6 [#17823]

Submitted by Myke on Mon, 04/11/2011 - 09:32

The cron'd monitor.pl and collectinfo.pl scripts are causing something to complain about syntax, this generates quite a bit of email as they're run every 5 minutes!

Since there's no line number reported, this is not an easy bug to find.

[root@Virtualmin ~]# /etc/webmin/virtual-server/collectinfo.pl
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.7/FAQ
[root@Virtualmin ~]# /etc/webmin/status/monitor.pl
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.7/FAQ

[root@Virtualmin ~]# crontab -l | egrep monitor.pl\|collectinfo.pl
4,9,14,19,24,29,34,39,44,49,54,59 * * * * /etc/webmin/virtual-server/collectinfo.pl
0,5,10,15,20,25,30,35,40,45,50,55 * * * * /etc/webmin/status/monitor.pl
[root@Virtualmin ~]# 

Operating system    CentOS Linux 5.6
Webmin version    1.540
Virtualmin version   3.84.gpl GPL
Kernel and CPU    Linux 2.6.24-12-pve on i686

Status:

Closed (fixed)

Comments

Submitted by JamieCameron on Mon, 04/11/2011 - 12:19 Comment #1

Sounds like the parameters accepted by ps have changed.

What output do you get on your system if you run :

ps V
ps --cols 2048 -eo user,ruser,group$width,rgroup,pid,ppid,pgid,pcpu,vsz,nice,etime,time,stime,tty,args

The work-around to suppress these messages is to SSH in as root and run :

crontab -e

and then add 2>/dev/null to the end of the collectinfo.pl and monitor.pl commands.

Submitted by JamieCameron on Mon, 04/11/2011 - 12:21 Comment #2

Sounds like the parameters accepted by ps have changed.

What output do you get on your system if you run :

ps V
ps --cols 2048 -eo user,ruser,group$width,rgroup,pid,ppid,pgid,pcpu,vsz,nice,etime,time,stime,tty,args

The work-around to suppress these messages is to SSH in as root and run :

crontab -e

and then add 2>/dev/null to the end of the collectinfo.pl and monitor.pl commands.

Submitted by Myke on Mon, 04/11/2011 - 14:01 Comment #3

Not going to give you the full output, as I don't want you to judge me for having Railo running on the same VM ;) (And the stdout to null shows that your new arguments don't output to stderr)

[root@Virtualmin ~]# ps --cols 2048 -eo user,ruser,group$width,rgroup,pid,ppid,pgid,pcpu,vsz,nice,etime,time,stime,tty,args > /dev/null 
[root@Virtualmin ~]# ps --cols 2048 -eo user,ruser,group$width,rgroup,pid,ppid,pgid,pcpu,vsz,nice,etime,time,stime,tty,args | head
USER     RUSER    GROUP    RGROUP     PID  PPID  PGID %CPU    VSZ  NI     ELAPSED     TIME STIME TT       COMMAND
root     root     root     root         1     0     1  0.0   2164   0    11:22:44 00:00:00 03:34 ?        init [3]      
root     root     root     root        53     1     1  0.0    104   0    11:22:44 00:00:00 03:34 ?        [init-logger]
root     root     root     root       101     1   101  0.0   2268  -4    11:22:44 00:00:00 03:34 ?        /sbin/udevd -d
root     root     root     root       194     1   194  0.0  18360   0    11:22:44 00:00:00 03:34 ?        brcm_iscsiuio
root     root     root     root       382     1   382  0.0   1820   0    11:22:43 00:00:00 03:34 ?        syslogd -m 0
named    named    named    named      422     1   422  0.0  60300   0    11:22:43 00:00:06 03:34 ?        /usr/sbin/named -u named
root     root     root     root       462     1   461  0.0  26700   0    11:22:42 00:00:04 03:34 ?        /usr/sbin/snmpd -Lf /dev/null -p /var/run/snmpd.pid -a
root     root     root     root       474     1   474  0.0   7228   0    11:22:42 00:00:00 03:34 ?        /usr/sbin/sshd
root     root     root     root       486     1   486  0.0   2840   0    11:22:42 00:00:00 03:34 ?        xinetd -stayalive -pidfile /var/run/xinetd.pid
[root@Virtualmin ~]# ps V
procps version 3.2.7
[root@Virtualmin ~]#

Way ahead of you on the suppression though - thanks for the tip!

Submitted by JamieCameron on Mon, 04/11/2011 - 14:32 Comment #4

Ok, so it looks like neither of those printed the message you saw ...

Did this start happening when you upgraded to CentOS 5.6?

Submitted by Myke on Mon, 04/11/2011 - 14:38 Comment #5

About that time, yes.

Submitted by JamieCameron on Mon, 04/11/2011 - 16:50 Comment #6

I just updated a test system to CentOS 5.6 and procps 3.2.7 , but am not seeing those errors ..

Submitted by Myke on Mon, 04/11/2011 - 17:05 Comment #7

Could you posit what file or line is causing the error? I can do some digging myself too...

Otherwise, what other info can I provide to help find the source?

Submitted by JamieCameron on Mon, 04/11/2011 - 17:38 Comment #8

Any chance I could login to your system myself and see what is going wrong here?

Submitted by Myke on Mon, 04/11/2011 - 19:14 Comment #9

Paste your public SSH key here and shoot me an email?

Submitted by JamieCameron on Tue, 04/12/2011 - 00:03 Comment #10

Ok, I have sent you an email with my public key ..

Submitted by JamieCameron on Tue, 04/12/2011 - 13:34 Comment #11

Ok, I see the bug - you had your system configured to use the FreeBSD-style ps command instead of Linux, at Webmin -> System -> Running Processes -> Module Config. I switched back to Linux and all is OK now.

Submitted by Myke on Tue, 04/12/2011 - 15:59 Comment #12

Never saw that screen before. Thanks.

Submitted by Issues on Tue, 04/26/2011 - 17:20 Comment #13

Automatically closed -- issue fixed for 2 weeks with no activity.