monitor.pl and collectinfo.pl don't get along with procps 3.2.7 on CentOS 5.6

The cron'd monitor.pl and collectinfo.pl scripts are causing something to complain about syntax, this generates quite a bit of email as they're run every 5 minutes!

Since there's no line number reported, this is not an easy bug to find.

[root@Virtualmin ~]# /etc/webmin/virtual-server/collectinfo.pl
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.7/FAQ
[root@Virtualmin ~]# /etc/webmin/status/monitor.pl
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.7/FAQ

[root@Virtualmin ~]# crontab -l | egrep monitor.pl\|collectinfo.pl
4,9,14,19,24,29,34,39,44,49,54,59 * * * * /etc/webmin/virtual-server/collectinfo.pl
0,5,10,15,20,25,30,35,40,45,50,55 * * * * /etc/webmin/status/monitor.pl
[root@Virtualmin ~]#

Operating system CentOS Linux 5.6
Webmin version 1.540
Virtualmin version 3.84.gpl GPL
Kernel and CPU Linux 2.6.24-12-pve on i686
Status: 
Closed (fixed)

Comments

Sounds like the parameters accepted by ps have changed.

What output do you get on your system if you run :

ps V
ps --cols 2048 -eo user,ruser,group$width,rgroup,pid,ppid,pgid,pcpu,vsz,nice,etime,time,stime,tty,args

The work-around to suppress these messages is to SSH in as root and run :

crontab -e

and then add 2>/dev/null to the end of the collectinfo.pl and monitor.pl commands.

Sounds like the parameters accepted by ps have changed.

What output do you get on your system if you run :

ps V
ps --cols 2048 -eo user,ruser,group$width,rgroup,pid,ppid,pgid,pcpu,vsz,nice,etime,time,stime,tty,args

The work-around to suppress these messages is to SSH in as root and run :

crontab -e

and then add 2>/dev/null to the end of the collectinfo.pl and monitor.pl commands.

Not going to give you the full output, as I don't want you to judge me for having Railo running on the same VM ;) (And the stdout to null shows that your new arguments don't output to stderr)

[root@Virtualmin ~]# ps --cols 2048 -eo user,ruser,group$width,rgroup,pid,ppid,pgid,pcpu,vsz,nice,etime,time,stime,tty,args > /dev/null
[root@Virtualmin ~]# ps --cols 2048 -eo user,ruser,group$width,rgroup,pid,ppid,pgid,pcpu,vsz,nice,etime,time,stime,tty,args | head
USER     RUSER    GROUP    RGROUP     PID  PPID  PGID %CPU    VSZ  NI     ELAPSED     TIME STIME TT       COMMAND
root     root     root     root         1     0     1  0.0   2164   0    11:22:44 00:00:00 03:34 ?        init [3]     
root     root     root     root        53     1     1  0.0    104   0    11:22:44 00:00:00 03:34 ?        [init-logger]
root     root     root     root       101     1   101  0.0   2268  -4    11:22:44 00:00:00 03:34 ?        /sbin/udevd -d
root     root     root     root       194     1   194  0.0  18360   0    11:22:44 00:00:00 03:34 ?        brcm_iscsiuio
root     root     root     root       382     1   382  0.0   1820   0    11:22:43 00:00:00 03:34 ?        syslogd -m 0
named    named    named    named      422     1   422  0.0  60300   0    11:22:43 00:00:06 03:34 ?        /usr/sbin/named -u named
root     root     root     root       462     1   461  0.0  26700   0    11:22:42 00:00:04 03:34 ?        /usr/sbin/snmpd -Lf /dev/null -p /var/run/snmpd.pid -a
root     root     root     root       474     1   474  0.0   7228   0    11:22:42 00:00:00 03:34 ?        /usr/sbin/sshd
root     root     root     root       486     1   486  0.0   2840   0    11:22:42 00:00:00 03:34 ?        xinetd -stayalive -pidfile /var/run/xinetd.pid
[root@Virtualmin ~]# ps V
procps version 3.2.7
[root@Virtualmin ~]#

Way ahead of you on the suppression though - thanks for the tip!

Ok, so it looks like neither of those printed the message you saw ...

Did this start happening when you upgraded to CentOS 5.6?

I just updated a test system to CentOS 5.6 and procps 3.2.7 , but am not seeing those errors ..

Could you posit what file or line is causing the error? I can do some digging myself too...

Otherwise, what other info can I provide to help find the source?

Any chance I could login to your system myself and see what is going wrong here?

Paste your public SSH key here and shoot me an email?

Ok, I have sent you an email with my public key ..

Ok, I see the bug - you had your system configured to use the FreeBSD-style ps command instead of Linux, at Webmin -> System -> Running Processes -> Module Config. I switched back to Linux and all is OK now.

Never saw that screen before. Thanks.

Automatically closed -- issue fixed for 2 weeks with no activity.