I love that cloudmin includes some basic system monitoring, however now that I have to come the cloudmin interface for 90% of my system management, I would prefer to see some more effective monitoring variables.
Today I ran into a problem where a HOST failed because the root partition was full, but the disk space alert was never triggered and I figured out why.
If you use the "disk space free/used" variable it can only effectively monitor the total disk space used or free. If you have an advanced partitioning scheme or simply mount a second partition for storage, this variable instantly becomes ineffective and unless you understand this weakness, you may make the same mistake as I in thinking you are monitoring the free space on your partitions. In this case, the alert was set to go off if ANY system has less that 5 GB free space. But because my backup partition on my host had lots of free space, my system ground to a halt with a full root partition because the 200 GB of free space on the backup partition caused the metric used to report the 200 or so free GB which would not trigger the alert for the crippled root partition.
I would like to see a new and more comprehensive set of drive space variables that can allow you to monitor free/used partition space on mounts, volumes, partitions, drives or any other aspect of drive space so that you can effectively catch a problem before it crashes the host or the vm.
Comments
Submitted by JamieCameron on Tue, 01/29/2013 - 19:50 Comment #1
Adding the ability to alert on a per-filesystem basis wouldn't fit in too well with Cloudmin's current alerting system .. but I can certainly add collection for free space on the root filesystem specifically.
If you want more flexibility and have Webmin installed on each system, you can setup more flexible alerts at Others -> System and Server Status, including per-filesystem thresholds.
Submitted by Franco Nogarin on Wed, 01/30/2013 - 10:56 Pro Licensee Comment #2
Thats a great work around I was not aware of, I will work with webmin right away for my edge cases and advanced partition needs, and yes, a root partition variable should work well towards protecting the stability of the VM, I would be very happy with that in cloudmin.
Thanks Jamie!!
Franco
Submitted by JamieCameron on Wed, 01/30/2013 - 11:01 Comment #3
Submitted by Franco Nogarin on Wed, 01/30/2013 - 17:08 Pro Licensee Comment #4
Just a point of note for others that are confused by what I have requested and what Jamie has suggested.
Cloudmin cannot do what I am asking, but Webmin on the cloudmaster is capable of doing exactly what I am asking.
I was going to each VM and Host and configuring a root partion monitor in webmin directly on each vm, tedious. However, you can do this once for all systems or select systems directly from webmin - other- system and server status on the cloudmin master.
The only edge cases where this wont suffice is when you are trying to monitor a VM that does not have webmin installed, or you are monitoring an external system that cannot run webmin, such as a ReadyNAS device. But in these cases, perhaps this would be an example of trying to manage these systems with the wrong tool.
Thanks so much Jamie! Franco