Backups bandwidth not counting for virtual server bandwidth usage ?

Hi,

Nothing highly urgent, but still wish to address this issue: one of our customers elected to do hourly, daily, weekly and monthly scheduled full backups to S3 of his site using "Scheduled backups" feature of VirtualminPro. 150 MB per backup... which means *25 = 37.5 GB per day = 1.16 TB per month...

Quite a few issues here:

1) E.g. we see in graph: Mail transferred (451.10 MB) Website traffic (4.57 GB) FTP transfers (1.53 MB). His allocation is of 10 GB per month...site traffic is within limits, but not backup. but no backups bandwidth is counted unfortunately, so he doesn't pay for the backup bandwidth, and is not aware of server use for that.

2) his "hourly" are :00 same time than all minute crons, and daily at midnight, worst time of day for server, same for the weekly and monthly.

3) the daily and hourly execute same time (ok, now with 3.75 shouldn't anymore), but still, no need.

4) if 2-3 other customer elect to do same, we will just have disks seeking, which isn't sane.

5) the backup process uses lots of CPU (1 CPU 70%), RAM (>250MB) and disk seek (data is read-written-read-rewritten-re-read-rewritten-re-read to be simply finally sent out) resources, due to the multiple steps taken one after the other. That's a big one. We had to add around 30% server resources to cover the peaks of backups !

So we have to stop that scheduled backups customer facility, to avoid abuse and resource misuse, until things are solved (accounting for bandwidth, scheduling backups of sites avoiding round hours, at least when customers do schedule the backups).

Any resolutions of items 1-5 above (not interfering with our own backups) of course would be welcome.

Status: 
Closed (fixed)

Comments

This one is more urgent:

Trying to find where to disable server-wide the scheduled backups usable by anyone, except the root user ?

I think i saw it somewhere can't find where, and search on "schedule backups" gives only reseller setting.

You can control this on a per-domain basis at Virtualmin at Administration Options -> Edit Owner Limits -> Edit capabilities for virtual servers -> Can make backups.

This can also be disabled at the plan level.

The underlying issue of backups not being counted towards bandwidth use I will take a look at ..

In edit Default plan i see:

Default available features: it's selected: Automatic, based on initial features

Default editing capabilities: it's selected: Automatic, based on other limits

in each plan type we have other settings.

Trying to understand, it's quite confusing me, specially this help text:

Default available features

When this option is set the Automatic (as it is by default), new top-level virtual servers will have their allowed features set based on those initially enabled when the server is created. If you want to give the server owner access to additional features by default, select the Selected below radio button and check the boxes for the features you want to grant.

Also this seems amazingly broad default features-set allowed by default for users which can create sub-servers... :

Default editing capabilities

These checkboxes determine which capabilities of their virtual servers domain owners can edit by default. Once a server is created, these can be changed on the Edit Owner Limits page.

If the Automatic option is selected, limits are determined based on whether the virtual server owner is allowed to create sub-servers or not. If so, he will have access to all capabilities. Otherwise, he can only manage users, aliases and edit web pages.

All capabilities is really an awfull lot, definitely more than you would expect a normal user to be able to handle without messing up.

Now, if in default plan, I do change the setting "Default editing capabilities:" from "Automatic" to "Selected below...", based on other limits, how will it be overridden/inherited in other plans, as in other plans i already have "selected below" ?

Also, tempted to press "Save and apply" to apply to all existing virtual server, but too scared that other settings (such as e.g. no reseller owning this domain) would be applied (in that example, it would remove reseller from a domain??), so probably the "save and apply" should be more fine-grained (what to apply ? all? parts?)

So have you been setting limits for individual domains up till now, or using plans?

If you've always been using plans, you can safely make and apply a plan-level change.

Alternately, you can take away the ability to do backups from all domains using a script like :

for dom in \`virtualmin list-domains --name-only\`; do
  virtualmin modify-limits --domain $dom --cannot-backup
done

So we can certainly include bandwidth used by backups in Virtualmin - I am working on this for the 3.76 release.

Limiting concurrent backups is tougher though, unless you lower their IO or CPU priority so they don't hurt other processes. This can be done at Webmin -> Webmin Configuration -> Advanced Options.

Cool, thanks.

btw, Including bandwidth of upgrades should probably be optional for user-initiated backups, but probably not be counting admin/root initiated-backups.

I believe something could be done on the classical scenario daily/weekly/monthly were all 3 can execute same time, which doesn't make sense. Also at least the minute of hour could be randomized mandatorily for the user-initiated scheduled-backups ? or maybe add them to anacron, which manages sequential cron tasks ? having all backups starting at :00 is just killing servers.

I've just implemented a feature to include network traffic resulting from backups created by domain owners in their bandwidth usage .. this will be in the 3.76 release.

Regarding concurrent backups, how about if there was a limit on the number that could be running at the same time?

That would be perfect in the medium term with SSDs coming. For harddisks, that limit should be 1, or smarter, 1 per mounted drive. Having 2 backups doing rotating-disk seeks on same rotating disk makes no real sense, since the backups go 5 times slower du to seeks-trashing and will surely kills websites, most probably quicking in the swap and winding down the webserver.

From what I have seen on our webservers, a virtualmin backup is using loots of resources, specially disk I/O (seeks, as long as we're not on SSDs).

On a not-completely-related note:

Also what I have seen is that if there are lots of files (e.g. redmine sessions have 250'000 tiny files left-over in public_html/tmp/sessions (what a strange place btw), or a customer has 300'000 cache files) the backup takes 1+ hour just to scan and read the files, but if I do a "find -atime +30 -delete" it takes only few seconds. Not sure what's so inefficient (tar ??), but it's very sloow.

A disk-sectors changes-tracking solution at disk level (such as r1soft (would be nice to have virtualmin support listed there btw) or timemachine ones, open-sourced one to be seen yet...) would make LOTS of sense here.

One more idea for the performance issue: one solution avoiding LOTS of unneeded seeks would be to avoid any intermediary file when not needed (e.g. with 1 file per site) to ssh remote.

In that case, the whole backup chain could be just a pipe into ssh:

backup-using-pipes.pl | tar | zip | ssh

That would make the disk only seek to read the data, and not to write intermediary files, making a gain in performance and system load of probably factor 5 to 10 !

Using a pipeline into ssh for backups is a little tricky, as current Virtualmin uses scp which doesn't accept stdin input as far as I know.

However, an option to limit concurrent backups is more do-able.. I'll work on this.

That's piping into ssh though .. not scp. Many users backup to accounts that only allow scp via a scponly shell, so piping into ssh running cat won't work.

BTW, I've just implemented support for concurrent backup limits, for inclusion in the next Virtualmin release.

Hi Jamie,

You're right on the scponly shells.

Great news on the concurrent backups limits.

Best Regards, Beat

Ok .. were there any other open issues for this bug?

No, closing it.

Many Thanks, and Happy Easter ! :-)