Critical multi destination backup corruption

Since one of the latest Virtualmin updates multi-destination backup no longer works correctly.

If destination 1 is a local folder and destination 2 is an FTP, backup will be made on destination 1, transfered to the FTP then deleted from destination 1. The backup log will succeed without indicating this behavior. That is, backup on destination 1 was deleted.

The bug was probably introduced in 3.95gpl or 3.96gpl

In my case this was a scary discovery. My long term backup is on destination 1(NAS) and my alternative "fail safe" one day backup on a FTP. Please fix this critical bug asap.

see http://www.virtualmin.com/node/24414#comment-109803

Thank you!

Status: 
Closed (fixed)

Comments

That's a pretty bad failure mode :-(

Are you backing up each domain into a separate file, or do you have one large file for all domains?

FYI, I have been unable to re-produce this bug..

Could you also post the backup process output from Virtualmin?

Hi Jamie,

The backup options are:

  • All virtual servers with all options
  • Do strftime-style time substitutions on file or directory name
  • Transfer each virtual server after it is backed up
  • One file per server
  • Continue with other features and servers
  • Full (all files)
  • Command to run before backup: mount...
  • Command to run after backup: umount...

(tried to attach the log file but does not work, getting javascript error with Opera and can't login firefox with my freshly reset password...) Here is the log file, only anomised and kept only 2 domains backup log, beginning and end:

LOG START

Backup is complete. Final size was 7.07 GB. Total backup time was 45 minutes, 39 seconds.

Sent by Virtualmin at: https://webch.somedomain.net:10000

Running pre-backup command .. .. done

Creating backup for virtual server ch2-fr.somedomain.net .. Copying virtual server configuration .. .. done

Copying Apache aliases ..
.. done

Creating TAR file of home directory ..
.. done

Uploading archive to FTP server 192.168.2.33 ..
.. done

.. completed in 1 seconds

Creating backup for virtual server somedomain.jp .. Copying virtual server configuration .. .. done

Backing up Cron jobs ..
.. none defined.

Saving mail aliases ..
.. done

Saving mail and FTP users ..
.. done

Backing up mail and FTP user Cron jobs ..
.. none to backup

Copying Apache virtual host configuration ..
.. done

Copying Logrotate configuration ..
.. done

Dumping MySQL database jp_domaliasname_jp ..
.. done

Backing up Webmin ACL files ..
.. done

Creating TAR file of home directory ..
.. done

Uploading archive to FTP server 192.168.2.33 ..
.. done

.. completed in 4 minutes, 51 seconds

..................... ... other domains ... .....................

Saving Virtualmin configuration ..
.. done

Saving templates and plans ..
.. done

Saving email templates ..
.. done

Saving custom fields, links, categories and shells ..
.. done

Saving custom script installers ..
.. done

Saving scheduled backups ..
.. done

Saving FTP directory restrictions ..
.. done

Saving DKIM settings ..
.. not installed

Saving greylisting settings ..
.. not installed

Saving mail server configuration ..
.. done

.. done

Uploading archive to FTP server 192.168.2.33 .. .. done

37 servers backed up successfully, 0 had errors. 8 Virtualmin configuration settings backed up successfully. Running post-backup command .. .. done

LOG END

That all looks OK to me..

Are you using pre- or post-backup commands to mount and un-mount a network filesystem for the local destination?

If anyone who is seeing this could grant me remote access to their system to debug it, that would greatly speed up finding the underlying cause ..

Unfortunately can't grant you access (does not depend from me).

I see the problem on two virtualmin systems, one pure web, the other pure mail. Both were migrated from CentOS 5 to CentOS 6 using virtualmin backups recently.

And yes, I am mounting a NAS before the backup and unmounting it after.

I have managed to trace backup when the problem appeared on both machines. It's on different dates, fortunately i was recently, 5th December for one and 28 november for the other. In both cases it was the day after updating to "wbm-virtual-server.noarch 0:3.95.gpl-1". On the web machine it was the only applied update, on the mail with many other system updates.

Checked all update logs and it seems that we have never updated to virtualmin 3.95, so the bug may be introduced in it and not necessary in 3.96 (as bleck supposed in the forum post)

You might also want to try upgrading to 3.97, which we just released.

Updated to 3.97 but it does not fix the problem.

In the backup config page, "Backup level" option has changed, there are radio boxes to choose form with label to their left "Neither (all files, and don't update incremental state)". It's really clear not what each is for.

For my existing backup neither was checked. Thinking it may be cause of the problem checked first radio (I suppose means full backup), but did not make a difference.

Just so we can narrow this down - can you trigger the same behavior by doing a backup from the command line? You can use a command like :

virtualmin backup-domains --domain whatever.com --all-features --newformat --dest /backup --dest ssh://user:pass@host:/backup

Once I know exactly what shell API command triggers this issue, I can better re-produce it.

The backups level issue is a separate problem in Virtualmin 3.97 - see http://virtualmin.com/node/24505 for a fix.

It works, except that used ftp instead of ssh and there was a typo in "backup-domains", no s at the end (for any other that my try the command)

The backup was created in the folder and then transfered, howver no virtualmin configuration file was made, only

mydomain.com.tar.gz mydomain.com.tar.gz.dom mydomain.com.tar.gz.info

LOG START

Creating backup for virtual server mydomain.com .. Copying virtual server configuration .. .. done

Backing up Cron jobs ..
.. none defined.

Saving mail aliases ..
.. done

Saving mail and FTP users ..
.. done

Backing up mail and FTP user Cron jobs ..
.. none to backup

Backing up Webmin ACL files ..
.. done

Creating TAR file of home directory ..
.. done

.. completed in 1 seconds

Uploading archive to FTP server 192.16.13.1 .. .. done

1 servers backed up successfully, 0 had errors.

Backup completed successfully. Final size was 18.69 kB

LOG END

You can use the --all-virtualmin flag to have Virtualmin settings included in the backup.

So were you unable to trigger the bug at all when doing command-line backups?

Tried also --all-virtualmin, but can't reproduce the bug that way. Executing the backup job still bugs.

Running this command in a terminal as root returns : "Command backup-domains.pl was not found".

That was a typo... option should be just "backup-domain", without the 's' at the end.

I just saw the change log from Version 3.95

Version 3.95 (18th October 2012) When running a scheduled backup from within the Virtualmin UI, pre and post backup commands are now run, and old backups purged if configured.

Something saids to me that it may be the cause of the problem! It seems to purge not only old backups. Where and how do you configure that option?

That might explain it - on the backup form, do you have "Delete old backups" set to anything for one or both of your destinations?

Yes, in the UI form, "Delete old backups" is set for both backups, with the same value.

This "time to keep" setting may be misinterpreted by the code that deals with "local files" backups, but only when multiple destinations are set. Local files backups work fine when the UI form is set for a single backup.

Delete old backups is set to NEVER for my both destinations

Ok, I found the cause of this now - the bug only happens when the "Transfer each virtual server after it is backed up" box is checked, which is why we didn't see it from a command-line backup.

The work-around until we release a fix is to un-check that box for backups to multiple destinations.

Thanks Jamie, will wait for next release.

Automatically closed -- issue fixed for 2 weeks with no activity.

Hi Jamie,

Since your "not yet" answer, version 3.98 was released. Does it fix this bug ? I didn't see it in the changelog.

Yes, the 3.98 release fixes this bug.