Hi,
Given the recent news regarding CentOS, I am migrating all of our Linode servers from CentOS to Ubuntu, starting with our staging server that runs Virtualmin Professional. I've successfully set up a new Ubuntu 20.04.1 LTS Linode and installed Virtualmin using the install script downloaded from the Software Licenses page in my account. Then, I did a full backup of the old CentOS server using the ZIP file format (one file per server, new format) and the new Ubuntu server as the SSH destination for the backup.
Everything worked properly, the backup took about an hour and ten minutes and resulted in about 28 GB worth of files. However, restoring that backup on the Ubuntu machine resulted in something extremely odd.
I started the restore at around 10 PM EST last night (Sunday, January 3) and it seemed to hang on the "Extracting backup archive files" step. I checked it around 1:30 AM this morning (Monday, January 4) and it was still on that step. Finally, when I checked it again around 6 AM the restore process had completed. But what I'm wondering is, why would it take at least 4 hours (if not potentially longer) to restore a 28 GB backup of about 184 virtual servers that only took an hour to actually back up?
I have done restores of backups of similar sizes and contents on other servers (running CentOS though instead of Ubuntu) and they usually complete quite quickly. This is the CLI command I ran to do the restore: virtualmin restore-domain --source /vm/ --all-domains --all-features --all-virtualmin
Something that might help - I inspected the size of /tmp every time I checked the restoration process. A few minutes after I started the restore last night, when I first checked and found it stuck on the extracting step, the /tmp folder was less than 1 GB in size. When I checked it again around 1 AM and it was still extracting archives, the /tmp folder was still less than 1 GB in size and had barely increased.
The restore was successful according to Virtualmin but some things are still broken (i.e Nginx throws 'no path in the unix domain socket in upstream "unix:"'), so I may have to try the restore again but wanted to see if there's any way it could be sped up first LOL. This is a fresh Ubuntu 20.04.1 LTS system with 8 GB RAM and 160 GB SSD storage so I'm not quite sure what could be going on, but I am very grateful in advance for any assistance provided!
Comments
Submitted by JamieCameron on Mon, 01/04/2021 - 17:50 Comment #1
When the restore is running, is the disk space in use on the filesystem that contains
/home
increasing? That would be one way to check that the restore is actually proceeding ..Submitted by JEMEDIACORP on Tue, 01/05/2021 - 15:25 Pro Licensee Comment #2
I'm not sure how Ubuntu partitions its filesystems, but I was noticing that the size of /tmp was very, very slowly increasing as Virtualmin was "extracting backup archive files." This was the step that took the longest, over 4 hours from what I noticed even though the backup I was restoring from took just 1 hour and 10 minutes to be created and was only 28 GB in size. The /tmp directory was increasing by about 100 MB every 20 minutes or so (approximately). I didn't look at /home though, and like I said the rest of the restore process (once extraction was finished) was extremely quick.
What if you went to Webmin > Webmin Configuration > Advanced Options and setup some other tmp directory to be used globally or for Virtualmin Virtual Servers module in particular? Would that speed the restore process up in any way?
Submitted by JEMEDIACORP on Tue, 01/05/2021 - 16:21 Pro Licensee Comment #4
Perhaps, though it begs the question of why /tmp in particular slowed the restore down, if that is indeed the cause. I am running this on a Linode server so it of course features an SSD, and the latest version of Ubuntu 20.04.1 LTS x64. Does Ubuntu have any limitations on /tmp? I'm not quite sure. But I can try to change the directory and re-run the restore and then see if anything changes.
Submitted by JamieCameron on Wed, 01/06/2021 - 22:14 Comment #5
Is /tmp mounted on a different filesystem perhaps? You can check by running the
mount
command.Submitted by JEMEDIACORP on Wed, 01/06/2021 - 22:44 Pro Licensee Comment #6
Not that I can see; it looks like /tmp is mounted via / which I think is the only actual disk filesystem on the server. But I am not that great at reading the output of the mount command. I've pasted a snippet of the output of mount below:
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) udev on /dev type devtmpfs (rw,nosuid,noexec,relatime,size=4032268k,nr_inodes=1008067,mode=755) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) tmpfs on /run type tmpfs (rw,nosuid,nodev,noexec,relatime,size=815284k,mode=755) /dev/sda on / type ext4 (rw,relatime,quota,usrquota,grpquota,errors=remount-ro) securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) none on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
Submitted by JamieCameron on Thu, 01/07/2021 - 23:12 Comment #7
Yeah, looks like /tmp is on the same filesystem as /
When you're running a restore and it's taking a long time, do you see any
unzip
ortar
process running withps
? If so, what's the full command line of that process?Submitted by JEMEDIACORP on Sun, 01/10/2021 - 17:45 Pro Licensee Comment #8
When I ran the restore that started this whole thread, I did see some tar commands in Virtualmin's "Running Processes" page. It didn't look like their CPU usage was very high and unfortunately I don't remember the full command line that was used. However, I have to do another restore of our production cluster in a few days so when I run that restore I'll keep an eye out for those processes, and also note if it takes a long time to restore the backup like it did before or if things go more smoothly. I will once again be restoring to a Ubuntu 20.04.1 LTS 64-bit system.
Submitted by JamieCameron on Sun, 01/10/2021 - 23:13 Comment #9
Thanks, that would be useful. Also, if the
zip
command isn't running, check for other processes that are using up a lot of CPU time.