Cloudmin process of moving from disk image to LVM [#27868]

Submitted by aitte on Wed, 06/05/2013 - 15:49

The process is very simple on the Xen level (not sure about KVM but I assume it is the same). I propose a documentation page be written on the topic.

Jamie or Andrey, could you shed some light on what Cloudmin-specific files need to be edited so that Cloudmin is fully aware that the VM has been moved to LVM?

Anyway, here are the steps:

Shut down the VM.
Just create logical volumes equal to or larger than the disk files you currently use.
DD the contents over:

dd if=/servers/server1.img of=/dev/vg0/server1_root

dd if=/servers/server1.swap of=/dev/vg0/server1_swap
Change the .cfg file for the server, to change the following:

disk = ['file:/servers/server1.img,xvda1,w','file:/servers/server1.swap,xvda2,w']

To:

disk = ['phy:/dev/vg0/server1_root,xvda1,w','phy:/dev/vg0/server1_swap,xvda2,w']
That's it! And if you created LV's larger than the original disk images, you'll also have to do an online-resize (expand) of the Ext4 filesystem to make the extra space usable. This process is supported on modern kernels (I think starting with 2.6).

But the Cloudmin steps after that I'm not as familiar with. You'd definitely have to delete and re-add any Scheduled Backups in Cloudmin (and delete the associated /etc/webmin/server-manager/backuplogs entries).

You probably also have to edit some config files manually in Cloudmin.

Status:

Closed (fixed)

Comments

Submitted by aitte on Wed, 06/05/2013 - 16:04 Comment #1

The benefit of moving from disk images to LVM is about 50% faster disk I/O, by the way. You completely avoid fragmentation of the host filesystem as well.

Submitted by JamieCameron on Wed, 06/05/2013 - 16:07 Comment #2

You will also need to edit the VM's file in /etc/webmin/servers and add the lines :

xen_filesystem_vg=vg0
xen_filesystem_lv=server1_root
xen_swapfile_vg=vg0
xen_swapfile_lv=server1_root

Submitted by JamieCameron on Wed, 06/05/2013 - 16:08 Comment #3

Actually, Cloudmin should support this kind of move natively. I will look into implementing this for the next release.

Submitted by aitte on Wed, 06/05/2013 - 16:12 Comment #4

Beautiful, thanks for the fast reply!

Saves me a lot of time. I was going on a grep-hunt and looking for the various hidden Cloudmin config files myself, but knew that I'd never be able to be as certain as the creator of the software himself.

Thank you very much for that. :)

I'll finish the process now and report back.

So the steps for KVM and Xen are: 1. Copy the data to an LVM logical volume. 2. Edit the Xen/KVM config files as I described in my first post. 3. Edit the cloudmin config as you just described.

One day when you or andrey have some time left over, it would be nice to collect all this info into a documentation page since it's probably a common process once server performance needs outgrow disk images.

Submitted by aitte on Wed, 06/05/2013 - 16:14 Comment #5

Update: Just saw your 2nd reply: "Actually, Cloudmin should support this kind of move natively. I will look into implementing this for the next release."

Sounds like a nice idea. "Point to Xen/KVM system, and tell Cloudmin to move it to LVM." and have it do all these steps.

Having it in Cloudmin would be very beneficial to everyone.

Submitted by aitte on Wed, 06/05/2013 - 16:27 Comment #6

Also have to throw in a very important warning:

If the user moves from a disk image to a Logical Volume that is LARGER than the disk image was, then they need to expand the filesystem.

The ONLY, and I mean ONLY safe way to expand the filesystem is using an ONLINE resize WHILE it is mounted.

This is because older versions of e2fsprogs had bugs that would cause data corruption during offline expansion (more details: forums.whirlpool.net.au/archive/2060328 - I am glad that guy found it out first so we could be spared that nightmare).

Submitted by JamieCameron on Wed, 06/05/2013 - 16:36 Comment #7

That's nasty - although I haven't seen that bug, or heard other users report it. Cloudmin has been doing offline filesystem expansion for a while now, such as when you resize a disk or create a VM whose root disk is larger than the image.

Submitted by aitte on Thu, 06/06/2013 - 03:59 Comment #8

Well, it only happens with old versions of e2fsprogs ("old" as in older than a few ago months; the bug was fixed in 1.42.7 which isn't even in most distros yet).

The final replies in the thread I linked shows that the author of e2fsprogs-resize2fs confirms the serious bug and recommends either the latest e2fsprogs version, or only ever doing online resizes. He also notes that online expansions are always the safest. (Shrinking can of course only be done offline, on the other hand.)

The good news is that online resizes are very safe. In that case, resize2fs does no work of its own at all - it simply tells the kernel to "expand this filesystem", causing the kernel to simply modify a tiny bit of metadata to expand the available size, and possibly also rewriting certain allocators from 32 bit to 64 bit (only in case the new filesystem is so large that it needs 64 bit addresses; for most expansions that will never happen).

Both those operations are safe to do online and are best left to the kernel precisely because of the risk of bugs like this in userland tools.

In offline mode, we are trusting a 3rd party tool (resize2fs) to do everything right.

Whereas with online mode (the kernel) we are relying on the code that knows the ext4 filesystem best: The kernel.

Anyway I definitely had to warn you about it, because it's a very real and severe bug up to some very recent versions of e2fsprogs. That thread was from a few months ago (ouch).

Submitted by aitte on Thu, 06/06/2013 - 04:02 Comment #9

Oh and for completeness sake I'll just throw in this thing about the reasons for choosing LVM:

It's easier to resize the disk, avoids fragmentation, and it's faster. If you avoid snapshots.

To quote another sysadmin's write speeds on LVM:

"no snapshot = ~11sec (727MB/sec) 1 snapshot = ~102sec (78MB/sec) 2 snapshots = ~144sec (55MB/sec) 3 snapshots = ~313sec (25MB/sec) 4 snapshots = ~607sec (15MB/sec)"

Your write performance slows down 10x if you enable a single snapshot. Unfortunately the lvm programmers can't do anything about it (redhat.com/archives/linux-lvm/2010-November/msg00023.html) because LVM itself is just a helpful, newbie-friendly wrapper on top of the "DM" (device mapper) in the linux kernel, which itself is very, very ineffective at making snapshots.

The whole problem stems from the fact that DM (device mapper) makes snapshots on a block-level, not on a file-level. So, any time a single bit in a block changes, the entire original block must be read, written to a safe space, and then the new data written, as well as a few other writes of metadata and such.

Hence the 10x slowdown of writes when you enable even a single snapshot.

There's a new "dm-thin" device mapper module out, which has smaller block sizes and therefore less Copy-on-Write and less performance drop, but it's not used by LVM, and using it would require re-formatting every logical volume anyway so it's not an option either.

So, in short: Snapshots on LVM? Don't do it unless you really have no need for write-performance.

Submitted by JamieCameron on Wed, 06/05/2013 - 18:48 Comment #10

Thanks for the info - for the record, we strongly recommend using LVM over regular files for VM disks.

Cloudmin only creates snapshots in two cases :

When doing a backup of a running VM with disks on LVM, so that filesystem state can be frozen during the copy without requiring a shutdown.
For VM snapshots that allow a rollback to a previous state.

In both cases, snapshots only exist for a short period.

Submitted by aitte on Wed, 06/05/2013 - 19:06 Comment #11

Good, so Cloudmin will only ever make snapshots if we were to enable backups inside Cloudmin, which we won't. We'll rely on Virtualmin inside the VM to Tarball up the relevant per-site data instead, which is then easy to restore to any system since we use the same base Virtualmin image for every VM.

During this switch to LVM, I came across the option "Xen Host Settings: Base directory for virtual systems" which I had previously set to /servers, causing each server to be created under that path (.img, .swap and .cfg). But now, I have set it to LVM, and want all new configs to be created in /etc/xen.

Is it correct to assume that "Base directory for virtual systems" should be set to /etc/xen now, to get the desired behavior of all per-VM .cfg files being placed there from now on?

Submitted by aitte on Wed, 06/05/2013 - 19:25 Comment #12

The systems are now all online on top of LVM instead. Beautiful.

I've noticed one more thing in /etc/webmin/servers:

xen_disksize=40960
xen_disktotal=47244640256

These are the old 40 GB values. Cloudmin's web GUI itself is actually ignoring these lines, and showing me the new 100 GB size on the LV properly, so things are working properly and it doesn't seem like Cloudmin uses these lines anymore.

So the question is what relation these two lines have when the VM has been changed to LVM? It looks like they are only relevant when the VM is disk image-based and can safely be deleted.

Submitted by JamieCameron on Wed, 06/05/2013 - 22:49 Comment #13

Those lines are cached information about total disk usage by the VM. Since you re-sized the disks outside of Cloudmin, you should remove the xen_disktotal line and adjust the xen_disksize line to be the new root disk size in MB.

Submitted by aitte on Thu, 06/06/2013 - 04:07 Comment #14

Yeah I figured they were cached information. Thanks. I've removed disktotal and changed disksize to be the same as the output of lvdisplay --units m (returned 102400.00 MiB), in other words "xen_disksize=102400".

Looks like everything is perfectly migrated now.

Good luck possibly making this migration a built-in feature. If you want to still do offline resizes but avoid the serious bug I mentioned, then I have an idea for a safety-check:

Use dumpe2fs /dev/vg0/server1_root to see if "Filesystem features:" contains "resize_inode" (aka "reserved Gdt blocks in order to allow online resizing"). If this Ext4 feature is present, the FS is ready to do expansions while the system is running. However, this is the feature that wasn't known about by outdated resize2fs versions while doing OFFLINE resizes, leading to heavy corruption.
If that feature is NOT enabled, it's safe to use resize2fs without any other checks.
If that feature IS enabled, check the version of the resize2fs tool to make sure it is >= 1.42.7 OTHERWISE the filesystem WILL become corrupted. Note that 1.42.7 is so new that most distros don't even have it. But it's better to tell the user "you can't resize this filesystem until you upgrade e2fsprogs" than to let them go through with it and lose all their data.

This extra piece of info was the least I could do as thanks for your help.

Submitted by JamieCameron on Thu, 06/06/2013 - 11:59 Comment #15

I guess the reason why we haven't seen this is that all Cloudmin pre-built images use ext3, primarily because some older distros (like CentOS 5) that are commonly used on host systems don't support ext4.

Submitted by aitte on Thu, 06/06/2013 - 14:05 Comment #16

Ahh, that is definitely why you haven't seen it. I am sure most users use the prebuilt images.

Ext4 with extents is quite a bit faster than Ext3, by the way. Consider moving over to it, even if that means dropping pre-built image support for some pre-2008 host OS's. In a VM, I/O is always going to suffer anyway, so it's good to avoid extra slowdowns wherever possible.

See: en.community.dell.com/techcenter/high-performance-computing/w/wiki/2290.aspx

Submitted by JamieCameron on Thu, 06/20/2013 - 01:10 Comment #17

FYI, support for moving VM disks between files, LVM and iSCSI has been implemented, and will be included in the next Cloudmin release.

Submitted by aitte on Thu, 06/20/2013 - 10:57 Comment #18

That's very cool, I am glad the feature now exists so that other users don't have to go through the long manual process. :-)

Submitted by Issues on Thu, 07/04/2013 - 16:11 Comment #19

Automatically closed -- issue fixed for 2 weeks with no activity.

Submitted by Issues on Thu, 07/04/2013 - 18:12 Comment #20

Automatically closed -- issue fixed for 2 weeks with no activity.

Submitted by Issues on Thu, 07/04/2013 - 19:41 Comment #21

Automatically closed -- issue fixed for 2 weeks with no activity.