Ubuntu 13,10 (64-bit Desktop) Cloudmin 4.04.gpl GPL
When trying to revert a snapshot of a cloudmin guest disk, there are only errors like this:
Reverting system to snapshot 20140218-winupdate .. .. failed : /dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 42949607424: Input/output error /dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 42949664768: Input/output error /dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 0: Input/output error /dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 4096: Input/output error Unable to merge invalidated snapshot LV "win-v1-tantalus.localhost_0_20140218-winupdate_snap"
any idea what's going on here? I'm not sure if this has something to do with cloudmin or just is a linux problem. Snapshot has been created using cloudmin from browser or via cmd line.
Creating guests using cloudmin+lvm works without problem.
some outputs from lvm commands:
root@lx-d0-midas:~# vgs
/dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 42949607424: Eingabe-/Ausgabefehler
/dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 42949664768: Eingabe-/Ausgabefehler
/dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 0: Eingabe-/Ausgabefehler
/dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 4096: Eingabe-/Ausgabefehler
VG #PV #LV #SN Attr VSize VFree
vg0 1 7 1 wz--n- 237,41g 7,94g
volg0 2 11 0 wz--n- 279,45g 74,59g
root@lx-d0-midas:~# pvs
/dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 42949607424: Eingabe-/Ausgabefehler
/dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 42949664768: Eingabe-/Ausgabefehler
/dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 0: Eingabe-/Ausgabefehler
/dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 4096: Eingabe-/Ausgabefehler
PV VG Fmt Attr PSize PFree
/dev/md1 vg0 lvm2 a-- 237,41g 7,94g
/dev/md127 volg0 lvm2 a-- 74,53g 74,53g
/dev/md2 volg0 lvm2 a-- 204,92g 64,00m
root@lx-d0-midas:~# lvs
/dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 42949607424: Eingabe-/Ausgabefehler
/dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 42949664768: Eingabe-/Ausgabefehler
/dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 0: Eingabe-/Ausgabefehler
/dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 4096: Eingabe-/Ausgabefehler
LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert
home vg0 -wi-ao--- 74,50g
kvm2 vg0 -wi-ao--- 70,00g
root vg0 -wi-ao--- 23,28g
swap vg0 -wi-ao--- 3,72g
var vg0 -wi-ao--- 13,97g
win-v1-tantalus.localhost_0_20140218-winupdate_snap vg0 swi-I-s-- 4,00g win-v1-tantalus_localhost_img 100.00
win-v1-tantalus_localhost_img vg0 owi-a-s-- 40,00g
debian3_localhost_img volg0 -wi-a---- 2,00g
debian60_localhost_img volg0 -wi-a---- 30,00g
...
cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md2 : active raid1 sdd2[1] sdb2[0]
214878799 blocks super 1.2 [2/2] [UU]
md127 : active raid1 sdd1[1] sdb1[0]
78156096 blocks [2/2] [UU]
md1 : active raid1 sdc2[1] sda2[0]
248950592 blocks super 1.2 [2/2] [UU]
md0 : active raid1 sdc1[1] sda1[0]
975296 blocks super 1.2 [2/2] [UU]
Thanx in advance Falko
Comments
Submitted by lulatsch66 on Thu, 02/27/2014 - 10:41 Comment #1
Seems that I have found the reason for the problem.
When creating snapshots via cloudmin gui there is a formular like this:
New snapshot details Snapshot ID .... Percent of virtual disk to allocate XX % out of YY GB
I can easily overallocate the snapshot using more than the available free space in the volume group.
Having created a lvm snapshot which uses not existing vg space, the snapshot isn't used and the above mentioned errors occur.
So perhaps, when creating snapshots, the remaining space in the corresponding volume group must be taken in account and creating snapshots bigger than the remaining free space should not be possible.
Thanx Falko
Submitted by JamieCameron on Thu, 02/27/2014 - 13:02 Comment #2
I'm actually surprised that the snapshot LV creation didn't fail in that case. Are you sure that you actually allocated more space than was left in the VG?
Submitted by lulatsch66 on Fri, 02/28/2014 - 05:42 Comment #3
Hello Jamie,
well, you're right and I'm sorry that I didn't try to reproduce the overallocation. But I've noticed that the "maximum usage" field didn't ever show other than 0 %, so this led me to this wrong conclusion.
The logs above (from creation date of this issue) show:
PV VG Fmt Attr PSize PFree
/dev/md1 vg0 lvm2 a-- 237,41g 7,94g
root@lx-d0-midas:~# lvs
...
/dev/vg0/win-v1-tantalus.localhost_0_20140218-winupdate_snap: read failed after 0 of 4096 at 4096: Eingabe-/Ausgabefehler
...
win-v1-tantalus.localhost_0_20140218-winupdate_snap vg0 swi-I-s-- 4,00g win-v1-tantalus_localhost_img 100.00
So this means 100 % snapshot usage ... may be this is a reason of the error?
Now, when I've deleted the 20140218-winupdate_snap, and created a new "winupdate snap" using the lvm module in webmin, these sizes are shown:
vgs
VG #PV #LV #SN Attr VSize VFree
vg0 1 8 2 wz--n- 237,41g 2,94g
win-v1-tantalus.localhost_0_dailytests_snap vg0 swi-a-s-- 4,00g win-v1-tantalus_localhost_img 49,27
win-v1-tantalus_localhost_img vg0 owi-a-s-- 40,00g
win-v1-tantalus_winupdates vg0 swi-a-s-- 5,00g win-v1-tantalus_localhost_img 39,41
Atm I have 5G+4G snapshots and 2,94G free.
So, when I try to reproduce this to create a snapshot using more than the free space, virtualmin shows correctly:
Creating snapshot of system win-v1-tantalus.localhost .. .. failed : Volume group "vg0" has insufficient free space (752 extents): 1024 required.
In the lvm module of webmin I see:
win-v1-tantalus.localhost_0_dailytests_snap 40 GB
win-v1-tantalus_localhost_img 40 GB
win-v1-tantalus_winupdates 40 GB
The last snapshot shows size 40 GB, Physical volumes allocated 5GB, Snapshot use percentage 39.41 % which is corresponding with the output from console. The dailytests snap shows size 40 GB, physical volumes allocated 4GB, Snapshot use percentage 49.27%.
So I'm trying to reproduce the 100% usage by installing something like office within the vm ...
After a short time now both snapshots show 100.00 usage, the errors are:
lvs|grep win-v1
/dev/vg0/win-v1-tantalus_winupdates: read failed after 0 of 4096 at 42949607424: Eingabe-/Ausgabefehler
/dev/vg0/win-v1-tantalus_winupdates: read failed after 0 of 4096 at 42949664768: Eingabe-/Ausgabefehler
/dev/vg0/win-v1-tantalus_winupdates: read failed after 0 of 4096 at 0: Eingabe-/Ausgabefehler
/dev/vg0/win-v1-tantalus_winupdates: read failed after 0 of 4096 at 4096: Eingabe-/Ausgabefehler
/dev/vg0/win-v1-tantalus.localhost_0_dailytests_snap: read failed after 0 of 4096 at 42949607424: Eingabe-/Ausgabefehler
/dev/vg0/win-v1-tantalus.localhost_0_dailytests_snap: read failed after 0 of 4096 at 42949664768: Eingabe-/Ausgabefehler
/dev/vg0/win-v1-tantalus.localhost_0_dailytests_snap: read failed after 0 of 4096 at 0: Eingabe-/Ausgabefehler
/dev/vg0/win-v1-tantalus.localhost_0_dailytests_snap: read failed after 0 of 4096 at 4096: Eingabe-/Ausgabefehler
win-v1-tantalus.localhost_0_dailytests_snap vg0 swi-I-s-- 4,00g win-v1-tantalus_localhost_img 100.00
win-v1-tantalus_localhost_img vg0 owi-aos-- 40,00g
win-v1-tantalus_winupdates vg0 swi-I-s-- 5,00g win-v1-tantalus_localhost_img 100.00
In Cloudmin -> Disk Snapshots the "Maximum usage" field shows 0 instead of 100%. In Webmin -> lvm -> logical volume details shows: Current status Not in use
IMHO both are wrong...
May be a warning about full (and therefore destroyed) snapshots on "System Information" or "List Managed Systems" pages would make sense?
Best regards, Falko
Submitted by JamieCameron on Fri, 02/28/2014 - 11:36 Comment #4
Ok, it sounds like the snapshot got full and could not be restored. These should appear in red on the "List Snapshots" page already.
The percent size of a snapshot has to be large enough to store all the changes to the VM disk between when it was created and when it is restored. So if the snapshot is only 10% of the VM size and more than 10% of data on disk changes, it will not be restorable.