Ubuntu
linux package

Bug #1292234
Activity log

Activity log for bug #1292234

Date	Who	What changed	Old value	New value	Message
2014-03-13 21:32:21	Jamie Strandboge	bug			added bug
2014-03-13 21:33:21	Jamie Strandboge	summary	qcow2 image corruption in trusty (qemu 1.7)	qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
2014-03-13 22:12:28	Serge Hallyn	qemu (Ubuntu): importance	Undecided	High
2014-05-01 21:26:32	Launchpad Janitor	qemu (Ubuntu): status	New	Confirmed
2014-05-01 21:26:37	Seth Arnold	bug			added subscriber Seth Arnold
2014-05-01 21:30:17	Jamie Strandboge	description	The security team uses a tool (http://bazaar.launchpad.net/~ubuntu-bugcontrol/ubuntu-qa-tools/master/view/head:/vm-tools/uvt) that uses libvirt snapshots quite a bit. I noticed after upgrading to trusty some time ago that qemu 1.7 (and the qemu 2.0 in the candidate ppa) has had stability problems such that the disk/partition table seems to be corrupted after removing a libvirt snapshot and then creating another with the same name. I don't have a very simple reproducer, but had enough that hallyn suggested I file a bug. First off: qemu-kvm 2.0~git-20140307.4c288ac-0ubuntu2 $ cat /proc/version_signature Ubuntu 3.13.0-16.36-generic 3.13.5 $ qemu-img info ./forhallyn-trusty-amd64.img image: ./forhallyn-trusty-amd64.img file format: qcow2 virtual size: 8.0G (8589934592 bytes) disk size: 4.0G cluster_size: 65536 Format specific information: compat: 0.10 Steps to reproduce: 1. create a virtual machine. For a simplified reproducer, I used virt-manager with: OS type: Linux Version: Ubuntu 14.04 Memory: 768 CPUs: 1 Select managed or existing (Browse, new volume) Create a new storage volume: qcow2 Max capacity: 8192 Allocation: 0 Advanced: NAT kvm x86_64 firmware: default 2. install a VM. I used trusty-desktop-amd64.iso from Jan 23 since it seems like I can hit the bug more reliably if I have lots of updates in a dist-upgrade. I have seen this with lucid-trusty guests that are i386 and amd64. After the install, reboot and then cleanly shutdown 3. Backup the image file somewhere since steps 1 and 2 take a while :) 4. Execute the following commands which are based on what our uvt tool does: $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh snapshot-current --name forhallyn-trusty-amd64 pristine $ virsh start forhallyn-trusty-amd64 $ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff after start, this might be different with qemu 1.5 in guest: sudo apt-get update sudo apt-get dist-upgrade 780 upgraded... shutdown -h now $ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh start forhallyn-trusty-amd64 # this command works, but there is often disk corruption The idea behind the above is to create a new VM with a pristine snapshot that we could revert later if we wanted. Instead, we boot the VM, run apt-get dist-upgrade, cleanly shutdown and then remove the old 'pristine' snapshot and create a new 'pristine' snapshot. The intention is to update the VM and the pristine snapshot so that when we boot the next time, we boot from the updated VM and can revert back to the updated VM. After running 'virsh start' after doing snapshot-delete/snapshot-create-as, the disk may be corrupted. This can be seen with grub failing to find .mod files, the kernel not booting, init failing, etc. This does not seem to be related to the machine type used. Ie, pc-i440fx-1.5, pc-i440fx-1.7 and pc-i440fx-2.0 all fail with qemu 2.0, pc-i440fx-1.5 and pc-i440fx-1.7 fail with qemu 1.7 and pc-i440fx-1.5 works fine with qemu 1.5. Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-3ubuntu5.3 from Ubuntu 13.10.	The security team uses a tool (http://bazaar.launchpad.net/~ubuntu-bugcontrol/ubuntu-qa-tools/master/view/head:/vm-tools/uvt) that uses libvirt snapshots quite a bit. I noticed after upgrading to trusty some time ago that qemu 1.7 (and the qemu 2.0 in the candidate ppa) has had stability problems such that the disk/partition table seems to be corrupted after removing a libvirt snapshot and then creating another with the same name. I don't have a very simple reproducer, but had enough that hallyn suggested I file a bug. First off: qemu-kvm 2.0~git-20140307.4c288ac-0ubuntu2 $ cat /proc/version_signature Ubuntu 3.13.0-16.36-generic 3.13.5 $ qemu-img info ./forhallyn-trusty-amd64.img image: ./forhallyn-trusty-amd64.img file format: qcow2 virtual size: 8.0G (8589934592 bytes) disk size: 4.0G cluster_size: 65536 Format specific information: compat: 0.10 Steps to reproduce: 1. create a virtual machine. For a simplified reproducer, I used virt-manager with: OS type: Linux Version: Ubuntu 14.04 Memory: 768 CPUs: 1 Select managed or existing (Browse, new volume) Create a new storage volume: qcow2 Max capacity: 8192 Allocation: 0 Advanced: NAT kvm x86_64 firmware: default 2. install a VM. I used trusty-desktop-amd64.iso from Jan 23 since it seems like I can hit the bug more reliably if I have lots of updates in a dist-upgrade. I have seen this with lucid-trusty guests that are i386 and amd64. After the install, reboot and then cleanly shutdown 3. Backup the image file somewhere since steps 1 and 2 take a while :) 4. Execute the following commands which are based on what our uvt tool does: $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh snapshot-current --name forhallyn-trusty-amd64 pristine $ virsh start forhallyn-trusty-amd64 $ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff after start, this might be different with qemu 1.5 in guest: sudo apt-get update sudo apt-get dist-upgrade 780 upgraded... shutdown -h now $ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh start forhallyn-trusty-amd64 # this command works, but there is often disk corruption The idea behind the above is to create a new VM with a pristine snapshot that we could revert later if we wanted. Instead, we boot the VM, run apt-get dist-upgrade, cleanly shutdown and then remove the old 'pristine' snapshot and create a new 'pristine' snapshot. The intention is to update the VM and the pristine snapshot so that when we boot the next time, we boot from the updated VM and can revert back to the updated VM. After running 'virsh start' after doing snapshot-delete/snapshot-create-as, the disk may be corrupted. This can be seen with grub failing to find .mod files, the kernel not booting, init failing, etc. This does not seem to be related to the machine type used. Ie, pc-i440fx-1.5, pc-i440fx-1.7 and pc-i440fx-2.0 all fail with qemu 2.0, pc-i440fx-1.5 and pc-i440fx-1.7 fail with qemu 1.7 and pc-i440fx-1.5 works fine with qemu 1.5. Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-3ubuntu5.4 from Ubuntu 13.10.
2014-05-19 20:33:48	Serge Hallyn	qemu (Ubuntu): assignee		Serge Hallyn (serge-hallyn)
2014-08-07 15:36:25	Serge Hallyn	tags		qcow2
2014-08-29 18:20:24	Ryan Harper	bug			added subscriber Ryan Harper
2014-11-26 04:34:07	Chris J Arges	qemu (Ubuntu): assignee	Serge Hallyn (serge-hallyn)	Chris J Arges (arges)
2014-12-01 20:19:55	Chris J Arges	attachment added		lp1292234-repro.sh https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+attachment/4272431/+files/lp1292234-repro.sh
2014-12-01 20:21:16	Chris J Arges	bug task added		qemu
2014-12-15 20:17:09	Chris J Arges	qemu (Ubuntu): status	Confirmed	In Progress
2014-12-23 03:38:16	Tony Breeds	bug			added subscriber Tony Breeds
2015-01-16 20:44:39	Chris J Arges	summary	qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)	qcow2 image corruption on non-extent filesystems (ext3)
2015-01-28 19:37:11	Chris J Arges	bug task deleted	qemu
2015-01-30 16:04:07	Chris J Arges	bug task added		linux (Ubuntu)
2015-01-30 16:04:52	Chris J Arges	linux (Ubuntu): assignee		Chris J Arges (arges)
2015-01-30 16:04:55	Chris J Arges	linux (Ubuntu): importance	Undecided	High
2015-01-30 16:04:57	Chris J Arges	linux (Ubuntu): status	New	In Progress
2015-01-30 16:05:06	Chris J Arges	qemu (Ubuntu): status	In Progress	Invalid
2015-01-30 16:05:09	Chris J Arges	qemu (Ubuntu): assignee	Chris J Arges (arges)
2015-01-30 16:05:10	Chris J Arges	qemu (Ubuntu): importance	High	Undecided
2015-01-30 18:46:49	Chris J Arges	description	The security team uses a tool (http://bazaar.launchpad.net/~ubuntu-bugcontrol/ubuntu-qa-tools/master/view/head:/vm-tools/uvt) that uses libvirt snapshots quite a bit. I noticed after upgrading to trusty some time ago that qemu 1.7 (and the qemu 2.0 in the candidate ppa) has had stability problems such that the disk/partition table seems to be corrupted after removing a libvirt snapshot and then creating another with the same name. I don't have a very simple reproducer, but had enough that hallyn suggested I file a bug. First off: qemu-kvm 2.0~git-20140307.4c288ac-0ubuntu2 $ cat /proc/version_signature Ubuntu 3.13.0-16.36-generic 3.13.5 $ qemu-img info ./forhallyn-trusty-amd64.img image: ./forhallyn-trusty-amd64.img file format: qcow2 virtual size: 8.0G (8589934592 bytes) disk size: 4.0G cluster_size: 65536 Format specific information: compat: 0.10 Steps to reproduce: 1. create a virtual machine. For a simplified reproducer, I used virt-manager with: OS type: Linux Version: Ubuntu 14.04 Memory: 768 CPUs: 1 Select managed or existing (Browse, new volume) Create a new storage volume: qcow2 Max capacity: 8192 Allocation: 0 Advanced: NAT kvm x86_64 firmware: default 2. install a VM. I used trusty-desktop-amd64.iso from Jan 23 since it seems like I can hit the bug more reliably if I have lots of updates in a dist-upgrade. I have seen this with lucid-trusty guests that are i386 and amd64. After the install, reboot and then cleanly shutdown 3. Backup the image file somewhere since steps 1 and 2 take a while :) 4. Execute the following commands which are based on what our uvt tool does: $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh snapshot-current --name forhallyn-trusty-amd64 pristine $ virsh start forhallyn-trusty-amd64 $ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff after start, this might be different with qemu 1.5 in guest: sudo apt-get update sudo apt-get dist-upgrade 780 upgraded... shutdown -h now $ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh start forhallyn-trusty-amd64 # this command works, but there is often disk corruption The idea behind the above is to create a new VM with a pristine snapshot that we could revert later if we wanted. Instead, we boot the VM, run apt-get dist-upgrade, cleanly shutdown and then remove the old 'pristine' snapshot and create a new 'pristine' snapshot. The intention is to update the VM and the pristine snapshot so that when we boot the next time, we boot from the updated VM and can revert back to the updated VM. After running 'virsh start' after doing snapshot-delete/snapshot-create-as, the disk may be corrupted. This can be seen with grub failing to find .mod files, the kernel not booting, init failing, etc. This does not seem to be related to the machine type used. Ie, pc-i440fx-1.5, pc-i440fx-1.7 and pc-i440fx-2.0 all fail with qemu 2.0, pc-i440fx-1.5 and pc-i440fx-1.7 fail with qemu 1.7 and pc-i440fx-1.5 works fine with qemu 1.5. Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-3ubuntu5.4 from Ubuntu 13.10.	[Impact] Users of non-extent ext4 filesystems (ext4 ^extents, or ext3 w/ CONFIG_EXT4_USE_FOR_EXT23=y) can encounter data corruption when using fallocate with FALLOC_FL_PUNCH_HOLE \| FALLOC_FL_KEEP_SIZE flags. This seems to be a regression in ext4_ind_remove_space introduced in 4f579ae7, whereas commit 77ea2a4b passes the following test case. [Test Case] 1) Setup ext4 ^extents, or ext3 filesystem with CONFIG_EXT4_USE_FOR_EXT23=y 2) Create and install a VM using a qcow2 image and store the file on the filesystem 3) Snapshot the image with qemu-img 4) Boot the image and do some disk operations (fio,etc) 5) Shutdown image and delete snapshot 6) Repeat 3-5 until VM no longer boots due to image corruption, generally this takes a few iterations depending on disk operations. -- The security team uses a tool (http://bazaar.launchpad.net/~ubuntu-bugcontrol/ubuntu-qa-tools/master/view/head:/vm-tools/uvt) that uses libvirt snapshots quite a bit. I noticed after upgrading to trusty some time ago that qemu 1.7 (and the qemu 2.0 in the candidate ppa) has had stability problems such that the disk/partition table seems to be corrupted after removing a libvirt snapshot and then creating another with the same name. I don't have a very simple reproducer, but had enough that hallyn suggested I file a bug. First off: qemu-kvm 2.0~git-20140307.4c288ac-0ubuntu2 $ cat /proc/version_signature Ubuntu 3.13.0-16.36-generic 3.13.5 $ qemu-img info ./forhallyn-trusty-amd64.img image: ./forhallyn-trusty-amd64.img file format: qcow2 virtual size: 8.0G (8589934592 bytes) disk size: 4.0G cluster_size: 65536 Format specific information: compat: 0.10 Steps to reproduce: 1. create a virtual machine. For a simplified reproducer, I used virt-manager with: OS type: Linux Version: Ubuntu 14.04 Memory: 768 CPUs: 1 Select managed or existing (Browse, new volume) Create a new storage volume: qcow2 Max capacity: 8192 Allocation: 0 Advanced: NAT kvm x86_64 firmware: default 2. install a VM. I used trusty-desktop-amd64.iso from Jan 23 since it seems like I can hit the bug more reliably if I have lots of updates in a dist-upgrade. I have seen this with lucid-trusty guests that are i386 and amd64. After the install, reboot and then cleanly shutdown 3. Backup the image file somewhere since steps 1 and 2 take a while :) 4. Execute the following commands which are based on what our uvt tool does: $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh snapshot-current --name forhallyn-trusty-amd64 pristine $ virsh start forhallyn-trusty-amd64 $ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff after start, this might be different with qemu 1.5 in guest: sudo apt-get update sudo apt-get dist-upgrade 780 upgraded... shutdown -h now $ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh start forhallyn-trusty-amd64 # this command works, but there is often disk corruption The idea behind the above is to create a new VM with a pristine snapshot that we could revert later if we wanted. Instead, we boot the VM, run apt-get dist-upgrade, cleanly shutdown and then remove the old 'pristine' snapshot and create a new 'pristine' snapshot. The intention is to update the VM and the pristine snapshot so that when we boot the next time, we boot from the updated VM and can revert back to the updated VM. After running 'virsh start' after doing snapshot-delete/snapshot-create-as, the disk may be corrupted. This can be seen with grub failing to find .mod files, the kernel not booting, init failing, etc. This does not seem to be related to the machine type used. Ie, pc-i440fx-1.5, pc-i440fx-1.7 and pc-i440fx-2.0 all fail with qemu 2.0, pc-i440fx-1.5 and pc-i440fx-1.7 fail with qemu 1.7 and pc-i440fx-1.5 works fine with qemu 1.5. Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-3ubuntu5.4 from Ubuntu 13.10.
2015-02-03 11:28:17	Josep M. Perez	bug			added subscriber Josep M. Perez
2015-02-03 18:12:05	Chris J Arges	description	[Impact] Users of non-extent ext4 filesystems (ext4 ^extents, or ext3 w/ CONFIG_EXT4_USE_FOR_EXT23=y) can encounter data corruption when using fallocate with FALLOC_FL_PUNCH_HOLE \| FALLOC_FL_KEEP_SIZE flags. This seems to be a regression in ext4_ind_remove_space introduced in 4f579ae7, whereas commit 77ea2a4b passes the following test case. [Test Case] 1) Setup ext4 ^extents, or ext3 filesystem with CONFIG_EXT4_USE_FOR_EXT23=y 2) Create and install a VM using a qcow2 image and store the file on the filesystem 3) Snapshot the image with qemu-img 4) Boot the image and do some disk operations (fio,etc) 5) Shutdown image and delete snapshot 6) Repeat 3-5 until VM no longer boots due to image corruption, generally this takes a few iterations depending on disk operations. -- The security team uses a tool (http://bazaar.launchpad.net/~ubuntu-bugcontrol/ubuntu-qa-tools/master/view/head:/vm-tools/uvt) that uses libvirt snapshots quite a bit. I noticed after upgrading to trusty some time ago that qemu 1.7 (and the qemu 2.0 in the candidate ppa) has had stability problems such that the disk/partition table seems to be corrupted after removing a libvirt snapshot and then creating another with the same name. I don't have a very simple reproducer, but had enough that hallyn suggested I file a bug. First off: qemu-kvm 2.0~git-20140307.4c288ac-0ubuntu2 $ cat /proc/version_signature Ubuntu 3.13.0-16.36-generic 3.13.5 $ qemu-img info ./forhallyn-trusty-amd64.img image: ./forhallyn-trusty-amd64.img file format: qcow2 virtual size: 8.0G (8589934592 bytes) disk size: 4.0G cluster_size: 65536 Format specific information: compat: 0.10 Steps to reproduce: 1. create a virtual machine. For a simplified reproducer, I used virt-manager with: OS type: Linux Version: Ubuntu 14.04 Memory: 768 CPUs: 1 Select managed or existing (Browse, new volume) Create a new storage volume: qcow2 Max capacity: 8192 Allocation: 0 Advanced: NAT kvm x86_64 firmware: default 2. install a VM. I used trusty-desktop-amd64.iso from Jan 23 since it seems like I can hit the bug more reliably if I have lots of updates in a dist-upgrade. I have seen this with lucid-trusty guests that are i386 and amd64. After the install, reboot and then cleanly shutdown 3. Backup the image file somewhere since steps 1 and 2 take a while :) 4. Execute the following commands which are based on what our uvt tool does: $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh snapshot-current --name forhallyn-trusty-amd64 pristine $ virsh start forhallyn-trusty-amd64 $ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff after start, this might be different with qemu 1.5 in guest: sudo apt-get update sudo apt-get dist-upgrade 780 upgraded... shutdown -h now $ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh start forhallyn-trusty-amd64 # this command works, but there is often disk corruption The idea behind the above is to create a new VM with a pristine snapshot that we could revert later if we wanted. Instead, we boot the VM, run apt-get dist-upgrade, cleanly shutdown and then remove the old 'pristine' snapshot and create a new 'pristine' snapshot. The intention is to update the VM and the pristine snapshot so that when we boot the next time, we boot from the updated VM and can revert back to the updated VM. After running 'virsh start' after doing snapshot-delete/snapshot-create-as, the disk may be corrupted. This can be seen with grub failing to find .mod files, the kernel not booting, init failing, etc. This does not seem to be related to the machine type used. Ie, pc-i440fx-1.5, pc-i440fx-1.7 and pc-i440fx-2.0 all fail with qemu 2.0, pc-i440fx-1.5 and pc-i440fx-1.7 fail with qemu 1.7 and pc-i440fx-1.5 works fine with qemu 1.5. Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-3ubuntu5.4 from Ubuntu 13.10.	[Impact] Users of non-extent ext4 filesystems (ext4 ^extents, or ext3 w/ CONFIG_EXT4_USE_FOR_EXT23=y) can encounter data corruption when using fallocate with FALLOC_FL_PUNCH_HOLE \| FALLOC_FL_KEEP_SIZE flags. This seems to be a regression in ext4_ind_remove_space introduced in 4f579ae7, whereas commit 77ea2a4b passes the following test case. [Test Case] 1) Setup ext4 ^extents, or ext3 filesystem with CONFIG_EXT4_USE_FOR_EXT23=y 2) Create and install a VM using a qcow2 image and store the file on the filesystem 3) Snapshot the image with qemu-img 4) Boot the image and do some disk operations (fio,etc) 5) Shutdown image and delete snapshot 6) Repeat 3-5 until VM no longer boots due to image corruption, generally this takes a few iterations depending on disk operations. [Fix] This is being discussed upstream here: http://marc.info/?l=linux-fsdevel&m=142264422605440&w=2 A temporary fix would be to disable punch_hole for non-extent filesystem. This is how the normal ext3 module handles this and it is up to userspace to handle the failure. diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 5653fa4..e14cdfe 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3367,6 +3367,10 @@ int ext4_punch_hole(struct inode inode, loff_t offset, loff_t length) unsigned int credits; int ret = 0; + / EXTENTS required */ + if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) + return -EOPNOTSUPP; + if (!S_ISREG(inode->i_mode)) return -EOPNOTSUPP; -- The security team uses a tool (http://bazaar.launchpad.net/~ubuntu-bugcontrol/ubuntu-qa-tools/master/view/head:/vm-tools/uvt) that uses libvirt snapshots quite a bit. I noticed after upgrading to trusty some time ago that qemu 1.7 (and the qemu 2.0 in the candidate ppa) has had stability problems such that the disk/partition table seems to be corrupted after removing a libvirt snapshot and then creating another with the same name. I don't have a very simple reproducer, but had enough that hallyn suggested I file a bug. First off: qemu-kvm 2.0~git-20140307.4c288ac-0ubuntu2 $ cat /proc/version_signature Ubuntu 3.13.0-16.36-generic 3.13.5 $ qemu-img info ./forhallyn-trusty-amd64.img image: ./forhallyn-trusty-amd64.img file format: qcow2 virtual size: 8.0G (8589934592 bytes) disk size: 4.0G cluster_size: 65536 Format specific information: compat: 0.10 Steps to reproduce: 1. create a virtual machine. For a simplified reproducer, I used virt-manager with: OS type: Linux Version: Ubuntu 14.04 Memory: 768 CPUs: 1 Select managed or existing (Browse, new volume) Create a new storage volume: qcow2 Max capacity: 8192 Allocation: 0 Advanced: NAT kvm x86_64 firmware: default 2. install a VM. I used trusty-desktop-amd64.iso from Jan 23 since it seems like I can hit the bug more reliably if I have lots of updates in a dist-upgrade. I have seen this with lucid-trusty guests that are i386 and amd64. After the install, reboot and then cleanly shutdown 3. Backup the image file somewhere since steps 1 and 2 take a while :) 4. Execute the following commands which are based on what our uvt tool does: $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh snapshot-current --name forhallyn-trusty-amd64 pristine $ virsh start forhallyn-trusty-amd64 $ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff after start, this might be different with qemu 1.5 in guest: sudo apt-get update sudo apt-get dist-upgrade 780 upgraded... shutdown -h now $ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh start forhallyn-trusty-amd64 # this command works, but there is often disk corruption The idea behind the above is to create a new VM with a pristine snapshot that we could revert later if we wanted. Instead, we boot the VM, run apt-get dist-upgrade, cleanly shutdown and then remove the old 'pristine' snapshot and create a new 'pristine' snapshot. The intention is to update the VM and the pristine snapshot so that when we boot the next time, we boot from the updated VM and can revert back to the updated VM. After running 'virsh start' after doing snapshot-delete/snapshot-create-as, the disk may be corrupted. This can be seen with grub failing to find .mod files, the kernel not booting, init failing, etc. This does not seem to be related to the machine type used. Ie, pc-i440fx-1.5, pc-i440fx-1.7 and pc-i440fx-2.0 all fail with qemu 2.0, pc-i440fx-1.5 and pc-i440fx-1.7 fail with qemu 1.7 and pc-i440fx-1.5 works fine with qemu 1.5. Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-3ubuntu5.4 from Ubuntu 13.10.
2015-02-03 18:16:11	Chris J Arges	description	[Impact] Users of non-extent ext4 filesystems (ext4 ^extents, or ext3 w/ CONFIG_EXT4_USE_FOR_EXT23=y) can encounter data corruption when using fallocate with FALLOC_FL_PUNCH_HOLE \| FALLOC_FL_KEEP_SIZE flags. This seems to be a regression in ext4_ind_remove_space introduced in 4f579ae7, whereas commit 77ea2a4b passes the following test case. [Test Case] 1) Setup ext4 ^extents, or ext3 filesystem with CONFIG_EXT4_USE_FOR_EXT23=y 2) Create and install a VM using a qcow2 image and store the file on the filesystem 3) Snapshot the image with qemu-img 4) Boot the image and do some disk operations (fio,etc) 5) Shutdown image and delete snapshot 6) Repeat 3-5 until VM no longer boots due to image corruption, generally this takes a few iterations depending on disk operations. [Fix] This is being discussed upstream here: http://marc.info/?l=linux-fsdevel&m=142264422605440&w=2 A temporary fix would be to disable punch_hole for non-extent filesystem. This is how the normal ext3 module handles this and it is up to userspace to handle the failure. diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 5653fa4..e14cdfe 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3367,6 +3367,10 @@ int ext4_punch_hole(struct inode inode, loff_t offset, loff_t length) unsigned int credits; int ret = 0; + / EXTENTS required */ + if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) + return -EOPNOTSUPP; + if (!S_ISREG(inode->i_mode)) return -EOPNOTSUPP; -- The security team uses a tool (http://bazaar.launchpad.net/~ubuntu-bugcontrol/ubuntu-qa-tools/master/view/head:/vm-tools/uvt) that uses libvirt snapshots quite a bit. I noticed after upgrading to trusty some time ago that qemu 1.7 (and the qemu 2.0 in the candidate ppa) has had stability problems such that the disk/partition table seems to be corrupted after removing a libvirt snapshot and then creating another with the same name. I don't have a very simple reproducer, but had enough that hallyn suggested I file a bug. First off: qemu-kvm 2.0~git-20140307.4c288ac-0ubuntu2 $ cat /proc/version_signature Ubuntu 3.13.0-16.36-generic 3.13.5 $ qemu-img info ./forhallyn-trusty-amd64.img image: ./forhallyn-trusty-amd64.img file format: qcow2 virtual size: 8.0G (8589934592 bytes) disk size: 4.0G cluster_size: 65536 Format specific information: compat: 0.10 Steps to reproduce: 1. create a virtual machine. For a simplified reproducer, I used virt-manager with: OS type: Linux Version: Ubuntu 14.04 Memory: 768 CPUs: 1 Select managed or existing (Browse, new volume) Create a new storage volume: qcow2 Max capacity: 8192 Allocation: 0 Advanced: NAT kvm x86_64 firmware: default 2. install a VM. I used trusty-desktop-amd64.iso from Jan 23 since it seems like I can hit the bug more reliably if I have lots of updates in a dist-upgrade. I have seen this with lucid-trusty guests that are i386 and amd64. After the install, reboot and then cleanly shutdown 3. Backup the image file somewhere since steps 1 and 2 take a while :) 4. Execute the following commands which are based on what our uvt tool does: $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh snapshot-current --name forhallyn-trusty-amd64 pristine $ virsh start forhallyn-trusty-amd64 $ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff after start, this might be different with qemu 1.5 in guest: sudo apt-get update sudo apt-get dist-upgrade 780 upgraded... shutdown -h now $ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh start forhallyn-trusty-amd64 # this command works, but there is often disk corruption The idea behind the above is to create a new VM with a pristine snapshot that we could revert later if we wanted. Instead, we boot the VM, run apt-get dist-upgrade, cleanly shutdown and then remove the old 'pristine' snapshot and create a new 'pristine' snapshot. The intention is to update the VM and the pristine snapshot so that when we boot the next time, we boot from the updated VM and can revert back to the updated VM. After running 'virsh start' after doing snapshot-delete/snapshot-create-as, the disk may be corrupted. This can be seen with grub failing to find .mod files, the kernel not booting, init failing, etc. This does not seem to be related to the machine type used. Ie, pc-i440fx-1.5, pc-i440fx-1.7 and pc-i440fx-2.0 all fail with qemu 2.0, pc-i440fx-1.5 and pc-i440fx-1.7 fail with qemu 1.7 and pc-i440fx-1.5 works fine with qemu 1.5. Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-3ubuntu5.4 from Ubuntu 13.10.	[Impact] Users of non-extent ext4 filesystems (ext4 ^extents, or ext3 w/ CONFIG_EXT4_USE_FOR_EXT23=y) can encounter data corruption when using fallocate with FALLOC_FL_PUNCH_HOLE \| FALLOC_FL_KEEP_SIZE flags. [Test Case] 1) Setup ext4 ^extents, or ext3 filesystem with CONFIG_EXT4_USE_FOR_EXT23=y 2) Create and install a VM using a qcow2 image and store the file on the filesystem 3) Snapshot the image with qemu-img 4) Boot the image and do some disk operations (fio,etc) 5) Shutdown image and delete snapshot 6) Repeat 3-5 until VM no longer boots due to image corruption, generally this takes a few iterations depending on disk operations. [Fix] This is being discussed upstream here: http://marc.info/?l=linux-fsdevel&m=142264422605440&w=2 A temporary fix would be to disable punch_hole for non-extent filesystem. This is how the normal ext3 module handles this and it is up to userspace to handle the failure. diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 5653fa4..e14cdfe 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3367,6 +3367,10 @@ int ext4_punch_hole(struct inode inode, loff_t offset, loff_t length) unsigned int credits; int ret = 0; + / EXTENTS required */ + if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) + return -EOPNOTSUPP; + if (!S_ISREG(inode->i_mode)) return -EOPNOTSUPP; -- The security team uses a tool (http://bazaar.launchpad.net/~ubuntu-bugcontrol/ubuntu-qa-tools/master/view/head:/vm-tools/uvt) that uses libvirt snapshots quite a bit. I noticed after upgrading to trusty some time ago that qemu 1.7 (and the qemu 2.0 in the candidate ppa) has had stability problems such that the disk/partition table seems to be corrupted after removing a libvirt snapshot and then creating another with the same name. I don't have a very simple reproducer, but had enough that hallyn suggested I file a bug. First off: qemu-kvm 2.0~git-20140307.4c288ac-0ubuntu2 $ cat /proc/version_signature Ubuntu 3.13.0-16.36-generic 3.13.5 $ qemu-img info ./forhallyn-trusty-amd64.img image: ./forhallyn-trusty-amd64.img file format: qcow2 virtual size: 8.0G (8589934592 bytes) disk size: 4.0G cluster_size: 65536 Format specific information: compat: 0.10 Steps to reproduce: 1. create a virtual machine. For a simplified reproducer, I used virt-manager with: OS type: Linux Version: Ubuntu 14.04 Memory: 768 CPUs: 1 Select managed or existing (Browse, new volume) Create a new storage volume: qcow2 Max capacity: 8192 Allocation: 0 Advanced: NAT kvm x86_64 firmware: default 2. install a VM. I used trusty-desktop-amd64.iso from Jan 23 since it seems like I can hit the bug more reliably if I have lots of updates in a dist-upgrade. I have seen this with lucid-trusty guests that are i386 and amd64. After the install, reboot and then cleanly shutdown 3. Backup the image file somewhere since steps 1 and 2 take a while :) 4. Execute the following commands which are based on what our uvt tool does: $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh snapshot-current --name forhallyn-trusty-amd64 pristine $ virsh start forhallyn-trusty-amd64 $ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff after start, this might be different with qemu 1.5 in guest: sudo apt-get update sudo apt-get dist-upgrade 780 upgraded... shutdown -h now $ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh start forhallyn-trusty-amd64 # this command works, but there is often disk corruption The idea behind the above is to create a new VM with a pristine snapshot that we could revert later if we wanted. Instead, we boot the VM, run apt-get dist-upgrade, cleanly shutdown and then remove the old 'pristine' snapshot and create a new 'pristine' snapshot. The intention is to update the VM and the pristine snapshot so that when we boot the next time, we boot from the updated VM and can revert back to the updated VM. After running 'virsh start' after doing snapshot-delete/snapshot-create-as, the disk may be corrupted. This can be seen with grub failing to find .mod files, the kernel not booting, init failing, etc. This does not seem to be related to the machine type used. Ie, pc-i440fx-1.5, pc-i440fx-1.7 and pc-i440fx-2.0 all fail with qemu 2.0, pc-i440fx-1.5 and pc-i440fx-1.7 fail with qemu 1.7 and pc-i440fx-1.5 works fine with qemu 1.5. Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-3ubuntu5.4 from Ubuntu 13.10.
2015-02-03 18:17:44	Chris J Arges	description	[Impact] Users of non-extent ext4 filesystems (ext4 ^extents, or ext3 w/ CONFIG_EXT4_USE_FOR_EXT23=y) can encounter data corruption when using fallocate with FALLOC_FL_PUNCH_HOLE \| FALLOC_FL_KEEP_SIZE flags. [Test Case] 1) Setup ext4 ^extents, or ext3 filesystem with CONFIG_EXT4_USE_FOR_EXT23=y 2) Create and install a VM using a qcow2 image and store the file on the filesystem 3) Snapshot the image with qemu-img 4) Boot the image and do some disk operations (fio,etc) 5) Shutdown image and delete snapshot 6) Repeat 3-5 until VM no longer boots due to image corruption, generally this takes a few iterations depending on disk operations. [Fix] This is being discussed upstream here: http://marc.info/?l=linux-fsdevel&m=142264422605440&w=2 A temporary fix would be to disable punch_hole for non-extent filesystem. This is how the normal ext3 module handles this and it is up to userspace to handle the failure. diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 5653fa4..e14cdfe 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3367,6 +3367,10 @@ int ext4_punch_hole(struct inode inode, loff_t offset, loff_t length) unsigned int credits; int ret = 0; + / EXTENTS required */ + if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) + return -EOPNOTSUPP; + if (!S_ISREG(inode->i_mode)) return -EOPNOTSUPP; -- The security team uses a tool (http://bazaar.launchpad.net/~ubuntu-bugcontrol/ubuntu-qa-tools/master/view/head:/vm-tools/uvt) that uses libvirt snapshots quite a bit. I noticed after upgrading to trusty some time ago that qemu 1.7 (and the qemu 2.0 in the candidate ppa) has had stability problems such that the disk/partition table seems to be corrupted after removing a libvirt snapshot and then creating another with the same name. I don't have a very simple reproducer, but had enough that hallyn suggested I file a bug. First off: qemu-kvm 2.0~git-20140307.4c288ac-0ubuntu2 $ cat /proc/version_signature Ubuntu 3.13.0-16.36-generic 3.13.5 $ qemu-img info ./forhallyn-trusty-amd64.img image: ./forhallyn-trusty-amd64.img file format: qcow2 virtual size: 8.0G (8589934592 bytes) disk size: 4.0G cluster_size: 65536 Format specific information: compat: 0.10 Steps to reproduce: 1. create a virtual machine. For a simplified reproducer, I used virt-manager with: OS type: Linux Version: Ubuntu 14.04 Memory: 768 CPUs: 1 Select managed or existing (Browse, new volume) Create a new storage volume: qcow2 Max capacity: 8192 Allocation: 0 Advanced: NAT kvm x86_64 firmware: default 2. install a VM. I used trusty-desktop-amd64.iso from Jan 23 since it seems like I can hit the bug more reliably if I have lots of updates in a dist-upgrade. I have seen this with lucid-trusty guests that are i386 and amd64. After the install, reboot and then cleanly shutdown 3. Backup the image file somewhere since steps 1 and 2 take a while :) 4. Execute the following commands which are based on what our uvt tool does: $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh snapshot-current --name forhallyn-trusty-amd64 pristine $ virsh start forhallyn-trusty-amd64 $ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff after start, this might be different with qemu 1.5 in guest: sudo apt-get update sudo apt-get dist-upgrade 780 upgraded... shutdown -h now $ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh start forhallyn-trusty-amd64 # this command works, but there is often disk corruption The idea behind the above is to create a new VM with a pristine snapshot that we could revert later if we wanted. Instead, we boot the VM, run apt-get dist-upgrade, cleanly shutdown and then remove the old 'pristine' snapshot and create a new 'pristine' snapshot. The intention is to update the VM and the pristine snapshot so that when we boot the next time, we boot from the updated VM and can revert back to the updated VM. After running 'virsh start' after doing snapshot-delete/snapshot-create-as, the disk may be corrupted. This can be seen with grub failing to find .mod files, the kernel not booting, init failing, etc. This does not seem to be related to the machine type used. Ie, pc-i440fx-1.5, pc-i440fx-1.7 and pc-i440fx-2.0 all fail with qemu 2.0, pc-i440fx-1.5 and pc-i440fx-1.7 fail with qemu 1.7 and pc-i440fx-1.5 works fine with qemu 1.5. Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-3ubuntu5.4 from Ubuntu 13.10.	[Impact] Users of non-extent ext4 filesystems (ext4 ^extents, or ext3 w/ CONFIG_EXT4_USE_FOR_EXT23=y) can encounter data corruption when using fallocate with FALLOC_FL_PUNCH_HOLE \| FALLOC_FL_KEEP_SIZE flags. [Test Case] 1) Setup ext4 ^extents, or ext3 filesystem with CONFIG_EXT4_USE_FOR_EXT23=y 2) Create and install a VM using a qcow2 image and store the file on the filesystem 3) Snapshot the image with qemu-img 4) Boot the image and do some disk operations (fio,etc) 5) Shutdown image and delete snapshot 6) Repeat 3-5 until VM no longer boots due to image corruption, generally this takes a few iterations depending on disk operations. [Fix] This is being discussed upstream here: http://marc.info/?l=linux-fsdevel&m=142264422605440&w=2 A temporary fix would be to disable punch_hole for non-extent filesystem. This is how the normal ext3 module handles this and it is up to userspace to handle the failure. I've run this with the test case and was able to run for 600 iterations over 3 days where most failures occur within the first 2-20 iterations. diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 5653fa4..e14cdfe 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3367,6 +3367,10 @@ int ext4_punch_hole(struct inode inode, loff_t offset, loff_t length) unsigned int credits; int ret = 0; + / EXTENTS required */ + if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) + return -EOPNOTSUPP; + if (!S_ISREG(inode->i_mode)) return -EOPNOTSUPP; -- The security team uses a tool (http://bazaar.launchpad.net/~ubuntu-bugcontrol/ubuntu-qa-tools/master/view/head:/vm-tools/uvt) that uses libvirt snapshots quite a bit. I noticed after upgrading to trusty some time ago that qemu 1.7 (and the qemu 2.0 in the candidate ppa) has had stability problems such that the disk/partition table seems to be corrupted after removing a libvirt snapshot and then creating another with the same name. I don't have a very simple reproducer, but had enough that hallyn suggested I file a bug. First off: qemu-kvm 2.0~git-20140307.4c288ac-0ubuntu2 $ cat /proc/version_signature Ubuntu 3.13.0-16.36-generic 3.13.5 $ qemu-img info ./forhallyn-trusty-amd64.img image: ./forhallyn-trusty-amd64.img file format: qcow2 virtual size: 8.0G (8589934592 bytes) disk size: 4.0G cluster_size: 65536 Format specific information: compat: 0.10 Steps to reproduce: 1. create a virtual machine. For a simplified reproducer, I used virt-manager with: OS type: Linux Version: Ubuntu 14.04 Memory: 768 CPUs: 1 Select managed or existing (Browse, new volume) Create a new storage volume: qcow2 Max capacity: 8192 Allocation: 0 Advanced: NAT kvm x86_64 firmware: default 2. install a VM. I used trusty-desktop-amd64.iso from Jan 23 since it seems like I can hit the bug more reliably if I have lots of updates in a dist-upgrade. I have seen this with lucid-trusty guests that are i386 and amd64. After the install, reboot and then cleanly shutdown 3. Backup the image file somewhere since steps 1 and 2 take a while :) 4. Execute the following commands which are based on what our uvt tool does: $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh snapshot-current --name forhallyn-trusty-amd64 pristine $ virsh start forhallyn-trusty-amd64 $ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff after start, this might be different with qemu 1.5 in guest: sudo apt-get update sudo apt-get dist-upgrade 780 upgraded... shutdown -h now $ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh start forhallyn-trusty-amd64 # this command works, but there is often disk corruption The idea behind the above is to create a new VM with a pristine snapshot that we could revert later if we wanted. Instead, we boot the VM, run apt-get dist-upgrade, cleanly shutdown and then remove the old 'pristine' snapshot and create a new 'pristine' snapshot. The intention is to update the VM and the pristine snapshot so that when we boot the next time, we boot from the updated VM and can revert back to the updated VM. After running 'virsh start' after doing snapshot-delete/snapshot-create-as, the disk may be corrupted. This can be seen with grub failing to find .mod files, the kernel not booting, init failing, etc. This does not seem to be related to the machine type used. Ie, pc-i440fx-1.5, pc-i440fx-1.7 and pc-i440fx-2.0 all fail with qemu 2.0, pc-i440fx-1.5 and pc-i440fx-1.7 fail with qemu 1.7 and pc-i440fx-1.5 works fine with qemu 1.5. Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-3ubuntu5.4 from Ubuntu 13.10.
2015-02-09 21:40:20	Launchpad Janitor	linux (Ubuntu): status	In Progress	Fix Released
2015-03-20 15:25:04	Chris J Arges	description	[Impact] Users of non-extent ext4 filesystems (ext4 ^extents, or ext3 w/ CONFIG_EXT4_USE_FOR_EXT23=y) can encounter data corruption when using fallocate with FALLOC_FL_PUNCH_HOLE \| FALLOC_FL_KEEP_SIZE flags. [Test Case] 1) Setup ext4 ^extents, or ext3 filesystem with CONFIG_EXT4_USE_FOR_EXT23=y 2) Create and install a VM using a qcow2 image and store the file on the filesystem 3) Snapshot the image with qemu-img 4) Boot the image and do some disk operations (fio,etc) 5) Shutdown image and delete snapshot 6) Repeat 3-5 until VM no longer boots due to image corruption, generally this takes a few iterations depending on disk operations. [Fix] This is being discussed upstream here: http://marc.info/?l=linux-fsdevel&m=142264422605440&w=2 A temporary fix would be to disable punch_hole for non-extent filesystem. This is how the normal ext3 module handles this and it is up to userspace to handle the failure. I've run this with the test case and was able to run for 600 iterations over 3 days where most failures occur within the first 2-20 iterations. diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 5653fa4..e14cdfe 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3367,6 +3367,10 @@ int ext4_punch_hole(struct inode inode, loff_t offset, loff_t length) unsigned int credits; int ret = 0; + / EXTENTS required */ + if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) + return -EOPNOTSUPP; + if (!S_ISREG(inode->i_mode)) return -EOPNOTSUPP; -- The security team uses a tool (http://bazaar.launchpad.net/~ubuntu-bugcontrol/ubuntu-qa-tools/master/view/head:/vm-tools/uvt) that uses libvirt snapshots quite a bit. I noticed after upgrading to trusty some time ago that qemu 1.7 (and the qemu 2.0 in the candidate ppa) has had stability problems such that the disk/partition table seems to be corrupted after removing a libvirt snapshot and then creating another with the same name. I don't have a very simple reproducer, but had enough that hallyn suggested I file a bug. First off: qemu-kvm 2.0~git-20140307.4c288ac-0ubuntu2 $ cat /proc/version_signature Ubuntu 3.13.0-16.36-generic 3.13.5 $ qemu-img info ./forhallyn-trusty-amd64.img image: ./forhallyn-trusty-amd64.img file format: qcow2 virtual size: 8.0G (8589934592 bytes) disk size: 4.0G cluster_size: 65536 Format specific information: compat: 0.10 Steps to reproduce: 1. create a virtual machine. For a simplified reproducer, I used virt-manager with: OS type: Linux Version: Ubuntu 14.04 Memory: 768 CPUs: 1 Select managed or existing (Browse, new volume) Create a new storage volume: qcow2 Max capacity: 8192 Allocation: 0 Advanced: NAT kvm x86_64 firmware: default 2. install a VM. I used trusty-desktop-amd64.iso from Jan 23 since it seems like I can hit the bug more reliably if I have lots of updates in a dist-upgrade. I have seen this with lucid-trusty guests that are i386 and amd64. After the install, reboot and then cleanly shutdown 3. Backup the image file somewhere since steps 1 and 2 take a while :) 4. Execute the following commands which are based on what our uvt tool does: $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh snapshot-current --name forhallyn-trusty-amd64 pristine $ virsh start forhallyn-trusty-amd64 $ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff after start, this might be different with qemu 1.5 in guest: sudo apt-get update sudo apt-get dist-upgrade 780 upgraded... shutdown -h now $ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh start forhallyn-trusty-amd64 # this command works, but there is often disk corruption The idea behind the above is to create a new VM with a pristine snapshot that we could revert later if we wanted. Instead, we boot the VM, run apt-get dist-upgrade, cleanly shutdown and then remove the old 'pristine' snapshot and create a new 'pristine' snapshot. The intention is to update the VM and the pristine snapshot so that when we boot the next time, we boot from the updated VM and can revert back to the updated VM. After running 'virsh start' after doing snapshot-delete/snapshot-create-as, the disk may be corrupted. This can be seen with grub failing to find .mod files, the kernel not booting, init failing, etc. This does not seem to be related to the machine type used. Ie, pc-i440fx-1.5, pc-i440fx-1.7 and pc-i440fx-2.0 all fail with qemu 2.0, pc-i440fx-1.5 and pc-i440fx-1.7 fail with qemu 1.7 and pc-i440fx-1.5 works fine with qemu 1.5. Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-3ubuntu5.4 from Ubuntu 13.10.	[Impact] Users of non-extent ext4 filesystems (ext4 ^extents, or ext3 w/ CONFIG_EXT4_USE_FOR_EXT23=y) can encounter data corruption when using fallocate with FALLOC_FL_PUNCH_HOLE \| FALLOC_FL_KEEP_SIZE flags. [Test Case] 1) Setup ext4 ^extents, or ext3 filesystem with CONFIG_EXT4_USE_FOR_EXT23=y 2) Create and install a VM using a qcow2 image and store the file on the filesystem 3) Snapshot the image with qemu-img 4) Boot the image and do some disk operations (fio,etc) 5) Shutdown image and delete snapshot 6) Repeat 3-5 until VM no longer boots due to image corruption, generally this takes a few iterations depending on disk operations. [Fix] commit 6f30b7e37a8239f9d27db626a1d3427bc7951908 upstream This has been discussed upstream here: http://marc.info/?l=linux-fsdevel&m=142264422605440&w=2 A temporary fix would be to disable punch_hole for non-extent filesystem. This is how the normal ext3 module handles this and it is up to userspace to handle the failure. I've run this with the test case and was able to run for 600 iterations over 3 days where most failures occur within the first 2-20 iterations. diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 5653fa4..e14cdfe 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3367,6 +3367,10 @@ int ext4_punch_hole(struct inode inode, loff_t offset, loff_t length) unsigned int credits; int ret = 0; + / EXTENTS required */ + if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) + return -EOPNOTSUPP; + if (!S_ISREG(inode->i_mode)) return -EOPNOTSUPP; -- The security team uses a tool (http://bazaar.launchpad.net/~ubuntu-bugcontrol/ubuntu-qa-tools/master/view/head:/vm-tools/uvt) that uses libvirt snapshots quite a bit. I noticed after upgrading to trusty some time ago that qemu 1.7 (and the qemu 2.0 in the candidate ppa) has had stability problems such that the disk/partition table seems to be corrupted after removing a libvirt snapshot and then creating another with the same name. I don't have a very simple reproducer, but had enough that hallyn suggested I file a bug. First off: qemu-kvm 2.0~git-20140307.4c288ac-0ubuntu2 $ cat /proc/version_signature Ubuntu 3.13.0-16.36-generic 3.13.5 $ qemu-img info ./forhallyn-trusty-amd64.img image: ./forhallyn-trusty-amd64.img file format: qcow2 virtual size: 8.0G (8589934592 bytes) disk size: 4.0G cluster_size: 65536 Format specific information: compat: 0.10 Steps to reproduce: 1. create a virtual machine. For a simplified reproducer, I used virt-manager with: OS type: Linux Version: Ubuntu 14.04 Memory: 768 CPUs: 1 Select managed or existing (Browse, new volume) Create a new storage volume: qcow2 Max capacity: 8192 Allocation: 0 Advanced: NAT kvm x86_64 firmware: default 2. install a VM. I used trusty-desktop-amd64.iso from Jan 23 since it seems like I can hit the bug more reliably if I have lots of updates in a dist-upgrade. I have seen this with lucid-trusty guests that are i386 and amd64. After the install, reboot and then cleanly shutdown 3. Backup the image file somewhere since steps 1 and 2 take a while :) 4. Execute the following commands which are based on what our uvt tool does: $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh snapshot-current --name forhallyn-trusty-amd64 pristine $ virsh start forhallyn-trusty-amd64 $ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff after start, this might be different with qemu 1.5 in guest: sudo apt-get update sudo apt-get dist-upgrade 780 upgraded... shutdown -h now $ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot" $ virsh start forhallyn-trusty-amd64 # this command works, but there is often disk corruption The idea behind the above is to create a new VM with a pristine snapshot that we could revert later if we wanted. Instead, we boot the VM, run apt-get dist-upgrade, cleanly shutdown and then remove the old 'pristine' snapshot and create a new 'pristine' snapshot. The intention is to update the VM and the pristine snapshot so that when we boot the next time, we boot from the updated VM and can revert back to the updated VM. After running 'virsh start' after doing snapshot-delete/snapshot-create-as, the disk may be corrupted. This can be seen with grub failing to find .mod files, the kernel not booting, init failing, etc. This does not seem to be related to the machine type used. Ie, pc-i440fx-1.5, pc-i440fx-1.7 and pc-i440fx-2.0 all fail with qemu 2.0, pc-i440fx-1.5 and pc-i440fx-1.7 fail with qemu 1.7 and pc-i440fx-1.5 works fine with qemu 1.5. Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-3ubuntu5.4 from Ubuntu 13.10.
2015-03-31 20:21:23	Chris J Arges	linux (Ubuntu): status	Fix Released	Confirmed
2015-04-03 16:26:51	Andy Whitcroft	linux (Ubuntu): status	Confirmed	Fix Committed
2015-04-03 22:21:30	Launchpad Janitor	linux (Ubuntu): status	Fix Committed	Fix Released
2015-08-10 19:42:32	Jeffrey	bug			added subscriber Jeffrey
2015-10-21 01:11:07	Chris J Arges	nominated for series		Ubuntu Vivid
2015-10-21 01:11:07	Chris J Arges	bug task added		qemu (Ubuntu Vivid)
2015-10-21 01:11:07	Chris J Arges	bug task added		linux (Ubuntu Vivid)
2015-10-21 01:11:07	Chris J Arges	nominated for series		Ubuntu Trusty
2015-10-21 01:11:07	Chris J Arges	bug task added		qemu (Ubuntu Trusty)
2015-10-21 01:11:07	Chris J Arges	bug task added		linux (Ubuntu Trusty)
2015-10-21 01:11:25	Chris J Arges	bug task deleted	qemu (Ubuntu)
2015-10-21 01:16:43	Chris J Arges	bug task deleted	qemu (Ubuntu Trusty)
2015-10-21 01:16:48	Chris J Arges	bug task deleted	qemu (Ubuntu Vivid)
2015-10-21 01:16:56	Chris J Arges	bug task added		linux-lts-utopic (Ubuntu)
2015-10-21 01:18:36	Chris J Arges	linux-lts-utopic (Ubuntu): status	New	Invalid
2015-10-21 01:18:41	Chris J Arges	linux (Ubuntu Trusty): assignee		Chris J Arges (arges)
2015-10-21 01:18:43	Chris J Arges	linux (Ubuntu Vivid): assignee		Chris J Arges (arges)
2015-10-21 01:18:44	Chris J Arges	linux-lts-utopic (Ubuntu Trusty): assignee		Chris J Arges (arges)
2015-10-21 01:18:47	Chris J Arges	linux (Ubuntu Trusty): importance	Undecided	High
2015-10-21 01:18:49	Chris J Arges	linux (Ubuntu Vivid): importance	Undecided	High
2015-10-21 01:18:51	Chris J Arges	linux-lts-utopic (Ubuntu Trusty): importance	Undecided	High
2015-10-21 01:18:53	Chris J Arges	linux (Ubuntu Trusty): status	New	In Progress
2015-10-21 01:18:55	Chris J Arges	linux (Ubuntu Vivid): status	New	In Progress
2015-10-21 01:18:57	Chris J Arges	linux-lts-utopic (Ubuntu Trusty): status	New	In Progress
2015-10-21 16:08:23	Chris J Arges	linux (Ubuntu Vivid): status	In Progress	Fix Released
2015-10-21 16:12:12	Chris J Arges	linux-lts-utopic (Ubuntu Trusty): status	In Progress	Fix Released
2015-10-21 16:14:36	Chris J Arges	linux (Ubuntu Vivid): assignee	Chris J Arges (arges)
2015-10-21 16:14:38	Chris J Arges	linux-lts-utopic (Ubuntu Trusty): assignee	Chris J Arges (arges)
2015-10-26 20:46:05	Brad Figg	linux (Ubuntu Trusty): status	In Progress	Fix Committed
2015-11-16 10:23:59	Luis Henriques	tags	qcow2	qcow2 verification-needed-trusty
2015-11-21 02:14:56	Seth Arnold	tags	qcow2 verification-needed-trusty	qcow2 verification-failed
2015-11-23 15:00:25	Brad Figg	tags	qcow2 verification-failed	qcow2 verification-failed-trusty
2015-11-24 03:49:29	Seth Arnold	tags	qcow2 verification-failed-trusty	qcow2 verification-needed-trusty
2015-11-25 05:38:20	Seth Arnold	tags	qcow2 verification-needed-trusty	qcow2 verification-done-trusty
2015-11-30 21:02:37	Launchpad Janitor	linux (Ubuntu Trusty): status	Fix Committed	Fix Released
2015-11-30 21:02:36	Launchpad Janitor	linux (Ubuntu Trusty): status	Fix Committed	Fix Released

Ubuntulinux package

Activity log for bug #1292234

Ubuntu
linux package