qcow2 image corruption on non-extent filesystems (ext3)

Bug #1292234 reported by Jamie Strandboge
32
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Chris J Arges
Trusty
Fix Released
High
Chris J Arges
Vivid
Fix Released
High
Unassigned
linux-lts-utopic (Ubuntu)
Invalid
Undecided
Unassigned
Trusty
Fix Released
High
Unassigned

Bug Description

[Impact]
Users of non-extent ext4 filesystems (ext4 ^extents, or ext3 w/ CONFIG_EXT4_USE_FOR_EXT23=y) can encounter data corruption when using fallocate with FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE flags.

[Test Case]
1) Setup ext4 ^extents, or ext3 filesystem with CONFIG_EXT4_USE_FOR_EXT23=y
2) Create and install a VM using a qcow2 image and store the file on the filesystem
3) Snapshot the image with qemu-img
4) Boot the image and do some disk operations (fio,etc)
5) Shutdown image and delete snapshot
6) Repeat 3-5 until VM no longer boots due to image corruption, generally this takes a few iterations depending on disk operations.

[Fix]
commit 6f30b7e37a8239f9d27db626a1d3427bc7951908 upstream

This has been discussed upstream here:
http://marc.info/?l=linux-fsdevel&m=142264422605440&w=2

A temporary fix would be to disable punch_hole for non-extent filesystem. This is how the normal ext3 module handles this and it is up to userspace to handle the failure. I've run this with the test case and was able to run for 600 iterations over 3 days where most failures occur within the first 2-20 iterations.

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 5653fa4..e14cdfe 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3367,6 +3367,10 @@ int ext4_punch_hole(struct inode *inode, loff_t
offset, loff_t length)
  unsigned int credits;
  int ret = 0;

+ /* EXTENTS required */
+ if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)))
+ return -EOPNOTSUPP;
+
  if (!S_ISREG(inode->i_mode))
   return -EOPNOTSUPP;

--

The security team uses a tool (http://bazaar.launchpad.net/~ubuntu-bugcontrol/ubuntu-qa-tools/master/view/head:/vm-tools/uvt) that uses libvirt snapshots quite a bit. I noticed after upgrading to trusty some time ago that qemu 1.7 (and the qemu 2.0 in the candidate ppa) has had stability problems such that the disk/partition table seems to be corrupted after removing a libvirt snapshot and then creating another with the same name. I don't have a very simple reproducer, but had enough that hallyn suggested I file a bug. First off:

qemu-kvm 2.0~git-20140307.4c288ac-0ubuntu2

$ cat /proc/version_signature
Ubuntu 3.13.0-16.36-generic 3.13.5

$ qemu-img info ./forhallyn-trusty-amd64.img
image: ./forhallyn-trusty-amd64.img
file format: qcow2
virtual size: 8.0G (8589934592 bytes)
disk size: 4.0G
cluster_size: 65536
Format specific information:
    compat: 0.10

Steps to reproduce:
1. create a virtual machine. For a simplified reproducer, I used virt-manager with:
  OS type: Linux
  Version: Ubuntu 14.04
  Memory: 768
  CPUs: 1

  Select managed or existing (Browse, new volume)
    Create a new storage volume:
      qcow2
      Max capacity: 8192
      Allocation: 0

  Advanced:
    NAT
    kvm
    x86_64
    firmware: default

2. install a VM. I used trusty-desktop-amd64.iso from Jan 23 since it seems like I can hit the bug more reliably if I have lots of updates in a dist-upgrade. I have seen this with lucid-trusty guests that are i386 and amd64. After the install, reboot and then cleanly shutdown

3. Backup the image file somewhere since steps 1 and 2 take a while :)

4. Execute the following commands which are based on what our uvt tool does:

$ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot"
$ virsh snapshot-current --name forhallyn-trusty-amd64
pristine
$ virsh start forhallyn-trusty-amd64
$ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff after start, this might be different with qemu 1.5

in guest:
sudo apt-get update
sudo apt-get dist-upgrade
780 upgraded...
shutdown -h now

$ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children
$ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot"

$ virsh start forhallyn-trusty-amd64 # this command works, but there is often disk corruption

The idea behind the above is to create a new VM with a pristine snapshot that we could revert later if we wanted. Instead, we boot the VM, run apt-get dist-upgrade, cleanly shutdown and then remove the old 'pristine' snapshot and create a new 'pristine' snapshot. The intention is to update the VM and the pristine snapshot so that when we boot the next time, we boot from the updated VM and can revert back to the updated VM.

After running 'virsh start' after doing snapshot-delete/snapshot-create-as, the disk may be corrupted. This can be seen with grub failing to find .mod files, the kernel not booting, init failing, etc.

This does not seem to be related to the machine type used. Ie, pc-i440fx-1.5, pc-i440fx-1.7 and pc-i440fx-2.0 all fail with qemu 2.0, pc-i440fx-1.5 and pc-i440fx-1.7 fail with qemu 1.7 and pc-i440fx-1.5 works fine with qemu 1.5.

Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-3ubuntu5.4 from Ubuntu 13.10.

summary: - qcow2 image corruption in trusty (qemu 1.7)
+ qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
Changed in qemu (Ubuntu):
importance: Undecided → High
Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

Have not yet been able to reproduce this. I'm considering adding an upstart job to your image which updates and shuts down, so I can test this in a loop.

Do you know whether (a) the --children option to snapshot delete or (b) using the same name for the new snapshot as the one you just delete are crucial to reproducing this?

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

I don't, I just used the options that our uvt command uses. I downgraded to saucy's qemu in the meantime so I can do my work. Do you need me to try some new test?

I'm not sure it makes any difference, but note that I am using a trusty host and kernel.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

Quoting Jamie Strandboge (<email address hidden>):
> I don't, I just used the options that our uvt command uses. I downgraded
> to saucy's qemu in the meantime so I can do my work. Do you need me to
> try some new test?

sigh, maybe.

I will keep trying.

> I'm not sure it makes any difference, but note that I am using a trusty
> host and kernel.

Right, that's what I'm using.

Have others on your team (who are not on the same thinkpad model :) seen
this as well? Have you seen it on different types of machines? Does
it happen more often if the machine is already working hard?

I wonder if I can reproduce it manually with qemu-img and qemu-nbd.

Revision history for this message
Jamie Strandboge (jdstrand) wrote : Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

Did you try with the image on https://chinstrap.canonical.com/~jamie/lp1292234/? I was only able to trigger it by using an old image, creating the snapshot, starting it, apt-get dist-upgrading, cleanly shutting down, then deleting the snapshot and creating another with the same name. Using a fresh install or a too new image doesn't do it for me (I guess enough has to happen in the guest to trigger it).

Ie:
$ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot"
$ virsh snapshot-current --name forhallyn-trusty-amd64
pristine
$ virsh start forhallyn-trusty-amd64
$ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff after start, this might be different with qemu 1.5

in guest:
sudo apt-get update
sudo apt-get dist-upgrade
780 upgraded...
shutdown -h now

$ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children
$ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot"

$ virsh start forhallyn-trusty-amd64 # this command works, but there is often disk corruption

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

Quoting Jamie Strandboge (<email address hidden>):
> Did you try with the image on
> https://chinstrap.canonical.com/~jamie/lp1292234/? I was only able to

Yup! I wget that, create the snapshot, upgrade, remove and create
the snapshot, then start the vm. The upgrades take a long time
so I've only tested it 3 times so far. How likely is the failure?
Should I just keep going?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

I've not yet been able to definitively reproduce this. (On a bad nested qemu setup i had some issues which i think were unrelated). I've tried on a trusty laptop, and on a faster machine with a trusty container on a trusty kernel. Starting with the images you posted for me each time.

Revision history for this message
Seth Arnold (seth-arnold) wrote :

I believe I just tripped this bug; I compressed some qcow2 images using this:

for f in sec-{lucid,precise,quantal,saucy,trusty}-{amd64,i386} ;
  do echo $f ;
  qemu-img convert -s pristine -p -f qcow2 -O qcow2 $f.qcow2 reclaimed.qcow2 ;
  mv reclaimed.qcow2 $f.qcow2 ;
  virsh snapshot-delete $f --snapshotname pristine ;
  uvt snapshot $f ;
done

The 'uvt snapshot' command makes a snapshot named 'pristine'.

AMD64 guests:
sec-lucid-amd64 booted without trouble.

sec-precise-amd64 reports:
Booting from Hard Disk...
Boot failed: not a booktable disk

No bootable device.

sec-quantal-amd64 reports:
Booting from Hard Disk...
error; file `/boot/grub/i386-pc/normal.mod' not found.
grub rescue>

sec-saucy-amd64 reports:
Booting from Hard Disk...
error: file `/boot/grub/i386-pc/normal.mod' not found.
Entering rescue mode...
grub rescue>

sec-trusty-amd64 reports:
Booting from Hard Disk...
Boot failed: not a bootable disk

No bootable device.

i386 guests:

sec-lucid-i386, sec-precise-i386, sec-quantal-i386, sec-saucy-i386 all booted fine.

sec-trusty-i386 reports:
Booting from Hard Disk...
Boot failed: not a bootable disk

No bootable device.

I use the i386 VMs significantly less often than the amd64 VMs.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in qemu (Ubuntu):
status: New → Confirmed
Revision history for this message
Jamie Strandboge (jdstrand) wrote :

FYI, I periodically use and follow the same procedure that Seth described (in fact, I did it yesterday) and had no problems with qemu 1.5.0+dfsg-3ubuntu5.4 (which I've apt pinned since reporting this bug).

description: updated
Changed in qemu (Ubuntu):
assignee: nobody → Serge Hallyn (serge-hallyn)
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

I have a clean install of trusty on an intel laptop. I added the following upstart job in the forhallyn-trusty-amd64.img root partition:

####################################################################
description "update and shutdown"
author "Serge Hallyn <email address hidden>"

start on runlevel [2345]

script
 sleep 5s
 apt-get update
 DEBIAN_FRONTEND='noninteractive' apt-get -y dist-upgrade
 sleep 5s
 shutdown -h now
end script
####################################################################

Then on the host I run this script:

####################################################################
#!/bin/bash

cp orig-with-upstart/forhallyn-trusty-amd64.img .

virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot"
virsh start forhallyn-trusty-amd64
sleep 20s
while [ 1 ]; do
 virsh list | grep -q forhallyn || break
 sleep 20s
done

# guest has updated. check the image file and fs here
qemu-img check forhallyn-trusty-amd64.img
if [ $? -ne 0 ]; then
    echo "image check failed after shutdown"
    exit 1
fi
qemu-nbd -c /dev/nbd0 forhallyn-trusty-amd64.img
fsck -a /dev/nbd0p1
if [ $? -ne 0 ]; then
    echo "fs bad after shutdown"
    qemu-nbd -d /dev/nbd0
    exit 1
fi
qemu-nbd -d /dev/nbd0

# now tweak the snapshots
virsh snapshot-delete forhallyn-trusty-amd64 pristine --children
virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot"

# and check the image file and fs again
qemu-img check forhallyn-trusty-amd64.img
if [ $? -ne 0 ]; then
    echo "image check failed after snapshot remove/create"
    exit 1
fi
qemu-nbd -c /dev/nbd0 forhallyn-trusty-amd64.img
fsck -a /dev/nbd0p1
if [ $? -ne 0 ]; then
    echo "fs bad after snapshot remove/create"
    qemu-nbd -d /dev/nbd0
    exit 1
fi
qemu-nbd -d /dev/nbd0

# all seems well
exit 0
####################################################################

I'll run that in a loop and see if it fails after 10 tries.

If you see anything there that I am NOT doing which would help to reproduce,
please let me know.

tags: added: qcow2
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

As far as I know, everyone who has experienced this has been using a
thinkpad. I've first experienced this myself last week, on a new
thinkpad running utopic.

Two curious things I noticed, beside this being a thinkpad:

1. I could not start the VM with the bad image at all. Until I rebooted.
Then the image was fine, and fsck-clean. This suggests a possible problem
with the page cache on the host.

2. I then disabled KSM. I have not seen this problem since then, however I
also have not hit a vm quite as hard yet. Will have to see whether a series
of package builds manages to make this happen again with KSM disabled.

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

On utopic amd64, I tried the new qemu 2.1 packages and disabled KSM. They seemed to be ok for a while, but after using 'uvt update' today (which under the hood does what is decribed in the bug description), I lost 6 VMs to this bug. A reboot did not solve it. I've downgraded to saucy again. Unfortunately, the saucy packages are no longer supported and have stopped getting security updates. This is getting rather dire for me....

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

Hi Jamie,

just to make sure, did you permanently disable ksm? Does

cat /sys/kernel/mm/ksm/run

still show 0?

I've so far never seen a case where a reboot did not fix the issue,
nor have I seen an issue (other than suspending the host sometimes
causing the VM to hang so that I have to destroy it) with ksm
disabled.

I had hoped to do some large parallel upgrade tests this week, but
network at linuxcon is not up to the task (even with apt-cacher-ng!)
If I can find a better room I'll see about trying there.

Revision history for this message
Jamie Strandboge (jdstrand) wrote : Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

I disabled KSM by setting /etc/default/qemu-kvm to have:
KSM_ENABLED=0

and did 'sudo restart qemu-kvm'. I also rebooted before seeing the problem. Since then, I downgraded to saucy's qemu-kvm which reset KSM_ENABLED=1. I didn't specifically check /sys/kernel/mm/ksm/run and of course now this is set to '1'.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

Ok - thanks Jamie.

Revision history for this message
Ryan Harper (raharper) wrote : Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

For the reproducers, something worth trying is to use to try is external snapshots (instead of internal which the snapshot-create-as does without flags).

instead run: snapshot-create-as --disk-only

which will basically do qemu-img create -b your_original_qcow2 -f qcow2 pristine

And store the snapshot delta in a separate file.

Revision history for this message
Ryan Harper (raharper) wrote :

I've been running the scripts from comment #10. I have two VMs each running simultaneously; I've completed 24 hours of this sequence, about 50 total cycles with zero errors in the qcow2 images.

We're missing something; possibly hardware specific?

Host machine is an Intel NUC on Trusty.
Linux kriek 3.13.0-34-generic #60-Ubuntu SMP Wed Aug 13 15:45:27 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Ill see about increasing concurrency next.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

There are 69 commits to block/qcow* between 1.5.0 and 1.7.0.
I have compiled binaries of qemu-system-x86_64 and qemu-img
at each of those commits and pushed them to

http://people.canonical.com/~serge/binaries.0
through
http://people.canonical.com/~serge/binaries.68

Note that binaries.0 is the *latest* commit.

So to bisect with these you could start with binaries.34, then
if that shows corruption, try binaries.51, or if it does not,
try binaries.17 etc. 6 steps should get us to a single commit.
It's not certain that one of these commits caused the
regression, but it seems a reasonable place to start.

Revision history for this message
Ryan Harper (raharper) wrote : Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

I'm also starting work on updating uvt to use external snapshots instead; this would be an alternative to use while chasing down the bug in internal snapshots.

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

I tried to reproduce this many different ways with 2.1+dfsg-3ubuntu3 over the weekend and could not trigger the issue (with ksm enabled too). I don't know what version I had in comment #12. 2.1+dfsg-3ubuntu2 is plausible based on the date of the comment and the publication of this version, though I can't guarantee it wasn't 2.1+dfsg-2ubuntu2 or even 2.1+dfsg-2ubuntu1 though I did specifically mention I used 2.1. I don't see anything in the changes that jumps out that qcow2 corruption bugs were fixed since my comment, so I'm worried I just haven't been able to reproduce....

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

I just had this happen to me with 2.1+dfsg-3ubuntu3 on utopic. I had a VM I had been using for a days, then did a 'uvt stop -rf ...' followed by 'uvt update sec-utopic-amd64' and I was dropped to a grub rescue. :\

I'll downgrade again and regenerate the VM.

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

This happened again with an important VM. I still don't have a reproducer for testing the bisect packages.... :(

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

On my main server (3.13.0-32-generic with precise userspace) I installed a trusty container with ext3 (LVM) backing store. There I installed uvt and created 4 VMs, 2 precise amd64 and 2 precise i386. I several times did:

ubuntu@uvttest:~$ cat list
p-precise-server-amd64
p-precise-server-i386
q-precise-server-i386
q-precise-server-amd64
ubuntu@uvttest:~$ for n in `cat list`; do uvt start -fr $n; done
ubuntu@uvttest:~$ for n in `cat list`; do tmux splitw -p 25 -t $TMUX_PANE "expect vmupgrade.expect $n"; done

where vmupgrade.expect is:
=================================================================
#!/usr/bin/expect

set container [lrange $argv 0 0]
spawn ssh $container
#expect "assword:"
#send -- "ubuntu\r"

expect "$container:~$"
send -- "export DEBIAN_FRONTEND=noninteractive\r"
send -- "sudo sed -i 's/never/lts/' /etc/update-manager/release-upgrades\r"
expect "assword for ubuntu:"
send -- "ubuntu\r"

expect "$container:~$"
send -- "sudo apt-get update\r"
expect "$container:~$"
send -- "sudo do-release-upgrade -f DistUpgradeViewNonInteractive\r"
set timeout 11000
expect "$container:~$"
send -- "sudo reboot\r"
=================================================================

Then I find /lib -name xxx; sudo reboot; find /lib -name xxx; and look
through dmesg for errors, then do

ubuntu@uvttest:~$ for n in `cat list`; do uvt stop -fr $n; done

Alas I've seen no corruption yet. The goal here isn't just to reproduce
it, but to do so reliably enough to be able to bisect - this isn't it :(

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

FYI, I was able to reproduce this last night and uploaded forhallyn-trusty-amd64.img.corrupted.gz to https://chinstrap.canonical.com/~jamie/lp1292234/ for comparison with forhallyn-trusty-amd64.img.gz.

Revision history for this message
Chris J Arges (arges) wrote :

$ od -x -N 72 forhallyn-trusty-amd64.img.corrupted | grep '[1-9]*'
refcount_table_cluster 0000 0100
0000000 4651 fb49 0000 0200 0000 0000 0000 0000
0000020 0000 0000 0000 1000 0000 0200 0000 0000
0000040 0000 0000 0000 1000 0000 0000 0300 0000
0000060 0000 0000 0100 0000 0000 0100 0000 0100
0000100 0000 0000 0500 0000

nb_snapshots = 0000 0100
snapshots_offset = 0000 0000 0500 0000

$ od -x -N 72 forhallyn-trusty-amd64.img | grep '[1-9]*'
0000000 4651 fb49 0000 0200 0000 0000 0000 0000
0000020 0000 0000 0000 1000 0000 0200 0000 0000
0000040 0000 0000 0000 1000 0000 0000 0300 0000
0000060 0000 0000 0100 0000 0000 0100 0000 0000
0000100 0000 0000 0000 0000

nb_snapshots = 0000 0000
snapshots_offset = 0000 0000 0000 0000

Looking at just the QCowHeader (and not de-scrambling BE format), I see the following differences; however I think this looks 'ok', I'll need to examine the rest of the file.

Chris J Arges (arges)
Changed in qemu (Ubuntu):
assignee: Serge Hallyn (serge-hallyn) → Chris J Arges (arges)
Revision history for this message
Chris J Arges (arges) wrote :

Ok I think I can reproduce this; after running some disk operations (bonnie++ and split a 100MB file), if I shutdown and try to boot the VM the disk cannot be booted and I'm presented with the grub menu.

However this reproducer is not yet 100% reliable. Next week I'll work on bisecting it down after testing latest upstream.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

Awesome - thank you Chris.

Revision history for this message
Ryan Harper (raharper) wrote : Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

can we confirm what filesystems and options are enabled when reproducing (ie, ext4 +extent mapping)[1] ? Bug 1368815 sounds very much like this. If the reproducing systems have ext4 extents mapping enabled, one could create an ext4 fs without extent mapping[2] and see if this still reproduces.

If it is related to the ext4 extents, the rate of memory pressure and speed of the underlying device would determine whether or not the file ends up being corrupt which might explain the difficulty of reproducing.

1. % sudo tune2fs -l /dev/disk/by-id/dm-name-kriek--vg-root | grep -i features
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
2. mke2fs -t ext4 -O ^extent /dev/<device>

Revision history for this message
Chris J Arges (arges) wrote :

Ryan,

The host's root filesystem is ext3/LVM (per Jamie's original configuration):

sudo tune2fs -l /dev/disk/by-id/dm-name-ubuntu--vg-root | grep -i features
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

Actually, for me it is just ext3 without LVM.
$ sudo tune2fs -l /dev/sda3 | grep -i features
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file

Revision history for this message
Chris J Arges (arges) wrote :

Attached is a reproducer for this issue, here is what needs to be done to setup the reproducer:
1) The host machine's filesystem needs to be ext3
2) Install a VM (via virsh) and use a qcow2 disk
3) Ensure you can ssh without a password and the VM has bonnie++ installed
4) Adjust the variables in the script before running
5) Run the script a couple of times

While this doesn't reproduce 100% of the time, I can usually get a failure within 1-3 trials. However executing this on a ext4 host filesystem I've been unable to reproduce this issue.

Revision history for this message
Chris J Arges (arges) wrote :

Also I've been able to reproduce this with the latest master in qemu, and even with the latest daily 3.18-rcX kernel on the host.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1292234] Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

Excellent!

Any chance you can start bisecting with http://people.canonical.com/~serge/binaries.{0..68}/{qemu-img,qemu-system-x86_64} ?

Revision history for this message
Chris J Arges (arges) wrote : Re: qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)

Serge,
So I was able to just compile my own qemu and test with that.

I did attempt a reverse bisect, and was able to reproduce as early as v1.1 and also reproduce on master HEAD.
v1.0 was inconclusive because qcow2 format I made with the newer binary seemed to be incompatible with v1.0; however from Jamies testing this seems to be a working version; so I'd say somewhere between v1.0.0, v1.1.0 lies the original change that enabled this issue. As I've been unable to reproduce this without virsh, reverse bisecting and using older qemu versions is a bit challenging as machine types change, features virsh wants to use aren't available, etc.

Another interesting thing I tested today was I was able to reproduce with ext4 with extents disabled; maybe that gives more clues. Just to make sure I wasn't crazy, mkfs'd the partition to vanilla ext4 and iterated for most of the afternoon with no failures.

My next steps are going to be enabling verbose output for qcow2, looking more deeply into what gets corrupted in the file, and turning on host filesystem debugging.

--chris

Chris J Arges (arges)
Changed in qemu (Ubuntu):
status: Confirmed → In Progress
Revision history for this message
Chris J Arges (arges) wrote :

FWIW, just re-reproduced this with latest upstream kernel / qemu / fresh qcow2 image.

Chris J Arges (arges)
summary: - qcow2 image corruption in trusty (qemu 1.7 and 2.0 candidate)
+ qcow2 image corruption on non-extent filesystems (ext3)
Chris J Arges (arges)
no longer affects: qemu
Chris J Arges (arges)
Changed in linux (Ubuntu):
assignee: nobody → Chris J Arges (arges)
importance: Undecided → High
status: New → In Progress
Changed in qemu (Ubuntu):
status: In Progress → Invalid
assignee: Chris J Arges (arges) → nobody
importance: High → Undecided
Chris J Arges (arges)
description: updated
Revision history for this message
Chris J Arges (arges) wrote :

Sent e-mail upstream about this issue: http://marc.info/?l=linux-fsdevel&m=142264422605440&w=2

Revision history for this message
Josep M. Perez (josep-m-perez) wrote :

Apparently this bug is also present in Debian. In my case the corrupted image was a windows one. When I run qemu-img check over it it will complain about lots of clusters, and if I pass it the repair flag, then it will end up crashing with the following message:

$ qemu-img check -r all windows.img
Repairing cluster 0 refcount=0 reference=1
Repairing cluster 1 refcount=0 reference=1
qcow2: Preventing invalid write on metadata (overlaps with active L1 table); image marked as corrupt.
Repairing cluster 2 refcount=0 reference=1
qcow2: Preventing invalid write on metadata (overlaps with active L1 table); image marked as corrupt.
qcow2: Preventing invalid write on metadata (overlaps with active L1 table); image marked as corrupt.
Repairing cluster 3 refcount=0 reference=1
qcow2: Preventing invalid write on metadata (overlaps with active L1 table); image marked as corrupt.
qcow2: Preventing invalid write on metadata (overlaps with active L1 table); image marked as corrupt.
qcow2: Preventing invalid write on metadata (overlaps with active L1 table); image marked as corrupt.
Repairing cluster 4 refcount=0 reference=1
qcow2: Preventing invalid write on metadata (overlaps with active L1 table); image marked as corrupt.
qcow2: Preventing invalid write on metadata (overlaps with active L1 table); image marked as corrupt.
qcow2: Preventing invalid write on metadata (overlaps with active L1 table); image marked as corrupt.
qcow2: Preventing invalid write on metadata (overlaps with active L1 table); image marked as corrupt.
Repairing cluster 5 refcount=0 reference=1
qcow2: Preventing invalid write on metadata (overlaps with active L1 table); image marked as corrupt.
Repairing cluster 6 refcount=0 reference=1
...
Repairing OFLAG_COPIED data cluster: l2_entry=8000000397a59000 refcount=0
Repairing OFLAG_COPIED data cluster: l2_entry=8000000397a5a000 refcount=0
Repairing OFLAG_COPIED data cluster: l2_entry=800000000001b000 refcount=0
The following inconsistencies were found and repaired:

    0 leaked clusters
    97850 corruptions

Double checking the fixed image now...
[1] 27716 segmentation fault (core dumped) qemu-img check -r all windows.img

Has anyone else tried this over a copy of the corrupted image?

Chris J Arges (arges)
description: updated
description: updated
Chris J Arges (arges)
description: updated
Revision history for this message
Chris J Arges (arges) wrote :

@josep-m-perez
Yes, this is an upstream bug. So it affects anyone using the right filesystem and CONFIGs. Once we fix this upstream, then it will be submitted as a stable kernel update and make its way into all stable kernels as applicable.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.18.0-13.14

---------------
linux (3.18.0-13.14) vivid; urgency=low

  [ Andy Whitcroft ]

  * hyper-v -- fix comment handing in /etc/network/interfaces
    - LP: #1413020

  [ Chris J Arges ]

  * [Config] Add ibmvfc to d-i
    - LP: #1416001
  * SAUCE: ext4: disable ext4_punch_hole for indirect filesystems
    - LP: #1292234

  [ Leann Ogasawara ]

  * rebase to v3.18.5
  * [Config] CONFIG_X86_UP_APIC_MSI=y
  * Release Tracking Bug
    - LP: #1417475
 -- Leann Ogasawara <email address hidden> Thu, 05 Feb 2015 09:58:20 +0200

Changed in linux (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Chris J Arges (arges) wrote :

Note there currently is a patch upstream:
https://lkml.org/lkml/2015/2/10/520

This fixes the original bug correctly without having to disable ext4_punch_hole for indirect filesystems. Once this lands in Linus' tree, I'll file an SRU to get this fixed across the board.

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

Woohoo! *Huge* thanks. This was a tricky one :)

Chris J Arges (arges)
description: updated
Revision history for this message
Chris J Arges (arges) wrote :

Sent email to upstream stable to apply this bug to affected kernels.

Chris J Arges (arges)
Changed in linux (Ubuntu):
status: Fix Released → Confirmed
Andy Whitcroft (apw)
Changed in linux (Ubuntu):
status: Confirmed → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (6.8 KiB)

This bug was fixed in the package linux - 3.19.0-12.12

---------------
linux (3.19.0-12.12) vivid; urgency=low

  [ Andy Whitcroft ]

  * [Packaging] do_common_tools should always be on
  * [Packaging] Provides: virtualbox-guest-modules when appropriate
    - LP: #1434579

  [ Chris J Arges ]

  * Revert "SAUCE: ext4: disable ext4_punch_hole for indirect filesystems"
    - LP: #1292234

  [ Leann Ogasawara ]

  * Release Tracking Bug
    - LP: #1439803

  [ Timo Aaltonen ]

  * SAUCE: i915_bpo: Provide a backport driver for Skylake & Cherryview
    graphics
    - LP: #1420774
  * SAUCE: i915_bpo: Update intel_ips.h file location
    - LP: #1420774
  * SAUCE: i915_bpo: Only support Skylake and Cherryview with the backport
    driver
    - LP: #1420774
  * SAUCE: i915_bpo: Rename the backport driver to i915_bpo
    - LP: #1420774
  * i915_bpo: [Config] Enable CONFIG_DRM_I915_BPO=m
    - LP: #1420774
  * SAUCE: i915_bpo: Add i915_bpo_*() calls for ubuntu/i915
    - LP: #1420774
  * SAUCE: i915_bpo: Revert "drm/i915: remove unused
    power_well/get_cdclk_freq api"
    - LP: #1420774
  * SAUCE: i915_bpo: Add i915_bpo specific power well calls
    - LP: #1420774
  * SAUCE: Backport I915_PARAM_MMAP_VERSION and I915_MMAP_WC
    - LP: #1420774
  * SAUCE: Partial backport of drm/i915: Add ioctl to set per-context
    parameters
    - LP: #1420774
  * SAUCE: drm/i915: Specify bsd rings through exec flag
    - LP: #1420774
  * SAUCE: drm/i915: add I915_PARAM_HAS_BSD2 to i915_getparam
    - LP: #1420774
  * SAUCE: drm/i915: add component support
    - LP: #1420774
  * SAUCE: drm/i915: Add tiled framebuffer modifiers
    - LP: #1420774
  * SAUCE: Backport new displayable tiling formats
    - LP: #1420774
  * SAUCE: Backport drm_crtc_vblank_reset() helper
    - LP: #1420774
  * SAUCE: drm/i915: Add I915_PARAM_REVISION
    - LP: #1420774
  * SAUCE: drm/i915: Export total subslice and EU counts
    - LP: #1420774
  * SAUCE: i915_bpo: Revert drm/mm: Support 4 GiB and larger ranges
    - LP: #1420774

  [ Upstream Kernel Changes ]

  * drm/i915/skl: Split the SKL PCI ids by GT
    - LP: #1420774
  * drm: Reorganize probed mode validation
    - LP: #1420774
  * drm: Perform basic sanity checks on probed modes
    - LP: #1420774
  * drm: Do basic sanity checks for user modes
    - LP: #1420774
  * drm/atomic-helper: Export both plane and modeset check helpers
    - LP: #1420774
  * drm/atomic-helper: Again check modeset *before* plane states
    - LP: #1420774
  * drm/atomic: Introduce state->obj backpointers
    - LP: #1420774
  * drm: allow property validation for refcnted props
    - LP: #1420774
  * drm: store property instead of id in obj attachment
    - LP: #1420774
  * drm: get rid of direct property value access
    - LP: #1420774
  * drm: add atomic_set_property wrappers
    - LP: #1420774
  * drm: tweak getconnector locking
    - LP: #1420774
  * drm: add atomic_get_property
    - LP: #1420774
  * drm: Remove unneeded braces for single statement blocks
    - LP: #1420774
  * drm: refactor getproperties/getconnector
    - LP: #1420774
  * drm: add atomic properties
    - LP: #1420774
  * drm/atomic: atomic_check functions
    - LP: #1420774
  * drm: s...

Read more...

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Seth Arnold (seth-arnold) wrote :

Is this still open against the 14.04.1 LTS kernel?

Thanks

Revision history for this message
Chris J Arges (arges) wrote :

The fix is the following:
$ git describe --contains 6f30b7e37a8239f9d27db626a1d3427bc7951908
v4.0-rc1~1^2

I thought this was going to be queued up for stable, but doesn't look like that happened.
If this still affects you in 3.13, 3.16, I can backport this patch. Let me know.

Revision history for this message
Seth Arnold (seth-arnold) wrote :

Chris, please do, I just recreated the issue with the "uvt update -rf" recipe from earlier; four of six VMs couldn't boot to a login: prompt, presumably from this bug.

Linux hunt 3.13.0-65-generic #106-Ubuntu SMP Fri Oct 2 22:08:27 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

(I know, it misses this week's update. I can't keep up on this treadmill...)

Thanks

Chris J Arges (arges)
no longer affects: qemu (Ubuntu)
Chris J Arges (arges)
no longer affects: qemu (Ubuntu Trusty)
no longer affects: qemu (Ubuntu Vivid)
Changed in linux-lts-utopic (Ubuntu):
status: New → Invalid
Changed in linux (Ubuntu Trusty):
assignee: nobody → Chris J Arges (arges)
Changed in linux (Ubuntu Vivid):
assignee: nobody → Chris J Arges (arges)
Changed in linux-lts-utopic (Ubuntu Trusty):
assignee: nobody → Chris J Arges (arges)
Changed in linux (Ubuntu Trusty):
importance: Undecided → High
Changed in linux (Ubuntu Vivid):
importance: Undecided → High
Changed in linux-lts-utopic (Ubuntu Trusty):
importance: Undecided → High
Changed in linux (Ubuntu Trusty):
status: New → In Progress
Changed in linux (Ubuntu Vivid):
status: New → In Progress
Changed in linux-lts-utopic (Ubuntu Trusty):
status: New → In Progress
Revision history for this message
Chris J Arges (arges) wrote :

Ok verified that this fix is in 3.16, 3.19+ kernels. Sent Trusty backport to ML.

Changed in linux (Ubuntu Vivid):
status: In Progress → Fix Released
Changed in linux-lts-utopic (Ubuntu Trusty):
status: In Progress → Fix Released
Chris J Arges (arges)
Changed in linux (Ubuntu Vivid):
assignee: Chris J Arges (arges) → nobody
Changed in linux-lts-utopic (Ubuntu Trusty):
assignee: Chris J Arges (arges) → nobody
Brad Figg (brad-figg)
Changed in linux (Ubuntu Trusty):
status: In Progress → Fix Committed
Revision history for this message
Luis Henriques (henrix) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-trusty
tags: added: verification-failed
removed: verification-needed-trusty
Revision history for this message
Seth Arnold (seth-arnold) wrote :

I was unable to test this specific modification due to significant regressions in the proposed kernel: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518509

Brad Figg (brad-figg)
tags: added: verification-failed-trusty
removed: verification-failed
Revision history for this message
Seth Arnold (seth-arnold) wrote :

Henrix pointed out that I also needed the linux-image-extras package. I'm now able to test this, and will report back when I've had a chance to create the VM images.

Thanks

tags: added: verification-needed-trusty
removed: verification-failed-trusty
Revision history for this message
Seth Arnold (seth-arnold) wrote :

I've tested several dozen VM snapshot and revert operations; previously, I'd have expected all my VMs to be dead by this time. This update makes libvirt / qemu / with qcow2 images usable again for me. Thanks!

tags: added: verification-done-trusty
removed: verification-needed-trusty
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (11.3 KiB)

This bug was fixed in the package linux - 3.13.0-70.113

---------------
linux (3.13.0-70.113) trusty; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1516733

  [ Upstream Kernel Changes ]

  * arm64: errata: use KBUILD_CFLAGS_MODULE for erratum #843419
    - LP: #1516682

linux (3.13.0-69.112) trusty; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1514858

  [ Joseph Salisbury ]

  * SAUCE: storvsc: use small sg_tablesize on x86
    - LP: #1495983

  [ Luis Henriques ]

  * [Config] updateconfigs after 3.13.11-ckt28 and 3.13.11-ckt29 stable
    updates

  [ Upstream Kernel Changes ]

  * ext4: fix indirect punch hole corruption
    - LP: #1292234
  * x86/hyperv: Mark the Hyper-V TSC as unstable
    - LP: #1498206
  * namei: permit linking with CAP_FOWNER in userns
    - LP: #1498162
  * iwlwifi: pci: add a few more PCI subvendor IDs for the 7265 series
    - LP: #1510616
  * Drivers: hv: vmbus: Increase the limit on the number of pfns we can
    handle
    - LP: #1495983
  * sctp: fix race on protocol/netns initialization
    - LP: #1514832
  * [media] v4l: omap3isp: Fix sub-device power management code
    - LP: #1514832
  * [media] rc-core: fix remove uevent generation
    - LP: #1514832
  * xtensa: fix threadptr reload on return to userspace
    - LP: #1514832
  * ARM: OMAP2+: DRA7: clockdomain: change l4per2_7xx_clkdm to SW_WKUP
    - LP: #1514832
  * mac80211: enable assoc check for mesh interfaces
    - LP: #1514832
  * PCI: Add dev_flags bit to access VPD through function 0
    - LP: #1514832
  * PCI: Add VPD function 0 quirk for Intel Ethernet devices
    - LP: #1514832
  * usb: dwc3: ep0: Fix mem corruption on OUT transfers of more than 512
    bytes
    - LP: #1514832
  * serial: 8250_pci: Add support for Pericom PI7C9X795[1248]
    - LP: #1514832
  * KVM: MMU: fix validation of mmio page fault
    - LP: #1514832
  * auxdisplay: ks0108: fix refcount
    - LP: #1514832
  * devres: fix devres_get()
    - LP: #1514832
  * iio: adis16400: Fix adis16448 gyroscope scale
    - LP: #1514832
  * iio: Add inverse unit conversion macros
    - LP: #1514832
  * iio: adis16480: Fix scale factors
    - LP: #1514832
  * iio: industrialio-buffer: Fix iio_buffer_poll return value
    - LP: #1514832
  * iio: event: Remove negative error code from iio_event_poll
    - LP: #1514832
  * NFSv4: don't set SETATTR for O_RDONLY|O_EXCL
    - LP: #1514832
  * unshare: Unsharing a thread does not require unsharing a vm
    - LP: #1514832
  * ASoC: adav80x: Remove .read_flag_mask setting from
    adav80x_regmap_config
    - LP: #1514832
  * drivers: usb :fsl: Implement Workaround for USB Erratum A007792
    - LP: #1514832
  * drivers: usb: fsl: Workaround for USB erratum-A005275
    - LP: #1514832
  * serial: 8250: don't bind to SMSC IrCC IR port
    - LP: #1514832
  * staging: comedi: adl_pci7x3x: fix digital output on PCI-7230
    - LP: #1514832
  * blk-mq: fix buffer overflow when reading sysfs file of 'pending'
    - LP: #1514832
  * xtensa: fix kernel register spilling
    - LP: #1514832
  * NFS: nfs_set_pgio_error sometimes misses errors
    - LP: #1514832
  * NFS: Fix a NULL pointer dereference of migration...

Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.