kvm + virtio disk corrupts large volumes (>1TB).

Bug #574665 reported by Curt Sampson
92
This bug affects 12 people
Affects Status Importance Assigned to Milestone
KVM
Unknown
Unknown
qemu-kvm (Ubuntu)
Fix Released
High
Serge Hallyn
Lucid
Fix Released
High
Serge Hallyn
Maverick
Fix Released
High
Serge Hallyn

Bug Description

Binary package hint: qemu-kvm

See this bug:

  http://sourceforge.net/tracker/index.php?func=detail&aid=2933400&group_id=180599&atid=893831

I have confirmed that this happens with 10.04 host and guest. Giving kvm a logical virtio volume of 1024 GB works; a logical volume of 1048 GB fails to install and produces disk errors on the guest (but not on the host). Changing the kvm configuration to use an IDE volume instead works fine even with a 1.7TB volume.

SRU Justification:

1. Impact: Use of > 1Tb virtio disk in kvm will almost certainly cause data corruption
2. How bug addressed: A minimal patch was cherry-picked from upstream, which addresses the fact that the difference between the start addresses of two requests can be > sizeof(int).
3. Patch: See lp:~serge-hallyn/ubuntu/lucid/qemu-kvm/virtio-corrupt, in particular the patch block-fix-sector-comparison.patch.
4. To reproduce: Create a 1.3Tb qemu disk or LVM partition. Run kvm with this as the virtio hard drive and a natty installer as cdrom. Select the default drive configurations (use whole disk with LVM). The installer will hang at this step.
 qemu-img create -f raw test.img 1300G
 kvm -cdrom natty-server-amd64.iso -drive file=test.img,if=virtio,index=0 -boot d -m 2G
 Re-test with the packages from the attached qemu-virtio-tst1.tgz tarball. The VM should now successfully install and be fully, reliably useful.
5. Regression potential: The patch is taken from upstream. If there is a regression, it can only impact qemu-kvm itself.

Revision history for this message
C de-Avillez (hggdh2) wrote :

Marking Confirmed/High, added upstream bug.

Changed in qemu-kvm (Ubuntu):
importance: Undecided → High
status: New → Confirmed
Revision history for this message
masc (masc) wrote :

Even though there's no corruption, ide cannot be considered an acceptable workaround with kvm.

Ata exceptions will occur mostly when writing large files. (eg. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/279693). Proposed workarounds like putting cd in drive don't work with kvm. All processes which currently have io requests pending will freeze for ~10 seconds when this happens. And it does, a lot.

Revision history for this message
Curt Sampson (cjs-cynic) wrote :

masc, the bug you pointed out appears to be when using a particular hardware SATA chipset; KVM isn't mentioned anywhere there. Are you saying that this happens with KVM guests when running on hosts without this problem?

Using dd, I wrote new 60GB file (writing zeros) on a guest using the KVM IDE emulation, and saw no errors on either guest or host.

Revision history for this message
masc (masc) wrote :

yes, writing is not enough though. try copying 60 GB from one ide disk to another.

there's plenty of those ata exception bugs floating around, I just picked the closest example.
failures during WRITE_DMA_EXT are most common.

Revision history for this message
zyro (zyro23) wrote :

i can confirm this issue. virtio plus guest volumes (qcow2/raw seems irrelevant, tried both with errors) >1T causes image corruption. the used disk space in the guests did not matter in my case. io errors showed sometimes at 10gb, sometimes later when already ~750gb were copied onto the virtual disk.

additionally, lucid 10.04 server partitioner freezes while trying to create ext4 fs on virtual disk > 1T with virtio. only workaround here was to run setup with ide driver, switching to virtio afterwards. but again, that caused disk corruption after some time (varied between minutes and days).

atm using one 250G virtio raw image and one 2T ide raw image stable without any flaws.

Revision history for this message
Michael Tokarev (mjt+launchpad-tls) wrote :
Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Thanks for the pointer, Michael. I don't see where this has landed in either the upstream qemu or qemu-kvm git yet. As soon as it does, I'll cherry pick the fix for Ubuntu 10.04. Thanks.

Changed in qemu-kvm (Ubuntu):
status: Confirmed → In Progress
assignee: nobody → Dustin Kirkland (kirkland)
status: In Progress → Triaged
Changed in qemu-kvm (Ubuntu Lucid):
status: New → Triaged
importance: Undecided → High
assignee: nobody → Dustin Kirkland (kirkland)
milestone: none → lucid-updates
Revision history for this message
David Weber (wb-munzinger) wrote :
Revision history for this message
Kari Kallioinen (karbas) wrote :

I patched qemu-kvm (0.12.3+noroms-0ubuntu9) with patch (http://marc.info/?l=qemu-devel&m=127436114712437) and compiled but still my 4TB partition corrupts.

I have setup where I have 4TB partition (from LVM) as a virtio-device in qemu-kvm. I have LVM also inside virtual machine and having 4GB partition as root and rest of 4TB used as a data-partition:
<-->
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/rootvg-root
                       9611492 3144132 5979120 35% /
none 505832 172 505660 1% /dev
none 511844 0 511844 0% /dev/shm
none 511844 92 511752 1% /var/run
none 511844 0 511844 0% /var/lock
none 511844 0 511844 0% /lib/init/rw
/dev/mapper/rootvg-storage
                     4216750656 1325906868 2890843788 32% /storage
/dev/vda3 217347 21434 184691 11% /boot
<-->

Still my virtualmachine seems to corrupt my virtual disk and gives following error to dmesg:

[ 325.052492] EXT4-fs error (device dm-2): ext4_mb_generate_buddy: EXT4-fs: group 9708: 27488 blocks in bitmap, 32768 in gd
[ 325.273227] Aborting journal on device dm-2-8.
[ 325.457530] EXT4-fs error (device dm-2): ext4_journal_start_sb: Detected aborted journal
[ 325.704655] EXT4-fs (dm-2): Remounting filesystem read-only
[ 325.854758] EXT4-fs (dm-2): Remounting filesystem read-only
[ 326.057791] EXT4-fs (dm-2): delayed block allocation failed for inode 81264677 at logical offset 102400 with max blocks 19767 with error -30
[ 326.385331]
[ 326.385358] This should not happen!! Data will be lost
[ 326.575336] EXT4-fs (dm-2): ext4_da_writepages: jbd2_start: 19767 pages, ino 81264677; err -30

Revision history for this message
masc (masc) wrote :

try to patch qemu-kvm 0.12.4, seemed to me there were other changes contributing to this fix in 0.12.4.
I verified that it works on gentoo http://bugs.gentoo.org/show_bug.cgi?id=321005, at least using 1.5TB disk.

Revision history for this message
Kari Kallioinen (karbas) wrote :

Yes. That seemed to help. 0.12.4 + patch.

Revision history for this message
David Weber (wb-munzinger) wrote :

Ubuntu Developers: Is there any chance to bump qemu-kvm to 0.12.4 in lucid? This would also fix a few other bugs.

If not we should have to backport a few other commits who are possible related to this bug.

Revision history for this message
Steven Wagner (stevenwagner) wrote :

As mentioned in comment #3, could this be related to a specific SATA chipset? I have a Intel 82801JI ICH10 SATA AHCI Controller, and have experienced random disk corruption on guests. Guests are both Linux and Win2k3. High usage on the sata controller seemed to be the trigger. Ive noticed no difference for if the guest is using virtio or ide mode. Guests will become unresponsive for >10 seconds when there is high disk usage on the node.

Revision history for this message
Curt Sampson (cjs-cynic) wrote : Re: [Bug 574665] Re: kvm + virtio disk corrupts large volumes (>1TB).

On 2010-06-24 22:24 -0000 (Thu), Steven Wagner wrote:

> As mentioned in comment #3, could this be related to a specific SATA
> chipset?...Ive noticed no difference for if the guest is using virtio
> or ide mode.

For me the guests work fine in IDE mode; it's only virtio that has a
problem. So, in a word, no.

Revision history for this message
masc (masc) wrote :

@curt: did you try copying 60gb from one raw ide disk to another within ubuntu guest?

Revision history for this message
masc (masc) wrote :

@steven: what you're describing seems to be a different issue though, sounds rather host related.
anyway the whole ide guest story is a bit offtopic here, Ill open a new bug for this as soon as I have time to do more tests.

Revision history for this message
Dan Bonelli (mccroskey42) wrote :

I'm experiencing a similar issue, albeit with slightly different hardware. I have several VMs backed by SAN storage, accessed through a QLogic 2560 controller. All of my volumes that are <1TB seem to function without issue. On the larger volumes (one in particular is >7TB with an XFS filesystem), I get near constant corruption whenever I copy a large volume of data (50-100GB). I switched the VM guest to use IDE instead of virtio, and the corruption was less frequent but didn't disappear altogether.

I've mounted the SAN volumes on the host and done 100GB+ copies without going through KVM and they worked flawlessly, so I can rule out the SAN setup as being problematic.

Revision history for this message
masc (masc) wrote :

Seems more tricky to patch than expected, more things contributing to this defect.
There was some regression on gentoo as well, patched/fixed in -r1 and re-introduced in -r2 although the previous patch was in.

Most efficient solution will be not to patch. Update to 0.12.5 which fixes the problem properly.

Revision history for this message
Thierry Carrez (ttx) wrote :

Dustin, any progress on this one ?
If you don't have the time to work on it, maybe reassign to Serge.

Changed in qemu-kvm (Ubuntu Lucid):
assignee: Dustin Kirkland (kirkland) → Serge Hallyn (serge-hallyn)
Changed in qemu-kvm (Ubuntu Maverick):
assignee: Dustin Kirkland (kirkland) → Serge Hallyn (serge-hallyn)
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

In the next week, I intend to push 0.12.5 to lucid-backports. Will update when
the merge has been proposed.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Note that this is fixed in 0.12.5 which is currently in Maverick.

Changed in qemu-kvm (Ubuntu Maverick):
status: Triaged → Fix Released
Revision history for this message
Kendall Gifford (zettabyte) wrote :

Is there any chance this bug can still affect an ext4 filesystem on a partition that is just slightly under 1TB? I originally experience this bug and, to work around it, re-partitioned my 1.5 TB drive into two partitions. Since then I've not experience any issues until today when I got some corruption out of the blue. I'm wondering if it is possible that the bug still exists with smaller filesystems but only much more infrequenly.

I posted the full details of my "story" here: http://ubuntuforums.org/showthread.php?t=1570149

Either way, looking forward to the roll-out of 0.12.5 into lucid-backports so that, if nothing else, I can (somewhat) eliminate the kvm/virtio layer as the source of my problems.

Revision history for this message
Marti (intgr) wrote :

@Kendall: I wouldn't be surprised, there have been lots of disk corruption bugs in the 0.12 series and the qemu-kvm project obviously has problems with project management (bug reports being ignored, poor QA etc).
Anyway, see:
http://www.linux-kvm.com/content/qemu-kvm-0125-released-bugfixes
http://www.linux-kvm.com/content/qemu-kvm-0124-released

This is the reason why I've postponed the upgrade of my VM servers from Karmic until now. Hopefully 0.12.5 has fixed all of the block device corruption bugs.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

(backport requested with LP: #632528)

Revision history for this message
Unlogic (unlogic-unlogic) wrote :

Is there any way to speedup the backports process cause kvm 0.12.3 that currently ships with Lucid has a few serious bugs which are fixed in 0.12.5.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

I've taken the proposed patch from http://marc.info/?l=qemu-devel&m=127436114712437 and applied it to lucid's qemu-kvm. The resulting source is at https://code.launchpad.net/~serge-hallyn/ubuntu/lucid/qemu-kvm/virtio-corrupt and the package is building in ppa:serge-hallyn/virt.

Could you please test the package in ppa:serge-hallyn/virt and let us know if it fixes your problem? It may not be sufficient since we're not starting from 0.12.4, but hopefully this patch suffices.

Revision history for this message
Curt Sampson (cjs-cynic) wrote : Re: [Bug 574665] Re: kvm + virtio disk corrupts large volumes (>1TB).

Unfortunately, I no longer have that particular configuration available
for testing. (It's now in production, and I can no longer risk destroying
the data on the disks.)

Revision history for this message
masc (masc) wrote :

just out of curiosity, 0.12.5 works well, why not just use it?

even dist-upgrade to 10.10 which has 0.12.5 would probably be less painful than having to test/deal with this bug.

Revision history for this message
Unlogic (unlogic-unlogic) wrote :

Good point!

Why keep struggling with patches for 0.12.4 instead of just moving on to 0.12.5 as in 10.10?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Quoting Unlogic (<email address hidden>):
> Good point!
>
> Why keep struggling with patches for 0.12.4 instead of just moving on to
> 0.12.5 as in 10.10?

10.10 has it's own set of bugs, pertaining to 32-bit guests and hosts,
which I've not quite had the hardware to faithfully reproduce.

Changed in qemu-kvm (Ubuntu Lucid):
status: Triaged → In Progress
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

My first few tests appear to verify that (a) stock lucid qemu-kvm succeeds with 1.1T virtio disk (on a raw LVM partition), fails with 1.3T virtio disk, and that the qemu-kvm from the attached qemu-virtio-tst1.tgz tarball (which is compiled from lp:~serge-hallyn/ubuntu/lucid/qemu-kvm/virtio-corrupt) succeeds with 1.3T virtio disk.

Continuing to test.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Several more tests have been 100% successful.

description: updated
tags: added: verification-needed
description: updated
Revision history for this message
Brian Roberg (robergb) wrote :

Thank you for backporting this patch, Serge! What will it take to get things rolling again for this to be included in lucid-updates? Are you just waiting for someone else to verify that the patch works in their setup? If so, we could probably give it a shot.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks, Brian, yes I need someone other than myself to confirm the fix. Greatly appreciated if you could!

Revision history for this message
Brian Roberg (robergb) wrote :

OK, a fellow sysadmin here is going to test the patch in our setup. He'll post the result here. There's a good chance he'll be able to do it this week, or perhaps next week.

Revision history for this message
Richard Neuböck (hawk-tbi) wrote :

Thanks for this patch! I can confirm its functionality. I'm using a (host) -> hardware raid 1 -> lvm -> drbd [2TB] -> (virtual machine) setup that fails (corrupts filesystems) repeatedly using the stock qemu-kvm implementation in lucid. After exchanging the the original qemu packages with the patched ones everything works fine even after heavy read/write tests lasting several days now.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks very much, Richard!

tags: added: verification-done
removed: verification-needed
Revision history for this message
Clint Byrum (clint-fewbar) wrote :

Removed verification-done .. as the package hasn't been tested from lucid-proposed yet.

tags: removed: verification-done
Revision history for this message
Clint Byrum (clint-fewbar) wrote :

APPROVED: qemu-kvm version 0.12.3+noroms-0ubuntu9.5 , uploaded to lucid-proposed, should be accepted.

Revision history for this message
Martin Pitt (pitti) wrote : Please test proposed package

Accepted qemu-kvm into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in qemu-kvm (Ubuntu Lucid):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Richard Neuböck (hawk-tbi) wrote :

I've tested the packages (especially qemu*) from lucid-proposed successfully. This time on a different machine but same hard- and software setup as before. Intense read and write tests on a 2TB filesystem ran for 24h without corrupting the filesystem.

Martin Pitt (pitti)
tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu-kvm - 0.12.3+noroms-0ubuntu9.5

---------------
qemu-kvm (0.12.3+noroms-0ubuntu9.5) lucid-proposed; urgency=low

  * debian/patches/block-fix-sector-comparison.patch: Fix virtio disk
    corruption with large (>1Tb) volumes (LP: #574665)
 -- Serge Hallyn <email address hidden> Fri, 28 Jan 2011 13:17:30 -0600

Changed in qemu-kvm (Ubuntu Lucid):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.