virtio block read errors

Bug #660060 reported by Lionel Bouton
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Fix Released
Undecided
Unassigned

Bug Description

Context :
- Gentoo Linux distribution on host and guests.
- qemu-kvm-0.12.5-r1
- 2.6.34-gentoo-r11 host kernel
- 2.6.29-gentoo-r5 guest kernels
- VM boots from and uses a single virtio block device.

On the old kvm bugtracker there was a discussion about a bug with virtio block devices :
http://sourceforge.net/tracker/?func=detail&atid=893831&aid=2933400&group_id=180599
I was affected (user gyver in the above discussion) and believed that the problem was fully solved : we had the read error problems on 4 physical hosts . I migrated 3 of the 4 hosts to Gentoo's qemu-kvm-0.12.5-r1 which fixed the problems and allowed us to use virtio block devices instead of emulated PIIX.

It seems there's a corner case left or another bug with similar consequences.

I just used a maintenance window on the last physical host (one hard disk switched for repair in a RAID1 array) to migrate the ancient kvm-85 (which worked for us with virtio) to 0.12.5. The read errors in virtio block mode came back instantly.

We have 3 VMs on this 4th host, 2 are x86, 1 is x86_64. All of them try to boot from a virtio block device and fail to do so with Gentoo's qemu-kvm-0.12.5-r1. They report read errors on /dev/vda and remount the root fs read-only. Reconfiguring them to use emulated PIIX works. There's something interesting about PIIX mode that I'm not sure I've seen before though: there are errors reported by the ATA stack during the boot and the guest kernels switch to PIO after resetting the ide0 interface. More on that later.

Booting all these VMs works properly with Gentoo's 0.11.1-r1 with virtio block.

Two details that might help :
1/
We use DRBD devices for all our virtual disks (on all 4 physical hosts),

2/
The "failing" host has different hardware, main differences :
- Core2 Duo architecture instead of Core i7,
- hardware RAID controller: a 3ware 8006-2LP with two SATA disks in RAID-1 mode instead of plain AHCI SATA controllers and software raid 1.
Currently the controller on the "failing" host is rebuilding the array after we switched a failing disk with a brand new one. Although there's no read error on the physical host as far as its kernel is concerned, read performance is suffering : 5MB/s top from the guest point of view with 0.11.1-r1 and virtio block with a dd if=/dev/vda (only one VM running and host idle to avoid interferences other than the RAID rebuild), ...

This poor read performance might explain the problem we saw in the guest kernel with PIIX emulation on qemu-kvm-0.12.5-r1 (slow reads might be confused with buggy DMA transfers explaining the PIO fallback). We didn't have time to test PIIX emulation after the RAID array was fully synchronized but can do on request.

Tags: block virtio
Revision history for this message
Thomas Huth (th-huth) wrote :

Triaging old bug tickets ... can you still recreate this issue with the latest version of QEMU, or can we close this bug nowadays?

Changed in qemu:
status: New → Incomplete
Revision history for this message
Lionel Bouton (lionel-subscription) wrote :

You can close this bug: it has been fixed long ago.

Revision history for this message
Thomas Huth (th-huth) wrote :

Thanks for your response! So I'm closing this ticket now...

Changed in qemu:
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.