Under heavy load qemu hits bdrv_error_action: Assertion `error >= 0' failed

Bug #1655225 reported by Dave Chiluk
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
qemu (Ubuntu)
Fix Released
Medium
Dave Chiluk
Trusty
Fix Released
Undecided
Unassigned

Bug Description

[Impact]

 * VM running in QEMU with this message in qemu log
qemu-system-x86_64: /build/qemu-_D3HGx/qemu-2.0.0+dfsg/block.c:3491: bdrv_error_action: Assertion `error >= 0' failed.

 * This results in the VM needing to be restarted.

[Test Case]

 * This only reproduces under extremely high I/O load and very infrequently.
   Once per month on major clouds.

[Regression Potential]

 * Commit 3bbf572345c is included in 2.4.0-rc4 and newer, and has not been removed in latest develepment. I have not found any commits referencing it as a regressor.

 * Patch adds memory barriers so regressions may show up in the form of
   performance issues.

[Other Info]

 * Redhat identified this commit as the fix for the issue here
   https://access.redhat.com/solutions/1459913
   https://bugzilla.redhat.com/show_bug.cgi?id=1142857
   https://git.centos.org/blob/rpms!qemu-kvm/05bba06e575829071bce813e12709f9ec477f120/SOURCES!kvm-atomics-add-explicit-compiler-fence-in-__atomic-memo.patch
 * Only apply to Trusty as Xenial and newer have fix already. Ignoring
   Precise because no currently reported cases.

Revision history for this message
Dave Chiluk (chiluk) wrote :

Here's the proposed debdiff. I'm currently waiting on confirmation from the customer of resolution, but that may never come, as it is so difficult to reproduce this issue.

Also need to double check if this applies to the cloud archives.

description: updated
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "lp1655225.trusty.debdiff" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch
Mathew Hodson (mhodson)
Changed in qemu (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Dave Chiluk (chiluk) wrote :

So far the fix appears to be safe, and I'm comfortable proceeding with the SRU.

description: updated
Revision history for this message
Dave Chiluk (chiluk) wrote :

Debdiff with updated version string now that I'm comfortable with the upload.

description: updated
Revision history for this message
Marc Deslauriers (mdeslaur) wrote :

ACK on the debdiff in comment #4, looks good. Waiting for qemu package currently in trusty-proposed to be removed or released before uploading. Thanks!

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I did some further checks and think the fix is fine.
Yet given the details as Dave outlined above it will be next-to-impossible to verify - that has to be considered when there is time for proposed verification.

I threw some extra upgrade and migration tests at it, but it was fine as expected.

That said I can't upload it to the queue yet one has to be aware that the former SRU waits for verification (bugs 1587039 and 1640382).

I pinged seyeongkim who owns the other issues to make him aware.
@chiluk - I highlighted you so you should find it in the IRC log.

Some test references:
- https://bileto.ubuntu.com/excuses/2423/trusty.html
- http://paste.ubuntu.com/23899762/

I didn't run the QA tests from the security Team as I'm currently debugging an issue in there.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

qemu_2.0.0+dfsg-2ubuntu1.31 migrated tonight, and as I pointed out this is already through review and all the tests that came to my mind on a ppa base.

Since Chiluk still is in the application process to be able to SRU upload I'll sponsor for now - I hope I did not miss that you got approved and wanted to do on your own.

It is now in the unapproved queue for the SRU Team to take a look at.

@Chiluk - as you pointed out this is not really explicitly testable. But if you could - once in proposed - run all sorts of stress against it that would be great as a minimal verification-done.

Revision history for this message
Robie Basak (racb) wrote :

Only apply to Trusty as Xenial and newer have fix already.

Changed in qemu (Ubuntu):
status: New → Fix Released
Revision history for this message
Robie Basak (racb) wrote : Please test proposed package

Hello Dave, or anyone else affected,

Accepted qemu into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/qemu/2.0.0+dfsg-2ubuntu1.32 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in qemu (Ubuntu Trusty):
status: New → Fix Committed
tags: added: verification-needed
Revision history for this message
Dave Chiluk (chiluk) wrote :

I ran fio 4k randread and 4k randwrite tests from within the vm against a qcow2. I was expecting performance degradation, but in every way the patch seems to have made things slightly faster.

pre1655225.output = 2.0.0+dfsg-2ubuntu1.31
16655225.output = 2.0.0+dfsg-2ubuntu1.32

Revision history for this message
Dave Chiluk (chiluk) wrote :
Revision history for this message
Dave Chiluk (chiluk) wrote :

As I'm not sure what other testing I might be able to do to exercise this code, I'm marking this verification-done.

tags: added: verification-done
removed: verification-needed
Rob Roschewsk (rr2976)
Changed in qemu (Ubuntu Trusty):
status: Fix Committed → Fix Released
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

2.0.0+dfsg-2ubuntu1.32 is still in proposed - should stay at Fix Committed IMO.

@Rob - can you share why you changed it?
@Chiluk - do you have any background?

Revision history for this message
Dave Chiluk (chiluk) wrote :

@paelzer
AFAIK it should still be in Fix Committed. Moving it back there.

@Rob
It seems as if you are new to launchpad. Welcome. Bugs move to fix committed when the fix lands in the -proposed archives. This is essentially the testing pocket where it gives everyone a chance to help test to make sure nothing has broken. Once it bakes in -proposed for a few weeks it will get promoted to -updates. The bug is only marked fix-released after it hits the -updates archive. This is because sometimes security fixes will pre-empt the changes in -proposed, and the SRU will need to be restarted.

Changed in qemu (Ubuntu):
status: Fix Released → Fix Committed
Changed in qemu (Ubuntu Trusty):
status: Fix Released → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu - 2.0.0+dfsg-2ubuntu1.32

---------------
qemu (2.0.0+dfsg-2ubuntu1.32) trusty; urgency=medium

  [ Dave Chiluk ]
  * Qemu VM crash with error
    "bdrv_error_action: Assertion `error >= 0' failed"
    (LP: #1655225)

 -- Christian Ehrhardt <email address hidden> Tue, 31 Jan 2017 11:26:19 +0100

Changed in qemu (Ubuntu Trusty):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for qemu has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Changed in qemu (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.