libvirt driver stuck deleting online snapshot

Bug #1627134 reported by Evgeny Antyshev
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Matthew Booth
Newton
Fix Released
High
Lee Yarwood

Bug Description

There is a problem in nova code in nova/virt/libvirt/driver.py:
            dev = guest.get_block_device(rebase_disk)
            if guest.is_active():
                result = dev.rebase(rebase_base, relative=relative)
                if result == 0:
                    LOG.debug('blockRebase started successfully',
                              instance=instance)

                while dev.wait_for_job(abort_on_error=True):
                    LOG.debug('waiting for blockRebase job completion',
                              instance=instance)
                    time.sleep(0.5)

It expects that libvirt block job stays for some period in 'cur == end' state, with end != 0 (wait_for_job logic). But in fact, at least for libvirt 1.3.3.2 and libvirt-python-1.2.17, we are not guaranteed to catch a job in such a state, before it dissapears and libvirt call returns empty result. Which is represented in get_job_info() by BlockDeviceJobInfo(job=0,bandwidth=0,cur=0,end=0).
Such result doesn't match wait_for_job finish criteria (effective since I45ac06eae0b1949f746dae305469718649bfcf23 is merged).

This bug started to occur in our third-party CI:
http://openstack-3rd-party-storage-ci-logs.virtuozzo.com/28/314928/13/check/dsvm-tempest-kvm/5aae7aa

n-cpu.log:
2016-08-17 15:47:04.856 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] volume_snapshot_delete: delete_info: {u'type': u'qcow2', u'merge_target_file': None, u'file_to_merge': None, u'volume_id': u'3e64cef0-03e3-407e-b6c5-fac873a7c98a'} _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2054
2016-08-17 15:47:04.864 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] found device at vda _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2098
2016-08-17 15:47:04.864 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] disk: vda, base: None, bw: 0, relative: False _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2171
2016-08-17 15:47:04.868 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] blockRebase started successfully _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2178
2016-08-17 15:47:04.889 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182
2016-08-17 15:47:05.396 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182
2016-08-17 15:47:05.951 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182
2016-08-17 15:47:06.456 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182
2016-08-17 15:47:06.968 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182
2016-08-17 15:47:07.594 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182

BTW, I didn't found tests checking this in gate.

Tags: libvirt
Revision history for this message
Matt Riedemann (mriedem) wrote :
Changed in nova:
status: New → In Progress
assignee: nobody → Evgeny Antyshev (eantyshev)
tags: added: libvirt
Changed in nova:
importance: Undecided → High
Revision history for this message
Matt Riedemann (mriedem) wrote :

I think this probably only hits certain volume backends, someone else was reporting the same thing using NFS, and I think virtuozzo is using a RemoteFS style driver too.

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

I think libvirt method virDomainGetBlockJobInfo shouldn't return empty dict. Please check my patch: https://review.openstack.org/#/c/365756/ which tries to clean this code littlebit. Maybe it would help somehow?

Revision history for this message
Evgeny Antyshev (eantyshev) wrote :

This code won't work: blockJobInfo() returns {} after rebase job is complete.
The proposed code will fail any matching in wait_for_job() when {} is passed.

I've checked this on my third-party CI (for Virtuozzo Storage volume backend). I also discovered that this code isn't checked in upstream gate, but only with unit tests.

Revision history for this message
Kashyap Chamarthy (kashyapc) wrote :

Currently Virtuozzo driver doesn't log libvirt debug log filters
http://openstack-3rd-party-storage-ci-logs.virtuozzo.com/28/314928/13/check/dsvm-tempest-kvm/5aae7aa/logs/libvirt/libvirtd.txt.gz

Can you also please add libvirt debug logs with log filters enabled, so libvirt folks could take a look:

(1) In /etc/libvirt/libvirtd.conf:

  log_filters="1:libvirt.c 1:qemu 1:conf 1:security 3:object 3:event 3:json 3:file 1:util 1:cpu"
  log_outputs="1:file:/var/log/libvirt/libvirtd.log"

(2) Restart libvirtd

(3) Repeat the test

(4) Attach the libvirtd.log file here.

Revision history for this message
Evgeny Antyshev (eantyshev) wrote :

Here's a libvirtd log for the period since rebase job creation till its termination

http://paste.openstack.org/show/583277/

Revision history for this message
Kashyap Chamarthy (kashyapc) wrote :
Revision history for this message
Kashyap Chamarthy (kashyapc) wrote :

Looking at the log...

Here we can the libvirt blockRebase() (QEMU 'block-stream') job has started:

-----
Sep 20 15:10:39 localhost libvirtd: 85852: info : qemuMonitorSend:1007 : QEMU_MONITOR_SEND_MSG: mon=0x7f64f8010bc0 msg={"execute":"block-stream","arguments":{"device":"drive-virtio-disk1","base":"/opt/stack/data/nova/mnt/530180c6e38955fc032f76c2c0b8b443/volume-0a1fd4cf-643b-4077-943d-69a6647ccf9b.8ffec8b3-fba4-4b41-969f-640dc7ea792a","backing-file":"volume-0a1fd4cf-643b-4077-943d-69a6647ccf9b.8ffec8b3-fba4-4b41-969f-640dc7ea792a"},"id":"libvirt-109"}#015#012 fd=-1
-----

libvirt queries QEMU for block job status:

-----
Sep 20 15:10:39 localhost libvirtd: 85849: info : qemuMonitorIOWrite:531 : QEMU_MONITOR_IO_WRITE: mon=0x7f64f8010bc0 buf={"execute":"query-block-jobs","id":"libvirt-110"}#015#012 len=51 ret=51 errno=11
-----

QEMU responds, saying that it is busy (and you can see, that len != offset; so it is still copying):

-----
Sep 20 15:10:39 localhost libvirtd: 85849: info : qemuMonitorJSONIOProcessLine:209 : QEMU_MONITOR_RECV_REPLY: mon=0x7f64f8010bc0 reply={"return": [{"io-status": "ok", "device": "drive-virtio-disk1", "busy": false, "len": 1073741824, "offset": 613941248, "paused": false, "speed": 0, "ready": false, "type": "stream"}], "id": "libvirt-110"}
-----

Shortly after that, we can see that the "len == offset", meaning copy job (type "stream") is finished, and QEMU has emitted the event BLOCK_JOB_COMPLETED:

-----
Sep 20 15:10:39 localhost libvirtd: 85849: info : qemuMonitorEmitEvent:1272 : mon=0x7f64f8010bc0 event=BLOCK_JOB_COMPLETED data={"device":"drive-virtio-disk1","len":1073741824,"offset":1073741824,"speed":0,"type":"stream"}
-----

Now libvirt queries QEMU for job status (line-148) once again:

-----
Sep 20 15:10:39 localhost libvirtd: 85854: info : qemuMonitorSend:1007 : QEMU_MONITOR_SEND_MSG: mon=0x7f64f8010bc0 msg={"execute":"query-block-jobs","id":"libvirt-111"}#015#012 fd=-1
-----

Response (notice the ID 'libvirt-111' matches the above query), is an empty dict:

-----
Sep 20 15:10:39 localhost libvirtd: 85849: info : qemuMonitorIOProcess:426 : QEMU_MONITOR_IO_PROCESS: mon=0x7f64f8010bc0 buf={"return": [], "id": "libvirt-111"}#015#012 len=37
-----

The above response from QEMU means: no jobs are active; so libvirt should _not_ be reporting a job status, with current libvirt versions.

Upstream libvirt says that maybe future libvirt could track that a job has finished and provide state even when QEMU didn't report any. (I'll file an upstream libvirt bug to track this).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/378746

Changed in nova:
assignee: Evgeny Antyshev (eantyshev) → Matthew Booth (mbooth-9)
Changed in nova:
assignee: Matthew Booth (mbooth-9) → Evgeny Antyshev (eantyshev)
Revision history for this message
Erlon R. Cruz (sombrafam) wrote :

I did some tests using the NFS snapshots around 30 days ago and this issue was not happening. I did the same tests again and indeed the snapshots are stucking. Do you know what added the problem?

Changed in nova:
assignee: Evgeny Antyshev (eantyshev) → Matthew Booth (mbooth-9)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/378746
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0f4bd241665c287e49f2d30ca79be96298217b7e
Submitter: Jenkins
Branch: master

commit 0f4bd241665c287e49f2d30ca79be96298217b7e
Author: Matthew Booth <email address hidden>
Date: Wed Sep 28 16:44:41 2016 +0100

    libvirt: Fix BlockDevice.wait_for_job when qemu reports no job

    We were misinterpreting the return value of blockJobInfo. Most
    immediately we were expecting it to return an integer, which has never
    been the case. blockJobInfo also raises an exception on error. Note
    that the implementation of abort_on_error has always expected an
    integer return value, and exceptions have never been handled, which
    means that abort_on_error has always been a no-op, and exceptions have
    never been swallowed. As this is also the most intuitive behaviour, we
    make it explicit by removing abort_on_error. Any exception raised by
    blockJobInfo will continue to propagate unhandled.

    We were obfuscating the return value indicating that the job did not
    exist, {}, by populating a BlockDeviceJobInfo with fake values. We
    de-obfuscate this by returning None instead, which is unambiguous.

    wait_for_job() was misnamed, as it does not wait. This is renamed to
    is_job_complete() to be less confusing. Note that the logic is
    reversed.

    After discussion with Eric Blake of the libvirt team (see review
    comments: https://review.openstack.org/#/c/375652/), we are now
    confident asserting that if no job exists then it has completed
    (although we are still not sure that it succeeded). Consequently we
    remove the wait_for_job_clean parameter, and always assume that no job
    means it has completed. Previously this was implicit because no job
    meant a defaulted BlockDeviceJobInfo.job value of 0.

    Co-authored-by: Sławek Kapłoński <email address hidden>
    Closes-Bug: #1627134
    Change-Id: I2d0daa32b1d37fa60412ad7a374ee38cebdeb579

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Evgeny Antyshev (<email address hidden>) on branch: master
Review: https://review.openstack.org/375652

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/387292

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.0.0.0b1

This issue was fixed in the openstack/nova 15.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/newton)

Reviewed: https://review.openstack.org/387292
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=a021a72ddaea0e1eb6e3f4b934870fd5f2a05485
Submitter: Jenkins
Branch: stable/newton

commit a021a72ddaea0e1eb6e3f4b934870fd5f2a05485
Author: Matthew Booth <email address hidden>
Date: Wed Sep 28 16:44:41 2016 +0100

    libvirt: Fix BlockDevice.wait_for_job when qemu reports no job

    We were misinterpreting the return value of blockJobInfo. Most
    immediately we were expecting it to return an integer, which has never
    been the case. blockJobInfo also raises an exception on error. Note
    that the implementation of abort_on_error has always expected an
    integer return value, and exceptions have never been handled, which
    means that abort_on_error has always been a no-op, and exceptions have
    never been swallowed. As this is also the most intuitive behaviour, we
    make it explicit by removing abort_on_error. Any exception raised by
    blockJobInfo will continue to propagate unhandled.

    We were obfuscating the return value indicating that the job did not
    exist, {}, by populating a BlockDeviceJobInfo with fake values. We
    de-obfuscate this by returning None instead, which is unambiguous.

    wait_for_job() was misnamed, as it does not wait. This is renamed to
    is_job_complete() to be less confusing. Note that the logic is
    reversed.

    After discussion with Eric Blake of the libvirt team (see review
    comments: https://review.openstack.org/#/c/375652/), we are now
    confident asserting that if no job exists then it has completed
    (although we are still not sure that it succeeded). Consequently we
    remove the wait_for_job_clean parameter, and always assume that no job
    means it has completed. Previously this was implicit because no job
    meant a defaulted BlockDeviceJobInfo.job value of 0.

    Co-authored-by: Sławek Kapłoński <email address hidden>
    Closes-Bug: #1627134
    Change-Id: I2d0daa32b1d37fa60412ad7a374ee38cebdeb579
    (cherry picked from commit 0f4bd241665c287e49f2d30ca79be96298217b7e)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/433772

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 14.0.4

This issue was fixed in the openstack/nova 14.0.4 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/mitaka)

Change abandoned by Matt Riedemann (<email address hidden>) on branch: stable/mitaka
Review: https://review.openstack.org/433772

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.