Instance snapshot fails with rbd backend

Bug #1803717 reported by Dr. Jens Harbott on 2018-11-16
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Dr. Jens Harbott
Queens
High
Christian Berendt
Rocky
High
Christian Berendt

Bug Description

http://logs.openstack.org/85/617985/1/check/devstack-plugin-ceph-tempest/58fe872/controller/logs/screen-n-cpu.txt.gz#_Nov_16_07_59_55_423217

Nov 16 08:07:14.891163 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: DEBUG nova.virt.libvirt.storage.rbd_utils [None req-3005471d-96d3-4fdd-a042-0b9e6025ccf4 tempest-ServerActionsTestJSON-406716108 tempest-ServerActionsTestJSON-406716108] creating snapshot(snap) on rbd image(0ef68017-c94d-43b4-8bb9-78f4d77cf928) {{(pid=3629) create_snap /opt/stack/nova/nova/virt/libvirt/storage/rbd_utils.py:383}}
Nov 16 08:07:16.213304 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: DEBUG oslo_service.periodic_task [None req-898d2dca-37a7-403f-b578-5ca2ae90e329 None None] Running periodic task ComputeManager._cleanup_expired_console_auth_tokens {{(pid=3629) run_periodic_tasks /usr/local/lib/python2.7/dist-packages/oslo_service/periodic_task.py:219}}
Nov 16 08:07:16.322727 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver [None req-3005471d-96d3-4fdd-a042-0b9e6025ccf4 tempest-ServerActionsTestJSON-406716108 tempest-ServerActionsTestJSON-406716108] Failed to snapshot image: TypeError: add_location() takes exactly 4 arguments (3 given)
Nov 16 08:07:16.322893 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver Traceback (most recent call last):
Nov 16 08:07:16.323039 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 1908, in snapshot
Nov 16 08:07:16.323192 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver purge_props=False)
Nov 16 08:07:16.323326 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver File "/opt/stack/nova/nova/image/api.py", line 142, in update
Nov 16 08:07:16.323460 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver purge_props=purge_props)
Nov 16 08:07:16.323604 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver File "/opt/stack/nova/nova/image/glance.py", line 588, in update
Nov 16 08:07:16.323801 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver _reraise_translated_image_exception(image_id)
Nov 16 08:07:16.324000 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver File "/opt/stack/nova/nova/image/glance.py", line 908, in _reraise_translated_image_exception
Nov 16 08:07:16.324179 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver six.reraise(type(new_exc), new_exc, exc_trace)
Nov 16 08:07:16.324362 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver File "/opt/stack/nova/nova/image/glance.py", line 586, in update
Nov 16 08:07:16.324511 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver image = self._update_v2(context, sent_service_image_meta, data)
Nov 16 08:07:16.324655 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver File "/opt/stack/nova/nova/image/glance.py", line 600, in _update_v2
Nov 16 08:07:16.324802 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver image = self._add_location(context, image_id, location)
Nov 16 08:07:16.324948 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver File "/opt/stack/nova/nova/image/glance.py", line 485, in _add_location
Nov 16 08:07:16.325110 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver context, 2, 'add_location', args=(image_id, location))
Nov 16 08:07:16.325263 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver File "/opt/stack/nova/nova/image/glance.py", line 193, in call
Nov 16 08:07:16.325421 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver result = getattr(controller, method)(*args, **kwargs)
Nov 16 08:07:16.325557 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver TypeError: add_location() takes exactly 4 arguments (3 given)
Nov 16 08:07:16.325747 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: ERROR nova.virt.libvirt.driver
Nov 16 08:07:16.432786 ubuntu-xenial-rax-iad-0000536097 nova-compute[3629]: DEBUG nova.virt.libvirt.storage.rbd_utils [None req-3005471d-96d3-4fdd-a042-0b9e6025ccf4 tempest-ServerActionsTestJSON-406716108 tempest-ServerActionsTestJSON-406716108] removing snapshot(snap) on rbd image(0ef68017-c94d-43b4-8bb9-78f4d77cf928) {{(pid=3629) remove_snap /opt/stack/nova/nova/virt/libvirt/storage/rbd_utils.py:410}}

This error may have been introduced up to three weeks ago without getting noticed because the ceph job has been broken.

Dr. Jens Harbott (j-harbott) wrote :

Seems this is a regression caused by https://review.openstack.org/614351

Changed in nova:
assignee: nobody → Dr. Jens Harbott (j-harbott)
sean mooney (sean-k-mooney) wrote :

i was debating if this should be a high or medium. i was going to mark as medium but since people are already working on it ill set it to high.

the error is pretty clear

add_location() takes exactly 4 arguments (3 given)

but i am wondering if the rbd backend has in advertenly chagned the signiture of
add_location to requrie 4 arguments when its base class only reqiures 3.

as such the fix might be to default the 4 parmater in the RBD version rather then change the call site at https://review.openstack.org/#/c/614351/3/nova/image/glance.py@485

Changed in nova:
status: New → Triaged
importance: Undecided → High

Fix proposed to branch: master
Review: https://review.openstack.org/618534

Changed in nova:
status: Triaged → In Progress
sean mooney (sean-k-mooney) wrote :

based on our irc conversation and the docs you found i think the propsed patch is correct.

https://docs.openstack.org/python-glanceclient/latest/reference/api/glanceclient.v2.images.html#glanceclient.v2.images.Controller

tags: added: ceph snapshot
Jay Pipes (jaypipes) wrote :

My patch broke all existing RBD deployments. I think that warrants a Critical priority :)

Changed in nova:
importance: High → Critical

Reviewed: https://review.openstack.org/618534
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=fd540e2135c26d8c297695a3fa73d993655f0ad8
Submitter: Zuul
Branch: master

commit fd540e2135c26d8c297695a3fa73d993655f0ad8
Author: Jens Harbott <email address hidden>
Date: Fri Nov 16 14:50:41 2018 +0000

    Fix regression in glance client call

    In [0] the way parameters are passed to the glance client was changed.
    Sadly one required argument was dropped during this, we need to insert
    it again in order to fix e.g. rbd backend usage.

    [0] https://review.openstack.org/614351

    Change-Id: I5a4cfb3c9b8125eca4f6c9561d3023537e606a93
    Closes-Bug: 1803717

Changed in nova:
status: In Progress → Fix Released

This issue was fixed in the openstack/nova 19.0.0.0rc1 release candidate.

Christian Berendt (berendt) wrote :

At the moment the snapshot feature is broken again on the Rocky stable branch.

With the merge of https://review.opendev.org/#/c/650064/ this bug was introduced on the Rocky stable branch two weeks ago.

Because of this I think this bug report has to be re-opened until the merge of https://review.opendev.org/#/c/655167/.

Matt Riedemann (mriedem) wrote :

> With the merge of https://review.opendev.org/#/c/650064/ this bug was introduced on the Rocky stable branch two weeks ago.

Hmm, looks like you're right because the ceph job in that change is failing snapshot tests:

http://logs.openstack.org/64/650064/2/check/devstack-plugin-ceph-tempest/291529b/testr_results.html.gz

Matt Riedemann (mriedem) wrote :

https://review.opendev.org/#/q/I3ed3303309fe2a25c0043fd206f36bada4b3b8f9 was also backported to stable/queens and the ceph job is failing there as well.

Changed in nova:
importance: Critical → High

Reviewed: https://review.opendev.org/655167
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=585e62e24481ce3e8c1d02a4d0b6f99b4f9df8c0
Submitter: Zuul
Branch: stable/rocky

commit 585e62e24481ce3e8c1d02a4d0b6f99b4f9df8c0
Author: Jens Harbott <email address hidden>
Date: Fri Nov 16 14:50:41 2018 +0000

    Fix regression in glance client call

    In [0] the way parameters are passed to the glance client was changed.
    Sadly one required argument was dropped during this, we need to insert
    it again in order to fix e.g. rbd backend usage.

    [0] https://review.openstack.org/614351

    Change-Id: I5a4cfb3c9b8125eca4f6c9561d3023537e606a93
    Closes-Bug: 1803717
    (cherry picked from commit fd540e2135c26d8c297695a3fa73d993655f0ad8)

Reviewed: https://review.opendev.org/655186
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d4dbbb645af4b4a8220cd3496cfb615f8904ec91
Submitter: Zuul
Branch: stable/queens

commit d4dbbb645af4b4a8220cd3496cfb615f8904ec91
Author: Jens Harbott <email address hidden>
Date: Fri Nov 16 14:50:41 2018 +0000

    Fix regression in glance client call

    In [0] the way parameters are passed to the glance client was changed.
    Sadly one required argument was dropped during this, we need to insert
    it again in order to fix e.g. rbd backend usage.

    [0] https://review.openstack.org/614351

    Change-Id: I5a4cfb3c9b8125eca4f6c9561d3023537e606a93
    Closes-Bug: 1803717
    (cherry picked from commit fd540e2135c26d8c297695a3fa73d993655f0ad8)

This issue was fixed in the openstack/nova 18.2.1 release.

This issue was fixed in the openstack/nova 17.0.11 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers