nova instance snapshots fail when using rbd

Bug #1922251 reported by Lucian Petrut
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Glance
New
Undecided
Unassigned
OpenStack Compute (nova)
New
Undecided
Unassigned

Bug Description

Snapshotting instances that use the RBD Glance image backend fails if Glance is configured to use multiple stores.

Trace: http://paste.openstack.org/raw/804113/

The reason is that the Nova Libvirt driver creates the RBD snapshot directly and then updates the Glance image location. However, Nova isn't aware of the Glance store, so this information won't be included.

Glance will error out when trying to add a location that doesn't include the store name when multiple stores are enabled, or even if there's a single one passed through the "enabled_backends" glance option.

https://github.com/openstack/nova/blob/68af588d5c7b5c9472cbc2731fee2956c86206ea/nova/virt/libvirt/imagebackend.py#L1144-L1178
https://github.com/openstack/nova/blob/68af588d5c7b5c9472cbc2731fee2956c86206ea/nova/image/glance.py#L701-L702
https://github.com/openstack/nova/blob/68af588d5c7b5c9472cbc2731fee2956c86206ea/nova/image/glance.py#L555-L561
https://github.com/openstack/python-glanceclient/blob/3.3.0/glanceclient/v2/images.py#L472
https://github.com/openstack/glance/blob/b5437773b20db3d6ef20d449a8a43171c8fc7f69/glance/location.py#L122-L129
https://github.com/openstack/glance_store/blob/ae9022cd3639bf3d0f482921d03b2b751f757399/glance_store/location.py#L83-L113

tags: added: ceph
summary: - instance snapshots fail when using rbd
+ nova instance snapshots fail when using rbd
description: updated
Revision history for this message
Abhishek Kekane (abhishek-kekane) wrote :

Could you please put your nova and glance rbd store configuration here for reference?

Revision history for this message
Lucian Petrut (petrutlucian94) wrote :

Sure, here are the relevant config sections: http://paste.openstack.org/raw/804130/

In our case, we avoided the issue by just disabling the file store since we weren't really using it. Still, I think it's a concern.

tags: added: multistore
Revision history for this message
Lee Yarwood (lyarwood) wrote :

Can you include g-api logs showing why the request was rejected?

https://docs.openstack.org/api-ref/image/v2/index.html?expanded=update-image-detail

I can't see a way for n-cpu to provide the store name even if it could work that out as g-api specifically asks for the following when updating the locations associated with an image:

{
[..]
  'locations': [
    'uri': {
      'metadata': {
        'key': 'value',
        [..]
      }
    },
  ]
}

This smells like a config issue and/or bug in g-api is causing it to reject a valid request from n-cpu.

Revision history for this message
Lucian Petrut (petrutlucian94) wrote :

The glance logs weren't very useful as the initial exception was being suppressed. Here are the key bits from Glance:

https://github.com/openstack/glance_store/blob/ae9022cd3639bf3d0f482921d03b2b751f757399/glance_store/location.py#L83-L113
https://github.com/openstack/glance/blob/b5437773b20db3d6ef20d449a8a43171c8fc7f69/glance/location.py#L122-L129

Basically Glance is throwing an UnknownScheme exception if the "store" metadata field is missing from the update request.

Revision history for this message
Dan Smith (danms) wrote :

Lee, for the copying thing I had to add a config option to tell nova what its store name in glance is because we needed it for that: images_rbd_glance_store_name . So, just FYI we have that information now (re: your comment of "even if it could work that out") if we were to need to supply it to glance for some reason.

However, AFAIK, we're doing snapshots in the nova-ceph-multistore job, which has a file and rbd store set, so I would think we'd be hitting this there if it was something fundamental.

Revision history for this message
Dan Smith (danms) wrote :

Lucian, that doesn't look like a multistore glance config to me, unless you snipped a whole lot out. There should be sections for each store, by name. Here's an example (template) from our job:

https://github.com/openstack/nova/blob/master/.zuul.yaml#L412-L428

You should have a section for your ceph and your file backends, for example, separate from your staging store. I'm not too familiar with the internals, but since glance supports (what I think is) old and new config arrangements with single and multiple stores, I wonder if your glance config isn't quite setup right such that it's failing to make some assumptions about them and thus the bug?

Revision history for this message
Lucian Petrut (petrutlucian94) wrote :

Dan, here's the full config: http://paste.openstack.org/raw/804210/.

The config looks good to me, the glance store sections are there. I tried to keep it simple with the first paste, sorry if that caused some confusion.

Revision history for this message
Dan Smith (danms) wrote :

Okay, that does look better, but there are still some things in glance_store that we don't have in our config. I can't say why those would matter without looking deeper, but it might be something to look at. At the end of the day, it seems like our test job is able to do the snapshotting without a problem (test_create_delete_image passes), so I would tend to iterate towards our exact config until you find out what the problem is.

Revision history for this message
Dan Smith (danms) wrote :

Yeah, so it looks to me that glance tries to determine the store from the URI itself, as it calls this with nothing more than the URI provided:

https://github.com/openstack/glance/blob/922e544ca2556994450e6972403ba4313318c5e0/glance/common/store_utils.py#L159-L178

Which, as Lee suggested, goes along with not having a parameter (that I know of either) to provide the store name in the location update.

That function can fail if the scheme map isn't working, which makes me wonder about your stores= config:

stores = glance.store.filesystem.Store, glance.store.http.Store,glance.store.rbd.Store

which is different from how it's specified in our job:

stores = file, http, rbd

Could that be related?

Revision history for this message
Lucian Petrut (petrutlucian94) wrote :

I should've mentioned that this issue occurred on Stein. I'll double check with a devstack environment to see if the issue persists.

FWIW, this method didn't exist in Stein: https://github.com/openstack/glance/blob/922e544ca2556994450e6972403ba4313318c5e0/glance/common/store_utils.py#L159-L178

Thanks a lot for feedback.

Revision history for this message
Lucian Petrut (petrutlucian94) wrote :

Ok, so I think this is a duplicate of https://bugs.launchpad.net/glance/+bug/1802587, which was fixed here: https://review.opendev.org/c/openstack/glance/+/617229.

Dan, thanks a again for mentioning that function.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.