XenAPI: failed to create image from volume backed instance with glance v2

Bug #1616938 reported by Jianghua Wang
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Matt Riedemann
Newton
Fix Committed
High
Matt Riedemann

Bug Description

XenServer with dirver XenAPI, it always fails to create image from volume.

Tempest test: tempest.scenario.test_volume_boot_pattern.TestVolumeBootPatternV2.test_create_ebs_image_and_check_boot

2016-08-24 08:13:24.636 27347 DEBUG nova.compute.api [req-fd7afac4-a2ee-41cb-b785-02766806db26 tempest-TestVolumeBootPatternV2-1645487335 tempest-TestVolumeBootPatternV2-1645487335] [instance: 5d5a8b10-655c-457e-8ad9-edbfb6ecd278] Creating snapshot from volume 9953359c-327a-41bf-abec-f3c3da416390. snapshot_volume_backed /opt/stack/new/nova/nova/compute/api.py:2445^M
2016-08-24 08:13:25.964 27347 INFO os_vif [req-fd7afac4-a2ee-41cb-b785-02766806db26 tempest-TestVolumeBootPatternV2-1645487335 tempest-TestVolumeBootPatternV2-1645487335] Loaded VIF plugin class '<class 'vif_plug_ovs.ovs.OvsPlugin'>' with name 'ovs'^M
2016-08-24 08:13:25.965 27347 INFO os_vif [req-fd7afac4-a2ee-41cb-b785-02766806db26 tempest-TestVolumeBootPatternV2-1645487335 tempest-TestVolumeBootPatternV2-1645487335] Loaded VIF plugin class '<class 'vif_plug_linux_bridge.linux_bridge.LinuxBridgePlugin'>' with name 'linux_bridge'^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions [req-fd7afac4-a2ee-41cb-b785-02766806db26 tempest-TestVolumeBootPatternV2-1645487335 tempest-TestVolumeBootPatternV2-1645487335] Unexpected exception in API method^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions Traceback (most recent call last):^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/api/openstack/extensions.py", line 338, in wrapped^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions return f(*args, **kwargs)^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/api/openstack/common.py", line 372, in inner^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions return f(*args, **kwargs)^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/api/validation/__init__.py", line 73, in wrapper^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions return func(*args, **kwargs)^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/api/validation/__init__.py", line 73, in wrapper^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions return func(*args, **kwargs)^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/api/openstack/compute/servers.py", line 1072, in _action_create_image^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions metadata)^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/compute/api.py", line 146, in inner^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions return f(self, context, instance, *args, **kw)^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/compute/api.py", line 2463, in snapshot_volume_backed^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions return self.image_api.create(context, image_meta)^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/image/api.py", line 106, in create^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions return session.create(context, image_info, data=data)^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/image/glance.py", line 626, in create^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions data, force_activate)^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/image/glance.py", line 658, in _create_v2^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions context, 2, 'create', **sent_service_image_meta)^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions File "/opt/stack/new/nova/nova/image/glance.py", line 174, in call^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions result = getattr(client.images, method)(*args, **kwargs)^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions File "/usr/local/lib/python2.7/dist-packages/glanceclient/v2/images.py", line 233, in create^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions raise TypeError(encodeutils.exception_to_unicode(e))^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions TypeError: Unable to set 'disk_format' to 'qcow2'. Reason: 'qcow2' is not one of [None, u'ami', u'ari', u'aki', u'vhd', u'raw', u'iso']^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions ^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions Failed validating u'enum' in schema[u'properties'][u'disk_format']:^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions {u'description': u'Format of the disk',^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions u'enum': [None, u'ami', u'ari', u'aki', u'vhd', u'raw', u'iso'],^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions u'type': [u'null', u'string']}^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions ^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions On instance[u'disk_format']:^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions 'qcow2'^M
2016-08-24 08:13:26.049 27347 ERROR nova.api.openstack.extensions ^M

Per my analysis, it should be bug due to it hardcode the disk format as qcow2.
See: https://github.com/openstack/nova/blob/master/nova/compute/api.py#L2471
At here, it cleans the disk_format and container_format

Then:

https://github.com/openstack/nova/blob/master//nova/image/glance.py#L651
It will hardcode the disk_format as qcow2 which is not supported by XenAPI.

nova/nova/image/glance.py
    def _create_v2(self, context, sent_service_image_meta, data=None,
                   force_activate=False):
        # Glance v1 allows image activation without setting disk and
        # container formats, v2 doesn't. It leads to the dirtiest workaround
        # where we have to hardcode this parameters.
        if force_activate:
            data = ''
            if 'disk_format' not in sent_service_image_meta:
                sent_service_image_meta['disk_format'] = 'qcow2'
            if 'container_format' not in sent_service_image_meta:
                sent_service_image_meta['container_format'] = 'bare'

tags: added: xenserver
tags: added: os-vif
summary: - XenAPI: failed to create image from volume
+ XenAPI: failed to create image from volume backed instance
Revision history for this message
Jianghua Wang (wjh-fresh) wrote : Re: XenAPI: failed to create image from volume backed instance

When create image from instance which is not volume backed, it depends on hypervisor's driver to correct the disk_format (and container_format) at uploading data to image. so there is no problem;
But for volume backed instance, creating image will take a volume snapshot in volume back-end. In this case, the image is simply a bucket of properties without any image data.
     https://github.com/openstack/nova/blob/master/nova/compute/api.py#L2545
There is no image data to be uploaded, so there is no chance for hypervisor to correct disk_format. If the hypervisor driver(e.g. XenAPI) doesn't support the default hardcoded disk_format - qcow2, it will cause failures at image creating and booting instance from that image.
On the other side, as the real data is in the volume snapshot. There is no need to process data basing on the image's format. So I think a reasonable solution is to use a common disk_format supported by all hypervisors - e.g. 'raw'.

Revision history for this message
Matt Riedemann (mriedem) wrote :

So basically we have a regression in newton for the xenserver backend it sounds like because the glance v2 code was added in newton and that's what's hardcoding the format to qcow2, and glance v1 usage is deprecated now. So you can't use glance v2 + xen and take snapshots of volume-backed instances.

tags: removed: os-vif
Changed in nova:
status: New → Triaged
importance: Undecided → High
tags: added: newton-rc-potential
Revision history for this message
Matt Riedemann (mriedem) wrote :

What is the glance image backend that you're using here? It's not that the xen api virt driver is failing, glance is rejecting the image create request - but what's the image backend in this case?

Changed in nova:
status: Triaged → Incomplete
summary: - XenAPI: failed to create image from volume backed instance
+ XenAPI: failed to create image from volume backed instance with glance
+ v2
Revision history for this message
Matt Riedemann (mriedem) wrote :

I wonder if an alternative in nova's glance v2 image service when creating the image is to check the CONF.snapshot_image_format (which is currently only a libvirt option), and if not set then default to qcow2. That seems pretty hacky though.

I don't really understand how xenapi works with glance though given the xen plugins code for uploading vhds here:

https://github.com/openstack/nova/blob/master/plugins/xenserver/xenapi/etc/xapi.d/plugins/glance.py#L612

Revision history for this message
Nikhil Komawar (nikhil-komawar) wrote :
Changed in glance:
status: New → Invalid
Revision history for this message
Nikhil Komawar (nikhil-komawar) wrote :

This issue seems to be happening much before the code block reaches the xen driver at the (nova.image.glance), so the schema check needs to be added here too https://github.com/openstack/nova/blob/65b72f2d9d829fa6e0f8ed7bea2fb2831d9ebb8b/nova/image/glance.py#L651 (this schema validation is new in v2, so a v2 issue here)

besides above:

It's not a glance issue (as per current description) so I marked it incomplete but if more issues are found we can discuss.

From the current looks of it, the assumptions during v1->v2 port was that xen was being sparsely deployed and they went with what made sense to pass the gate tests.

So, I think the first thing reporter needs to do is add a test in the gate to indicate this use case. (This will avoid regressions)

Then to solve this we can add that check which fetches the schema from glance-api, checks if the format attempted to upload is correct and logs appropriate message if failed. (at the xen plugin, we can't reuse the code that will be added as prescribed above)

Also, I checked the nova xen driver and looks like it is supported for vhd only for both v1 and v2

https://github.com/openstack/nova/blob/15a159b3fcedc1b360367641a41413966c854a9a/plugins/xenserver/xenapi/etc/xapi.d/plugins/glance.py#L251

https://github.com/openstack/nova/blob/15a159b3fcedc1b360367641a41413966c854a9a/plugins/xenserver/xenapi/etc/xapi.d/plugins/glance.py#L316

Revision history for this message
Matt Riedemann (mriedem) wrote :

Thanks for the investigation Nikhil. I agree it looks like the nova.image.glance v2 API code will need to get the schema first and then pick an appropriate disk_format and container_format values.

Changed in nova:
status: Incomplete → Confirmed
Revision history for this message
Matt Riedemann (mriedem) wrote :

The good news is we have a gate test that triggers this:

tempest.scenario.test_volume_boot_pattern.TestVolumeBootPatternV2.test_create_ebs_image_and_check_boot

It's just that the xenserver CI has to probably skip that test because of this bug.

Matt Riedemann (mriedem)
Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
status: Confirmed → In Progress
no longer affects: glance
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Jianghua Wang (<email address hidden>) on branch: master
Review: https://review.openstack.org/366825
Reason: this bug will be fixed by https://review.openstack.org/#/c/375875

Revision history for this message
Matt Riedemann (mriedem) wrote :

My proposed fix is here: https://review.openstack.org/#/c/375875/

I hadn't seen https://review.openstack.org/366825 because it wasn't linked into this bug report and the status was still New.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/375875
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=2fe5daeb5fcd2750b0eb083fc5c8bfd3fdb992e2
Submitter: Jenkins
Branch: master

commit 2fe5daeb5fcd2750b0eb083fc5c8bfd3fdb992e2
Author: Matt Riedemann <email address hidden>
Date: Sat Sep 24 11:24:40 2016 -0400

    Determine disk_format for volume-backed snapshot from schema

    Xen deployments don't support qcow2 images which is what the
    glance v2 API code in nova defaults to, so basically you can't
    create a snapshot of a volume-backed instance with glance v2 and
    Xen right now.

    This change uses the glance v2 image schema to determine the
    disk_format to use based on some rules:

    1. Look for a preferred disk_format using an ordered list.
    2. If we still can't figure it out, just use the first
       supported disk_format available.

    Change-Id: Ifaa150fda393e2b49114e30dd5e30e5bf52b4ed1
    Closes-Bug: #1616938

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/newton)

Reviewed: https://review.openstack.org/376999
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7c4a5b1ee729a4e9bb6add2a6b6423f9bbd1c0d0
Submitter: Jenkins
Branch: stable/newton

commit 7c4a5b1ee729a4e9bb6add2a6b6423f9bbd1c0d0
Author: Matt Riedemann <email address hidden>
Date: Sat Sep 24 11:24:40 2016 -0400

    Determine disk_format for volume-backed snapshot from schema

    Xen deployments don't support qcow2 images which is what the
    glance v2 API code in nova defaults to, so basically you can't
    create a snapshot of a volume-backed instance with glance v2 and
    Xen right now.

    This change uses the glance v2 image schema to determine the
    disk_format to use based on some rules:

    1. Look for a preferred disk_format using an ordered list.
    2. If we still can't figure it out, just use the first
       supported disk_format available.

    Change-Id: Ifaa150fda393e2b49114e30dd5e30e5bf52b4ed1
    Closes-Bug: #1616938
    (cherry picked from commit 2fe5daeb5fcd2750b0eb083fc5c8bfd3fdb992e2)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 14.0.0.0rc2

This issue was fixed in the openstack/nova 14.0.0.0rc2 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.0.0.0b1

This issue was fixed in the openstack/nova 15.0.0.0b1 development milestone.

Changed in nova:
status: Fix Released → Confirmed
Changed in nova:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.