Openstack Pike + LXD with ZFS - cannot convert image to raw

Bug #1710994 reported by Andrew McLeod
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Charm Test Infra
Invalid
Low
Unassigned
OpenStack Nova Compute Charm
Fix Released
Medium
Andrew McLeod

Bug Description

Deploying Pike on arm64 - during instance creation nova reports the following error:

fault | {"message": "Build of instance 3460a1bc-72ac-4d43-a232-d150161c2447 aborted: Image b81b08c8-2487-4fea-b39a-d9b07b75b2f0 is unacceptable: Unable to convert image to raw: Image /var/lib/nova/instances/_base/43dbc7beed2459ae424c6b889caf7854f26321e5.part is unacceptable: ", "code": 500, "details": " File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 1785, in _do_build_and_run_instance |
| | filter_properties) |
| | File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 2007, in _build_and_run_instance |

Further investigation of nova-compute logs reveals:

Unable to convert image to raw: Unexpected error while running command.
Command: qemu-img convert -t none -O raw -f qcow2 /var/lib/nova/instances/_base/f8488b6e96eb7e752a6d2aa046a8de693c06a5fd.part /var/lib/nova/instances/_base/f8488b6e96eb7e752a6d2aa046a8de693c06a5fd.converted
Exit code: 1
Stdout: u''
Stderr: u"qemu-img: file system may not support O_DIRECT\nqemu-img: Could not open '/var/lib/nova/instances/_base/f8488b6e96eb7e752a6d2aa046a8de693c06a5fd.converted': Could not open '/var/lib/nova/instances/_base/f8488b6e96eb7e752a6d2aa046a8de693c06a5fd.converted': Invalid argument\n"

Full log:

https://pastebin.canonical.com/195965/

tags: added: openstack-version.ocata
removed: ocata
Ryan Beisner (1chb1n)
tags: added: arm64
Revision history for this message
Andrew McLeod (admcleod) wrote :

Additionally, this does not affect s390x (only arm64)

Revision history for this message
dann frazier (dannf) wrote :

Can you share what host kernel is running on the impacted ARM server?

Revision history for this message
Andrew McLeod (admcleod) wrote :

ubuntu@s4lpa:~/openstack-on-lxd> uname -a
Linux s4lpa 4.4.0-87-generic #110-Ubuntu SMP Tue Jul 18 12:56:43 UTC 2017 s390x s390x s390x GNU/Linux

Additional information:

This is with LXD deployment, using ZFS backend storage.

If i do the same deployment with directory backend storage, the following bug presents:

https://bugs.launchpad.net/nova/+bug/1711213

Revision history for this message
dann frazier (dannf) wrote :

Thanks. I was able to reproduce:

root@qemu-img:~# strace -o /tmp/strace.out qemu-img convert -t none -O raw -f qcow2 zesty-server-cloudimg-arm64.img zesty-server-cloudimg-arm64.img.raw
qemu-img: file system may not support O_DIRECT
qemu-img: Could not open 'zesty-server-cloudimg-arm64.img.raw': Could not open 'zesty-server-cloudimg-arm64.img.raw': Invalid argument
root@qemu-img:~# grep DIRECT /tmp/strace.out
openat(AT_FDCWD, "zesty-server-cloudimg-arm64.img.raw", O_RDWR|O_DIRECT|O_CLOEXEC) = -1 EINVAL (Invalid argument)

Indeed, it does appear that O_DIRECT is not supported on this kernel.

A workaround for this would be to drop the "-t none" argument:

root@qemu-img:~# qemu-img convert -O raw -f qcow2 zesty-server-cloudimg-arm64.img zesty-server-cloudimg-arm64.img.raw
root@qemu-img:~#

Revision history for this message
Andrew McLeod (admcleod) wrote :

Due to a mistake I made with my deployment, I erroneously reported that this was a problem with ocata - however, the bundle was deploying pike.

It would seem that ocata openstack is not trying to do the conversion, but pike is for some reason.

Revision history for this message
dann frazier (dannf) wrote :

This does not appear to be arm64-specific - here's the same reproduction on x86:

root@qemu-img:~# uname -a
Linux qemu-img 4.4.0-92-generic #115-Ubuntu SMP Thu Aug 10 09:04:33 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
root@qemu-img:~# qemu-img convert -t none -O raw -f qcow2 zesty-server-cloudimg-arm64.img zesty-server-cloudimg-arm64.img.rawqemu-img: file system may not support O_DIRECT
qemu-img: Could not open 'zesty-server-cloudimg-arm64.img.raw': Could not open 'zesty-server-cloudimg-arm64.img.raw': Invalid argument

And here's an upstream bug describing the (arch-independent) issue:
  https://github.com/zfsonlinux/zfs/issues/224

So, if this is only appearing on arm64 in our testing, the trigger is probably at a higher level.

Revision history for this message
dann frazier (dannf) wrote :

I suspect the causal commit is this one in nova:

commit 1301368bf2352eddcc664202d7f159f523f681e2
Author: Matthew Booth <email address hidden>
Date: Wed Mar 8 16:38:49 2017 +0000

    Ensure image conversion flushes output data to disk

My suspicion is that it was just never tested on zfs.

Revision history for this message
Raghuram Kota (rkota) wrote : Re: arm64 - Pike- cannot convert image to raw

Changedthe title and description to state "Pike" per comment#5

summary: - arm64 - cannot convert image to raw
+ arm64 - Pike- cannot convert image to raw
description: updated
Revision history for this message
Raghuram Kota (rkota) wrote :

@Andrew : based on comm #5, is it ok to conclude that this bug does *not* impact Ocata ?

Revision history for this message
Ryan Beisner (1chb1n) wrote :

I suspect this is not limited to aarch64. We need to retest Pike with openstack-on-lxd on x86 and ppc64el.

dann frazier (dannf)
Changed in nova:
status: New → Confirmed
Revision history for this message
Raghuram Kota (rkota) wrote :

@Andrew @Ryan : Do we know if this bug impacts *Ocata* on ARM64 ? Per com#5 it does not seem so, but wanted to check.

Revision history for this message
Andrew McLeod (admcleod) wrote :

I must apologise for improperly reporting this bug initially. I have updated the title and tags. Also, my comment #1 is invalid, so to clarify:

This does not impact Openstack Ocata, it first becomes apparent in Openstack Pike.

This is not an arm64 specific issue - actually, it impacts openstack on lxd when using ZFS as a backing store. the scenario where the bug was noticed was:

Openstack on top of LXD with a ZFS backing store.

ZFS does not handle O_DIRECT calls (see note #6).

summary: - arm64 - Pike- cannot convert image to raw
+ Openstack Pike + LXD with ZFS - cannot convert image to raw
tags: added: openstack-version.pike
removed: aarch64 arm64 openstack-version.ocata
tags: added: openstack-on-lxd
Revision history for this message
James Page (james-page) wrote :

Snippet from qemu:

static void raw_parse_flags(int bdrv_flags, int *open_flags)
{
    assert(open_flags != NULL);

    *open_flags |= O_BINARY;
    *open_flags &= ~O_ACCMODE;
    if (bdrv_flags & BDRV_O_RDWR) {
        *open_flags |= O_RDWR;
    } else {
        *open_flags |= O_RDONLY;
    }

    /* Use O_DSYNC for write-through caching, no flags for write-back caching,
     * and O_DIRECT for no caching. */
    if ((bdrv_flags & BDRV_O_NOCACHE)) {
        *open_flags |= O_DIRECT;
    }
}

no-cache == use O_DIRECT.

affects: nova → charm-test-infra
Changed in charm-test-infra:
importance: Undecided → Low
status: Confirmed → Triaged
Revision history for this message
James Page (james-page) wrote :

Moving this bug to charm-test-infra - I don't think we can really blame nova for wanting to ensure that image operations are consistently written to disk.

Revision history for this message
James Page (james-page) wrote :
Revision history for this message
Andrew McLeod (admcleod) wrote :

The default for Pike (and in fact Ocata) is

force_raw_images = True

I have put "force_raw_images = False" into the default section of the nova.conf on a compute slave (in this case arm64), restarted nova-compute, and am able to launch an instance.

Next step is to decide if this should be a charm config option or if the charm should attempt to detect the backend storage?

Revision history for this message
Ryan Beisner (1chb1n) wrote :

Let's add a new charm config option for this, with default True. The o-o-lxd test procedure and bundles will need to set it False, but the norm for metal/production deploys should be True if I understand correctly. Thank you.

Changed in charm-nova-compute:
assignee: nobody → Andrew McLeod (admcleod)
status: New → Confirmed
importance: Undecided → Medium
Changed in charm-test-infra:
status: Triaged → Invalid
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-nova-compute (master)

Fix proposed to branch: master
Review: https://review.openstack.org/513872

Changed in charm-nova-compute:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-nova-compute (master)

Reviewed: https://review.openstack.org/513872
Committed: https://git.openstack.org/cgit/openstack/charm-nova-compute/commit/?id=a6638ce3bf633e8abc6e52b030d9d1c12338f9b4
Submitter: Zuul
Branch: master

commit a6638ce3bf633e8abc6e52b030d9d1c12338f9b4
Author: Andrew McLeod <email address hidden>
Date: Fri Oct 20 16:55:05 2017 -0600

    Add charm config option for force_raw_images

    Since Ocata, backing images are by default converted into raw format
    but since Pike a problem with O_DIRECT calls and ZFS backend storage
    has presented - i.e., Pike on LXD with ZFS backend storage will result
    in an error about image conversion. This config option allows
    that default to be changed so that Pike on LXD with ZFS will lauch
    instances succesfully.

    Closes-Bug: 1710994

    Change-Id: Ieba15b44b4b56a356d58c4504dd259fc80e7575d

Changed in charm-nova-compute:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-nova-compute (stable/17.08)

Fix proposed to branch: stable/17.08
Review: https://review.openstack.org/518419

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-nova-compute (stable/17.08)

Reviewed: https://review.openstack.org/518419
Committed: https://git.openstack.org/cgit/openstack/charm-nova-compute/commit/?id=a382453b91b1333d1e2fbe333c77503efd156b80
Submitter: Zuul
Branch: stable/17.08

commit a382453b91b1333d1e2fbe333c77503efd156b80
Author: Andrew McLeod <email address hidden>
Date: Fri Oct 20 16:55:05 2017 -0600

    Add charm config option for force_raw_images

    Since Ocata, backing images are by default converted into raw format
    but since Pike a problem with O_DIRECT calls and ZFS backend storage
    has presented - i.e., Pike on LXD with ZFS backend storage will result
    in an error about image conversion. This config option allows
    that default to be changed so that Pike on LXD with ZFS will lauch
    instances succesfully.

    Closes-Bug: 1710994

    Change-Id: Ieba15b44b4b56a356d58c4504dd259fc80e7575d
    (cherry picked from commit a6638ce3bf633e8abc6e52b030d9d1c12338f9b4)

Changed in charm-nova-compute:
milestone: none → 17.11
James Page (james-page)
Changed in charm-nova-compute:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.