Comment 2 for bug 1646181

Revision history for this message
JB Broccard (jbbroccard) wrote : Re: Fail to boot VM out of large snapshots (30GB+)

All,
I was able to get around the issue by triggering the "qemu-img info" command right before the python code executes it. See my change in /usr/lib/python2.7/site-packages/nova/virt/images.py :

def qemu_img_info(path, format=None):
    """Return an object containing the parsed output from qemu-img info."""
    # TODO(mikal): this code should not be referring to a libvirt specific
    # flag.
    # NOTE(sirp): The config option import must go here to avoid an import
    # cycle
    CONF.import_opt('images_type', 'nova.virt.libvirt.imagebackend',
                    group='libvirt')
    if not os.path.exists(path) and CONF.libvirt.images_type != 'rbd':
        raise exception.DiskNotFound(location=path)

    try:
        cmd = ('env', 'LC_ALL=C', 'LANG=C', 'qemu-img', 'info', path)
        if format is not None:
            cmd = cmd + ('-f', format)
        os.system('(time /usr/bin/qemu-img info ' + str(path) + ') >> /tmp/testfile 2>&1') # <- my change
        out, err = utils.execute(*cmd, prlimit=QEMU_IMG_LIMITS)

With this the issue does not occur anymore.
What I noticed is that through NFS accessing files on the NFS mounted FS can be very slow right after the copy/convert, running the same command right before the ones that trigger the exception made it.

I tried to track down where the Exit code: -9 came from but it seems to be coming from qemu-img program itself and oslo_concurrency libraries does not interpret this "-9".

Could anyone give an opinion on this workaround?