Comment 2 for bug 1818847

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/640781
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e7b64eaad82db38dd46f586b650da4ddde42533b
Submitter: Zuul
Branch: master

commit e7b64eaad82db38dd46f586b650da4ddde42533b
Author: Kashyap Chamarthy <email address hidden>
Date: Thu Feb 28 12:33:12 2019 +0100

    qemu: Make disk image conversion dramatically faster

    tl;dr: Use 'writeback' instead of 'writethrough' as the cache mode of
    the target image for `qemu-img convert`. Two reasons: (a) if the image
    conversion completes succesfully, then 'writeback' calls fsync() to
    safely write data to the physical disk; and (b) 'writeback' makes the
    image conversion a _lot_ faster.

    Back-of-the-envelope "benchmark" (on an SSD)
    --------------------------------------------

    (Ran both the tests thrice each; version: qemu-img-2.11.0)

    With 'writethrough':

        $> time (qemu-img convert -t writethrough -f qcow2 -O raw \
                Fedora-Cloud-Base-29.qcow2 Fedora-Cloud-Base-29.raw)
        real 1m43.470s
        user 0m8.310s
        sys 0m3.661s

    With 'writeback':

        $> time (qemu-img convert -t writeback -f qcow2 -O raw \
                Fedora-Cloud-Base-29.qcow2 5-Fedora-Cloud-Base-29.raw)

        real 0m7.390s
        user 0m5.179s
        sys 0m1.780s

    I.e. ~103 seconds of elapsed wall-clock time for 'writethrough' vs. ~7
    seconds for 'writeback' -- IOW, 'writeback' is nearly _15_ times faster!

    Details
    -------

    Nova commit e6ce9557f84cdcdf4ffdd12ce73a008c96c7b94a ("qemu-img do not
    use cache=none if no O_DIRECT support") was introduced to make instances
    boot on filesystems that don't support 'O_DIRECT' (which bypasses the
    host page cache and flushes data directly to the disk), such as 'tmpfs'.
    In doing so it introduced the 'writethrough' cache for the target image
    for `qemu-img convert`.

    This patch proposes to change that to 'writeback'.

    Let's addresses the 'safety' concern:

      "What about data integrity in the event of a host crash (especially
       on shared file systems such as NFS)?"

    Answer: If the host crashes mid-way during image conversion, then
    neither "data integrity" nor the cache mode in use matters. But if the
    image conversion completes _succesfully_, then 'writeback' will safely
    write the data to the physical disk, just as 'writethough' does.

    So we are as safe as we can, but with the extra benefit of image
    conversion being _much_ faster.

            * * *

    The `qemu-img convert` command defaults to 'cache=writeback' for the
    source image. And 'cache=unsafe' for the target, because if `qemu-img`
    "crashes during the conversion, the user will throw away the broken
    output file anyway and start over"[1]. And `qemu-img convert`
    supports[2] fsync() for the target image since QEMU 1.1 (2012).

    [1] https://git.qemu.org/?p=qemu.git;a=commitdiff;h=1bd8e175
        -- "qemu-img convert: Use cache=unsafe for output image"
    [2] https://git.qemu.org/?p=qemu.git;a=commitdiff;h=80ccf93b
        -- "qemu-img: let 'qemu-img convert' flush data"

    Closes-Bug: #1818847

    Change-Id: I574be2b629aaff23556e25f8db0d740105be6f07
    Signed-off-by: Kashyap Chamarthy <email address hidden>
    Looks-good-to-me'd-by: Kevin Wolf <email address hidden>