Post-migration, Cinder volumes lose disk cache value, resulting in I/O latency

Bug #1706083 reported by Kashyap Chamarthy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Kashyap Chamarthy
Newton
Fix Committed
Medium
Kashyap Chamarthy
Ocata
Fix Committed
Medium
Tony Breeds

Bug Description

Description
===========

[This was initially reported by a Red Hat OSP customer.]

The I/O latency of a Cinder volume after live migration of an instance
to which it's attached increases significantly. This stays increased
till the VM is stopped and started again. [VM is booted with Cinder
volume.

This is not the case when using a disk from a Nova store backend [
without Cinder volume] -- or at least the difference isn't so
significantly high after a live migration.

The storage backend is Ceph 2.0.

How reproducible: Consistently

Steps to Reproduce
==================

(0) Both the Nova instances and Cinder volumes are located on Ceph

(1) Create a Nova instance with a Cinder volume attached to it

(2) Live migrate it to a target Compute node

(3) Run `ioping` (`ioping -c 10 .`) on the Cinder volume.
    Alternatively, run other I/O benchmarks like using `fio` with
    'direct=1' (which uses non-bufferred I/O) as a good sanity check to
    get a second opinion regarding latency.

Actual result
=============

Before live migration: `ioping` output on the Cinder volume attached to a Nova
instance:

    [guest]$ sudo ioping -c 10 .
    4 KiB <<< . (xfs /dev/sda1): request=1 time=98.0 us (warmup)
    4 KiB <<< . (xfs /dev/sda1): request=2 time=135.6 us
    4 KiB <<< . (xfs /dev/sda1): request=3 time=155.5 us
    4 KiB <<< . (xfs /dev/sda1): request=4 time=161.7 us
    4 KiB <<< . (xfs /dev/sda1): request=5 time=148.4 us
    4 KiB <<< . (xfs /dev/sda1): request=6 time=354.3 us
    4 KiB <<< . (xfs /dev/sda1): request=7 time=138.0 us (fast)
    4 KiB <<< . (xfs /dev/sda1): request=8 time=150.7 us
    4 KiB <<< . (xfs /dev/sda1): request=9 time=149.6 us
    4 KiB <<< . (xfs /dev/sda1): request=10 time=138.6 us (fast)

    --- . (xfs /dev/sda1) ioping statistics ---
    9 requests completed in 1.53 ms, 36 KiB read, 5.87 k iops, 22.9 MiB/s
    generated 10 requests in 9.00 s, 40 KiB, 1 iops, 4.44 KiB/s
    min/avg/max/mdev = 135.6 us / 170.3 us / 354.3 us / 65.6 us

After live migration, `ioping` output on the Cinder

    [guest]$ sudo ioping -c 10 .
    4 KiB <<< . (xfs /dev/sda1): request=1 time=1.03 ms (warmup)
    4 KiB <<< . (xfs /dev/sda1): request=2 time=948.6 us
    4 KiB <<< . (xfs /dev/sda1): request=3 time=955.7 us
    4 KiB <<< . (xfs /dev/sda1): request=4 time=920.5 us
    4 KiB <<< . (xfs /dev/sda1): request=5 time=1.03 ms
    4 KiB <<< . (xfs /dev/sda1): request=6 time=838.2 us
    4 KiB <<< . (xfs /dev/sda1): request=7 time=1.13 ms (slow)
    4 KiB <<< . (xfs /dev/sda1): request=8 time=868.6 us
    4 KiB <<< . (xfs /dev/sda1): request=9 time=985.2 us
    4 KiB <<< . (xfs /dev/sda1): request=10 time=936.6 us

    --- . (xfs /dev/sda1) ioping statistics ---
    9 requests completed in 8.61 ms, 36 KiB read, 1.04 k iops, 4.08 MiB/s
    generated 10 requests in 9.00 s, 40 KiB, 1 iops, 4.44 KiB/s
    min/avg/max/mdev = 838.2 us / 956.9 us / 1.13 ms / 81.0 us

This goes back to an average of 200us again after shutting down and
starting up the instance.

Expected result
===============

No I/O latency experienced on Cinder volumes.

Changed in nova:
assignee: nobody → Kashyap Chamarthy (kashyapc)
status: New → Confirmed
Changed in nova:
status: Confirmed → In Progress
Revision history for this message
Kashyap Chamarthy (kashyapc) wrote :

Here's the change to fix this issue:

    https://review.openstack.org/#/c/485752/ -- libvirt: Post-migration,
    set cache value for Cinder volume(s)

tags: added: libvirt volumes
tags: added: live-migration
summary: - Post-migration, Cinder volumes lose disk cache value resulting in I/O
+ Post-migration, Cinder volumes lose disk cache value, resulting in I/O
latency
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/485752
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=14c38ac0f253036da79f9d07aedf7dfd5778fde8
Submitter: Jenkins
Branch: master

commit 14c38ac0f253036da79f9d07aedf7dfd5778fde8
Author: Kashyap Chamarthy <email address hidden>
Date: Thu Jul 20 19:01:23 2017 +0200

    libvirt: Post-migration, set cache value for Cinder volume(s)

    This was noticed in a downstream bug when a Nova instance with Cinder
    volume (in this case, both the Nova instance storage _and_ Cinder volume
    are located on Ceph) is migrated to a target Compute node, the disk
    cache value for the Cinder volume gets changed. I.e. the QEMU
    command-line for the Cinder volume stored on Ceph turns into the
    following:

    Pre-migration, QEMU command-line for the Nova instance:

        [...] -drive file=rbd:volumes/volume-[...],cache=writeback

    Post-migration, QEMU command-line for the Nova instance:

        [...] -drive file=rbd:volumes/volume-[...],cache=none

    Furthermore, Jason Dillaman from Ceph confirms RBD cache being enabled
    pre-migration:

        $ ceph --admin-daemon /var/run/qemu/ceph-client.openstack.[...] \
            config get rbd_cache
        {
            "rbd_cache": "true"
        }

    And disabled, post-migration:

        $ ceph --admin-daemon /var/run/qemu/ceph-client.openstack.[...] \
            config get rbd_cache
        {
            "rbd_cache": "false"
        }

    This change in cache value post-migration causes I/O latency on the
    Cinder volume.

    From a chat with Daniel Berrangé on IRC: Prior to live migration, Nova
    rewrites all the <disk> elements, and passes this updated guest XML
    across to target libvirt. And it is never calling _set_cache_mode()
    when doing this. So `nova.conf`'s `writeback` setting is getting lost,
    leaving the default `cache=none` setting. And this mistake (of leaving
    the default cache value to 'none') will of course be correct when you
    reboot the guest on the target later.

    So:

      - Call _set_cache_mode() in _get_volume_config() method -- because it
        is a callback function to _update_volume_xml() in
        nova/virt/libvirt/migration.py.

      - And remove duplicate calls to _set_cache_mode() in
        _get_guest_storage_config() and attach_volume().

      - Fix broken unit tests; adjust test_get_volume_config() to reflect
        the disk cache mode.

    Thanks: Jason Dillaman of Ceph for observing the change in cache modes
            in a downstream bug analysis, Daniel Berrangé for help in
            analysis from a Nova libvirt driver POV, and Stefan Hajnoczi
            from QEMU for help on I/O latency instrumentation with `perf`.

    Closes-bug: 1706083
    Change-Id: I4184382b49dd2193d6a21bfe02ea973d02d8b09f

Changed in nova:
status: In Progress → Fix Released
Matt Riedemann (mriedem)
Changed in nova:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/488959

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/489198

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.0.0.0rc1

This issue was fixed in the openstack/nova 16.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/ocata)
Download full text (3.2 KiB)

Reviewed: https://review.openstack.org/489198
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=306b3f1aac4fc30ee14a7f5f66474d68860768fa
Submitter: Jenkins
Branch: stable/ocata

commit 306b3f1aac4fc30ee14a7f5f66474d68860768fa
Author: Kashyap Chamarthy <email address hidden>
Date: Thu Jul 20 19:01:23 2017 +0200

    libvirt: Post-migration, set cache value for Cinder volume(s)

    This was noticed in a downstream bug when a Nova instance with Cinder
    volume (in this case, both the Nova instance storage _and_ Cinder volume
    are located on Ceph) is migrated to a target Compute node, the disk
    cache value for the Cinder volume gets changed. I.e. the QEMU
    command-line for the Cinder volume stored on Ceph turns into the
    following:

    Pre-migration, QEMU command-line for the Nova instance:

        [...] -drive file=rbd:volumes/volume-[...],cache=writeback

    Post-migration, QEMU command-line for the Nova instance:

        [...] -drive file=rbd:volumes/volume-[...],cache=none

    Furthermore, Jason Dillaman from Ceph confirms RBD cache being enabled
    pre-migration:

        $ ceph --admin-daemon /var/run/qemu/ceph-client.openstack.[...] \
            config get rbd_cache
        {
            "rbd_cache": "true"
        }

    And disabled, post-migration:

        $ ceph --admin-daemon /var/run/qemu/ceph-client.openstack.[...] \
            config get rbd_cache
        {
            "rbd_cache": "false"
        }

    This change in cache value post-migration causes I/O latency on the
    Cinder volume.

    From a chat with Daniel Berrangé on IRC: Prior to live migration, Nova
    rewrites all the <disk> elements, and passes this updated guest XML
    across to target libvirt. And it is never calling _set_cache_mode()
    when doing this. So `nova.conf`'s `writeback` setting is getting lost,
    leaving the default `cache=none` setting. And this mistake (of leaving
    the default cache value to 'none') will of course be correct when you
    reboot the guest on the target later.

    So:

      - Call _set_cache_mode() in _get_volume_config() method -- because it
        is a callback function to _update_volume_xml() in
        nova/virt/libvirt/migration.py.

      - And remove duplicate calls to _set_cache_mode() in
        _get_guest_storage_config() and attach_volume().

      - Fix broken unit tests; adjust test_get_volume_config() to reflect
        the disk cache mode.

    Thanks: Jason Dillaman of Ceph for observing the change in cache modes
            in a downstream bug analysis, Daniel Berrangé for help in
            analysis from a Nova libvirt driver POV, and Stefan Hajnoczi
            from QEMU for help on I/O latency instrumentation with `perf`.

    Conflicts [stable/ocata]:
     - libvirt/driver.py: The _get_scsi_controller() method from Git master
       isn't in Ocata, so adjust the _get_guest_storage_config() method
       accordingly.
     - Fix unit test conflicts in the method
       test_attach_volume_with_vir_domain_affect_live_flag().

    Closes-bug: 1706083
    Change-Id: I4184382b49dd2193d6a...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/newton)
Download full text (3.2 KiB)

Reviewed: https://review.openstack.org/488959
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1797d73efc0601a0a664d32a127669e93bce3d45
Submitter: Jenkins
Branch: stable/newton

commit 1797d73efc0601a0a664d32a127669e93bce3d45
Author: Kashyap Chamarthy <email address hidden>
Date: Thu Jul 20 19:01:23 2017 +0200

    libvirt: Post-migration, set cache value for Cinder volume(s)

    This was noticed in a downstream bug when a Nova instance with Cinder
    volume (in this case, both the Nova instance storage _and_ Cinder volume
    are located on Ceph) is migrated to a target Compute node, the disk
    cache value for the Cinder volume gets changed. I.e. the QEMU
    command-line for the Cinder volume stored on Ceph turns into the
    following:

    Pre-migration, QEMU command-line for the Nova instance:

        [...] -drive file=rbd:volumes/volume-[...],cache=writeback

    Post-migration, QEMU command-line for the Nova instance:

        [...] -drive file=rbd:volumes/volume-[...],cache=none

    Furthermore, Jason Dillaman from Ceph confirms RBD cache being enabled
    pre-migration:

        $ ceph --admin-daemon /var/run/qemu/ceph-client.openstack.[...] \
            config get rbd_cache
        {
            "rbd_cache": "true"
        }

    And disabled, post-migration:

        $ ceph --admin-daemon /var/run/qemu/ceph-client.openstack.[...] \
            config get rbd_cache
        {
            "rbd_cache": "false"
        }

    This change in cache value post-migration causes I/O latency on the
    Cinder volume.

    From a chat with Daniel Berrangé on IRC: Prior to live migration, Nova
    rewrites all the <disk> elements, and passes this updated guest XML
    across to target libvirt. And it is never calling _set_cache_mode()
    when doing this. So `nova.conf`'s `writeback` setting is getting lost,
    leaving the default `cache=none` setting. And this mistake (of leaving
    the default cache value to 'none') will of course be correct when you
    reboot the guest on the target later.

    So:

      - Call _set_cache_mode() in _get_volume_config() method -- because it
        is a callback function to _update_volume_xml() in
        nova/virt/libvirt/migration.py.

      - And remove duplicate calls to _set_cache_mode() in
        _get_guest_storage_config() and attach_volume().

      - Fix broken unit tests; adjust test_get_volume_config() to reflect
        the disk cache mode.

    Thanks: Jason Dillaman of Ceph for observing the change in cache modes
            in a downstream bug analysis, Daniel Berrangé for help in
            analysis from a Nova libvirt driver POV, and Stefan Hajnoczi
            from QEMU for help on I/O latency instrumentation with `perf`.

    Conflicts [stable/newton]:
     - libvirt/driver.py: The _get_scsi_controller() method from Git master
       isn't in Newton, so adjust the _get_guest_storage_config() method
       accordingly.
     - Fix unit test conflicts in the method
       test_attach_volume_with_vir_domain_affect_live_flag().

    Closes-bug: 1706083
    Change-Id: I4184382b49dd2193...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.0.7

This issue was fixed in the openstack/nova 15.0.7 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 14.0.8

This issue was fixed in the openstack/nova 14.0.8 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.