Make RBD Usable for Ephemeral Storage

Bug #1226351 reported by Mike Perez on 2013-09-17
112
This bug affects 19 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Dmitry Borodaenko

Bug Description

Currently in Havana development, RBD as ephemeral storage has serious stability
and performance issues that makes the Ceph cluster a bottleneck for using an
image as a source.

Nova has to currently communicate with the external service Glance, which has
to talk to the separate Ceph storage backend to fetch path information. The
entire image is then downloaded to local disk, and then imported from local
disk to RBD. This leaves a stability concern, especially with large images for
the instance to be successfully created.

This can be eliminated by instead having Nova's RBD image backend utility
communicate directly with the Ceph backend to do a copy-on-write of the image.
Not only does this greatly improve stability, but performance is drastically
improved by not having to do a full copy of the image.

Fix proposed to branch: master
Review: https://review.openstack.org/46879

Changed in nova:
assignee: nobody → Josh Durgin (jdurgin)
status: New → In Progress
tags: added: ceph rbd

This seems to have stalled?

Josh Durgin (jdurgin) wrote :

hudson didn't update this, but it's up for review: https://review.openstack.org/#/c/59149/
I expect it will be merged soon since several rounds of review have already been done.

Andrew Woodward (xarses) wrote :

I think that this patch is very useful to the community, What can we do to help push this through?

Can we get this tagged for inclusion into icehouse and an importance set?

Changed in nova:
importance: Undecided → Medium

Reviewed: https://review.openstack.org/59149
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c25c60f6a9ab1ccf12f72f76d400e7c9c0d090b3
Submitter: Jenkins
Branch: master

commit c25c60f6a9ab1ccf12f72f76d400e7c9c0d090b3
Author: Josh Durgin <email address hidden>
Date: Wed Jan 22 15:07:17 2014 -0800

    enable cloning for rbd-backed ephemeral disks

    Currently when using rbd as an image backend, nova downloads the
    glance image to local disk and then copies it again into rbd. This
    can be very slow for large images, and wastes bandwidth as well as
    disk space.

    When the glance image is stored in the same ceph cluster, the data is
    being pulled out and pushed back in unnecessarily. Instead, create a
    copy-on-write clone of the image. This is fast, and does not depend
    on the size of the image. Instead of taking minutes, booting takes
    seconds, and is not limited by the disk copy.

    Add some rbd utility functions from cinder to support cloning and
    let the rbd imagebackend rely on librbd instead of the rbd
    command line tool for checking image existence.

    Add an ImageHandler for rbd that does the cloning if an applicable
    image location is available. If no such location is available, or rbd
    is not configured for ephemeral disks, this handler does nothing, so
    enable it by default.

    blueprint rbd-clone-image-handler
    Closes-bug: 1226351
    Change-Id: I9b77a50206d0eda709df8356faaeeba35d232f22
    Signed-off-by: Josh Durgin <email address hidden>
    Signed-off-by: Zhi Yan Liu <email address hidden>

Changed in nova:
status: In Progress → Fix Committed
Changed in nova:
milestone: none → icehouse-rc1
Andrew Woodward (xarses) on 2014-03-13
Changed in nova:
status: Fix Committed → In Progress
Changed in nova:
milestone: icehouse-rc1 → none

Any status on this patch? Because it was reverted before icehouse was released.

Shuquan Huang (shuquan) wrote :

I merge the patch from master. But it doesn't work. Is there any other place should be configured?

Changed in nova:
assignee: Josh Durgin (jdurgin) → Dmitry Borodaenko (dborodaenko)
Dmitry Borodaenko (angdraug) wrote :

Current incarnation of this patch:
https://review.openstack.org/94295

Changed in nova:
assignee: Dmitry Borodaenko (dborodaenko) → Jay Pipes (jaypipes)
Changed in nova:
assignee: Jay Pipes (jaypipes) → Dmitry Borodaenko (dborodaenko)
Changed in nova:
assignee: Dmitry Borodaenko (dborodaenko) → Michael Still (mikalstill)
Changed in nova:
assignee: Michael Still (mikalstill) → Dmitry Borodaenko (dborodaenko)

Reviewed: https://review.openstack.org/94295
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=112b140e2daa7207a8d37c77a92456b155f3ecb9
Submitter: Jenkins
Branch: master

commit 112b140e2daa7207a8d37c77a92456b155f3ecb9
Author: Dmitry Borodaenko <email address hidden>
Date: Mon May 19 16:30:14 2014 -0700

    Enable cloning for rbd-backed ephemeral disks

    Currently when using rbd as an image backend, nova downloads the glance
    image to local disk and then copies it again into rbd. This can be very
    slow for large images, and wastes bandwidth as well as disk space.

    When the glance image is stored in the same ceph cluster, the data is
    being pulled out and pushed back in unnecessarily. Instead, create a
    copy-on-write clone of the image. This is fast, and does not depend on
    the size of the image. Instead of taking minutes, booting takes seconds,
    and is not limited by the disk copy.

    Add some rbd utility functions from cinder to support cloning and let
    the rbd imagebackend rely on librbd instead of the rbd command line tool
    for checking image existence.

    Adds a new clone() method to the image backend, so backends like rbd can
    make optimizations like this. Try to use clone() for the root disk when
    it comes from an image, but fall back to fetch_to_raw() if clone()
    fails.

    Instead of calling disk.get_disk_size() directly from
    verify_base_size(), which assumes the disk is stored locally, add a new
    method that is overridden by the Rbd subclass to get the disk size.

    DocImpact

    Implements: blueprint rbd-clone-image-handler
    Closes-Bug: 1226351
    Co-Authored-By: Josh Durgin <email address hidden>
    Signed-Off-By: Josh Durgin <email address hidden>
    Signed-Off-By: Zhi Yan Liu <email address hidden>
    Signed-Off-By: Dmitry Borodaenko <email address hidden>
    Change-Id: I0f50659b54a92fc21086990be8925ea15008569a

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx) on 2014-09-05
Changed in nova:
milestone: none → juno-3
status: Fix Committed → Fix Released
Abel Lopez (al592b) wrote :

Would love to see this in Icehouse...

Xav Paice (xavpaice) wrote :

+1 for Icehouse. The patch does apply cleanly - there's a good source for the patch in the Debian packages for Icehouse (plus a bunch of other useful ones for Ceph users).

Dmitry Borodaenko (angdraug) wrote :

The whole patch series for Icehouse is available on Github here:
https://github.com/angdraug/nova/tree/rbd-ephemeral-clone-stable-icehouse

If there's a will from the Nova team to review and merge this to stable/icehouse, I can post this patch series to gerrit.

Abel Lopez (al592b) on 2014-09-22
tags: added: icehouse-backport-potential
Thierry Carrez (ttx) on 2014-10-16
Changed in nova:
milestone: juno-3 → 2014.2
cristi1979 (cristi-falcas) wrote :

What files do I have to update from the git repo (icehouse) for this to work?

Dmitry Borodaenko (angdraug) wrote :

$ git whatchanged --oneline 2014.1.3..rbd-ephemeral-clone-stable-icehouse|awk '/^:/{print $6}'|grep -v ^nova/tests/|sort -u
nova/compute/manager.py
nova/compute/rpcapi.py
nova/exception.py
nova/virt/baremetal/driver.py
nova/virt/driver.py
nova/virt/fake.py
nova/virt/hyperv/driver.py
nova/virt/imagehandler/__init__.py
nova/virt/images.py
nova/virt/libvirt/driver.py
nova/virt/libvirt/imagebackend.py
nova/virt/libvirt/rbd.py
nova/virt/libvirt/rbd_utils.py
nova/virt/libvirt/utils.py
nova/virt/vmwareapi/driver.py
nova/virt/xenapi/driver.py

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers