RBD driver doesn't support customized ceph cluster name

Bug #1444855 reported by Huang Zhiteng
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Low
Huang Zhiteng
Juno
Fix Released
Undecided
Huang Zhiteng

Bug Description

RBD driver fails to connect to ceph cluster if the cluster has a name other than 'ceph'.

The problem is Ceph rados.py assumes the cluster name is 'ceph', if 'clustername' is None when instantiating the Rados class: https://github.com/ceph/ceph/blob/master/src/pybind/rados.py#L297, which unfortunately is how Cinder RBD driver use it: https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/rbd.py#L302

That results error like this:

2015-04-16 00:43:17.052 20354 ERROR cinder.volume.drivers.rbd [req-ecf1c044-e4d0-4ea4-a080-c40acbafc233 None None] error connecting to ceph cluster
2015-04-16 00:43:17.052 20354 TRACE cinder.volume.drivers.rbd Traceback (most recent call last):
2015-04-16 00:43:17.052 20354 TRACE cinder.volume.drivers.rbd File "/opt/openstack/cinder/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 262, in check_for_setup_error
2015-04-16 00:43:17.052 20354 TRACE cinder.volume.drivers.rbd with RADOSClient(self):
2015-04-16 00:43:17.052 20354 TRACE cinder.volume.drivers.rbd File "/opt/openstack/cinder/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 234, in __init__
2015-04-16 00:43:17.052 20354 TRACE cinder.volume.drivers.rbd self.cluster, self.ioctx = driver._connect_to_rados(pool)
2015-04-16 00:43:17.052 20354 TRACE cinder.volume.drivers.rbd File "/opt/openstack/cinder/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 283, in _connect_to_rados
2015-04-16 00:43:17.052 20354 TRACE cinder.volume.drivers.rbd client.connect()
2015-04-16 00:43:17.052 20354 TRACE cinder.volume.drivers.rbd File "/usr/lib/python2.7/dist-packages/rados.py", line 417, in connect
2015-04-16 00:43:17.052 20354 TRACE cinder.volume.drivers.rbd raise make_ex(ret, "error calling connect")
2015-04-16 00:43:17.052 20354 TRACE cinder.volume.drivers.rbd ObjectNotFound: error calling connect
2015-04-16 00:43:17.052 20354 TRACE cinder.volume.drivers.rbd
2015-04-16 00:43:17.054 20354 ERROR cinder.volume.manager [req-ecf1c044-e4d0-4ea4-a080-c40acbafc233 None None] Error encountered during initialization of driver: RBDDriver
2015-04-16 00:43:17.054 20354 ERROR cinder.volume.manager [req-ecf1c044-e4d0-4ea4-a080-c40acbafc233 None None] Bad or unexpected response from the storage volume backend API: error connecting to ceph cluster

tags: added: driver rbd
Jon Bernard (jbernard)
Changed in cinder:
assignee: nobody → Jon Bernard (jbernard)
Jon Bernard (jbernard)
Changed in cinder:
status: New → Confirmed
importance: Undecided → Low
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/174415

Changed in cinder:
assignee: Jon Bernard (jbernard) → Huang Zhiteng (zhiteng-huang)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/174482

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/174415
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=6db57c53a1363ff267e58f88b0937cd5d0e842c4
Submitter: Jenkins
Branch: master

commit 6db57c53a1363ff267e58f88b0937cd5d0e842c4
Author: Zhiteng Huang <email address hidden>
Date: Thu Apr 16 22:47:31 2015 +0800

    Add support for customized cluster name

    Current RBD driver assumes ceph cluster name to be 'ceph', for
    cluster has a different name, the driver won't be able to connect
    to the cluster. This change add a new config option
    'rbd_cluster_name' to address this issue.

    DocImpact

    Change-Id: I02ae1a255fd613fce291cc7ddf90cfd9175255a8
    Closes-bug: #1444855

Changed in cinder:
status: In Progress → Fix Committed
Revision history for this message
Josh Durgin (jdurgin) wrote :

Thanks Winston! That's definitely the proper fix. As a workaround for older versions, one could set rbd_ceph_conf to a ceph config file with the setting cluster=clustername in the global section (and possibly other settings that depend on cluster name by default, like keyring).

Revision history for this message
Huang Zhiteng (zhiteng-huang) wrote :

Hi Josh,

Thanks for the tip. I tried to apply the workaround you mentioned by adding 'cluster=clustername' to both ceph config file and client keyring), but RBD driver still emitting the same error. I'll take a closer look at ceph configuration and rados.py.

Revision history for this message
Josh Durgin (jdurgin) wrote :

Ah, there may be a librados internal I'm forgetting that depends on the cluster name that's only initialized during rados_create(). If so it'd be good to figure out what that is, since it should be documented in the librados interface.

Revision history for this message
Huang Zhiteng (zhiteng-huang) wrote :

Josh, seems like librados::RadosClient only looks at clustername when it was created at https://github.com/ceph/ceph/blob/master/src/pybind/rados.py#L300 -> https://github.com/ceph/ceph/blob/master/src/librados/librados.cc#L2219 Not sure if there is other places where librados creates client and would parse the ceph config file.

You know much better about librados, could you please update your finding here? If the workaround works, it'd be nice for old versions, if not, I'd backport this fix.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/juno)

Fix proposed to branch: stable/juno
Review: https://review.openstack.org/174700

Revision history for this message
Josh Durgin (jdurgin) wrote :

By all means backport the fix. I'll check into librados for more clarity, and see if there's a workaround for older openstack versions that don't get backports anymore.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/juno)

Reviewed: https://review.openstack.org/174700
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=f5895fe819646f1b30b83b9d28cbaa75ff5dffc4
Submitter: Jenkins
Branch: stable/juno

commit f5895fe819646f1b30b83b9d28cbaa75ff5dffc4
Author: Zhiteng Huang <email address hidden>
Date: Thu Apr 16 22:47:31 2015 +0800

    Add support for customized cluster name

    Current RBD driver assumes ceph cluster name to be 'ceph', for
    cluster has a different name, the driver won't be able to connect
    to the cluster. This change add a new config option
    'rbd_cluster_name' to address this issue.

    DocImpact

    Change-Id: I02ae1a255fd613fce291cc7ddf90cfd9175255a8
    Closes-bug: #1444855
    (cherry picked from commit 6db57c53a1363ff267e58f88b0937cd5d0e842c4)

tags: added: in-stable-juno
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/icehouse)

Fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/175258

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/175475

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/175475
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=877c8e7d8c0d397021e1a06b2d2a36962d6a9d42
Submitter: Jenkins
Branch: master

commit 877c8e7d8c0d397021e1a06b2d2a36962d6a9d42
Author: Zhiteng Huang <email address hidden>
Date: Tue Apr 21 00:25:11 2015 +0800

    RBD: Add missing Ceph customized cluster name support

    It turns out '--cluster' is also needed when RBD driver talks to
    ceph cluster using 'ceph' command (not via librados). This change
    appends RBDDriver._ceph_args with '--cluster' when 'rbd_cluster_name'
    config option is not None.

    Change-Id: Ie957a3658a630947a140f4172f775e42b7611c6e
    Closes-bug: #1444855

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/175723

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/juno)

Fix proposed to branch: stable/juno
Review: https://review.openstack.org/175724

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/icehouse)

Fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/175725

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (stable/icehouse)

Change abandoned by Huang Zhiteng (<email address hidden>) on branch: stable/icehouse
Review: https://review.openstack.org/175725

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Huang Zhiteng (<email address hidden>) on branch: stable/icehouse
Review: https://review.openstack.org/175258

Revision history for this message
Josh Durgin (jdurgin) wrote :

The following workaround worked for me (just changing the cinder-volume node):

1) If you don't have one, create a ceph config file with at least these settings. It can be anywhere readable by cinder-volume, but an easy place to remember is /etc/ceph/cluster_foo.conf

[global]
mon host = host1,host2,host3
[client.volumes]
keyring = /etc/ceph/cluster_foo.client.volumes.keyring

2) In cinder.conf, set rbd_ceph_conf = /etc/ceph/cluster_foo.conf or wherever you put the ceph configuration

The rbd_ceph_conf setting is already used by all the shell commands and library calls in the rbd driver, so this should work for everything.

The reason the keyring needs to be referenced explicitly is that the default locations in ceph expand the $cluster variable before it can be applied from the config file. Other ceph settings that use $cluster in the default value are log_file and admin_socket.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (stable/juno)

Change abandoned by Huang Zhiteng (<email address hidden>) on branch: stable/juno
Review: https://review.openstack.org/175724

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (stable/kilo)

Change abandoned by Huang Zhiteng (<email address hidden>) on branch: stable/kilo
Review: https://review.openstack.org/175723

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Huang Zhiteng (<email address hidden>) on branch: stable/kilo
Review: https://review.openstack.org/174482

Thierry Carrez (ttx)
Changed in cinder:
milestone: none → liberty-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in cinder:
milestone: liberty-1 → 7.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.