manila-ganesha fails to initialize CephFS driver along with cephfs charm usage: ceph_volume_client.EvictionError: Failed to evict client with auth_name=manila-ganesha from mds

Bug #1929699 reported by Vladimir Grevtsev
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Manila-Ganesha Charm
New
Undecided
Unassigned

Bug Description

== Environment

bionic/ussuri cloud, latest stable charms. The same behaviour was observed on the focal/ussuri cloud as well.

bundle: https://pastebin.canonical.com/p/yrXmGxpKrY/
juju status: https://paste.ubuntu.com/p/twgjCHmKYR/

== Problem statement

Ganesha is trying to initialize the CephFS driver, but it fails at the client eviction step:

Error encountered during initialization of driver CephFSDriver@juju-551dcc-5-lxd-2@cephfsnfs1: ceph_volume_client.EvictionError: Failed to evict client with auth_name=manila-ganesha from mds 0/4941: Error -13 ("") while Sending evict to mds.4941

== Analysis

The full log output of the manila-share service: https://paste.ubuntu.com/p/hgDwzYZFrZ/

In the meanwhile, the ceph-fs unit contains the following:

2021-05-26T11:36:29.675+0000 7faafe573700 1 mds.juju-551dcc-3-lxd-0 handle_command: received command from client without `tell` capability: 172.27.84.251:0/2721661976

And indeed, there is no "mds" capability defined for the client.manila-ganesha user created in Ceph:

ubuntu@OrangeBox84:~/fce-demo$ j ssh ceph-mon/0

Last login: Wed May 26 11:06:58 2021 from 172.27.84.1
ubuntu@juju-551dcc-1-lxd-0:~$ sudo ceph auth get client.manila-ganesha
exported keyring for client.manila-ganesha
[client.manila-ganesha]
 key = AQC8Ja5gbC1ZNhAAZEcfuaaJrrx/CuMbOvg89A==
 caps mon = "allow r; allow command \"osd blacklist\""
 caps osd = "allow rwx"

The "mds" capability has only been granted for the mon units, ceph-fs units, and the "client.admin" and "client.bootstrap-mds" users: http://paste.ubuntu.com/p/bcwPD9psMY/

The above issue is completely gone after issuing the manual capability grant command:

ubuntu@juju-551dcc-1-lxd-0:~$ sudo ceph auth caps client.manila-ganesha mon 'allow r; allow command "osd blacklist"' mds 'allow *' osd 'allow rwx'
updated caps for client.manila-ganesha
ubuntu@juju-551dcc-1-lxd-0:~$ sudo ceph auth get client.manila-ganesha
exported keyring for client.manila-ganesha
[client.manila-ganesha]
 key = AQC8Ja5gbC1ZNhAAZEcfuaaJrrx/CuMbOvg89A==
 caps mds = "allow *"
 caps mon = "allow r; allow command \"osd blacklist\""
 caps osd = "allow rwx"

After issuing the above, manila-share service finally was able to initialize the driver:

2021-05-26 11:44:32.807 18296 INFO manila.share.manager [req-22acddb5-b684-4172-bd90-31c4123f3676 - - - - -] Finished initialization of driver: 'CephFSDriver@juju-551dcc-5-lxd-2@cephfsnfs1'

(full log: https://paste.ubuntu.com/p/2Gb48B9yXs/)

Tags: field-high
summary: manila-ganesha fails to initialize CephFS driver along with cephfs charm
- usage
+ usage: ceph_volume_client.EvictionError: Failed to evict client with
+ auth_name=manila-ganesha from mds
description: updated
description: updated
Revision history for this message
Vladimir Grevtsev (vlgrevtsev) wrote :

Subscribing field-high as this affects ongoing customer delivery.

Side note: the target env is focal/ussuri; while the bionic/ussuri was a lab env used to reproduce the issue.

tags: added: field-high
Revision history for this message
Vladimir Grevtsev (vlgrevtsev) wrote :

Even after adding the capability manually, the "manila access-allow ..." command fails, complaining about RADOS command error in manila-share.log: https://pastebin.canonical.com/p/JcYDPTGY5F/

In the meanwhile, ceph-mon logs shows the following:

2021-05-27T13:25:03.934+0000 7fdbf9bf1700 0 mon.juju-291059-3-lxd-1@2(peon) e2 handle_command mon_command({"prefix": "auth get", "entity": "client.ganesha-0737adf5-016d-4472-8926-161d6fdf583e", "format": "json"} v 0) v1
2021-05-27T13:25:03.934+0000 7fdbf9bf1700 1 mon.juju-291059-3-lxd-1@2(peon) e2 handle_command access denied
2021-05-27T13:25:03.934+0000 7fdbf9bf1700 0 log_channel(audit) log [INF] : from='client.? 172.16.155.21:0/2708241706' entity='client.manila-ganesha' cmd=[{"prefix": "auth get", "entity": "client.ganesha-0737adf5-016d-4472-8926-161d6fdf583e", "format": "json"}]: access denied
2021-05-27T13:25:03.998+0000 7fdbf9bf1700 0 mon.juju-291059-3-lxd-1@2(peon) e2 handle_command mon_command({"prefix": "auth get-or-create", "entity": "client.ganesha-0737adf5-016d-4472-8926-161d6fdf583e", "caps": ["mds", "allow rw path=/volumes/_nogroup/0737adf5-016d-4472-8926-161d6fdf583e", "osd", "allow rw pool=ceph-fs_data namespace=fsvolumens_0737adf5-016d-4472-8926-161d6fdf583e", "mon", "allow r"], "format": "json"} v 0) v1
2021-05-27T13:25:03.998+0000 7fdbf9bf1700 1 mon.juju-291059-3-lxd-1@2(peon) e2 handle_command access denied
2021-05-27T13:25:03.998+0000 7fdbf9bf1700 0 log_channel(audit) log [INF] : from='client.? 172.16.155.21:0/2708241706' entity='client.manila-ganesha' cmd=[{"prefix": "auth get-or-create", "entity": "client.ganesha-0737adf5-016d-4472-8926-161d6fdf583e", "caps": ["mds", "allow rw path=/volumes/_nogroup/0737adf5-016d-4472-8926-161d6fdf583e", "osd", "allow rw pool=ceph-fs_data namespace=fsvolumens_0737adf5-016d-4472-8926-161d6fdf583e", "mon", "allow r"], "format": "json"}]: access denied

Revision history for this message
Vladimir Grevtsev (vlgrevtsev) wrote :

I've noticed that the previously granted permissions are not enough, according to the https://github.com/openstack/charm-manila-ganesha/blob/master/src/lib/charm/openstack/manila_ganesha.py#L45-L52

Trying to fix that:

root@juju-291059-3-lxd-1:~# ceph auth caps client.manila-ganesha mon 'allow r; allow command "osd blacklist"; allow command "auth del"; allow command "auth caps"; allow command "auth get"; allow command "auth get-or-create"' mds 'allow *' osd 'allow rwx'
updated caps for client.manila-ganesha
root@juju-291059-3-lxd-1:~# ceph auth get client.manila-ganesha
exported keyring for client.manila-ganesha
[client.manila-ganesha]
        key = AQDZmZtg7rn3JxAAg1H3RhG5iqC6kRzEEdZcnA==
        caps mds = "allow *"
        caps mon = "allow r; allow command \"osd blacklist\"; allow command \"auth del\"; allow command \"auth caps\"; allow command \"auth get\"; allow command \"auth get-or-create\""
        caps osd = "allow rwx"

again:
manila create --share-type cephfsnfstype --name cephnfsshare1 nfs 1
manila access-allow cephnfsshare1 ip <IP_ADDR> --access-level rw

-> same rados.Error: access denied in manila-share.log

ceph-mon: https://pastebin.canonical.com/p/CmQTDpNSYf/

Even granting an "allow *" permission didn't help....

root@juju-291059-3-lxd-1:~# ceph auth caps client.manila-ganesha mon 'allow *' mds 'allow *' osd 'allow rwx'
updated caps for client.manila-ganesha
root@juju-291059-3-lxd-1:~# ceph auth get client.manila-ganesha
exported keyring for client.manila-ganesha
[client.manila-ganesha]
        key = AQDZmZtg7rn3JxAAg1H3RhG5iqC6kRzEEdZcnA==
        caps mds = "allow *"
        caps mon = "allow *"
        caps osd = "allow rwx"

Same result in manila-share.log @ ganesha unit: https://pastebin.canonical.com/p/jybZqWGtSg/

Revision history for this message
Vladimir Grevtsev (vlgrevtsev) wrote :

However, after restarting the manila-share service, it was finally able to change the permissions:

2021-05-27 14:34:09.393 688319 INFO manila.share.access [req-cd5d3e37-5fdc-46bf-80f6-234e491e3e77 9b7cb182ef794fcda9ca67392da762c4 fd3fdad279c148a58ce3e0602f862d7e - - -] Access rules were successfully modified for share instance 3902e78a-9317-462d-9eb4-263d2d4a6824 belonging to share 450880ec-9675-4c64-a09a-e0e9e598fb3f.

And the final workaround looks like:

1) [on ceph-mon] sudo ceph auth caps client.manila-ganesha mon 'allow r; allow command "osd blacklist"; allow command "auth del"; allow command "auth caps"; allow command "auth get"; allow command "auth get-or-create"' mds 'allow *' osd 'allow rwx'
2) [on manila-ganesha] sudo service manila-share restart

However, it's not clear yet why those permissions were missing initially, as the charm code clearly defines them.

Revision history for this message
Vladimir Grevtsev (vlgrevtsev) wrote :

I did two redeployments recently and now the permissions are back in-place, so currently this issue looks like some race condition. Doing some more to confirm.

Revision history for this message
Vladimir Grevtsev (vlgrevtsev) wrote :

ubuntu@juju-09efdd-3-lxd-2:~$ sudo ceph auth get client.manila-ganesha
exported keyring for client.manila-ganesha
[client.manila-ganesha]
        key = AQBOiLRgipCgHRAAtLmcLpW12lY1GSWWJwaFxA==
        caps mon = "allow r; allow command \"osd blacklist\""
        caps osd = "allow rwx"
ubuntu@juju-09efdd-3-lxd-2:~$

And again, there's no caps defined for some reason,

Revision history for this message
Nobuto Murata (nobuto) wrote :

I'm working on the same environment, but the issue is no longer reproducible somehow. Setting this as Incomplete for now.

Changed in charm-manila-ganesha:
status: New → Incomplete
Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote (last edit ):

@Nobuto: that's likely due to the fix for https://bugs.launchpad.net/charm-manila-ganesha/+bug/1889287 landing, as the change does include a bit more guarding for setting up the cluster.

Revision history for this message
Nobuto Murata (nobuto) wrote :

> @Nobuto: that's likely due to the fix for https://bugs.launchpad.net/charm-manila-ganesha/+bug/1889287 landing, as the change does include a bit more guarding for setting up the cluster.

The latest patches will make it more solid for sure, but I couldn't reproduce it with the stable charms either for multiple times. Anyway, we can leave it as Incomplete and let it expire unless other people will be hit by this.

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Since this is incomplete and unable to be recreated with changes improving this area, removing field high subscription

Revision history for this message
Przemyslaw Hausman (phausman) wrote :

I'm hitting the same bug on a focal/ussuri on a different deployment.

manila-gaesha charm revision 37

After I applied the workaround from @vlgrevtsev, the CephFS driver was finally able to initialize.

Changed in charm-manila-ganesha:
status: Incomplete → New
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.