ReadWriteMany volumes permissions are too restrictive

Bug #1866262 reported by David Coronel
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
CDK Addons
Fix Released
High
Joseph Borg
Kubernetes Control Plane Charm
Fix Released
Medium
Kevin W Monroe
OpenStack Ceph-FS Charm
Invalid
Undecided
Unassigned

Bug Description

When I create a notebook in Kubeflow and I specify an additional data volume with ReadWriteMany, I get an error when I try to mount or use the directory inside the notebook. I use a terminal inside Kubeflow to do the testing.

Here is what a Kubeflow notebook looks like with one additional ReadWriteOnce volume and one additional ReadWriteMany volume:

tf-docker ~ > ls -la
ls: data-vol-2: Permission denied
total 6
drwxrws--- 5 root users 1058 Mar 5 21:37 .
drwxr-xr-x 1 root root 4096 Sep 28 06:31 ..
-rw-r--r-- 1 jovyan users 0 Mar 5 21:37 allo
drwxrws--- 2 root users 0 Mar 5 22:10 data-vol-1
drwxr-x--- 2 root root 0 Mar 5 20:46 data-vol-2
drwx--S--- 3 jovyan users 1058 Mar 5 20:50 .local

tf-docker ~ > touch data-vol-1/test-file

tf-docker ~ > touch data-vol-2/test-file
touch: cannot touch 'data-vol-2/test-file': Permission denied

This is on a fresh new charmed k8s 1.17 deployment with charms revisions:

ceph-fs rev 30
ceph-mon rev 45
kubernetes-master rev 808

Kubeflow has been deployed manually on top of Charmed Kubernetes. I have CephFS in the bundle and the relations in place to get a CephFS storage class in k8s. I have defined cephfs as my default storage class.

The only different thing with those two volumes is that one is RWO and the other is RWM. They come up with different permissions inside the pod and I don't understand what is responsible for that behavior.

Note that these tensorflow images from gcr.io don't run as root. They run as the jovyan user.

The tensorflow image is gcr.io/kubeflow-images-public/tensorflow-2.0.0a0-notebook-gpu:v0.7.0

I also tried to set a security context with a pod manifest (https://github.com/kubeflow/kubeflow/blob/master/components/admission-webhook/README.md) but it didn't change the mount permissions of the volumes.

Revision history for this message
David Coronel (davecore) wrote :

subscribed ~field-high

Revision history for this message
David Coronel (davecore) wrote :
Download full text (5.4 KiB)

Here is the YAML output of each PV and PVC. I don't see any differences other than one is RWO and the other is RWM:

$ kubectl get pvc -n david-coronel | grep march
march9-vol-1-rwo Bound pvc-04a32697-883a-4fd3-976d-76b7997b9b77 10Gi RWO cephfs 48s
march9-vol-2-rwm Bound pvc-4cc7d51b-f1a2-4d2e-bcfb-791bb3b1622d 10Gi RWX cephfs 48s
workspace-march9 Bound pvc-03555334-9618-4138-ba87-bb59660c3601 10Gi RWO cephfs 49s

$ kubectl get pv -n david-coronel | grep march
pvc-03555334-9618-4138-ba87-bb59660c3601 10Gi RWO Delete Bound david-coronel/workspace-march9 cephfs 61s
pvc-04a32697-883a-4fd3-976d-76b7997b9b77 10Gi RWO Delete Bound david-coronel/march9-vol-1-rwo cephfs 61s
pvc-4cc7d51b-f1a2-4d2e-bcfb-791bb3b1622d 10Gi RWX Delete Bound david-coronel/march9-vol-2-rwm cephfs 61s

$ kubectl get pvc -n david-coronel march9-vol-1-rwo -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: cephfs.csi.ceph.com
  creationTimestamp: "2020-03-09T14:26:35Z"
  finalizers:
  - kubernetes.io/pvc-protection
  name: march9-vol-1-rwo
  namespace: david-coronel
  resourceVersion: "1597491"
  selfLink: /api/v1/namespaces/david-coronel/persistentvolumeclaims/march9-vol-1-rwo
  uid: 04a32697-883a-4fd3-976d-76b7997b9b77
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: cephfs
  volumeMode: Filesystem
  volumeName: pvc-04a32697-883a-4fd3-976d-76b7997b9b77
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 10Gi
  phase: Bound

$ kubectl get pvc -n david-coronel march9-vol-2-rwm -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: cephfs.csi.ceph.com
  creationTimestamp: "2020-03-09T14:26:35Z"
  finalizers:
  - kubernetes.io/pvc-protection
  name: march9-vol-2-rwm
  namespace: david-coronel
  resourceVersion: "1597497"
  selfLink: /api/v1/namespaces/david-coronel/persistentvolumeclaims/march9-vol-2-rwm
  uid: 4cc7d51b-f1a2-4d2e-bcfb-791bb3b1622d
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  storageClassName: cephfs
  volumeMode: Filesystem
  volumeName: pvc-4cc7d51b-f1a2-4d2e-bcfb-791bb3b1622d
status:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 10Gi
  phase: Bound

$ kubectl get pv -n david-coronel pvc-04a32697-883a-4fd3-976d-76b7997b9b77 -o yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: cephfs.csi.ceph.com
  creationTimestamp: ...

Read more...

Revision history for this message
David Coronel (davecore) wrote :

Here is a ls of the directory on the kubernetes worker where the containerd containers actually run, where we see that the owner of the RWM mount is "drwxr-x---" and root:root versus the owner of the RWO mount is "drwxrws---" and root:users :

root@<worker node>:/var/lib/kubelet/pods/26a9857d-bd4a-4219-9667-b3b1932db700/volumes/kubernetes.io~csi# ls -lR
.:
total 12
drwxr-x--- 3 root root 4096 Mar 9 14:26 pvc-03555334-9618-4138-ba87-bb59660c3601
drwxr-x--- 3 root root 4096 Mar 9 14:26 pvc-04a32697-883a-4fd3-976d-76b7997b9b77
drwxr-x--- 3 root root 4096 Mar 9 14:26 pvc-4cc7d51b-f1a2-4d2e-bcfb-791bb3b1622d

./pvc-03555334-9618-4138-ba87-bb59660c3601:
total 5
drwxrws--- 6 root users 893 Mar 9 14:26 mount
-rw-r--r-- 1 root root 305 Mar 9 14:26 vol_data.json

./pvc-03555334-9618-4138-ba87-bb59660c3601/mount:
total 2
drwxr-sr-x 2 root users 0 Mar 9 14:26 data-vol-1
drwxr-sr-x 2 root users 0 Mar 9 14:26 data-vol-2
drwxr-sr-x 2 ubuntu users 0 Mar 9 14:26 'Untitled Folder'

./pvc-03555334-9618-4138-ba87-bb59660c3601/mount/data-vol-1:
total 0

./pvc-03555334-9618-4138-ba87-bb59660c3601/mount/data-vol-2:
total 0

'./pvc-03555334-9618-4138-ba87-bb59660c3601/mount/Untitled Folder':
total 0

./pvc-04a32697-883a-4fd3-976d-76b7997b9b77:
total 5
drwxrws--- 2 root users 0 Mar 9 14:26 mount
-rw-r--r-- 1 root root 305 Mar 9 14:26 vol_data.json

./pvc-04a32697-883a-4fd3-976d-76b7997b9b77/mount:
total 0

./pvc-4cc7d51b-f1a2-4d2e-bcfb-791bb3b1622d:
total 5
drwxr-x--- 2 root root 0 Mar 9 14:26 mount
-rw-r--r-- 1 root root 305 Mar 9 14:26 vol_data.json

./pvc-4cc7d51b-f1a2-4d2e-bcfb-791bb3b1622d/mount:
total 0

Revision history for this message
David Coronel (davecore) wrote :

One workaround is to manually change on the worker node the ownership and permissions of pvc-4cc7d51b-f1a2-4d2e-bcfb-791bb3b1622d/mount to "drwxrws---" and root:users

This allows the Kubeflow pod to write to the volume. Mounting this RWM volume in another Kubeflow pod also works, and I can see the files created from one notebook in the other notebook.

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

charm-ceph-fs has no involvement in mounting the shares, or in creating the shares, that are used by Kubernetes. As I understand it, the cephfs-csi plugin is responsible for the mounting of shares on the hosts.

Changed in charm-ceph-fs:
status: New → Incomplete
Revision history for this message
David Coronel (davecore) wrote :

Thanks for the reply Chris. Maybe this bug needs to be filed under another Launchpad project then? Is containerd the component responsible for creating, assigning permissions/ownerships and mounting the volume?

Revision history for this message
Cory Johns (johnsca) wrote :

It would be the ceph-csi[1] plugin itself, or possibly internal Kubernetes. I haven't been able to find any settings, other than the securityContext section for the PVC (which isn't included in your output above), which would give any control over the mount permissions.

[1]: https://github.com/ceph/ceph-csi

Revision history for this message
George Kraft (cynerva) wrote :

Added cdk-addons and kubernetes-master as components that need investigation in relation to this bug. I agree with Cory that the issue likely lies within ceph-csi, which is deployed and configured by cdk-addons.

Revision history for this message
David Coronel (davecore) wrote :

Here is the output of kubectl get pod in yaml format for my pod that contains these volumes:

https://pastebin.canonical.com/p/sMycPfNMmB/

I think these lines apply to my workload container:

  securityContext:
    fsGroup: 100

Could this issue be related?: https://github.com/kubernetes/examples/issues/260

Revision history for this message
Cory Johns (johnsca) wrote :

So, it turns out that there was a core Kubernetes issue[1] which led to fsGroup being changed to explicitly only apply to RWO volumes[2], leading to the permission issue that is being seen. There is a ceph-csi upstream workaround[3] for this, which is included in the 2.0 plugin version that will be included in the CK 1.18 release. In the meantime, the workaround you mentioned can be incorporated into the pod spec via an initContainer, as mentioned[4] on the kubernetes/examples#260 issue you linked.

[1]: https://github.com/kubernetes/kubernetes/issues/66323
[2]: https://github.com/kubernetes/kubernetes/blob/06ad960bfd03b39c8310aaf92d1e7c12ce618213/pkg/volume/csi/csi_mounter.go#L391-L394
[3]: https://github.com/ceph/ceph-csi/pull/423
[4]: https://github.com/kubernetes/examples/issues/260#issuecomment-534160265

Changed in charm-ceph-fs:
status: Incomplete → Invalid
David Coronel (davecore)
Changed in cdk-addons:
status: New → Confirmed
Cory Johns (johnsca)
Changed in cdk-addons:
importance: Undecided → Medium
Changed in charm-kubernetes-master:
importance: Undecided → Medium
status: New → Triaged
Changed in cdk-addons:
status: Confirmed → Triaged
Changed in charm-kubernetes-master:
milestone: none → 1.18
status: Triaged → Fix Committed
assignee: nobody → Kevin W Monroe (kwmonroe)
Revision history for this message
Cory Johns (johnsca) wrote :

It seems that there's an issue with the v2.0 release of ceph-csi and the current release of OpenStack (train), so we ended up having to revert back to the v1.1 release of ceph-csi for the 1.18 release [1]. The plan is to reapply the upgrade once the next release of OpenStack (ussuri) is available [2].

In the meantime, we are adding information about the initContainers workaround to the docs [3].

[1]: https://bugs.launchpad.net/cdk-addons/+bug/1868150
[2]: https://bugs.launchpad.net/cdk-addons/+bug/1867940
[3]: https://github.com/charmed-kubernetes/kubernetes-docs/pull/374

Changed in charm-kubernetes-master:
status: Fix Committed → Fix Released
Cory Johns (johnsca)
Changed in cdk-addons:
milestone: none → 1.19
assignee: nobody → Kevin W Monroe (kwmonroe)
George Kraft (cynerva)
Changed in cdk-addons:
importance: Medium → High
Revision history for this message
Cory Johns (johnsca) wrote :
Changed in cdk-addons:
assignee: Kevin W Monroe (kwmonroe) → Joseph Borg (joeborg)
status: Triaged → In Progress
Cory Johns (johnsca)
Changed in cdk-addons:
status: In Progress → Fix Committed
Changed in cdk-addons:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.