Comment 6 for bug 1858839

Revision history for this message
George Kraft (cynerva) wrote :

I'm able to reproduce this easily enough. It looks like a bug in Ceph CSI v1.0.2's getVolumeName function[1], which takes the target_path from kubelet's NodePublishVolume request:

{ ... "target_path": "/var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/publish/pvc-c5a78d65-52a7-4cd2-92c9-d2a250db5089/0a023bbf-4302-4385-8cad-ff7446692518" ... }

and takes the last directory to be the volume name:

0a023bbf-4302-4385-8cad-ff7446692518

When in fact, the actual volume name is:

pvc-c5a78d65-52a7-4cd2-92c9-d2a250db5089

The volume name and RBD image name are the same, so csi-rbdplugin ends up looking for the RBD image with the wrong name, and fails to find it. The bug only occurs with block volumes because the target_path for block volumes is slightly different and has its own code path in getVolumeName.

Later versions of Ceph CSI (v1.1.0+) appear to use a different approach for determining the volume name[2]. So, I suspect we can fix this by updating to a more recent version of Ceph CSI.

[1]: https://github.com/ceph/ceph-csi/blob/a4dd8457350b4c4586743d78cbd5776437e618b6/pkg/rbd/nodeserver.go#L107-L109
[2]: https://github.com/ceph/ceph-csi/blob/c7ba26d23d784f7e21d850c6d014897f25a9868d/pkg/rbd/nodeserver.go#L132-L142