Add support for CSI snapshots

Bug #1926494 reported by Paul Goins
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
CDK Addons
Triaged
Wishlist
Unassigned

Bug Description

This is likely related to https://bugs.launchpad.net/cdk-addons/+bug/1896765. It covers snapshots in CDK, but is not specific to CephFS.

Basically, it appears that CDK is not deploying the bits needed to support the beta and GA APIs for volume snapshots.

This can be worked around manually by installing the upstream CRDs, roles, and deployments for the external snapshotter (e.g. https://github.com/kubernetes/kubernetes/tree/v1.20.6/cluster/addons/volumesnapshots for K8s 1.20.6). However, CDK does not do this by default.

Additionally, upstream docs talk about a "snapshot validation webhook" which should be deployed as well, although there is not really a "stock" webhook available for use, only an example one which requires building a custom docker image. While not strictly necessary, it seems to be strongly suggested, and the above-mentioned manual workaround makes no provision for this. This seems like something that CDK should provide as well.

References:

* https://kubernetes.io/blog/2020/12/10/kubernetes-1.20-volume-snapshot-moves-to-ga/
* https://github.com/kubernetes-csi/external-snapshotter

Tags: sts
George Kraft (cynerva)
Changed in cdk-addons:
importance: Undecided → Wishlist
status: New → Triaged
Revision history for this message
Paul Goins (vultaire) wrote (last edit ):

A correction to my previous explanation re: the webhook: it's not that there is not a stock webhook, it's that there's more than one way to install it and docs provide an example rather than prescribing a specific method as the "proper" way to do it.

There also appears to be built images for the webhook (e.g. k8s.gcr.io/sig-storage/snapshot-validation-webhook:v4.0.0).

I don't know why the webhook isn't in the main K8s repo even though the core functionality has been brought into the central repo; I wish I understood that...

Revision history for this message
Paul Goins (vultaire) wrote :

Initially putting the webhook concern aside, I've made a few initial MRs for this. I don't expect them to be accepted as-is, but I hope for feedback:

* https://github.com/charmed-kubernetes/cdk-addons/pull/206
* https://github.com/charmed-kubernetes/charm-kubernetes-master/pull/162

I was able to manually test that the needed bits get deployed successfully in my test environment after redirecting rocks.canonical.com:443/cdk/sig-storage/snapshot-controller:v3.0.2 to k8s.gcr.io/sig-storage/snapshot-controller:v3.0.2 to k8s.gcr.io/sig-storage/snapshot-controller:v3.0.2.

I have not yet verified actually performing volume snapshots via these patches; I'm trying to set up a ceph-rdb backend from now to verify.

Revision history for this message
Paul Goins (vultaire) wrote :

I unfortunately have been having problems setting up a proper testing environment. I've got K8s set up with calico and a small test ceph cluster, but traffic isn't getting NATted out of the pods to the ceph-mons, despite the calico IP pool being set with natOutgoing=true. I'll need to step back from this for now because of how much time this setup has consumed.

I may need to look around for a good test K8s bundle including ceph-osd which I could spin up to test this on...

tags: added: sts
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.