Comment 3 for bug 1949913

Revision history for this message
Paul Goins (vultaire) wrote (last edit ):

Hi George,

It can lead to permanent data loss. And indeed, while the immediate bug is in layer-vault-kv in my opinion, we have perhaps a second issue related to this in kubernetes-master. I'll try to lay this out simply below.

1. 2 kubernetes-master apps are deployed in different models, connected to the same vault via a CMR. Each app sees itself as "kubernetes-master", and thus the _get_secret_backend() function mentioned in my last comment returns "charm-kubernetes-master".

2. Each app stores its encryption_secret at charm-kubernetes-master/kv/app. (Source from current main branch tip: https://github.com/charmed-kubernetes/charm-kubernetes-control-plane/blob/54e02bb/reactive/kubernetes_control_plane.py#L3307) The last one to call the generate_encryption_key handler will clobber whatever may have been in vault before at that value.

3. Running the create_secure_storage() function - directly, or because the kubernetes-control-plane.secure-storage.created flag was cleared (perhaps during attempted debug of issues, or as a follow-up to the layer.vaultlocker.ready flag being cleared), will cause the /var/snap/kube-apiserver/common/encryption/encryption_config.yaml file to be rewritten with whatever the present value in vault has. This appears to be the moment that the previous key will be irrevocably lost, as it may have persisted in this config file despite it having already being overwritten in vault.

4. After the config is rewritten based upon current vault values, you find yourself in a situation where you cannot read any existing secrets. "kubectl get secrets" will fail with cryptic errors like "Internal error occurred: unable to transform key "/registry/secrets/kube-system/attachdetach-controller-token-4bcd3": invalid padding on input" since it cannot decode these secrets. However, creating and reading new secrets via kubectl works fine, and pulling those secrets via etcdctl confirms that they are indeed being encrypted. In other words: most likely, the config file's key has changed and is no longer able to decrypt any secrets created prior to the accidental change.

5. As a slight tangent, but related to the above: the kubernetes-control-plane charm doesn't appear to handle key rotation. Upon key change, secrets are not re-encoded with the new key; the old key is simply unceremoniously dropped and existing keys become inaccessible. This is a use case which should also be considered, at least to limit the damage of a collision or other unexpected changes to the data in vault.

Hope this helps - please reach out if you wish for more details or clarification.