Comment 4 for bug 1949913

Revision history for this message
George Kraft (cynerva) wrote :

Thanks for the details. Given the delayed but inevitable loss of Kubernetes secret data, I would agree with your assessment that this is a critical issue. Even worse, it seems, that the cluster can appear healthy at first, until much later, when the encryption config file gets rewritten.

I believe we should be able to fix the collision by adding the model UUID to the secret backend. On upgrade, we'll have to be careful to ensure that existing data is migrated properly.

When migrating the encryption key, instead of grabbing it from Vault, perhaps we should consider grabbing it from the local encryption config file if it exists. This might heal existing clusters that have been damaged but are not showing symptoms yet.

Regarding point 5, I agree that key rotation would be good to have and would have lessened the impact here. The upstream Kubernetes documentation describes how to do this[1]. It will be tricky to implement in the charm. To keep the scope of this issue small, I think we will have to consider key rotation separately.

For my own slight tangent, from a security standpoint, it seems unfortunate that units using the vault-kv relation have full read and write access to a database shared by other models. I don't think a kubernetes-control-plane app in one model should have read access to the secret data for a kubernetes-control-plane app in a different model. Does vault have access controls that we're not taking advantage of in the charms? Or is this an indication that each cluster should have its own instance of vault? I'm not sure, but it seems worth looking into.

[1]: https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/#rotating-a-decryption-key