Comment 0 for bug 1940549

Revision history for this message
Steven Parker (sbparke) wrote :

While attempting to refresh certificates for a k8s installation no units other then the client leaders updated.

Steps to replicate:
  Deploy k8s stack and vault with replication count 3 (HA).
  Delete vault unit which is leader and add another
    execute refresh certificates action
    confirm k8s client.crt is actually updated or fail to update
       juju ssh kubernetes-worker/0 sudo openssl x509 -in /root/cdk/client.crt -text
  Repeat a few times

  At issue is that there are multiple instances of relation data from those units being shared with other applications vs one source of truth (the leader).
We have one vault leader which provides the correct data when we re-issue certificates.
However, older vault units that may have been leader at some time still retain stale certificate data shared with all the clients.
That stale data is conflicting with the newly provided certificates and the clients think nothing has changed (the stale data has the original certs)
and thus the clients do not drop the client certificates to disk.

We cleared data from the non leaders to solve the issue:
For example here is vault/0 which is a non-leader (vault/1 is the current leader)
  juju run -u vault/0 "relation-set -r certificates:61
       kubernetes-master_0.server.key='' kubernetes-master_0.server.cert='' kubernetes-master_0.processed_client_requests=''
       kubernetes-master_1.server.key='' kubernetes-master_1.server.cert='' kubernetes-master_1.processed_client_requests=''
       kubernetes-master_2.server.key='' kubernetes-master_2.server.cert='' kubernetes-master_2.processed_client_requests=''
       kubernetes-master_5.server.key='' kubernetes-master_5.server.cert='' kubernetes-master_5.processed_client_requests=''
    "

Once the stale data was cleared the clients saw the new certificates and updated correctly.