Comment 1 for bug 1940549

Revision history for this message
Cory Johns (johnsca) wrote :

As mentioned, this arises because the interface protocol in question was created before application-level relation data was available, so the leader has no choice but to write the response data in its unit data bucket, potentially leading to conflicting data being presented on the relation. The requesting side has no way to know which unit is the leader and thus which data is authoritative, but it could perhaps parse the cert data and pick the best one based on the effective and expiration dates. However, there are many more clients than providers for this relation and this issue impacts all of them, not just Kubernetes.

Possible solutions:

1) Migrate the interface protocol to app-level rel data. This would be the best solution for a new interface, but migrating to it now would require updating every charm which uses this interface on either side of the relation. It might be possible to do incrementally by writing the data in both buckets and applying one of the other fixes and then gradually updating the client charms to prefer the app-level data.

2) Make provider units clear their relation data whenever they see that they are not the leader. Requires no updates to the clients, and possibly no communication between the leaders and non-leaders of the provider, except that there is a chance for the non-leaders to wipe out relation data before the leader has written the new data, so that may want to be managed using a leader data field.

3) Make provider units all write the latest data as soon as it's available. I think this should be possible for Vault if the non-leader units can read the secrets out, but they'll need some trigger to know when the leader has generated the initial or updated data. For EasyRSA, the cert data will need to be copied to leader data if it isn't already. This is a bit more complex than 2 but ensures that the correct data is always available on the relation.