kubelet ID collisions when using multiple kubernetes-worker apps
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Kubernetes Control Plane Charm |
New
|
Undecided
|
Kevin W Monroe |
Bug Description
Hi,
As part of exploratory work to see if it's possible to deploy multiple kubernetes-worker apps and relate them with the same kubernetes-master. Unfortunately, it looks like this presently is not possible.
I believe that this line of code in kubernetes_
userid = "kubelet-
In plain English for those lacking immediate context: the request variable is a 2-item tuple of (unit_name, data_mapping) reflecting workers related to the kubernetes-master app. We're simply taking the unit_name (e.g. kubernetes-
userid is a field used in the known_tokens.csv file. The problem here is, if a second kubernetes-worker app is deployed, the kubernetes-master will hit this code when trying to handle setting up auth for the new worker - and if the unit is e.g. kubernetes-
The rest of the code is written so that it will be detected that the new worker doesn't have a token (since that's looked up via username, e.g. system:
Likewise, if the existing worker's record is processed after, it will see its record doesn't exist, and a new token will be created for it as well, clobbering the new worker's record.
The end result is: tokens may end up getting replaced for both conflicting units (not just the new one), and ultimately only one of the tokens will be retained, thus only one of the conflicting workers will be allowed to communicate with the kubernetes-master.
Why a fix is desired: while it may be a niche use case, there may be cases where different classes of kubernetes-workers need to be deployed, e.g. mix of metals and KVMs, or mix of machines with different types of network access. The current code more-or-less prevents it since it creates a conflict which can only be avoided by ensuring the unit name suffixes never collide, e.g. that there is never a kubernetes-worker/0 and kubernetes-
Changed in charm-kubernetes-master: | |
assignee: | nobody → Kevin W Monroe (kwmonroe) |
@Paul, thanks for the report and triage! This was recently hit again and opened as 1906732 with a few more logs detailing the issue. I'm going to dupe this to that; we should have a fix out in the first bugfix release of CK 1.20.