kube-controller-manager fails to start when vsphere user includes '\'

Bug #1872811 reported by Camille Rodriguez
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Kubernetes Control Plane Charm
Fix Released
Medium
Tim Van Steenburgh

Bug Description

I deployed charmed kubernetes on vmware, and it deploys everything, but pods won't come on.

It might be because the kube-controller-manager keeps restarting on the kubernetes-master. It seems like the leader keeps switching between the two masters as well. I attached a juju crashdump here.

ubuntu@juju-bc2079-14:~$ journalctl -u snap.kube-controller-manager.daemon.service
-- Logs begin at Tue 2020-04-14 18:39:32 UTC, end at Tue 2020-04-14 19:08:42 UTC. --
Apr 14 18:42:41 juju-bc2079-14 systemd[1]: Started Service for snap application kube-controller-manager.daemon.
Apr 14 18:42:41 juju-bc2079-14 kube-controller-manager.daemon[14096]: cat: /var/snap/kube-controller-manager/1518/args: No such file or directory
Apr 14 18:42:43 juju-bc2079-14 kube-controller-manager.daemon[14096]: I0414 18:42:43.192359 14096 serving.go:312] Generated self-signed cert in-memory
Apr 14 18:42:43 juju-bc2079-14 kube-controller-manager.daemon[14096]: W0414 18:42:43.192451 14096 client_config.go:543] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
Apr 14 18:42:43 juju-bc2079-14 kube-controller-manager.daemon[14096]: W0414 18:42:43.192457 14096 client_config.go:548] error creating inClusterConfig, falling back to default config: unable to load in-cluster configuration, KUBERNETES
Apr 14 18:42:43 juju-bc2079-14 kube-controller-manager.daemon[14096]: invalid configuration: no configuration has been provided
Apr 14 18:42:43 juju-bc2079-14 systemd[1]: snap.kube-controller-manager.daemon.service: Main process exited, code=exited, status=1/FAILURE
Apr 14 18:42:43 juju-bc2079-14 systemd[1]: snap.kube-controller-manager.daemon.service: Failed with result 'exit-code'.
Apr 14 18:42:43 juju-bc2079-14 systemd[1]: snap.kube-controller-manager.daemon.service: Service hold-off time over, scheduling restart.
Apr 14 18:42:43 juju-bc2079-14 systemd[1]: snap.kube-controller-manager.daemon.service: Scheduled restart job, restart counter is at 1.
Apr 14 18:42:43 juju-bc2079-14 systemd[1]: Stopped Service for snap application kube-controller-manager.daemon.
Apr 14 18:42:43 juju-bc2079-14 systemd[1]: Started Service for snap application kube-controller-manager.daemon.
Apr 14 18:42:44 juju-bc2079-14 kube-controller-manager.daemon[15230]: I0414 18:42:44.088992 15230 serving.go:312] Generated self-signed cert in-memory
Apr 14 18:42:44 juju-bc2079-14 kube-controller-manager.daemon[15230]: W0414 18:42:44.089177 15230 client_config.go:543] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
Apr 14 18:42:44 juju-bc2079-14 kube-controller-manager.daemon[15230]: W0414 18:42:44.089188 15230 client_config.go:548] error creating inClusterConfig, falling back to default config: unable to load in-cluster configuration, KUBERNETES
Apr 14 18:42:44 juju-bc2079-14 kube-controller-manager.daemon[15230]: invalid configuration: no configuration has been provided
Apr 14 18:42:44 juju-bc2079-14 systemd[1]: snap.kube-controller-manager.daemon.service: Main process exited, code=exited, status=1/FAILURE
Apr 14 18:42:44 juju-bc2079-14 systemd[1]: snap.kube-controller-manager.daemon.service: Failed with result 'exit-code'.
Apr 14 18:42:44 juju-bc2079-14 systemd[1]: snap.kube-controller-manager.daemon.service: Service hold-off time over, scheduling restart.
Apr 14 18:42:44 juju-bc2079-14 systemd[1]: snap.kube-controller-manager.daemon.service: Scheduled restart job, restart counter is at 2.
Apr 14 18:42:44 juju-bc2079-14 systemd[1]: Stopped Service for snap application kube-controller-manager.daemon.
Apr 14 18:42:44 juju-bc2079-14 systemd[1]: Started Service for snap application kube-controller-manager.daemon.
Apr 14 18:42:44 juju-bc2079-14 kube-controller-manager.daemon[15319]: I0414 18:42:44.656610 15319 serving.go:312] Generated self-signed cert in-memory

Revision history for this message
Camille Rodriguez (camille.rodriguez) wrote :

I redeployed without the vsphere-integrator and now the pods are coming up. So the issue can be narrowed to the vsphere-integrator charm , but I do not know why exactly

Revision history for this message
George Kraft (cynerva) wrote :

> I attached a juju crashdump here.

I'm not seeing it. Can you try attaching the crashdump again? We'll need the crashdump, or otherwise more details, to properly assess this.

Changed in charmed-kubernetes-bundles:
status: New → Incomplete
Revision history for this message
Camille Rodriguez (camille.rodriguez) wrote :

Launchpad kept failing and timing out when I attached it. I will try again. However, I did isolate the issue.

When deploying with the vsphere-integrator, if the username contains a backslash (\), it will fail silently to connect to vmware, and that is why no pods were coming up.

When discarding the vsphere integrator, juju/kubernetes had no trouble to spawn the pods with the same user (i.e DOMAIN\my_vmware_user).

I believe this should be handled by the charm, or at least give a clear warning when the username has a \ in it. I was able to change it to my_vmware_user@DOMAIN and it resolved my issue.

Revision history for this message
Camille Rodriguez (camille.rodriguez) wrote :

Got a timeout again. Sorry, launchpad doesn't want a juju-crashdump to be attached to this bug.

Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :
Revision history for this message
George Kraft (cynerva) wrote :

Thank you. Good job isolating the failure condition. Here is the relevant error from kube-controller-manager:

F0414 18:03:11.338599 22305 controllermanager.go:230] error building controller context: cloud provider could not be initialized: could not init cloud provider "vsphere": 6:8: unquoted '\' must be followed by new line

I believe that's referring to line 6 of the vsphere cloud config file, which is indeed the username: https://github.com/charmed-kubernetes/charm-kubernetes-master/blob/4fe832d3d891e21d87a9aaee4e1162a768bbb291/reactive/kubernetes_master.py#L2398

Sounds like we can fix it by adding quotes around the username when rendering that file.

no longer affects: charmed-kubernetes-bundles
Changed in charm-kubernetes-master:
status: New → Confirmed
summary: - CK does not deploy any pods, kube-controller-manager keeps restarting
+ kube-controller-manager fails to start when vsphere user includes '\'
Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :
tags: added: review-needed
Changed in charm-kubernetes-master:
status: Confirmed → In Progress
assignee: nobody → Tim Van Steenburgh (tvansteenburgh)
milestone: none → 1.18+ck1
importance: Undecided → Medium
George Kraft (cynerva)
Changed in charm-kubernetes-master:
status: In Progress → Fix Committed
tags: removed: review-needed
Revision history for this message
George Kraft (cynerva) wrote :
Changed in charm-kubernetes-master:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.