Can't bootstrap k8s controller

Bug #1876091 reported by Natalia Litvinova
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
Wishlist
Unassigned

Bug Description

After deploying a kubernetes-core cluster with NFS charm on Juju 2.7.6 and running add-k8s command successfully, juju hangs when trying to bootstrap the controller on new k8s cloud:

$ juju add-k8s newcore
This operation can be applied to both a copy on this client and to the one on a controller.
Do you want to add k8s cloud newcore to:
    1. client only (--client)
    2. controller "core" only (--controller core)
    3. both (--client --controller core)
Enter your choice, or type Q|q to quit: 3

k8s substrate added as cloud "newcore" with storage provisioned
by the existing "default" storage class.
You can now bootstrap to this cloud by running 'juju bootstrap newcore'.

$ juju bootstrap newcore
Creating Juju controller "newcore" on newcore
Creating k8s resources for controller "controller-newcore"
Starting controller pod
Bootstrap agent now started
Contacting Juju controller at 10.152.183.204 to verify accessibility...

I can see that controller was created, but the address 10.152.183.204 is not pingable

Revision history for this message
Ian Booth (wallyworld) wrote :

If the k8s cluster creates cluster ip services, these may not be accessible external to the cluster, eg by the Juju client.

How the controller service is spun up can be configured

https://discourse.juju.is/t/new-features-and-changes-in-juju-2-7/2268

Configuring the controller service when bootstrapping

When bootstrapping directly to a k8s cluster, Juju will create the front end service to the controller pod according to the type of the underlying cluster. For microk8s, a ClusterIP service is created; for Charmed Kubernetes on AWS or Azure, a LoadBalancer service is created etc.

You can override this behaviour, to choose a specific service type, and where necessary provide the external IP address(es) with which to configure the service. Here’s how you’d specify to use a k8s ExternalName service

$ juju bootstrap microk8s test
  --config controller-service-type=external
  --config controller-external-name=mydnsname
  --config controller-external-ips=[10.0.0.1,10.0.0.2]
(where “mynamename” and the external ips are provided as dictated by the k8s set up being used).

Other options for service type are cluster and loadbalancer.

Maybe this information will help you get started.

Revision history for this message
Natalia Litvinova (natalytvinova) wrote :

Thanks for pointing to the new juju changes doc! I tried bootstrapping with the following command using my k8s worker IP: https://paste.ubuntu.com/p/T7yTWymM46/

And I can see that the service is being created, but I cant telnet to it and service controller pod does not have juju on it: https://paste.ubuntu.com/p/nxkDb92fbR/

Revision history for this message
Ian Booth (wallyworld) wrote :

There's 2 containers in the controller pod. The exec in the pastebin has gone to the container running mongo. You can exec to the container running the jujud agent by using the "-c api-server" option with kubectl exec.

telnet isn't enabled inside the juju controller container.

ExternalName k8s services may or may not be applicable to your scenario.
https://kubernetes.io/docs/concepts/services-networking/service/#externalname

Did you try "loadbalancer". It also depends on how things have been set up. eg what underlying cloud kubernetes-core is deployed to, dies that substrate support creating a loadbalancer automatically (like is done on AWS etc), if not has a bespoke resource been set up to direct external traffic to the cluster, and if so those ip addresses would need to be specified via the controller-external-ips arg.

Are you able to give more context to your setup?

Revision history for this message
Pedro Guimarães (pguimaraes) wrote :

Hi, I am running juju 2.8.5 and Charmed Kubernetes 1.18 on stable charms and also seeing this issue.
Running k8s on top of MAAS 2.6.
When I apply configs described on comment #1, I can get to the point where juju starts the containers and waits for its return.

The extra configs as described on #1, can be seen on: https://pastebin.canonical.com/p/Rfn4wpxZNH/
As well as the --debug output.
The IPs listed as options correspond to kubernetes-workers' IPs; and external-name config corresponds to an A record that points to the same worker IPs.

Here is the logs from api-server: https://pastebin.canonical.com/p/BBfBKf62sw/
I've attached a bash to that container and could verify that both localhost:17070 and localhost:37017 were available.

Revision history for this message
Pedro Guimarães (pguimaraes) wrote :

Juju status from that Charmed Kubernetes deployment: https://pastebin.canonical.com/p/NCyCW2RHy9/

Revision history for this message
John A Meinel (jameinel) wrote :

@pedro:
19:17:33 DEBUG juju.api apiclient.go:758 looked up caas-controller.maas -> [172.27.16.118 172.27.16.116 72.27.16.117]
19:27:33 ERROR juju.cmd.juju.commands bootstrap.go:795 unable to contact api server after 1 attempts: dial tcp 72.27.16.117:17070: i/o timeout

One of those addresses looks like a typo. Namely 72.27.16.117 and not 172.27.16.117 like the other ones.

It is unclear to me why we seem to only try that one address and not the others (possibly because the other two addresses are RFC1918 Private IP addresses). But I'm guessing the IP address is incorrect because other than the missing 1 at the beginning and the final digit, they share the same values.

Revision history for this message
Pedro Guimarães (pguimaraes) wrote :

Hi @jameinel, yes that was my very first run and had a wrong IP (missing 1 in front of it). I fixed that later, sorry for sharing old logs

Revision history for this message
John A Meinel (jameinel) wrote :

juju bootstrap --debug test-k8s --config controller-service-type=external --config controller-external-name=caas-controller.maas --config controller-external-ips=[172.27.16.117,172.27.16.116,172.27.16.118]

Is telling us that the controller would be available on specific IPs, however you are saying that it was *actually* available at 'localhost:17070' is that correct?

Do you have newer logs of this configuration? Ultimately we need to be able to route packets to the controller and have it respond, and have configuration which tells us where to connect to it. There are a few knobs that can be tweaked but Juju can't know ahead of time what your routing configuration is.

Changed in juju:
importance: Undecided → High
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for juju because there has been no activity for 60 days.]

Changed in juju:
status: Incomplete → Expired
Revision history for this message
Pedro Guimarães (pguimaraes) wrote :

Hi, I am moving this bug back to "new" status. I am still not able to bootstrap a cluster using external type.

Kubernetes and Juju versions: https://pastebin.canonical.com/p/HkDdRDrZfm/

I've registered a caas-controller.maas, which is resolvable from my juju client as showed: https://pastebin.canonical.com/p/dXvKvXBsCh/

You can see that record matches all my 3x kubernetes workers: https://pastebin.canonical.com/p/MDmDymkWpZ/

And I can resolve that same record from any of the workers: https://pastebin.canonical.com/p/TCf2YDtMYb/

juju clouds: https://pastebin.canonical.com/p/tBs3bMwfpS/

I run the following command to bootstrap: $ juju bootstrap --debug test --config controller-service-type=external --config controller-external-name=caas-controller.maas --config controller-external-ips=[172.27.16.70,172.27.16.71,172.27.16.72]

Bootstrap times out: https://pastebin.canonical.com/p/nkK3KcTKwQ/

While it is trying to bootstrap, I can collect the following information:
Controller pod describe: https://pastebin.canonical.com/p/HSVMtmQPVf/
Service describe: https://pastebin.canonical.com/p/kjkrMfNCBw/

If I ssh to one of the worker nodes and try to access port 17070, it fails: https://pastebin.canonical.com/p/FJKcgH67bH/

Changed in juju:
status: Expired → New
Revision history for this message
Pedro Guimarães (pguimaraes) wrote :

The only practical option I can see right now is to use the Load Balancer type: https://github.com/juju/juju/blob/4953ff65902d95a7497ee16ee56a8070872f7a0e/environs/bootstrap/config.go#L133

Can we have the option for NodePort as well?

Revision history for this message
Ian Booth (wallyworld) wrote :

The expectation from the Juju perspective is that the CLI can route to the external IPs specified for the controller.
Can you provision a load balancer to allow ingress to the cluster to achieve that?

Unfortunately NodePort is not something currently on the radar as something that we can schedule to be done "soon". Marking this as Wishlist so that we can at least still track the request.

Changed in juju:
importance: High → Wishlist
status: New → Triaged
Revision history for this message
John A Meinel (jameinel) wrote :

I thought with Juju 2.9 we already use tunneling from K8s to get access to the Juju controller for the juju cli.

Which means you wouldn't have to declare any of the external information (depending on what else you are trying to do).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.