Juju -> LXD Cluster - Waiting for kubelet to start

Bug #1834374 reported by Joel Johnston
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
AWS Integrator Charm
Fix Released
Undecided
Unassigned

Bug Description

Using the instructions found here: https://github.com/charmed-kubernetes/bundle/wiki/Deploying-on-LXD

I have built a five machine lxd cluster using MaaS. I've created a passthrough bridge on br0 on each node and am using the static IP's assigned by mass to displace the primary interface. After that I manually created the lxd cluster and added the subsequent nodes.

I then added the lxd cluster as a cloud object in Juju, provisioned credentials, and the bootstrapped juju against the lxd cluster. This all seems to work well.

I then use the lxd-profile.yaml and the instructions listed above to modify the lxc profile on my my maas/juju machine. I deploy the kubernetes cluster against the lxd cluster with juju deploy cs:bundle/canonical-kubernetes-592

I then apply the proxy config listed in the instructions to modify the worker network profile with juju config -m "$JUJU_CONTROLLER:$JUJU_MODEL" kubernetes-worker proxy-extra-args="proxy-mode=userspace"

The cluster comes up and gets most of the way until it sticks here.

Every 2.0s: juju status --color lv-maas-01: Wed Jun 26 21:47:03 2019

Model Controller Cloud/Region Version SLA Timestamp
kubernetes lxd-cluster-default lxd-cluster/default 2.6.4 unsupported 21:47:03Z

App Version Status Scale Charm Store Rev OS Notes
easyrsa 3.0.1 active 1 easyrsa jujucharms 222 ubuntu
etcd 3.2.10 active 3 etcd jujucharms 397 ubuntu
flannel 0.10.0 active 5 flannel jujucharms 386 ubuntu
kubeapi-load-balancer 1.14.0 active 1 kubeapi-load-balancer jujucharms 583 ubuntu exposed
kubernetes-master 1.13.7 waiting 2 kubernetes-master jujucharms 604 ubuntu
kubernetes-worker 1.13.7 waiting 3 kubernetes-worker jujucharms 472 ubuntu exposed

Unit Workload Agent Machine Public address Ports Message
easyrsa/0* active idle 0 <myip> Certificate Authority connected.
etcd/0* active idle 1 <myip> 2379/tcp Healthy with 3 known peers
etcd/1 active idle 2 <myip> 2379/tcp Healthy with 3 known peers
etcd/2 active idle 3 <myip> 2379/tcp Healthy with 3 known peers
kubeapi-load-balancer/0* active idle 4 <myip> 443/tcp Loadbalancer ready.
kubernetes-master/0* waiting idle 5 <myip> 6443/tcp Waiting for 7 kube-system pods to start
  flannel/2 active idle <myip> Flannel subnet 10.1.69.1/24
kubernetes-master/1 waiting idle 6 <myip> 6443/tcp Waiting for 7 kube-system pods to start
  flannel/3 active idle <myip> Flannel subnet 10.1.7.1/24
kubernetes-worker/0 waiting idle 7 <myip> 80/tcp,443/tcp Waiting for kubelet to start.
  flannel/1 active idle <myip> Flannel subnet 10.1.90.1/24
kubernetes-worker/1* waiting idle 8 <myip> 80/tcp,443/tcp Waiting for kubelet to start.
  flannel/0* active idle <myip> Flannel subnet 10.1.77.1/24
kubernetes-worker/2 waiting idle 9 <myip> 80/tcp,443/tcp Waiting for kubelet to start.
  flannel/4 active idle <myip> Flannel subnet 10.1.97.1/24

Machine State DNS Inst id Series AZ Message
0 started <myip> juju-c4ad65-0 bionic Running
1 started <myip> juju-c4ad65-1 bionic Running
2 started <myip> juju-c4ad65-2 bionic Running
3 started <myip> juju-c4ad65-3 bionic Running
4 started <myip> juju-c4ad65-4 bionic Running
5 started <myip> juju-c4ad65-5 bionic Running
6 started <myip> juju-c4ad65-6 bionic Running
7 started <myip> juju-c4ad65-7 bionic Running
8 started <myip> juju-c4ad65-8 bionic Running
9 started <myip> juju-c4ad65-9 bionic Running

Please let me know what I should check to verify the config. Thank you.

description: updated
Revision history for this message
Mike Wilson (knobby) wrote :

Thanks for the bug report. This certainly should work. It would be great to get a cdk-field-agent run for this bug to help out. If not that, can we get the output of `kubectl describe no` and `kubectl describe po -A` run from the master unit.

Revision history for this message
Joel Johnston (joeldjohnston) wrote :
Revision history for this message
Joel Johnston (joeldjohnston) wrote :

Not sure if that attachment is going to be effective the cdk-field-agent ended in a loop complaining of a timeout error

"Error checking action output. Ignoring.
ERROR timeout reached
Error checking action output. Ignoring.
ERROR timeout reached
Error checking action output. Ignoring.
ERROR timeout reached
Error checking action output. Ignoring.
ERROR timeout reached
Error checking action output. Ignoring.
ERROR timeout reached
Error checking action output. Ignoring."

I'm going to assume this is because the Master pods are in a waiting state?

Revision history for this message
George Kraft (cynerva) wrote :

Indeed, that's missing the info we needed. It looks like cdk-field-agent spews that message when it's waiting for a debug action to complete. I think if you run it, but wait longer, it will eventually complete with the info we need. I'll follow up and see about making the output of cdk-field-agent cleaner.

Please run cdk-field-agent again and wait for it to complete.

Alternatively, if you're willing to do a bit of back-and-forth on this issue, we can walk through more direct commands to troubleshoot this. The main thing is, we need to see why the workers are "waiting for kubelet to start." Seeing the Kubelet logs would help:

juju run --unit kubernetes-worker/0 -- journalctl -o cat -u snap.kubelet.daemon

Revision history for this message
Joel Johnston (joeldjohnston) wrote :
Download full text (133.6 KiB)

cloudadmin@lv-maas-01:~$ juju run --unit kubernetes-worker/0 -- journalctl -o cat -u snap.kubelet.daemon
snap.kubelet.daemon.service: Failed to reset devices.list: Operation not permitted
Started Service for snap application kubelet.daemon.
cat: /var/snap/kubelet/1031/args: No such file or directory
I0626 19:42:13.523303 17004 server.go:407] Version: v1.13.7
I0626 19:42:13.523538 17004 plugins.go:103] No cloud provider specified.
W0626 19:42:13.523559 17004 server.go:552] standalone mode, no API client
W0626 19:42:13.633959 17004 server.go:464] No api server defined - no events will be sent to API server.
I0626 19:42:13.633991 17004 server.go:666] --cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /
F0626 19:42:13.634931 17004 server.go:261] failed to run Kubelet: Running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false. /proc/swaps contained: [Filename Type Size Used Priority none virtual 8388604 8388336 0]
snap.kubelet.daemon.service: Main process exited, code=exited, status=255/n/a
snap.kubelet.daemon.service: Failed with result 'exit-code'.
snap.kubelet.daemon.service: Service hold-off time over, scheduling restart.
snap.kubelet.daemon.service: Scheduled restart job, restart counter is at 1.
Stopped Service for snap application kubelet.daemon.
snap.kubelet.daemon.service: Failed to reset devices.list: Operation not permitted
Started Service for snap application kubelet.daemon.
snap.kubelet.daemon.service: Failed to reset devices.list: Operation not permitted
I0626 19:42:14.052796 18823 server.go:407] Version: v1.13.7
I0626 19:42:14.052983 18823 plugins.go:103] No cloud provider specified.
W0626 19:42:14.053000 18823 server.go:552] standalone mode, no API client
W0626 19:42:14.088469 18823 server.go:464] No api server defined - no events will be sent to API server.
I0626 19:42:14.088489 18823 server.go:666] --cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /
F0626 19:42:14.089407 18823 server.go:261] failed to run Kubelet: Running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false. /proc/swaps contained: [Filename Type Size Used Priority none virtual 8388604 8388336 0]
snap.kubelet.daemon.service: Main process exited, code=exited, status=255/n/a
snap.kubelet.daemon.service: Failed with result 'exit-code'.
snap.kubelet.daemon.service: Service hold-off time over, scheduling restart.
snap.kubelet.daemon.service: Scheduled restart job, restart counter is at 2.
Stopped Service for snap application kubelet.daemon.
snap.kubelet.daemon.service: Failed to reset devices.list: Operation not permitted
Started Service for snap application kubelet.daemon.
snap.kubelet.daemon.service: Failed to reset devices.list: Operation not permitted
I0626 19:42:14.445545 18919 server.go:407] Version: v1.13.7
I0626 19:42:14.446270 18919 plugins.go:103] No cloud provider specified.
W0626 19:42:14.446295 18919 server.go:552] standalone mode, no API client
W0626 19:42:14.495318 18919 server.go:464] No api server defined - no even...

Revision history for this message
George Kraft (cynerva) wrote :

Thanks. This is the fatal error:

F0626 19:55:06.745273 60563 kubelet.go:1384] Failed to start ContainerManager [open /proc/sys/vm/overcommit_memory: permission denied, open /proc/sys/kernel/panic: permission denied, open /proc/sys/kernel/panic_on_oops: permission denied

We usually see this error when the LXD profile hasn't been applied. Can you confirm that the profile has been applied with the name "juju-kubernetes", and that the instances are using it?

Command with example output below. This will show you both the profile contents, and the instances that are using the profile:

$ lxc profile show juju-kubernetes
config:
  boot.autostart: "true"
  linux.kernel_modules: ip_tables,ip6_tables,netlink_diag,nf_nat,overlay
  raw.lxc: |
    lxc.apparmor.profile=unconfined
    lxc.mount.auto=proc:rw sys:rw
    lxc.cap.drop=
  security.nesting: "true"
  security.privileged: "true"
description: ""
devices:
  aadisable:
    path: /sys/module/nf_conntrack/parameters/hashsize
    source: /dev/null
    type: disk
  aadisable1:
    path: /sys/module/apparmor/parameters/enabled
    source: /dev/null
    type: disk
  aadisable2:
    path: /dev/kmsg
    source: /dev/kmsg
    type: unix-char
name: juju-kubernetes
used_by:
- /1.0/containers/juju-d5cfa2-0
- /1.0/containers/juju-d5cfa2-1
- /1.0/containers/juju-d5cfa2-2
- /1.0/containers/juju-d5cfa2-3
- /1.0/containers/juju-d5cfa2-4
- /1.0/containers/juju-d5cfa2-6
- /1.0/containers/juju-d5cfa2-5
- /1.0/containers/juju-d5cfa2-7
- /1.0/containers/juju-d5cfa2-9
- /1.0/containers/juju-d5cfa2-8

Reading through your original description more carefully, this stood out:

> I have built a five machine lxd cluster using MaaS.

Ah! I don't think we've tested the case where an LXD cluster spans multiple machines. I'm not too familiar with this scenario - is it possible you need to apply the LXD profile on all five hosts?

Revision history for this message
Joel Johnston (joeldjohnston) wrote :

ubuntu@lv-kube-01:~$ lxc profile show juju-kubernetes
config:
  boot.autostart: "true"
  security.nesting: "true"
description: ""
devices: {}
name: juju-kubernetes
used_by:
- /1.0/containers/juju-c4ad65-0
- /1.0/containers/juju-c4ad65-1
- /1.0/containers/juju-c4ad65-2
- /1.0/containers/juju-c4ad65-4
- /1.0/containers/juju-c4ad65-5
- /1.0/containers/juju-c4ad65-3
- /1.0/containers/juju-c4ad65-7
- /1.0/containers/juju-c4ad65-6
- /1.0/containers/juju-c4ad65-8
- /1.0/containers/juju-c4ad65-9

ubuntu@lv-kube-02:~$ lxc profile show juju-kubernetes
config:
  boot.autostart: "true"
  security.nesting: "true"
description: ""
devices: {}
name: juju-kubernetes
used_by:
- /1.0/containers/juju-c4ad65-0
- /1.0/containers/juju-c4ad65-1
- /1.0/containers/juju-c4ad65-2
- /1.0/containers/juju-c4ad65-4
- /1.0/containers/juju-c4ad65-5
- /1.0/containers/juju-c4ad65-3
- /1.0/containers/juju-c4ad65-7
- /1.0/containers/juju-c4ad65-6
- /1.0/containers/juju-c4ad65-8
- /1.0/containers/juju-c4ad65-9

ubuntu@lv-kube-03:~$ lxc profile show juju-kubernetes
config:
  boot.autostart: "true"
  security.nesting: "true"
description: ""
devices: {}
name: juju-kubernetes
used_by:
- /1.0/containers/juju-c4ad65-0
- /1.0/containers/juju-c4ad65-1
- /1.0/containers/juju-c4ad65-2
- /1.0/containers/juju-c4ad65-4
- /1.0/containers/juju-c4ad65-5
- /1.0/containers/juju-c4ad65-3
- /1.0/containers/juju-c4ad65-7
- /1.0/containers/juju-c4ad65-6
- /1.0/containers/juju-c4ad65-8
- /1.0/containers/juju-c4ad65-9ubuntu@lv-kube-04:~$ lxc profile show juju-kubernetes
config:
  boot.autostart: "true"
  security.nesting: "true"
description: ""
devices: {}
name: juju-kubernetes
used_by:
- /1.0/containers/juju-c4ad65-0
- /1.0/containers/juju-c4ad65-1
- /1.0/containers/juju-c4ad65-2
- /1.0/containers/juju-c4ad65-4
- /1.0/containers/juju-c4ad65-5
- /1.0/containers/juju-c4ad65-3
- /1.0/containers/juju-c4ad65-7
- /1.0/containers/juju-c4ad65-6
- /1.0/containers/juju-c4ad65-8
- /1.0/containers/juju-c4ad65-9

ubuntu@lv-kube-04:~$ lxc profile show juju-kubernetes
config:
  boot.autostart: "true"
  security.nesting: "true"
description: ""
devices: {}
name: juju-kubernetes
used_by:
- /1.0/containers/juju-c4ad65-0
- /1.0/containers/juju-c4ad65-1
- /1.0/containers/juju-c4ad65-2
- /1.0/containers/juju-c4ad65-4
- /1.0/containers/juju-c4ad65-5
- /1.0/containers/juju-c4ad65-3
- /1.0/containers/juju-c4ad65-7
- /1.0/containers/juju-c4ad65-6
- /1.0/containers/juju-c4ad65-8
- /1.0/containers/juju-c4ad65-9

ubuntu@lv-kube-05:~$ lxc profile show juju-kubernetes
config:
  boot.autostart: "true"
  security.nesting: "true"
description: ""
devices: {}
name: juju-kubernetes
used_by:
- /1.0/containers/juju-c4ad65-0
- /1.0/containers/juju-c4ad65-1
- /1.0/containers/juju-c4ad65-2
- /1.0/containers/juju-c4ad65-4
- /1.0/containers/juju-c4ad65-5
- /1.0/containers/juju-c4ad65-3
- /1.0/containers/juju-c4ad65-7
- /1.0/containers/juju-c4ad65-6
- /1.0/containers/juju-c4ad65-8
- /1.0/containers/juju-c4ad65-9

Revision history for this message
Joel Johnston (joeldjohnston) wrote :

So, I thought I'd add a bit of why to this thread to make sure I'm working under the correct assumptions. The purpose of this attempt to build kubernetes (and eventually mongodb, openstack, ceph and ....?) against lxd has to do with juju.

I have a 5 machine lab environment that I want to use to house multiple environments simultaneously. I have successfully deployed kubernetes against MaaS using juju a number of times. However, once this is done, the machines are marked as deployed, so if I were to try to deploy a mongodb stack against it (even if I used containers as targets,) juju sees those nodes as deployed.

So my thought was to build these hosts into a lxd cluster and thus be able to provision (n) nodes as containers so that I could build a kubernetes cluster and then simultaneous mongodb cluster in the same cluster without deploying mongo on kubernetes. Eventually, I'd like to have quite a number of simultaneous lab environments stacked here, using juju's multi-tenant capabilities.

If there's an easier way to achieve this with juju and maas, without a lxd cluster per say, I'd be all ears on that. Thanks again for your help.

Revision history for this message
George Kraft (cynerva) wrote :

From your output of `lxc profile show juju-kubernetes`, it looks like the profile wasn't applied correctly. The profile from your output is the default one created by Juju, not the one from our CDK-on-LXD documentation. Can you try applying the profile again?

> I have a 5 machine lab environment that I want to use to house multiple environments simultaneously. I have successfully deployed kubernetes against MaaS using juju a number of times. However, once this is done, the machines are marked as deployed, so if I were to try to deploy a mongodb stack against it (even if I used containers as targets,) juju sees those nodes as deployed.

You should be able to deploy multiple units to a small number of machines using the Juju CLI. For example, if you want to deploy mongodb to an LXD container on machine 2 while it's already in use, you'd do:

juju deploy mongodb --to lxd:2

> If there's an easier way to achieve this with juju and maas, without a lxd cluster per say, I'd be all ears on that. Thanks again for your help.

I believe the most common approach for this type of scenario is to run your Juju controller and models on a MAAS cloud, and use placement directives to tell Juju to deploy to LXD containers as needed (see above).

One benefit to this approach is that you could run troublesome units like kubernetes-worker directly on bare metal, while still utilizing LXD for the remaining units that don't have problems with it.

What you're trying seems reasonable, too, though. It just hasn't been tested well, so you may run into issues we're not aware of.

Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :

Yeah, your approach makes sense and AFAIK should work. We don't test on clustered lxd right now, but I'd like to see this work - I'm not aware of any reasons why it wouldn't.

But, the juju-kubernetes profiles you show above are not correct, so that's a problem. Did your steps looks roughly like this?

1. After you bootstrap, do `juju add-model kubernetes`
2. Using the instructions on the wiki you cited, apply the custom profile on all 5 hosts, ensuring that $JUJU_MODEL is "kubernetes".
3. juju deploy cs:bundle/canonical-kubernetes-592

This will get easier in the future, once this feature is done: https://bugs.launchpad.net/charm-kubernetes-master/+bug/1835078

As for alternatives, you could also explore using the KVM Pods feature of MAAS: https://docs.maas.io/2.6/en/manage-kvm-intro

Revision history for this message
Joel Johnston (joeldjohnston) wrote :
Download full text (8.0 KiB)

Alright Thank You to both George and Tim. One confusing thing about running in cluster mode is that profiles APPEAR to be maintained across the cluster. For instance I created a profile on machine 5 and a lxc profile list shows said created profile on machine 1. However, it would appear that the profile needs to be applied to all of the machines for it to actually work.

The second thing that got me was the naming of the profile. In juju you name the environment kubernetes. When it deploys, it uses the profile juju-kubernetes. I was trying to match these names and "outsmart" the instructions.. So that didn't work for me very well :)

Per your instructions I named the environment kubernetes via juju and then applied the lxc profile on each machine to juju-kubernetes profile. Juju used said juju-kubernetes named profile to deploy the containers and....

ubuntu@lv-kube-01:~$ lxc list
+---------------+---------+-----------------------+------+------------+-----------+------------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | LOCATION |
+---------------+---------+-----------------------+------+------------+-----------+------------+
| juju-7f1191-0 | RUNNING | <my ip range>.99 (eth0) | | PERSISTENT | 0 | lv-kube-01 |
+---------------+---------+-----------------------+------+------------+-----------+------------+
| juju-cfe9ce-0 | RUNNING | <my ip range>.117 (eth0) | | PERSISTENT | 0 | lv-kube-04 |
+---------------+---------+-----------------------+------+------------+-----------+------------+
| juju-cfe9ce-1 | RUNNING | <my ip range>.115 (eth0) | | PERSISTENT | 0 | lv-kube-05 |
+---------------+---------+-----------------------+------+------------+-----------+------------+
| juju-cfe9ce-2 | RUNNING | <my ip range>.116 (eth0) | | PERSISTENT | 0 | lv-kube-03 |
+---------------+---------+-----------------------+------+------------+-----------+------------+
| juju-cfe9ce-3 | RUNNING | <my ip range>.124 (eth0) | | PERSISTENT | 0 | lv-kube-02 |
+---------------+---------+-----------------------+------+------------+-----------+------------+
| juju-cfe9ce-4 | RUNNING | <my ip range>.120 (eth0) | | PERSISTENT | 0 | lv-kube-02 |
+---------------+---------+-----------------------+------+------------+-----------+------------+
| juju-cfe9ce-5 | RUNNING | 172.17.0.1 (docker0) | | PERSISTENT | 0 | lv-kube-02 |
| | | <my ip range>.118 (eth0) | | | | |
| | | 10.1.79.0 (flannel.1) | | | | |
+---------------+---------+-----------------------+------+------------+-----------+------------+
| juju-cfe9ce-6 | RUNNING | 172.17.0.1 (docker0) | | PERSISTENT | 0 | lv-kube-05 |
| | | <my ip range>.108 (eth0) | | | | |
| | | 10.1.61.0 (flannel.1) | | | | |
+---------------+---------+-----------------------+------+------------+-----------+------------+
| juju-cfe9ce-7 | RUNNING | 172...

Read more...

Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :

Awesome Joel! Glad you got it working, and thanks for taking the time to leave a detailed update here.

Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :

I made some additions to the wiki instructions to clarify the profile naming, and highlight that the profile must be applied to each host in a lxd cluster.

Changed in charm-aws-integrator:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.