Kubernetes Control Plane Charm

Kubernetes master fails to start the pods when using calico

Bug #1854520 reported by Alexander Balderson on 2019-11-29

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Kubernetes Control Plane Charm	Invalid	High	George Kraft

Bug Description

All Solutions-QA runs have been failing when trying to start the pods since we moved to calico.

All other units come up normally, and nothing is standing out to me in the logs.

Crashdump is attached.

Tags:

Revision history for this message

Alexander Balderson (asbalderson) wrote on 2019-11-29:

juju-crashdump-kubernetes-2019-11-29-07.20.28.tar.gz Edit (42.0 MiB, application/x-tar)

George Kraft (cynerva) on 2019-11-29

Changed in charm-kubernetes-master:
assignee:	nobody → George Kraft (cynerva)

Alexander Balderson (asbalderson) on 2019-12-02

tags:

added: cdo-qa foundations-engine

Revision history for this message

Alexander Balderson (asbalderson) wrote on 2019-12-04:

subscribing field-high since this is blocking solutions-qa testing of calico

Revision history for this message

George Kraft (cynerva) wrote on 2019-12-05:

Kubelet is failing to start pods with:

Error syncing pod eb88b735-ce2c-4b1d-ab97-24390482bada ("coredns-568cb7d86-x5v7c_kube-system(eb88b735-ce2c-4b1d-ab97-24390482bada)"), skipping: failed to "CreatePodSandbox" for "coredns-568cb7d86-x5v7c_kube-system(eb88b735-ce2c-4b1d-ab97-24390482bada)" with CreatePodSandboxError: "CreatePodSandbox for pod \"coredns-568cb7d86-x5v7c_kube-system(eb88b735-ce2c-4b1d-ab97-24390482bada)\" failed: rpc error: code = Unknown desc = failed to setup network for sandbox \"a7d821e0e598a2d78c38c115dc804f9e839cfb393ee3e99f35b0b0125fd3e473\": Get https://10.5.0.7:443/api/v1/namespaces/kube-system: Service Unavailable"

That's the IP for kubeapi-load-balancer/0. I'm not seeing that request in the kubeapi-load-balancer/0 nginx logs, which tells me that traffic is not going where it's supposed to - possibly a proxy or firewall issue.

I haven't been able to reproduce this. I need more information. How are the applications configured? That information isn't included in the crashdump, see https://github.com/juju/juju-crashdump/issues/50 and a proposed fix here: https://github.com/juju/juju-crashdump/pull/52

Changed in charm-kubernetes-master:
importance:	Undecided → High
status:	New → Incomplete

Revision history for this message

Alexander Balderson (asbalderson) wrote on 2019-12-05:

kubernetes_bundle.yaml Edit (5.7 KiB, text/plain)

Thanks for the proposal!

I'm attaching the bundle from the deploy. It is basically our standard daily run of kubernetes on serverstack but flannel was replaced with calico. The flannel based run passes every night whereas the calcio based run fails.

Changed in charm-kubernetes-master:
status:	Incomplete → New

Revision history for this message

George Kraft (cynerva) wrote on 2019-12-05:

Ah, thanks. I suspect it has something to do with proxy configuration on containerd. I'll have another go at reproducing this and get back to you.

Changed in charm-kubernetes-master:
status:	New → In Progress

Revision history for this message

George Kraft (cynerva) wrote on 2019-12-05:

I can reproduce this by deploying cs:~containers/kubernetes-calico with http_proxy and https_proxy set. If I add the local network to no_proxy, though, then it works.

Can you try adding 10.5.0.0/24 (or whatever your actual subnet is) to containerd's no_proxy config and see if that resolves your problem?

George Kraft (cynerva) on 2019-12-06

Changed in charm-kubernetes-master:
status:	In Progress → Incomplete

Revision history for this message

Alexander Balderson (asbalderson) wrote on 2019-12-09:

That did seem to resolve the issue
thanks for the help

Revision history for this message

George Kraft (cynerva) wrote on 2019-12-09:

Cool, thanks for the follow-up!

Changed in charm-kubernetes-master:
status:	Incomplete → Invalid

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

auto-github-juju-juju-crashdump #50
[closed] Edit

Bug watches keep track of this bug in other bug trackers.