kubernetes-master unit fails to connect to itself

Bug #1929234 reported by Joshua Genet
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Kubernetes Control Plane Charm
Fix Released
High
Gabriel Cocenza

Bug Description

Run here:
https://solutions.qa.canonical.com/testruns/testRun/d3a7df7a-c49b-447c-8061-0aa2ff02590a

Logs/Artifacts/Bundles here:
https://oil-jenkins.canonical.com/artifacts/d3a7df7a-c49b-447c-8061-0aa2ff02590a/index.html

---

1.21 k8s on baremetal

Looking through system.journal shows a bunch of connection refused for several k8s resources on the
IP of the container for kubernetes-master/2:
Get "https://192.168.33.39:6443/apis/apps/v1/replicasets?limit=500&resourceVersion=0": dial tcp 192.168.33.39:6443: connect: connection refused

Following this run, we had 2 successful runs so seems to be either a race or just bad hardware on our end.

Revision history for this message
George Kraft (cynerva) wrote :

Looks like yet another case where pacemaker (hacluster), which hijacks control of the systemd services, has decided not to restart kube-apiserver. Yay.

kube-apiserver stopped at 03:11:35, no restart attempted for the next ~4 hours.

I could have sworn there is an open issue about this, but I can't find it.

Changed in charm-kubernetes-master:
importance: Undecided → High
status: New → Triaged
Revision history for this message
Michael Skalka (mskalka) wrote :

Seeing what looks like a similar issue on aws here: https://solutions.qa.canonical.com/testruns/testRun/33ed9e4f-5d3b-4a62-b34b-b54642df3862

test_dns_provider fails, but on the backend it looks like it's actually an issue getting a lock on the nginx pod:

E0528 13:15:59.858909 7 leaderelection.go:325] error retrieving resource lock ingress-nginx-kubernetes-worker/ingress-controller-leader-nginx: Get "https://10.152.183.1:443/api/v1/namespaces/ingress-nginx-kubernetes-worker/configmaps/ingress-controller-leader-nginx": dial tcp 10.152.183.1:443: connect: connection refused

Revision history for this message
George Kraft (cynerva) wrote :

> Seeing what looks like a similar issue on aws here: ...

This failure is completely unrelated - hacluster is not present in that environment, and kube-apiserver is running just fine. Please open a new issue for the failure on AWS.

Revision history for this message
Michael Skalka (mskalka) wrote :
Download full text (3.5 KiB)

Worth noting that this can cause an issue in test_sans:

=================================== FAILURES ===================================
__________________________________ test_sans ___________________________________
Traceback (most recent call last):
  File "/home/ubuntu/k8s-validation/jobs/integration/validation.py", line 1071, in test_sans
    await retry_async_with_timeout(
  File "/home/ubuntu/k8s-validation/jobs/integration/utils.py", line 202, in retry_async_with_timeout
    raise asyncio.TimeoutError(timeout_msg)
asyncio.exceptions.TimeoutError: extra sans config did not propagate to server certs
------------------------------ Captured log setup ------------------------------
WARNING juju.client.connection:connection.py:706 unknown facade CAASApplication
WARNING juju.client.connection:connection.py:730 unexpected facade CAASApplication found, unable to decipher version to use
WARNING juju.client.connection:connection.py:706 unknown facade CAASApplicationProvisioner
WARNING juju.client.connection:connection.py:730 unexpected facade CAASApplicationProvisioner found, unable to decipher version to use
WARNING juju.client.connection:connection.py:706 unknown facade CAASFirewallerEmbedded
WARNING juju.client.connection:connection.py:730 unexpected facade CAASFirewallerEmbedded found, unable to decipher version to use
WARNING juju.client.connection:connection.py:706 unknown facade CAASModelOperator
WARNING juju.client.connection:connection.py:730 unexpected facade CAASModelOperator found, unable to decipher version to use
WARNING juju.client.connection:connection.py:724 unknown common facade version for CAASUnitProvisioner
WARNING juju.client.connection:connection.py:706 unknown facade CharmHub
WARNING juju.client.connection:connection.py:730 unexpected facade CharmHub found, unable to decipher version to use
WARNING juju.model:model.py:905 unknown delta type: id
- generated xml file: /home/ubuntu/project/generated/kubernetes/k8s-suite/test_sans-junit.xml -
----- generated html file: file:///home/ubuntu/k8s-validation/report.html ------
=========================== short test summary info ============================
FAILED jobs/integration/validation.py::test_sans - asyncio.exceptions.Timeout...
======================== 1 failed in 606.01s (0:10:06) =========================

var/log/juju/unit-kubernetes-master-0.log:

2021-08-05 13:47:09 INFO unit.kubernetes-master/0.juju-log server.go:314 Executing ['kubectl', '--kubeconfig=/root/.kube/config', 'get', 'secrets', '-n', 'kube-system', '--field-selector', 'type=juju.is/token-auth', '-o', 'json']
2021-08-05 13:47:09 WARNING unit.kubernetes-master/0.update-status logger.go:60 The connection to the server 192.168.33.170:6443 was refused - did you specify the right host or port?
2021-08-05 13:47:09 INFO unit.kubernetes-master/0.juju-log server.go:314 Executing ['kubectl', '--kubeconfig=/root/.kube/config', 'get', 'secrets', '-n', 'kube-system', '--field-selector', 'type=juju.is/token-auth', '-o', 'json']
2021-08-05 13:47:09 WARNING unit.kubernetes-master/0.update-status logger.go:60 The connection to the server 192.168.33.170:6443 was refused - did you specify the right host or port...

Read more...

Changed in charm-kubernetes-master:
assignee: nobody → Gabriel Angelo Sgarbi Cocenza (gabrielcocenza)
status: Triaged → In Progress
Revision history for this message
Gabriel Cocenza (gabrielcocenza) wrote :

I was able to reproduce the bug in a test environment and the logs[1] show that pacemaker and corosync is not able to solve the failed service (kube-scheduler in my case). Rebooting the unit or restarting the service seems to solve the issue, but it's against the propose of the unit heal itself from an eventual failure.

Talking with OpenStack team, the added template [2] to always restart systemd services might cause this conflict. In an environment where pacemaker and corosync is not present, this template fulfills the need of restarting if the service crushes. On the other hand, in a environment where hacluster is present and running, I think it makes sense that the resources should be managed only by pacemaker.

The approach that I'm thinking to implement is to remove the always-restart.conf while configuring the cluster when HA is connected.

[1] https://pastebin.canonical.com/p/p9T2tCdsNn/
[2] https://github.com/charmed-kubernetes/charm-kubernetes-master/blob/master/reactive/kubernetes_master.py#L958-L982

Revision history for this message
Gabriel Cocenza (gabrielcocenza) wrote :

Updating what I've been doing lately with this bug...

After removing the template responsible for always restart systemd services, I saw that the issue persisted and reading the documentation it says that services should be disabled to let be controlled by the cluster. After doing this, the problem still persisted and I've opened a bug against the pacemaker project [2] which wasn't very helpful.

Reading the logs it's possible to see that services fails to start couple of times before being stable and running. Pacemaker by default consider start failure as fatal and that is why it doesn't try to restart the service again. [3] [4]

Knowing that services get stable and running eventually when systemd are responsible to then, my new approach now is to wait that all services are running in all nodes to add as a resource to the cluster. For that I will use the kube-masters [5] interface and just trigger this logic in the presence of hacluster (flag ha.connected) and keep those best practices of disabling and removing the always restart template.

[1] https://clusterlabs.org/pacemaker/doc/deprecated/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_systemd.html

[2] https://bugs.clusterlabs.org/show_bug.cgi?id=5487
[3] https://clusterlabs.org/pacemaker/doc/deprecated/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_failure_response.html
[4] https://clusterlabs.org/pacemaker/doc/deprecated/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-cluster-options.html
[5] https://github.com/charmed-kubernetes/interface-kube-masters/blob/master/peers.py

Revision history for this message
Gabriel Cocenza (gabrielcocenza) wrote :

I think I got at the root of the problem regarding the systemd services and Pacemaker. When those resources run independently of Pacemaker, the file always-restart does a great job of keeping then alive. OTOH, Pacemaker needs more configurations to run as expected. The first thing that I noticed is that if the configuration `start-failure-is-fatal` is not set to false, Pacemaker will give up of trying to restart the service and that is why you could have the problem described in the bug. The second thing is that apiserver should be running before starting controller-manager in the same node.

Right now pacemaker starts apiserver on one node and tries to start controller-manager in another node that does not have apiserver.

Those constraints can be achieved using colocation and order, but I found out that charm-hacluster doesn't wait to settle all configurations before starting allocating the services in the nodes. I've opened a bug [1] for this.

Another issue that I found is that right now charm-hacluster doesn't group the VIP with other resources. This means that if a service starts to fail, Pacemaker won't try to move the VIP to another node. I've also opened a bug for this [2].

The approach now it will be removing the master services from Pacemaker and leave it to systemd. At every hook execution it will check the healthy of those services (already implemented) and in case of a failed service, it will force the migration of the VIP to a node where services are healthy.

[1] https://bugs.launchpad.net/charm-hacluster/+bug/1952492
[2] https://bugs.launchpad.net/charm-hacluster/+bug/1952753

Revision history for this message
Gabriel Cocenza (gabrielcocenza) wrote :
tags: added: review-needed
George Kraft (cynerva)
Changed in charm-kubernetes-master:
status: In Progress → Fix Committed
milestone: none → 1.23+ck1
tags: removed: review-needed
Changed in charm-kubernetes-master:
milestone: 1.23+ck1 → 1.24
Changed in charm-kubernetes-master:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.