test_status_messages fails waiting for one system pod however all pods are up/healthy

Bug #1928354 reported by Michael Skalka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Charmed Kubernetes Testing
Invalid
Undecided
Unassigned

Bug Description

In this [0] test run (crashdump [1]) we see the test_status_messages test fail waiting on one system pod to be up:

...
===Flaky Test Report===

test_status_messages failed (4 runs remaining out of 5).
 <class 'AssertionError'>
 assert 'Waiting for ... pod to start' == 'Kubernetes master running.'
  - Kubernetes master running.
  + Waiting for 1 kube-system pod to start
 [<TracebackEntry /home/ubuntu/k8s-validation/jobs/integration/validation.py:257>]
...

However those pods are alive according to one k8s-master:

ubernetes-master/0* waiting idle 12 52.5.182.211 6443/tcp Waiting for 1 kube-system pod to start
...
kubernetes-master/1 active idle 13 3.80.166.226 6443/tcp Kubernetes master running.
...

K8s master-1 has the right info:

2021-05-13 06:59:25 INFO juju-log Checking system pods status: calico-kube-controllers-54cf4fcc4d-stkr5=Running, coredns-6f867cd986-7zm99=Running, kube-state-metrics-7799879d89-7hzmk=Running, metrics-server-v0.3.6-c68f9b948-kwxz4=Running

where as k8s-master-0 thinks it is still pending:

2021-05-13 06:59:37 INFO juju-log Checking system pods status: calico-kube-controllers-54cf4fcc4d-stkr5=Running, coredns-6f867cd986-7zm99=Running, kube-state-metrics-7799879d89-7hzmk=Running, metrics-server-v0.3.6-c68f9b948-kwxz4=Running, metrics-server-v0.3.6-f6cf867b4-xzjxd=Pending

In the pod-logs there are no failures, so this looks like k8s-master-0 just has stale data.

0. https://solutions.qa.canonical.com/testruns/testRun/d45690eb-24c7-4e8c-b598-8fb64583303e
1. https://oil-jenkins.canonical.com/artifacts/d45690eb-24c7-4e8c-b598-8fb64583303e/generated/generated/kubernetes/juju-crashdump-kubernetes-2021-05-13-07.00.31.tar.gz

Revision history for this message
George Kraft (cynerva) wrote :

Duplicate of LP:1923041

Revision history for this message
Konstantinos Kaskavelis (kaskavel) wrote :

Closing this due to inactivity (low number of occurrences, and no hit for more than one year)

Changed in charmed-kubernetes-testing:
status: New → Invalid
tags: added: solutions-qa-expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.