qa - test_status_messages failing waiting for services to start

Bug #1879545 reported by Adam Stokes
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Kubernetes Control Plane Charm
Fix Released
High
Kevin W Monroe

Bug Description

Test report: https://jenkaas.s3.amazonaws.com/41f14796-6d90-4c06-8a7b-f56094d80b57/index.html
Crashdump: http://jenkaas.s3-website-us-east-1.amazonaws.com/41f14796-6d90-4c06-8a7b-f56094d80b57/artifacts.tar.gz

Test output:
Traceback (most recent call last):
  File "/var/lib/jenkins/slaves/jenkins-slave-3/workspace/validate-ck/arch/amd64/channel/edge/node/runner-validate/series/bionic/snap_version/1.18/edge/jobs/integration/validation.py", line 229, in test_status_messages
    assert unit.workload_status_message == message
AssertionError: assert 'Stopped serv...er,kube-proxy' == 'Kubernetes master running.'
  - Kubernetes master running.
  + Stopped services: kube-controller-manager,kube-proxy

Changed in charmed-kubernetes-testing:
assignee: nobody → Kevin W Monroe (kwmonroe)
importance: Undecided → High
status: New → In Progress
status: In Progress → Confirmed
Changed in charm-kubernetes-master:
status: New → In Progress
importance: Undecided → High
Changed in charmed-kubernetes-testing:
importance: High → Undecided
assignee: Kevin W Monroe (kwmonroe) → nobody
Changed in charm-kubernetes-master:
assignee: nobody → Kevin W Monroe (kwmonroe)
Revision history for this message
Kevin W Monroe (kwmonroe) wrote :

This came in with the fix for bug 1841226. We no longer generate basic_auth.csv on install, so k8s-master followers will never make it to 'active' status; instead they loop on:

-----
unit-kubernetes-master-1: 09:36:57 INFO unit.kubernetes-master/1.juju-log Invoking reactive handler: reactive/apt.py:49:ensure_package_status
unit-kubernetes-master-1: 09:36:57 INFO unit.kubernetes-master/1.juju-log Invoking reactive handler: reactive/kubernetes_master.py:473:safely_join_cohort
unit-kubernetes-master-1: 09:36:57 INFO unit.kubernetes-master/1.juju-log Invoking reactive handler: reactive/kubernetes_master.py:658:setup_non_leader_authentication
unit-kubernetes-master-1: 09:36:57 INFO unit.kubernetes-master/1.juju-log Missing content for file /root/cdk/basic_auth.csv
-----

Fix for this will be to make 'basic_auth.csv' optional. If it's there, followers should keep it in sync with the leader. If it's not there, don't make that a blocker.

George Kraft (cynerva)
no longer affects: charmed-kubernetes-testing
Revision history for this message
Kevin W Monroe (kwmonroe) wrote :
tags: added: review-needed
Revision history for this message
Kevin W Monroe (kwmonroe) wrote :

I tweaked the proposed solution from comment 1. We don't make basic_auth.csv optional, but rather ensure it's a stubbed out file. I went this route because i thought it would be nicer if cluster operators could see that the file was still there with a clear message that it shouldn't be used for auth in 1.19+. Making it optional could have led to ops generating their own basic_auth.csv and wondering why those entries had no effect.

tags: removed: review-needed
Changed in charm-kubernetes-master:
status: In Progress → Fix Committed
Changed in charm-kubernetes-master:
milestone: none → 1.19
Changed in charm-kubernetes-master:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.