During the deploy of a bundle with 3 kubernetes-master one unit goes to error state when trying to execute the upgrade-charm hook

Bug #1836063 reported by Giuseppe Petralia
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Incomplete
Undecided
Unassigned
Kubernetes Control Plane Charm
Fix Released
Medium
Robert Gildein

Bug Description

kubernetes-master revision: 700

juju version 2.6.5 xenial

When deploying a bundle with 3 kubernetes master the installation of one of the three units fails with the following error:

2019-07-10 12:08:41 DEBUG upgrade-charm Traceback (most recent call last):
2019-07-10 12:08:41 DEBUG upgrade-charm File "/var/lib/juju/agents/unit-kubernetes-master-2/charm/hooks/upgrade-charm", line 22, in <module>
2019-07-10 12:08:41 DEBUG upgrade-charm main()
2019-07-10 12:08:41 DEBUG upgrade-charm File "/var/lib/juju/agents/unit-kubernetes-master-2/.venv/lib/python3.5/site-packages/charms/reactive/__init__.py", line 73, in main
2019-07-10 12:08:41 DEBUG upgrade-charm bus.dispatch(restricted=restricted_mode)
2019-07-10 12:08:41 DEBUG upgrade-charm File "/var/lib/juju/agents/unit-kubernetes-master-2/.venv/lib/python3.5/site-packages/charms/reactive/bus.py", line 379, in dispatch
2019-07-10 12:08:41 DEBUG upgrade-charm _invoke(hook_handlers)
2019-07-10 12:08:41 DEBUG upgrade-charm File "/var/lib/juju/agents/unit-kubernetes-master-2/.venv/lib/python3.5/site-packages/charms/reactive/bus.py", line 359, in _invoke
2019-07-10 12:08:41 DEBUG upgrade-charm handler.invoke()
2019-07-10 12:08:41 DEBUG upgrade-charm File "/var/lib/juju/agents/unit-kubernetes-master-2/.venv/lib/python3.5/site-packages/charms/reactive/bus.py", line 181, in invoke
2019-07-10 12:08:41 DEBUG upgrade-charm self._action(*args)
2019-07-10 12:08:41 DEBUG upgrade-charm File "/var/lib/juju/agents/unit-kubernetes-master-2/charm/reactive/kubernetes_master.py", line 169, in check_for_upgrade_needed
2019-07-10 12:08:41 DEBUG upgrade-charm update_certificates()
2019-07-10 12:08:41 DEBUG upgrade-charm File "/var/lib/juju/agents/unit-kubernetes-master-2/charm/reactive/kubernetes_master.py", line 915, in update_certificates
2019-07-10 12:08:41 DEBUG upgrade-charm send_data()
2019-07-10 12:08:41 DEBUG upgrade-charm File "/var/lib/juju/agents/unit-kubernetes-master-2/charm/reactive/kubernetes_master.py", line 855, in send_data
2019-07-10 12:08:41 DEBUG upgrade-charm ingress_ip = get_ingress_address(kube_api_endpoint.endpoint_name)
2019-07-10 12:08:41 DEBUG upgrade-charm AttributeError: 'NoneType' object has no attribute 'endpoint_name'
2019-07-10 12:08:41 ERROR juju.worker.uniter.operation runhook.go:132 hook "upgrade-charm" failed: exit status 1

It is trying to execute the upgrade-charm hook that is never executed on the other two units.

https://paste.ubuntu.com/p/36bywpk5W7/

Revision history for this message
Pedro Guimarães (pguimaraes) wrote :

Adding juju to get their input.

We have the following situation: one of the kubernetes-master units fail with upgrade-charm hook failed. That hook is called right after install on that unit.
None of the units run that hook.

Looking more deeper with debug-hooks, we've found out that .unit-state.db never gets populated on this unit, whereas other 2 units get all their relation info into kv table. That causes this specific unit to fail.

Revision history for this message
George Kraft (cynerva) wrote :

Looks like a bug in the kubernetes-master charm. It happens when an upgrade-charm hook runs before a kube-api-endpoint relation is established.

The upgrade-charm handler calls update_certificates()[1], which calls send_data()[2], which assumes kube-api-endpoint is available[3]. Either this code path needs to be eliminated, or there needs to be a check somewhere along the line that kube-api-endpoint is actually available.

It is weird that Juju runs an upgrade-charm hook on initial deployment. But the charm should be able to handle it.

[1]: https://github.com/charmed-kubernetes/charm-kubernetes-master/blob/aa7065651343626bac2ca16bcf849f6e9e051ab5/reactive/kubernetes_master.py#L169
[2]: https://github.com/charmed-kubernetes/charm-kubernetes-master/blob/aa7065651343626bac2ca16bcf849f6e9e051ab5/reactive/kubernetes_master.py#L915
[3]: https://github.com/charmed-kubernetes/charm-kubernetes-master/blob/aa7065651343626bac2ca16bcf849f6e9e051ab5/reactive/kubernetes_master.py#L846

Changed in charm-kubernetes-master:
status: New → Confirmed
Revision history for this message
Joseph Phillips (manadart) wrote :

Can we get the full log for the unit agent?

Revision history for this message
Giuseppe Petralia (peppepetra) wrote :

unit log

Revision history for this message
Joseph Phillips (manadart) wrote :

Despite the fact that the charm could be made resilient to this issue, the fact that a charm upgrade is queued right after installation looks like it could be a race that was addressed in the 2.6.6 release.

Can you confirm whether this bundle is OK under 2.6.6?

Changed in juju:
status: New → Incomplete
George Kraft (cynerva)
Changed in charm-kubernetes-master:
importance: Undecided → Medium
status: Confirmed → Triaged
Changed in charm-kubernetes-master:
status: Triaged → In Progress
assignee: nobody → Robert Gildein (rgildein)
Changed in charm-kubernetes-master:
milestone: none → 1.20+ck1
milestone: 1.20+ck1 → none
Revision history for this message
Robert Gildein (rgildein) wrote :
tags: added: review-needed
Cory Johns (johnsca)
Changed in charm-kubernetes-master:
status: In Progress → Fix Committed
milestone: none → 1.20+ck1
tags: removed: review-needed
Revision history for this message
George Kraft (cynerva) wrote :
Changed in charm-kubernetes-master:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.