Comment 2 for bug 1948906

Revision history for this message
DUFOUR Olivier (odufourc) wrote :

From my experience these past days, this can happen as well on the pre-series-upgrade hook with a very similar behavior where is stuck in a loop too for various reasons.

I may add another reproducer as well.

The environment is the following :
MaaS : 3.1
Juju : 2.9.31
Openstack : bionic-ussuri

The goal is to upgrade from bionic-ussuri to focal-ussuri.

As per the following documents:
https://juju.is/docs/olm/upgrade-a-machines-series
https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/upgrade-series.html
The following steps are executed:
juju upgrade-series <machine-id> prepare focal
(manual steps below including do-release-upgrade)
juju upgrade-series <machine-id> complete

If the pre-series-upgrade hook in the “upgrade-series prepare” or post-series-upgrade in the “upgrade-series complete” step fails, there is no way to recover the unit from the “blocked” status and an infinite loop occurs[1][2]. The unit is stuck in such a state even if errors in the OS layer are resolved such as APT package errors.

How to reproduce:
deploy aodh on bionic by Juju with the bundle attached to this ticket
* juju deploy upgrade-issue-bundle.yaml
do upgrade-series prepare
* juju upgrade-series 0/lxd/0 prepare focal
do release-upgrade (for testing purpose, this is optional)
* juju run –machine 0/lxd/0 –timeout=60m \
  sudo DEBIAN_FRONTEND=noninteractive \
  do-release-upgrade -f DistUpgradeViewNonInteractive
break apt on purpose
* sudo ln -s /bin/false /usr/local/bin/apt-get
do upgrade-series complete
* juju upgrade-series 0/lxd/0 complete
fix apt
* sudo rm /usr/local/bin/apt-get

Then, there is no way to complete or rerun the “complete” step since the unit is stuck at blocked. And executing the same command errors out:

$ juju upgrade-series 0/lxd/0 complete
-> ERROR machine "0/lxd/0" can not complete, it is either not prepared or already completed

[1] in upgrade-series prepare
2022-06-13 07:06:10 ERROR juju.worker.dependency engine.go:693 "uniter" manifold worker returned unexpected error: executing operation "run pre-series-upgrade hook" for placement/5: upgrade series status "prepare running"
2022-06-13 07:08:21 ERROR juju.worker.uniter.operation runhook.go:194 error updating workload status before pre-series-upgrade hook: upgrade series status "prepare running"
2022-06-13 07:08:21 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "run pre-series-upgrade hook" for placement/5: upgrade series status "prepare running"
2022-06-13 07:08:21 ERROR juju.worker.dependency engine.go:693 "uniter" manifold worker returned unexpected error: executing operation "run pre-series-upgrade hook" for placement/5: upgrade series status "prepare running"

[2] in upgrade-series complete
2022-06-13 07:42:13 ERROR juju.worker.dependency engine.go:693 "uniter" manifold worker returned unexpected error: executing operation "run post-series-upgrade hook" for placement/4: upgrade series status "complete running"
2022-06-13 07:44:10 ERROR juju.worker.uniter.operation runhook.go:194 error updating workload status before post-series-upgrade hook: upgrade series status "complete running"
2022-06-13 07:44:10 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "run post-series-upgrade hook" for placement/4: upgrade series status "complete running"
2022-06-13 07:44:10 ERROR juju.worker.dependency engine.go:693 "uniter" manifold worker returned unexpected error: executing operation "run post-series-upgrade hook" for placement/4: upgrade series status "complete running"