hook failures due to API connectivity should not put charm in error state

Bug #1884332 reported by John George
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
High
Unassigned

Bug Description

With 2.7.7+2.7-5323b9 the apache2 charm gets an install hook error on vSphere

The bundle, crashdump and other test artifacts are available through the link at the bottom of the test run page:
https://solutions.qa.canonical.com/#/qa/testRun/2825b321-6320-404c-ab98-bf3776851b21

2020-06-19 20:49:32 DEBUG worker.uniter.jujuc server.go:204 running hook tool "config-get"
2020-06-19 20:49:32 DEBUG install ERROR connection is shut down
2020-06-19 20:49:32 DEBUG install Traceback (most recent call last):
2020-06-19 20:49:32 DEBUG install File "hooks/hooks.py", line 1093, in <module>
2020-06-19 20:49:32 DEBUG install main(hook_name)
2020-06-19 20:49:32 DEBUG install File "hooks/hooks.py", line 1045, in main
2020-06-19 20:49:32 DEBUG install install_hook()
2020-06-19 20:49:32 DEBUG install File "hooks/hooks.py", line 294, in install_hook
2020-06-19 20:49:32 DEBUG install apt_source = config_get('apt-source') or ''
2020-06-19 20:49:32 DEBUG install File "hooks/hooks.py", line 285, in config_get
2020-06-19 20:49:32 DEBUG install result = orig_config_get(scope)
2020-06-19 20:49:32 DEBUG install File "/var/lib/juju/agents/unit-apache2-0/charm/hooks/charmhelpers/core/hookenv.py", line 436, in config
2020-06-19 20:49:32 DEBUG install subprocess.check_output(config_cmd_line).decode('UTF-8'))
2020-06-19 20:49:32 DEBUG install File "/usr/lib/python2.7/subprocess.py", line 223, in check_output
2020-06-19 20:49:32 DEBUG install raise CalledProcessError(retcode, cmd, output=output)
2020-06-19 20:49:32 DEBUG install subprocess.CalledProcessError: Command '['config-get', '--all', '--format=json']' returned non-zero exit status 1

Revision history for this message
Michael Skalka (mskalka) wrote :
Download full text (3.6 KiB)

Seeing this bug again here: https://solutions.qa.canonical.com/#/qa/testRun/beec9b26-8e52-40f4-8eeb-a0bbf22c0371

e24a18bc-f64b-473e-854d-6bb19b6c00b9: unit-apache2-0 2020-06-24 03:10:54 DEBUG worker.uniter.jujuc server.go:204 running hook tool "config-get"
e24a18bc-f64b-473e-854d-6bb19b6c00b9: unit-apache2-0 2020-06-24 03:10:54 DEBUG unit.apache2/0.install runner.go:387 ERROR connection is shut down
e24a18bc-f64b-473e-854d-6bb19b6c00b9: unit-apache2-0 2020-06-24 03:10:54 DEBUG unit.apache2/0.install runner.go:387 Traceback (most recent call last):
e24a18bc-f64b-473e-854d-6bb19b6c00b9: unit-apache2-0 2020-06-24 03:10:54 DEBUG unit.apache2/0.install runner.go:387 File "hooks/hooks.py", line 1093, in <module>
e24a18bc-f64b-473e-854d-6bb19b6c00b9: unit-apache2-0 2020-06-24 03:10:54 DEBUG unit.apache2/0.install runner.go:387 main(hook_name)
35c20d9c-69ba-4c05-8152-70a77f6a60ce: machine-0 2020-06-24 03:10:57 DEBUG juju.apiserver request_notifier.go:178 <- [69] unit-apache2-0 {"request-id":28,"type":"NotifyWatcher","version":1,"id":"5","request":"Next","params":"'params redacted'"}
35c20d9c-69ba-4c05-8152-70a77f6a60ce: machine-0 2020-06-24 03:10:57 DEBUG juju.apiserver request_notifier.go:195 -> [69] unit-apache2-0 2.317674ms {"request-id":27,"response":"'body redacted'"} ProxyUpdater[""].ProxyConfig
35c20d9c-69ba-4c05-8152-70a77f6a60ce: machine-0 2020-06-24 03:10:57 DEBUG juju.apiserver request_notifier.go:195 -> [69] unit-apache2-0 17.674085ms {"request-id":16,"response":"'body redacted'"} MeterStatus[""].WatchMeterStatus
e24a18bc-f64b-473e-854d-6bb19b6c00b9: unit-apache2-0 2020-06-24 03:10:54 DEBUG unit.apache2/0.install runner.go:387 File "hooks/hooks.py", line 1045, in main
e24a18bc-f64b-473e-854d-6bb19b6c00b9: unit-apache2-0 2020-06-24 03:10:54 DEBUG unit.apache2/0.install runner.go:387 install_hook()
e24a18bc-f64b-473e-854d-6bb19b6c00b9: unit-apache2-0 2020-06-24 03:10:54 DEBUG unit.apache2/0.install runner.go:387 File "hooks/hooks.py", line 294, in install_hook
e24a18bc-f64b-473e-854d-6bb19b6c00b9: unit-apache2-0 2020-06-24 03:10:54 DEBUG unit.apache2/0.install runner.go:387 apt_source = config_get('apt-source') or ''
e24a18bc-f64b-473e-854d-6bb19b6c00b9: unit-apache2-0 2020-06-24 03:10:54 DEBUG unit.apache2/0.install runner.go:387 File "hooks/hooks.py", line 285, in config_get
e24a18bc-f64b-473e-854d-6bb19b6c00b9: unit-apache2-0 2020-06-24 03:10:54 DEBUG unit.apache2/0.install runner.go:387 result = orig_config_get(scope)
e24a18bc-f64b-473e-854d-6bb19b6c00b9: unit-apache2-0 2020-06-24 03:10:54 DEBUG unit.apache2/0.install runner.go:387 File "/var/lib/juju/agents/unit-apache2-0/charm/hooks/charmhelpers/core/hookenv.py", line 436, in config
e24a18bc-f64b-473e-854d-6bb19b6c00b9: unit-apache2-0 2020-06-24 03:10:54 DEBUG unit.apache2/0.install runner.go:387 subprocess.check_output(config_cmd_line).decode('UTF-8'))
e24a18bc-f64b-473e-854d-6bb19b6c00b9: unit-apache2-0 2020-06-24 03:10:54 DEBUG unit.apache2/0.install runner.go:387 File "/usr/lib/python2.7/subprocess.py", line 223, in check_output
e24a18bc-f64b-473e-854d-6bb19b6c00b9: unit-apache2-0 2020-06-24 03:10:54 DEBUG unit.apache2/0.install runner.go:387 r...

Read more...

Revision history for this message
Michael Skalka (mskalka) wrote :
Tim Penhey (thumper)
Changed in juju:
status: New → Invalid
Revision history for this message
Tim Penhey (thumper) wrote :

This isn't a bug, but expected behaviour of Juju.

The charm wasn't able to execute the command because the API connection was dropped.
The API connection was dropped because the API server was restarting.
The API server was restarting because the state worker was restarting.
The state worker was restarting because mongo failed to respond to a ping.

This is how juju responds to errors in the underlying substrate. This is all expected behaviour, and not a bug.

When everything comes back on line, the uniter will rerun the hooks and they should continue like normal.

The root cause is mongo not getting the time it needs to run on the machine.

Revision history for this message
Alexander Balderson (asbalderson) wrote :

This check should be handled by the juju worker internally, with a retry + back off setup, so the hook doesnt fail in this case, but instead tries while mongo restarts. otherwise every charm would have to have a retry written around these commands.

Changed in juju:
status: Invalid → New
Tim Penhey (thumper)
summary: - 2.7.7+2.7-5323b92 worker.uniter.jujuc config-get ERROR connection is
- shut down
+ hook failures due to API connectivity should not put charm in error
+ state
Changed in juju:
status: New → Triaged
importance: Undecided → High
milestone: none → 2.8-next
Michael Skalka (mskalka)
tags: removed: cdo-release-blocker
Ian Booth (wallyworld)
Changed in juju:
milestone: 2.8-next → 2.9.1
Ian Booth (wallyworld)
Changed in juju:
milestone: 2.9.1 → 2.9.2
Changed in juju:
milestone: 2.9.2 → 2.9.3
Changed in juju:
milestone: 2.9.3 → 2.9.4
Changed in juju:
milestone: 2.9.4 → 2.9.5
John A Meinel (jameinel)
Changed in juju:
milestone: 2.9.5 → 2.9-next
Revision history for this message
Harry Pidcock (hpidcock) wrote :

We should add some retry logic to the uniter where safe to do so. With the move away from mongo, this should also get better.

Changed in juju:
milestone: 2.9-next → 3.1-beta1
Revision history for this message
Vitaly Antonenko (anvial) wrote :

Looks like we don't plan to work under this for 3.1 release

Changed in juju:
milestone: 3.1-beta1 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.