refresh doesn't warn about hook failures

Bug #1915224 reported by Jeff Pihach
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
Low
Unassigned

Bug Description

A local charm deploy failed in the install hook so after fixing the charm I ran:

```
juju refresh foo --path ./foo.charm
Added charm "local:focal/foo-2" to the model.
```

The following `juju status` showed that the charm had been upgraded but the install hook was still in a failed state and the error in `debug-log` showed the same error as before.

```
App Version Status Scale Charm Store Rev OS Message
foo error 1 foo local 2 ubuntu hook failed: "install"
```

It's my understanding then that the charm wasn't actually upgraded even though the status indicated as such. So which charm is actually running here?

I'd have expected a warning when refreshing that there was an issue AND how to start resolving it, something like..

```
juju refresh foo --path ./foo.charm
Application foo has an install hook failure, try running `juju resolved install` to refresh
```
or

```
juju refresh foo --path ./foo.charm
Application foo has an install hook failure.
If you'd like ignore this error run
  juju refresh foo --path ./foo.charm --force-units
```

Revision history for this message
Harry Pidcock (hpidcock) wrote :

Curious, was this a k8s charm?

Revision history for this message
Jeff Pihach (hatch) wrote :

Nope just a basic focal one from `charmcraft init`

Revision history for this message
Ian Booth (wallyworld) wrote :

The charm does appear to be upgraded because the bug description has juju status showing that the Revision is indeed 2.

Can you check to see what juju show-status-log on the affected unit prints out? To see what hooks, if any, were run.

Revision history for this message
John A Meinel (jameinel) wrote :

The issue (as I understand it), is that 'juju status' reports the *desired* version, and you have to look at "juju status --format=yaml" or "juju show-unit" in order to find out that a given unit has failed to upgrade.
It doesn't help that by default 'juju upgrade-charm' doesn't actually upgrade a charm that is in an error state, you have to use 'juju upgrade-charm --force-units' (otherwise it is trying to run the current error hook, and only once that "succeeds" will it move on to unpacking the new charm and running the upgrade-charm hook).

A few thoughts

a) We could change 'juju upgrade-charm' if something is currently in error state. This isn't perfect because of race conditions, but is very likely the tool people would go to when they are iterating on charming and realize they have a typo/bug in their charm causing it to error out. Having at least a Warning: unit foo/0 is in error state, did you want '--force-units'

I don't *really* like that, because its too late. You've already issued upgrade-charm, you don't really want to issue upgrade *again*. What you want is a way to poke a unit that wants an upgrade. (You could `juju resolve --no-retry foo/0` but that cause that hook to never run, and `juju resolve foo/0` will rerun the hook, but not with the upgraded charm (I believe))

Which raises

b) `juju resolve --upgrade-unit` ?

c) I really do want `juju status` to have some sort of indication when something is incomplete. (for example 'juju upgrade-model' but not all unit agents have upgraded, `juju upgrade-charm` but some units haven't moved to the new charm code yet.

tags: added: charm-developer-workflow status upgrade-charm
Changed in juju:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Canonical Juju QA Bot (juju-qa-bot) wrote :

This Medium-priority bug has not been updated in 60 days, so we're marking it Low importance. If you believe this is incorrect, please update the importance.

Changed in juju:
importance: Medium → Low
tags: added: expirebugs-bot
Revision history for this message
Leon (sed-i) wrote :

I keeping running into this when I `juju refresh`.

I wonder if the AX (admin experience?) could be "refreshed" :)

Proposal:
On juju refresh:
- if the charm is not in error status, then refresh (same behavior as now).
- if the charm is in error status, refuse the refresh (error out with non zero exit code). Apply the refresh only if `--force-units` provided.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.