Unit in "agent-state: error" cannot be resolved

Bug #1270041 reported by Ryan Finnie
70
This bug affects 13 people
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
Medium
Unassigned

Bug Description

I have a unit which is in "agent-state: error":

$ juju status
services:
  charmname:
    units:
      charmname/0:
        agent-state: started
        agent-version: 1.14.1.1
        machine: "2"
      charmname/1:
        agent-state: error
        agent-state-info: 'hook failed: "upgrade-charm"'
        agent-version: 1.14.1.1
        machine: "15"

but cannot be resolved:
$ juju resolved --retry charmname/1
error: cannot set resolved mode for unit "charmname/1": already resolved

The symptoms sound similar to LP: #1175031, but in a much newer version. Through a combination of "juju upgrade-charm --force charmname", "juju resolved charmname/1" and restarting the charm agent on the unit, I've been occasionally able to get it to retry, but no matter the outcome of the hooks, it doesn't clear the error state.

=== Agent start example ===

2014-01-16 20:50:27 INFO juju.jujud unit.go:72 unit agent unit-charmname-1 start
2014-01-16 20:50:27 INFO juju runner.go:253 worker: start "state"
2014-01-16 20:50:27 INFO juju.state open.go:68 opening state; mongo addresses: ["10.33.17.73:37017"]; entity "unit-charmname-1"
2014-01-16 20:50:27 INFO juju runner.go:253 worker: start "api"
2014-01-16 20:50:27 INFO juju apiclient.go:106 state/api: dialing "wss://10.33.17.73:17070/"
2014-01-16 20:50:27 INFO juju apiclient.go:116 state/api: connection established
2014-01-16 20:50:27 INFO juju.state open.go:106 connection established
2014-01-16 20:50:28 INFO juju runner.go:253 worker: start "upgrader"
2014-01-16 20:50:28 INFO juju runner.go:253 worker: start "uniter"
2014-01-16 20:50:28 INFO juju.worker.uniter uniter.go:83 unit "charmname/1" started
2014-01-16 20:50:28 INFO juju.worker.upgrader upgrader.go:139 desired tool version: 1.14.1.1
2014-01-16 20:50:28 INFO juju.worker.uniter modes.go:408 ModeInit starting
2014-01-16 20:50:28 INFO juju.worker.uniter modes.go:29 updating unit addresses
2014-01-16 20:50:28 DEBUG juju.worker.uniter.filter filter.go:472 charm check skipped, not yet installed.
2014-01-16 20:50:28 DEBUG juju.worker.uniter.filter filter.go:289 got unit change
2014-01-16 20:50:28 DEBUG juju.worker.uniter.filter filter.go:313 got relations change
2014-01-16 20:50:28 DEBUG juju.worker.uniter.filter filter.go:297 got service change
2014-01-16 20:50:28 DEBUG juju.worker.uniter.filter filter.go:478 preparing new upgrade event
2014-01-16 20:50:28 DEBUG juju.worker.uniter.filter filter.go:305 got config change
2014-01-16 20:50:28 DEBUG juju.worker.uniter.filter filter.go:309 preparing new config event
2014-01-16 20:50:28 INFO juju.worker.uniter modes.go:52 reconciling relation state
2014-01-16 20:50:28 INFO juju.worker.uniter uniter.go:508 joining relation "subcharm:a charmname:a"
2014-01-16 20:50:28 INFO juju.worker.uniter uniter.go:530 joined relation "subcharm:a charmname:a"
2014-01-16 20:50:28 DEBUG juju.worker.uniter modes.go:410 ModeInit exiting
2014-01-16 20:50:28 INFO juju.worker.uniter modes.go:408 ModeContinue starting
2014-01-16 20:50:28 INFO juju.worker.uniter modes.go:65 loading uniter state
2014-01-16 20:50:28 INFO juju.worker.uniter modes.go:109 awaiting error resolution for "upgrade-charm" hook
2014-01-16 20:50:28 DEBUG juju.worker.uniter modes.go:410 ModeContinue exiting
2014-01-16 20:50:28 INFO juju.worker.uniter modes.go:408 ModeHookError starting
2014-01-16 20:50:28 DEBUG juju.worker.uniter.filter filter.go:388 want resolved event
2014-01-16 20:50:28 DEBUG juju.worker.uniter.filter filter.go:382 want forced upgrade true
2014-01-16 20:50:28 DEBUG juju.worker.uniter.filter filter.go:486 no new charm event
2014-01-16 20:50:28 DEBUG juju.worker.uniter.filter filter.go:335 sent resolved event

=== Agent start example w/ upgrade-charm retry ===

2014-01-16 20:30:38 INFO juju.jujud unit.go:72 unit agent unit-charmname-1 start
2014-01-16 20:30:38 INFO juju runner.go:253 worker: start "state"
2014-01-16 20:30:38 INFO juju.state open.go:68 opening state; mongo addresses: ["10.33.17.73:37017"]; entity "unit-charmname-1"
2014-01-16 20:30:38 INFO juju runner.go:253 worker: start "api"
2014-01-16 20:30:38 INFO juju apiclient.go:106 state/api: dialing "wss://10.33.17.73:17070/"
2014-01-16 20:30:38 INFO juju apiclient.go:116 state/api: connection established
2014-01-16 20:30:38 INFO juju.state open.go:106 connection established
2014-01-16 20:30:39 INFO juju runner.go:253 worker: start "uniter"
2014-01-16 20:30:39 INFO juju.worker.uniter uniter.go:83 unit "charmname/1" started
2014-01-16 20:30:39 INFO juju.worker.uniter modes.go:408 ModeInit starting
2014-01-16 20:30:39 INFO juju.worker.uniter modes.go:29 updating unit addresses
2014-01-16 20:30:39 DEBUG juju.worker.uniter.filter filter.go:472 charm check skipped, not yet installed.
2014-01-16 20:30:39 DEBUG juju.worker.uniter.filter filter.go:289 got unit change
2014-01-16 20:30:39 DEBUG juju.worker.uniter.filter filter.go:297 got service change
2014-01-16 20:30:39 DEBUG juju.worker.uniter.filter filter.go:478 preparing new upgrade event
2014-01-16 20:30:39 DEBUG juju.worker.uniter.filter filter.go:313 got relations change
2014-01-16 20:30:39 DEBUG juju.worker.uniter.filter filter.go:305 got config change
2014-01-16 20:30:39 DEBUG juju.worker.uniter.filter filter.go:309 preparing new config event
2014-01-16 20:30:39 INFO juju runner.go:253 worker: start "upgrader"
2014-01-16 20:30:39 INFO juju.worker.upgrader upgrader.go:139 desired tool version: 1.14.1.1
2014-01-16 20:30:39 INFO juju.worker.uniter modes.go:52 reconciling relation state
2014-01-16 20:30:39 INFO juju.worker.uniter uniter.go:508 joining relation "subcharm:a charmname:a"
2014-01-16 20:30:39 INFO juju.worker.uniter uniter.go:530 joined relation "subcharm:a charmname:a"
2014-01-16 20:30:39 DEBUG juju.worker.uniter modes.go:410 ModeInit exiting
2014-01-16 20:30:39 INFO juju.worker.uniter modes.go:408 ModeContinue starting
2014-01-16 20:30:39 INFO juju.worker.uniter modes.go:65 loading uniter state
2014-01-16 20:30:39 INFO juju.worker.uniter modes.go:109 awaiting error resolution for "upgrade-charm" hook
2014-01-16 20:30:39 DEBUG juju.worker.uniter modes.go:410 ModeContinue exiting
2014-01-16 20:30:39 INFO juju.worker.uniter modes.go:408 ModeHookError starting
2014-01-16 20:30:39 DEBUG juju.worker.uniter.filter filter.go:388 want resolved event
2014-01-16 20:30:39 DEBUG juju.worker.uniter.filter filter.go:382 want forced upgrade true
2014-01-16 20:30:39 DEBUG juju.worker.uniter.filter filter.go:478 preparing new upgrade event
2014-01-16 20:30:39 DEBUG juju.worker.uniter modes.go:410 ModeHookError exiting
2014-01-16 20:30:39 INFO juju.worker.uniter modes.go:408 ModeUpgrading local:precise/charmname-155082009 starting
2014-01-16 20:30:39 INFO juju.worker.uniter uniter.go:230 fetching charm "local:precise/charmname-155082009"
2014-01-16 20:30:39 DEBUG juju.worker.uniter.filter filter.go:332 sent upgrade event
2014-01-16 20:30:39 INFO juju charm.go:56 worker/uniter/charm: downloading local:precise/charmname-155082009 from http://kelpie.internal:8080/v1/AUTH_xxx/juju-env/local_3a_precise_2f_charmname-155082009
2014-01-16 20:30:39 INFO juju charm.go:68 worker/uniter/charm: download complete
2014-01-16 20:30:39 INFO juju charm.go:80 worker/uniter/charm: download verified
2014-01-16 20:30:39 DEBUG juju.worker.uniter.filter filter.go:347 changing charm to "local:precise/charmname-155082009"
2014-01-16 20:30:39 DEBUG juju.worker.uniter.filter filter.go:486 no new charm event
2014-01-16 20:30:39 INFO juju.worker.uniter uniter.go:253 deploying charm "local:precise/charmname-155082009"
2014-01-16 20:30:39 DEBUG juju.worker.uniter.filter filter.go:305 got config change
2014-01-16 20:30:39 DEBUG juju.worker.uniter.filter filter.go:309 preparing new config event
2014-01-16 20:30:39 DEBUG juju.worker.uniter.filter filter.go:313 got relations change
2014-01-16 20:30:39 INFO juju deployer.go:144 worker/uniter/charm: preparing charm upgrade
2014-01-16 20:30:39 INFO juju deployer.go:164 worker/uniter/charm: deploying charm
2014-01-16 20:30:40 INFO juju deployer.go:98 worker/uniter/charm: charm deployment succeeded
2014-01-16 20:30:40 INFO juju.worker.uniter uniter.go:264 charm "local:precise/charmname-155082009" is deployed
2014-01-16 20:30:40 DEBUG juju.worker.uniter modes.go:410 ModeUpgrading local:precise/charmname-155082009 exiting
2014-01-16 20:30:40 INFO juju.worker.uniter modes.go:408 ModeContinue starting
2014-01-16 20:30:40 INFO juju.worker.uniter modes.go:109 awaiting error resolution for "upgrade-charm" hook
2014-01-16 20:30:40 DEBUG juju.worker.uniter modes.go:410 ModeContinue exiting
2014-01-16 20:30:40 INFO juju.worker.uniter modes.go:408 ModeHookError starting
2014-01-16 20:30:40 DEBUG juju.worker.uniter.filter filter.go:388 want resolved event
2014-01-16 20:30:40 DEBUG juju.worker.uniter.filter filter.go:382 want forced upgrade true
2014-01-16 20:30:40 DEBUG juju.worker.uniter.filter filter.go:486 no new charm event
2014-01-16 20:30:40 DEBUG juju.worker.uniter.filter filter.go:335 sent resolved event
2014-01-16 20:30:43 DEBUG juju.worker.uniter.filter filter.go:289 got unit change

Curtis Hovey (sinzui)
tags: added: resolved
tags: added: 1.14.1
Curtis Hovey (sinzui)
Changed in juju-core:
status: New → Triaged
importance: Undecided → High
tags: added: upgrade-charm
Curtis Hovey (sinzui)
Changed in juju-core:
importance: High → Medium
Revision history for this message
Peter Sabaini (peter-sabaini) wrote :
Download full text (19.7 KiB)

I have a similar issue:

$ juju status rabbitmq-server/1 | grep -A 3 rabbitmq-server/1:
      rabbitmq-server/1:
        agent-state: error
        agent-state-info: 'hook failed: "config-changed"'
        agent-version: 1.20.9.1

$ juju resolved rabbitmq-server/1
ERROR cannot set resolved mode for unit "rabbitmq-server/1": already resolved

When restarting jujud I can see it retrying the config-changed hook w/o issue, but the error state remains:

# restart jujud-unit-rabbitmq-server-1

 -- log excerpt of unit-rabbitmq-server-1

2015-02-05 13:42:42 INFO juju.worker.uniter uniter.go:115 unit "rabbitmq-server/1" started
2015-02-05 13:42:42 DEBUG juju.worker.uniter runlistener.go:84 juju-run listener running
2015-02-05 13:42:42 INFO juju.worker.uniter modes.go:396 ModeContinue starting
2015-02-05 13:42:42 INFO juju.worker.uniter modes.go:31 loading uniter state
2015-02-05 13:42:42 INFO juju.worker.uniter modes.go:48 continuing after "config-changed" hook
2015-02-05 13:42:42 DEBUG juju.worker.uniter modes.go:398 ModeContinue exiting
2015-02-05 13:42:42 INFO juju.worker.uniter modes.go:396 ModeConfigChanged starting
2015-02-05 13:42:42 DEBUG juju.worker.uniter uniter.go:757 new environment change
2015-02-05 13:42:42 DEBUG juju.worker.uniter.filter filter.go:507 charm check skipped, not yet installed.
2015-02-05 13:42:42 DEBUG juju.worker.uniter.filter filter.go:325 got service change
2015-02-05 13:42:42 DEBUG juju.worker.uniter.filter filter.go:521 no new charm event
2015-02-05 13:42:42 DEBUG juju.worker.uniter.filter filter.go:317 got unit change
2015-02-05 13:42:42 DEBUG juju.worker.uniter.filter filter.go:341 got relations change
2015-02-05 13:42:42 DEBUG juju.worker.uniter.filter filter.go:333 got config change
2015-02-05 13:42:42 DEBUG juju.worker.uniter.filter filter.go:337 preparing new config event
2015-02-05 13:42:42 DEBUG juju.worker.uniter.filter filter.go:440 discarded config event
2015-02-05 13:42:42 INFO juju.worker.uniter uniter.go:480 running "config-changed" hook
2015-02-05 13:42:43 INFO worker.uniter.jujuc server.go:102 running hook tool "config-get" ["source" "--format=json"]
2015-02-05 13:42:43 DEBUG worker.uniter.jujuc server.go:103 hook context id "rabbitmq-server/1:config-changed:5433364711736979567"; dir "/var/lib/juju/agents/unit-rabbitmq-server-1/charm"
2015-02-05 13:42:43 INFO worker.uniter.jujuc server.go:102 running hook tool "config-get" ["key" "--format=json"]
2015-02-05 13:42:43 DEBUG worker.uniter.jujuc server.go:103 hook context id "rabbitmq-server/1:config-changed:5433364711736979567"; dir "/var/lib/juju/agents/unit-rabbitmq-server-1/charm"
2015-02-05 13:42:43 INFO worker.uniter.jujuc server.go:102 running hook tool "juju-log" ["Source is not present. Skipping"]
2015-02-05 13:42:43 DEBUG worker.uniter.jujuc server.go:103 hook context id "rabbitmq-server/1:config-changed:5433364711736979567"; dir "/var/lib/juju/agents/unit-rabbitmq-server-1/charm"
2015-02-05 13:42:43 INFO juju-log Source is not present. Skipping
2015-02-05 13:42:43 DEBUG juju.worker.rsyslog worker.go:164 Reloading rsyslog configuration
2015-02-05 13:42:43 INFO config-changed Ign http://security.ubuntu.com trusty-security InRelease
2015-02-05 13:...

Revision history for this message
Peter Sabaini (peter-sabaini) wrote :

Context: the config-changed hook did indeed fail previously because of another issue which is resolved now

tags: added: canonical-bootstack
Revision history for this message
Junien F (axino) wrote :

FWIW I ran into this bug, and got it fixed by removing stale lock files in /var/lib/juju/locks (if there is no juju action running, this directory should be empty) and bouncing machine-0 jujud.

Revision history for this message
Paul Gear (paulgear) wrote :

Junien's workaround above worked for me today on a cloud with units stuck in resolved state; it didn't seem necessary to restart jujud-machine-0 (status was clean straight after removing the stale lock files), but I did it anyway.

Paul Gear (paulgear)
tags: added: canonical-is
Revision history for this message
Anastasia (anastasia-macmood) wrote :

Lock implementation has been completely re-designed for newer versions of Juju - 1.25.6 and Juju 2.
We believe this has been fixed.

If you are experiencing further failures with Juju 2, please file a bug with newer logs against "juju" launchpad project.

Changed in juju-core:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.