Bug #1875481 “Juju 2.8 error about unit not being the leader” : Bugs : Canonical Juju

Revision history for this message

Harry Pidcock (hpidcock) wrote on 2020-04-28:

#1

Can we please get the full log of when this happened?

Revision history for this message

Yang Kelvin Liu (kelvin.liu) wrote on 2020-04-28:

#2

Hi Ken,
`pod-spec-set` can be run on leader unit only.
The error is that Juju complained the `dex-auth/1` is the leader but it was setting podspec.
So we have to check is_leader() then run `pod-spec-set`.

Revision history for this message

Yang Kelvin Liu (kelvin.liu) wrote on 2020-04-28:

#3

Sorry, relayed to quickly.
Could you give a charm to show how to reproduce it?

Revision history for this message

Kenneth Koski (knkski) wrote on 2020-04-28:

#4

Download full text (9.7 KiB)

I had this happen with two charms for my most recent attempt. The full logs from one of them:

$ kubectl logs --tail 100 -fl juju-operator=katib-controller
2020-04-28 13:28:46 DEBUG juju.worker.introspection socket.go:127 stats worker now serving
2020-04-28 13:28:46 DEBUG juju.worker.dependency engine.go:564 "api-config-watcher" manifold worker started at 2020-04-28 13:28:46.875767974 +0000 UTC
2020-04-28 13:28:46 DEBUG juju.worker.dependency engine.go:564 "migration-fortress" manifold worker started at 2020-04-28 13:28:46.876854252 +0000 UTC
2020-04-28 13:28:46 DEBUG juju.api apiclient.go:1105 successfully dialed "wss://172.31.33.208:17070/model/81f1584d-a77f-4f30-8a12-417f4c2fcedd/api"
2020-04-28 13:28:46 INFO juju.api apiclient.go:637 connection established to "wss://172.31.33.208:17070/model/81f1584d-a77f-4f30-8a12-417f4c2fcedd/api"
2020-04-28 13:28:46 INFO juju.worker.apicaller connect.go:158 [81f158] "application-katib-controller" successfully connected to "172.31.33.208:17070"
2020-04-28 13:28:46 DEBUG juju.api monitor.go:35 RPC connection died
2020-04-28 13:28:46 DEBUG juju.worker.dependency engine.go:584 "api-caller" manifold worker completed successfully
2020-04-28 13:28:46 DEBUG juju.worker.apicaller connect.go:128 connecting with old password
2020-04-28 13:28:47 DEBUG juju.api apiclient.go:1105 successfully dialed "wss://3.87.99.136:17070/model/81f1584d-a77f-4f30-8a12-417f4c2fcedd/api"
2020-04-28 13:28:47 INFO juju.api apiclient.go:637 connection established to "wss://3.87.99.136:17070/model/81f1584d-a77f-4f30-8a12-417f4c2fcedd/api"
2020-04-28 13:28:47 INFO juju.worker.apicaller connect.go:158 [81f158] "application-katib-controller" successfully connected to "3.87.99.136:17070"
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "api-caller" manifold worker started at 2020-04-28 13:28:47.105344894 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "upgrader" manifold worker started at 2020-04-28 13:28:47.115656 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "migration-minion" manifold worker started at 2020-04-28 13:28:47.115705665 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "log-sender" manifold worker started at 2020-04-28 13:28:47.115728209 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "upgrade-steps-runner" manifold worker started at 2020-04-28 13:28:47.116695916 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:584 "upgrade-steps-runner" manifold worker completed successfully
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "migration-inactive-flag" manifold worker started at 2020-04-28 13:28:47.117828492 +0000 UTC
2020-04-28 13:28:47 INFO juju.worker.caasupgrader upgrader.go:112 abort check blocked until version event received
2020-04-28 13:28:47 DEBUG juju.worker.caasupgrader upgrader.go:127 current agent binary version: 2.8-rc1
2020-04-28 13:28:47 INFO juju.worker.caasupgrader upgrader.go:118 unblocking abort check
2020-04-28 13:28:47 INFO juju.worker.migrationminion worker.go:140 migration phase is now: NONE
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "charm...

I had this happen with two charms for my most recent attempt. The full logs from one of them:

$ kubectl logs --tail 100 -fl juju-operator=katib-controller
2020-04-28 13:28:46 DEBUG juju.worker.introspection socket.go:127 stats worker now serving
2020-04-28 13:28:46 DEBUG juju.worker.dependency engine.go:564 "api-config-watcher" manifold worker started at 2020-04-28 13:28:46.875767974 +0000 UTC
2020-04-28 13:28:46 DEBUG juju.worker.dependency engine.go:564 "migration-fortress" manifold worker started at 2020-04-28 13:28:46.876854252 +0000 UTC
2020-04-28 13:28:46 DEBUG juju.api apiclient.go:1105 successfully dialed "wss://172.31.33.208:17070/model/81f1584d-a77f-4f30-8a12-417f4c2fcedd/api"
2020-04-28 13:28:46 INFO juju.api apiclient.go:637 connection established to "wss://172.31.33.208:17070/model/81f1584d-a77f-4f30-8a12-417f4c2fcedd/api"
2020-04-28 13:28:46 INFO juju.worker.apicaller connect.go:158 [81f158] "application-katib-controller" successfully connected to "172.31.33.208:17070"
2020-04-28 13:28:46 DEBUG juju.api monitor.go:35 RPC connection died
2020-04-28 13:28:46 DEBUG juju.worker.dependency engine.go:584 "api-caller" manifold worker completed successfully
2020-04-28 13:28:46 DEBUG juju.worker.apicaller connect.go:128 connecting with old password
2020-04-28 13:28:47 DEBUG juju.api apiclient.go:1105 successfully dialed "wss://3.87.99.136:17070/model/81f1584d-a77f-4f30-8a12-417f4c2fcedd/api"
2020-04-28 13:28:47 INFO juju.api apiclient.go:637 connection established to "wss://3.87.99.136:17070/model/81f1584d-a77f-4f30-8a12-417f4c2fcedd/api"
2020-04-28 13:28:47 INFO juju.worker.apicaller connect.go:158 [81f158] "application-katib-controller" successfully connected to "3.87.99.136:17070"
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "api-caller" manifold worker started at 2020-04-28 13:28:47.105344894 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "upgrader" manifold worker started at 2020-04-28 13:28:47.115656 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "migration-minion" manifold worker started at 2020-04-28 13:28:47.115705665 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "log-sender" manifold worker started at 2020-04-28 13:28:47.115728209 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "upgrade-steps-runner" manifold worker started at 2020-04-28 13:28:47.116695916 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:584 "upgrade-steps-runner" manifold worker completed successfully
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "migration-inactive-flag" manifold worker started at 2020-04-28 13:28:47.117828492 +0000 UTC
2020-04-28 13:28:47 INFO juju.worker.caasupgrader upgrader.go:112 abort check blocked until version event received
2020-04-28 13:28:47 DEBUG juju.worker.caasupgrader upgrader.go:127 current agent binary version: 2.8-rc1
2020-04-28 13:28:47 INFO juju.worker.caasupgrader upgrader.go:118 unblocking abort check
2020-04-28 13:28:47 INFO juju.worker.migrationminion worker.go:140 migration phase is now: NONE
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "charm-dir" manifold worker started at 2020-04-28 13:28:47.128343176 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "api-address-updater" manifold worker started at 2020-04-28 13:28:47.128419952 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.logger logger.go:64 initial log config: "<root>=DEBUG"
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "proxy-config-updater" manifold worker started at 2020-04-28 13:28:47.129173596 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "logging-config-updater" manifold worker started at 2020-04-28 13:28:47.129250537 +0000 UTC
2020-04-28 13:28:47 INFO juju.worker.logger logger.go:118 logger worker started
2020-04-28 13:28:47 DEBUG juju.worker.caasupgrader upgrader.go:150 desired agent binary version: 2.8-rc1
2020-04-28 13:28:47 DEBUG juju.worker.logger logger.go:92 reconfiguring logging from "<root>=DEBUG" to "<root>=WARNING"
2020-04-28 13:33:55 ERROR juju.worker.uniter.context context.go:753 "katib-controller/1" is not the leader but is setting application k8s spec
2020-04-28 13:33:55 ERROR juju-log pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:33:56 ERROR juju-log Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-katib-controller-1/charm/reactive/katib_controller.py", line 279, in start_charm
    'files/defaultTrialTemplate.yaml.tmpl'
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:33:56 ERROR juju.worker.uniter.operation runhook.go:136 hook "install" (via explicit, bespoke hook script) failed: exit status 1
2020-04-28 13:34:04 ERROR juju.worker.uniter.context context.go:753 "katib-controller/1" is not the leader but is setting application k8s spec
2020-04-28 13:34:04 ERROR juju-log pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:34:04 ERROR juju-log Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-katib-controller-1/charm/reactive/katib_controller.py", line 279, in start_charm
    'files/defaultTrialTemplate.yaml.tmpl'
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:34:04 ERROR juju.worker.uniter.operation runhook.go:136 hook "install" (via explicit, bespoke hook script) failed: exit status 1
2020-04-28 13:34:17 ERROR juju.worker.uniter.context context.go:753 "katib-controller/1" is not the leader but is setting application k8s spec
2020-04-28 13:34:17 ERROR juju-log pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:34:17 ERROR juju-log Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-katib-controller-1/charm/reactive/katib_controller.py", line 279, in start_charm
    'files/defaultTrialTemplate.yaml.tmpl'
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:34:17 ERROR juju.worker.uniter.operation runhook.go:136 hook "install" (via explicit, bespoke hook script) failed: exit status 1
2020-04-28 13:34:34 WARNING juju.worker.uniter.operation leader.go:123 we should run a leader-deposed hook here, but we can't yet
2020-04-28 13:34:38 ERROR juju.worker.uniter.context context.go:753 "katib-controller/1" is not the leader but is setting application k8s spec
2020-04-28 13:34:38 ERROR juju-log pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:34:38 ERROR juju-log Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-katib-controller-1/charm/reactive/katib_controller.py", line 279, in start_charm
    'files/defaultTrialTemplate.yaml.tmpl'
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:34:38 ERROR juju.worker.uniter.operation runhook.go:136 hook "install" (via explicit, bespoke hook script) failed: exit status 1

Revision history for this message

Kenneth Koski (knkski) wrote on 2020-04-28:

#5

If I'm actively watching the deploy, I can see the multiple units:

$ juju status
...
dex-auth/0* active idle 10.1.19.73 5556/TCP
dex-auth/1 maintenance executing 10.1.17.61 5556/TCP configuring container

Eventually the earlier units will go away, leaving only the later, broken units.

Revision history for this message

Kenneth Koski (knkski) wrote on 2020-04-28:

#6

Download full text (18.0 KiB)

And here's logs from the first charm that I noticed this for:

$ kubectl logs --tail 1000 -l juju-operator=dex-auth
2020-04-28 13:28:28 INFO juju.cmd supercommand.go:91 running jujud [2.8-rc1 3584 da98e184dd907fe3263b7f098147cd99aba4c73c gc go1.14.2]
2020-04-28 13:28:28 DEBUG juju.cmd supercommand.go:92 args: []string{"/var/lib/juju/tools/jujud", "caasoperator", "--application-name=dex-auth", "--debug"}
2020-04-28 13:28:28 DEBUG juju.agent agent.go:571 read agent config, format "2.0"
2020-04-28 13:28:28 INFO juju.worker.upgradesteps worker.go:70 upgrade steps for 2.8-rc1 have already been run.
2020-04-28 13:28:28 INFO juju.cmd.jujud caasoperator.go:200 caas operator application-dex-auth start (2.8-rc1 [gc])
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "agent" manifold worker started at 2020-04-28 13:28:28.429503399 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "upgrade-steps-gate" manifold worker started at 2020-04-28 13:28:28.429686675 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "clock" manifold worker started at 2020-04-28 13:28:28.429925853 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "api-config-watcher" manifold worker started at 2020-04-28 13:28:28.430132147 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.introspection socket.go:97 introspection worker listening on "@jujud-application-dex-auth"
2020-04-28 13:28:28 DEBUG juju.worker.introspection socket.go:127 stats worker now serving
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "upgrade-steps-flag" manifold worker started at 2020-04-28 13:28:28.439923237 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.apicaller connect.go:128 connecting with old password
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "migration-fortress" manifold worker started at 2020-04-28 13:28:28.451176989 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.api apiclient.go:1105 successfully dialed "wss://172.31.33.208:17070/model/81f1584d-a77f-4f30-8a12-417f4c2fcedd/api"
2020-04-28 13:28:28 INFO juju.api apiclient.go:637 connection established to "wss://172.31.33.208:17070/model/81f1584d-a77f-4f30-8a12-417f4c2fcedd/api"
2020-04-28 13:28:28 INFO juju.worker.apicaller connect.go:158 [81f158] "application-dex-auth" successfully connected to "172.31.33.208:17070"
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "api-caller" manifold worker started at 2020-04-28 13:28:28.655837161 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "migration-minion" manifold worker started at 2020-04-28 13:28:28.666100125 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "log-sender" manifold worker started at 2020-04-28 13:28:28.666290424 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "upgrader" manifold worker started at 2020-04-28 13:28:28.666330008 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "upgrade-steps-runner" manifold worker started at 2020-04-28 13:28:28.667080449 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:584 "upgrade-steps-runner" manifold worker completed succes...

And here's logs from the first charm that I noticed this for:

$ kubectl logs --tail 1000 -l juju-operator=dex-auth
2020-04-28 13:28:28 INFO juju.cmd supercommand.go:91 running jujud [2.8-rc1 3584 da98e184dd907fe3263b7f098147cd99aba4c73c gc go1.14.2]
2020-04-28 13:28:28 DEBUG juju.cmd supercommand.go:92   args: []string{"/var/lib/juju/tools/jujud", "caasoperator", "--application-name=dex-auth", "--debug"}
2020-04-28 13:28:28 DEBUG juju.agent agent.go:571 read agent config, format "2.0"
2020-04-28 13:28:28 INFO juju.worker.upgradesteps worker.go:70 upgrade steps for 2.8-rc1 have already been run.
2020-04-28 13:28:28 INFO juju.cmd.jujud caasoperator.go:200 caas operator application-dex-auth start (2.8-rc1 [gc])
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "agent" manifold worker started at 2020-04-28 13:28:28.429503399 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "upgrade-steps-gate" manifold worker started at 2020-04-28 13:28:28.429686675 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "clock" manifold worker started at 2020-04-28 13:28:28.429925853 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "api-config-watcher" manifold worker started at 2020-04-28 13:28:28.430132147 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.introspection socket.go:97 introspection worker listening on "@jujud-application-dex-auth"
2020-04-28 13:28:28 DEBUG juju.worker.introspection socket.go:127 stats worker now serving
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "upgrade-steps-flag" manifold worker started at 2020-04-28 13:28:28.439923237 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.apicaller connect.go:128 connecting with old password
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "migration-fortress" manifold worker started at 2020-04-28 13:28:28.451176989 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.api apiclient.go:1105 successfully dialed "wss://172.31.33.208:17070/model/81f1584d-a77f-4f30-8a12-417f4c2fcedd/api"
2020-04-28 13:28:28 INFO juju.api apiclient.go:637 connection established to "wss://172.31.33.208:17070/model/81f1584d-a77f-4f30-8a12-417f4c2fcedd/api"
2020-04-28 13:28:28 INFO juju.worker.apicaller connect.go:158 [81f158] "application-dex-auth" successfully connected to "172.31.33.208:17070"
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "api-caller" manifold worker started at 2020-04-28 13:28:28.655837161 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "migration-minion" manifold worker started at 2020-04-28 13:28:28.666100125 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "log-sender" manifold worker started at 2020-04-28 13:28:28.666290424 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "upgrader" manifold worker started at 2020-04-28 13:28:28.666330008 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "upgrade-steps-runner" manifold worker started at 2020-04-28 13:28:28.667080449 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:584 "upgrade-steps-runner" manifold worker completed successfully
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "migration-inactive-flag" manifold worker started at 2020-04-28 13:28:28.667489517 +0000 UTC
2020-04-28 13:28:28 INFO juju.worker.caasupgrader upgrader.go:112 abort check blocked until version event received
2020-04-28 13:28:28 DEBUG juju.worker.caasupgrader upgrader.go:127 current agent binary version: 2.8-rc1
2020-04-28 13:28:28 INFO juju.worker.caasupgrader upgrader.go:118 unblocking abort check
2020-04-28 13:28:28 INFO juju.worker.migrationminion worker.go:140 migration phase is now: NONE
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "charm-dir" manifold worker started at 2020-04-28 13:28:28.678593991 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "proxy-config-updater" manifold worker started at 2020-04-28 13:28:28.678678165 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.logger logger.go:64 initial log config: "<root>=DEBUG"
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "logging-config-updater" manifold worker started at 2020-04-28 13:28:28.67879184 +0000 UTC
2020-04-28 13:28:28 INFO juju.worker.logger logger.go:118 logger worker started
2020-04-28 13:28:28 DEBUG juju.worker.dependency engine.go:564 "api-address-updater" manifold worker started at 2020-04-28 13:28:28.678830063 +0000 UTC
2020-04-28 13:28:28 DEBUG juju.worker.caasupgrader upgrader.go:150 desired agent binary version: 2.8-rc1
2020-04-28 13:28:28 DEBUG juju.worker.logger logger.go:92 reconfiguring logging from "<root>=DEBUG" to "<root>=WARNING"
2020-04-28 13:34:30 ERROR juju.worker.uniter.context context.go:753 "dex-auth/1" is not the leader but is setting application k8s spec
2020-04-28 13:34:30 ERROR juju-log oidc-client:8: pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:34:30 ERROR juju-log oidc-client:8: Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-dex-auth-1/charm/reactive/dex_auth.py", line 154, in start_charm
    for crd in yaml.safe_load_all(Path("resources/crds.yaml").read_text())
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:34:30 ERROR juju.worker.uniter.operation runhook.go:136 hook "oidc-client-relation-joined" (via explicit, bespoke hook script) failed: exit status 1
2020-04-28 13:34:36 ERROR juju.worker.uniter.context context.go:753 "dex-auth/1" is not the leader but is setting application k8s spec
2020-04-28 13:34:36 ERROR juju-log oidc-client:8: pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:34:36 ERROR juju-log oidc-client:8: Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-dex-auth-1/charm/reactive/dex_auth.py", line 154, in start_charm
    for crd in yaml.safe_load_all(Path("resources/crds.yaml").read_text())
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:34:36 ERROR juju.worker.uniter.operation runhook.go:136 hook "oidc-client-relation-joined" (via explicit, bespoke hook script) failed: exit status 1
2020-04-28 13:34:48 ERROR juju.worker.uniter.context context.go:753 "dex-auth/1" is not the leader but is setting application k8s spec
2020-04-28 13:34:48 ERROR juju-log oidc-client:8: pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:34:48 ERROR juju-log oidc-client:8: Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-dex-auth-1/charm/reactive/dex_auth.py", line 154, in start_charm
    for crd in yaml.safe_load_all(Path("resources/crds.yaml").read_text())
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:34:48 ERROR juju.worker.uniter.operation runhook.go:136 hook "oidc-client-relation-joined" (via explicit, bespoke hook script) failed: exit status 1
2020-04-28 13:34:54 WARNING juju.worker.uniter.operation leader.go:123 we should run a leader-deposed hook here, but we can't yet
2020-04-28 13:35:10 ERROR juju.worker.uniter.context context.go:753 "dex-auth/1" is not the leader but is setting application k8s spec
2020-04-28 13:35:10 ERROR juju-log oidc-client:8: pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:35:10 ERROR juju-log oidc-client:8: Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-dex-auth-1/charm/reactive/dex_auth.py", line 154, in start_charm
    for crd in yaml.safe_load_all(Path("resources/crds.yaml").read_text())
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:35:10 ERROR juju.worker.uniter.operation runhook.go:136 hook "oidc-client-relation-joined" (via explicit, bespoke hook script) failed: exit status 1
2020-04-28 13:36:15 ERROR juju.worker.uniter.context context.go:753 "dex-auth/2" is not the leader but is setting application k8s spec
2020-04-28 13:36:15 ERROR juju-log oidc-client:8: pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:36:15 ERROR juju-log oidc-client:8: Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-dex-auth-2/charm/reactive/dex_auth.py", line 154, in start_charm
    for crd in yaml.safe_load_all(Path("resources/crds.yaml").read_text())
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:36:15 ERROR juju.worker.uniter.operation runhook.go:136 hook "oidc-client-relation-joined" (via explicit, bespoke hook script) failed: exit status 1
2020-04-28 13:36:21 ERROR juju.worker.uniter.context context.go:753 "dex-auth/2" is not the leader but is setting application k8s spec
2020-04-28 13:36:21 ERROR juju-log oidc-client:8: pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:36:21 ERROR juju-log oidc-client:8: Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-dex-auth-2/charm/reactive/dex_auth.py", line 154, in start_charm
    for crd in yaml.safe_load_all(Path("resources/crds.yaml").read_text())
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:36:22 ERROR juju.worker.uniter.operation runhook.go:136 hook "oidc-client-relation-joined" (via explicit, bespoke hook script) failed: exit status 1
2020-04-28 13:36:33 ERROR juju.worker.uniter.context context.go:753 "dex-auth/2" is not the leader but is setting application k8s spec
2020-04-28 13:36:33 ERROR juju-log oidc-client:8: pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:36:33 ERROR juju-log oidc-client:8: Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-dex-auth-2/charm/reactive/dex_auth.py", line 154, in start_charm
    for crd in yaml.safe_load_all(Path("resources/crds.yaml").read_text())
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:36:33 ERROR juju.worker.uniter.operation runhook.go:136 hook "oidc-client-relation-joined" (via explicit, bespoke hook script) failed: exit status 1
2020-04-28 13:36:53 ERROR juju.worker.uniter.context context.go:753 "dex-auth/2" is not the leader but is setting application k8s spec
2020-04-28 13:36:53 ERROR juju-log oidc-client:8: pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:36:53 ERROR juju-log oidc-client:8: Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-dex-auth-2/charm/reactive/dex_auth.py", line 154, in start_charm
    for crd in yaml.safe_load_all(Path("resources/crds.yaml").read_text())
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:36:54 ERROR juju.worker.uniter.operation runhook.go:136 hook "oidc-client-relation-joined" (via explicit, bespoke hook script) failed: exit status 1
2020-04-28 13:37:34 ERROR juju.worker.uniter.context context.go:753 "dex-auth/2" is not the leader but is setting application k8s spec
2020-04-28 13:37:34 ERROR juju-log oidc-client:8: pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:37:34 ERROR juju-log oidc-client:8: Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-dex-auth-2/charm/reactive/dex_auth.py", line 154, in start_charm
    for crd in yaml.safe_load_all(Path("resources/crds.yaml").read_text())
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:37:34 ERROR juju.worker.uniter.operation runhook.go:136 hook "oidc-client-relation-joined" (via explicit, bespoke hook script) failed: exit status 1
2020-04-28 13:38:09 WARNING juju.worker.uniter.operation leader.go:123 we should run a leader-deposed hook here, but we can't yet

Revision history for this message

Ian Booth (wallyworld) wrote on 2020-04-28:

#7

As an aside, one issue Juju currently has is that if the leader unit does away, there is not an immediate election of a new leader - the leadership lease needs to time out. This can take up to a minute. So that could explain the symptom. See bug 1469731

It comes down to: was there > 1 unit asked for, and why was unit 0 removed. If possible, until bug 1469731 is fixed, it's best to try and avoid removing the leader unit.

Revision history for this message

Kenneth Koski (knkski) wrote on 2020-04-28:

#8

I tried deploying the latest stable bundle from the charm store instead of building the charms locally and was able to reproduce the issue.

Revision history for this message

Kenneth Koski (knkski) wrote on 2020-04-28:

#9

I haven't intentionally used the >1 unit functionality for the Kubeflow charms. I don't think I've accidentally enabled it given that I'm able to reproduce with the stable bundle from the charm store, either.

Revision history for this message

Kenneth Koski (knkski) wrote on 2020-04-28:

#10

Looks like it's spinning up new units quite frequently:

dex-auth/60* error idle 10.1.17.121 5556/TCP hook failed: "oidc-client-relation-joined"

Revision history for this message

Ian Booth (wallyworld) wrote on 2020-04-28:

#11

In a deployment, if a pod is terminated and recreated, this will show up as a new unit in Juju, since it is a different pod with a new UUID. If that is being done in error, then the root cause of the error will need to be addressed. A hook error should not result in the pod being restarted though.
All that notwithstanding, if a deployment pod fr which the Juju unit is the leader does legitimately get restarted, causing a new unit to show up in Juju, then due to bug 1469731, leadership is not immediately transferred.

Ian Booth (wallyworld) on 2020-05-08

tags:

added: k8s

Revision history for this message

Ian Booth (wallyworld) wrote on 2020-05-08:

#12

TL;DR; a quick win is to fix the charms to do a is_leader() check before doing leader only calls.

So there's a few things here.

In trying to reproduce on microk8s, I've had it work many times and fail a few times. The charms which have failed have been dex-auth and katib-controller. One thing to note about the charms is that start_charm() in dex-auth does not appear to have an is_leader() check. This check in needed in *all* charms that need to use leader only api calls. So this needs to be fixed in any of the charms in the bundle that don't do that check.

One way is which juju could trigger a pod bounce is in how it does write out the deployment yaml - it's not reading the existing yaml and updating, so the result is a new replicaset which means a bounce of the pod(s). That needs fixing in Juju, but doesn't appear to be the issue here.

When the issue has been observed, extra logging added to juju appears to show that juju is creating the deployment with scale 1 and correctly leaving it alone after that. Something else is causing the pod to bounce and this triggers the cycle of new pod -> new unit -> start_charm() -> error not leader. Fixing bug 1469731 will mask the issue somewhat.

Adding to 2.8 milestone to track the work to improve how juju creates deployments.

Changed in juju:
milestone:	none → 2.8-rc1
importance:	Undecided → High
status:	New → Triaged

Revision history for this message

Kenneth Koski (knkski) wrote on 2020-05-09:

#13

Download full text (21.0 KiB)

I updated the dex-auth charm to check whether or not it's a leader before calling pod spec set, and that stopped the repeated bouncing of pods, but end up with the dex-auth/1 unit that's unable to do any work. Here are the logs:

20-05-08 23:18:02 INFO juju.cmd supercommand.go:91 running jujud [2.9-beta1 3641 493556091fc053367946e6aada9c863982377a94 gc go1.14.2]
2020-05-08 23:18:02 DEBUG juju.cmd supercommand.go:92 args: []string{"/var/lib/juju/tools/jujud", "caasoperator", "--application-name=dex-auth", "--debug"}
2020-05-08 23:18:02 DEBUG juju.agent agent.go:571 read agent config, format "2.0"
2020-05-08 23:18:02 INFO juju.worker.upgradesteps worker.go:70 upgrade steps for 2.9-beta1 have already been run.
2020-05-08 23:18:02 INFO juju.cmd.jujud caasoperator.go:200 caas operator application-dex-auth start (2.9-beta1 [gc])
2020-05-08 23:18:02 DEBUG juju.worker.dependency engine.go:564 "upgrade-steps-gate" manifold worker started at 2020-05-08 23:18:02.906930387 +0000 UTC
2020-05-08 23:18:02 DEBUG juju.worker.dependency engine.go:564 "agent" manifold worker started at 2020-05-08 23:18:02.907023013 +0000 UTC
2020-05-08 23:18:02 DEBUG juju.worker.apicaller connect.go:128 connecting with old password
2020-05-08 23:18:02 DEBUG juju.worker.dependency engine.go:564 "clock" manifold worker started at 2020-05-08 23:18:02.90718082 +0000 UTC
2020-05-08 23:18:02 DEBUG juju.worker.dependency engine.go:564 "upgrade-steps-flag" manifold worker started at 2020-05-08 23:18:02.908256142 +0000 UTC
2020-05-08 23:18:02 DEBUG juju.worker.introspection socket.go:97 introspection worker listening on "@jujud-application-dex-auth"
2020-05-08 23:18:02 DEBUG juju.worker.introspection socket.go:127 stats worker now serving
2020-05-08 23:18:02 DEBUG juju.worker.dependency engine.go:564 "api-config-watcher" manifold worker started at 2020-05-08 23:18:02.917148121 +0000 UTC
2020-05-08 23:18:02 DEBUG juju.worker.dependency engine.go:564 "migration-fortress" manifold worker started at 2020-05-08 23:18:02.918394872 +0000 UTC
2020-05-08 23:18:02 DEBUG juju.api apiclient.go:1105 successfully dialed "wss://34.229.248.19:17070/model/34904a0c-8909-4a13-85eb-6d35dd1535f8/api"
2020-05-08 23:18:02 INFO juju.api apiclient.go:637 connection established to "wss://34.229.248.19:17070/model/34904a0c-8909-4a13-85eb-6d35dd1535f8/api"
2020-05-08 23:18:02 INFO juju.worker.apicaller connect.go:158 [34904a] "application-dex-auth" successfully connected to "34.229.248.19:17070"
2020-05-08 23:18:02 DEBUG juju.api monitor.go:35 RPC connection died
2020-05-08 23:18:02 DEBUG juju.worker.dependency engine.go:584 "api-caller" manifold worker completed successfully
2020-05-08 23:18:02 DEBUG juju.worker.apicaller connect.go:128 connecting with old password
2020-05-08 23:18:03 DEBUG juju.api apiclient.go:1105 successfully dialed "wss://172.31.36.124:17070/model/34904a0c-8909-4a13-85eb-6d35dd1535f8/api"
2020-05-08 23:18:03 INFO juju.api apiclient.go:637 connection established to "wss://172.31.36.124:17070/model/34904a0c-8909-4a13-85eb-6d35dd1535f8/api"
2020-05-08 23:18:03 INFO juju.worker.apicaller connect.go:158 [34904a] "application-dex-auth" successfully connected to "172.31.36.124:17070"
2020-05-08 2...

I updated the dex-auth charm to check whether or not it's a leader before calling pod spec set, and that stopped the repeated bouncing of pods, but end up with the dex-auth/1 unit that's unable to do any work. Here are the logs:

20-05-08 23:18:02 INFO juju.cmd supercommand.go:91 running jujud [2.9-beta1 3641 493556091fc053367946e6aada9c863982377a94 gc go1.14.2]
2020-05-08 23:18:02 DEBUG juju.cmd supercommand.go:92   args: []string{"/var/lib/juju/tools/jujud", "caasoperator", "--application-name=dex-auth", "--debug"}
2020-05-08 23:18:02 DEBUG juju.agent agent.go:571 read agent config, format "2.0"
2020-05-08 23:18:02 INFO juju.worker.upgradesteps worker.go:70 upgrade steps for 2.9-beta1 have already been run.
2020-05-08 23:18:02 INFO juju.cmd.jujud caasoperator.go:200 caas operator application-dex-auth start (2.9-beta1 [gc])
2020-05-08 23:18:02 DEBUG juju.worker.dependency engine.go:564 "upgrade-steps-gate" manifold worker started at 2020-05-08 23:18:02.906930387 +0000 UTC
2020-05-08 23:18:02 DEBUG juju.worker.dependency engine.go:564 "agent" manifold worker started at 2020-05-08 23:18:02.907023013 +0000 UTC
2020-05-08 23:18:02 DEBUG juju.worker.apicaller connect.go:128 connecting with old password
2020-05-08 23:18:02 DEBUG juju.worker.dependency engine.go:564 "clock" manifold worker started at 2020-05-08 23:18:02.90718082 +0000 UTC
2020-05-08 23:18:02 DEBUG juju.worker.dependency engine.go:564 "upgrade-steps-flag" manifold worker started at 2020-05-08 23:18:02.908256142 +0000 UTC
2020-05-08 23:18:02 DEBUG juju.worker.introspection socket.go:97 introspection worker listening on "@jujud-application-dex-auth"
2020-05-08 23:18:02 DEBUG juju.worker.introspection socket.go:127 stats worker now serving
2020-05-08 23:18:02 DEBUG juju.worker.dependency engine.go:564 "api-config-watcher" manifold worker started at 2020-05-08 23:18:02.917148121 +0000 UTC
2020-05-08 23:18:02 DEBUG juju.worker.dependency engine.go:564 "migration-fortress" manifold worker started at 2020-05-08 23:18:02.918394872 +0000 UTC
2020-05-08 23:18:02 DEBUG juju.api apiclient.go:1105 successfully dialed "wss://34.229.248.19:17070/model/34904a0c-8909-4a13-85eb-6d35dd1535f8/api"
2020-05-08 23:18:02 INFO juju.api apiclient.go:637 connection established to "wss://34.229.248.19:17070/model/34904a0c-8909-4a13-85eb-6d35dd1535f8/api"
2020-05-08 23:18:02 INFO juju.worker.apicaller connect.go:158 [34904a] "application-dex-auth" successfully connected to "34.229.248.19:17070"
2020-05-08 23:18:02 DEBUG juju.api monitor.go:35 RPC connection died
2020-05-08 23:18:02 DEBUG juju.worker.dependency engine.go:584 "api-caller" manifold worker completed successfully
2020-05-08 23:18:02 DEBUG juju.worker.apicaller connect.go:128 connecting with old password
2020-05-08 23:18:03 DEBUG juju.api apiclient.go:1105 successfully dialed "wss://172.31.36.124:17070/model/34904a0c-8909-4a13-85eb-6d35dd1535f8/api"
2020-05-08 23:18:03 INFO juju.api apiclient.go:637 connection established to "wss://172.31.36.124:17070/model/34904a0c-8909-4a13-85eb-6d35dd1535f8/api"
2020-05-08 23:18:03 INFO juju.worker.apicaller connect.go:158 [34904a] "application-dex-auth" successfully connected to "172.31.36.124:17070"
2020-05-08 23:18:03 DEBUG juju.worker.dependency engine.go:564 "api-caller" manifold worker started at 2020-05-08 23:18:03.148507567 +0000 UTC
2020-05-08 23:18:03 DEBUG juju.worker.dependency engine.go:564 "upgrade-steps-runner" manifold worker started at 2020-05-08 23:18:03.157749223 +0000 UTC
2020-05-08 23:18:03 DEBUG juju.worker.dependency engine.go:564 "log-sender" manifold worker started at 2020-05-08 23:18:03.157784157 +0000 UTC
2020-05-08 23:18:03 DEBUG juju.worker.dependency engine.go:584 "upgrade-steps-runner" manifold worker completed successfully
2020-05-08 23:18:03 DEBUG juju.worker.dependency engine.go:564 "migration-minion" manifold worker started at 2020-05-08 23:18:03.158702557 +0000 UTC
2020-05-08 23:18:03 DEBUG juju.worker.dependency engine.go:564 "upgrader" manifold worker started at 2020-05-08 23:18:03.158798244 +0000 UTC
2020-05-08 23:18:03 DEBUG juju.worker.dependency engine.go:564 "migration-inactive-flag" manifold worker started at 2020-05-08 23:18:03.16143109 +0000 UTC
2020-05-08 23:18:03 INFO juju.worker.migrationminion worker.go:140 migration phase is now: NONE
2020-05-08 23:18:03 INFO juju.worker.caasupgrader upgrader.go:112 abort check blocked until version event received
2020-05-08 23:18:03 DEBUG juju.worker.caasupgrader upgrader.go:127 current agent binary version: 2.9-beta1
2020-05-08 23:18:03 INFO juju.worker.caasupgrader upgrader.go:118 unblocking abort check
2020-05-08 23:18:03 DEBUG juju.worker.dependency engine.go:564 "api-address-updater" manifold worker started at 2020-05-08 23:18:03.171572807 +0000 UTC
2020-05-08 23:18:03 DEBUG juju.worker.dependency engine.go:564 "charm-dir" manifold worker started at 2020-05-08 23:18:03.171618882 +0000 UTC
2020-05-08 23:18:03 DEBUG juju.worker.logger logger.go:64 initial log config: "<root>=DEBUG"
2020-05-08 23:18:03 DEBUG juju.worker.dependency engine.go:564 "logging-config-updater" manifold worker started at 2020-05-08 23:18:03.171841129 +0000 UTC
2020-05-08 23:18:03 INFO juju.worker.logger logger.go:118 logger worker started
2020-05-08 23:18:03 DEBUG juju.worker.dependency engine.go:564 "proxy-config-updater" manifold worker started at 2020-05-08 23:18:03.172686859 +0000 UTC
2020-05-08 23:18:03 DEBUG juju.worker.caasupgrader upgrader.go:150 desired agent binary version: 2.9-beta1
2020-05-08 23:18:03 DEBUG juju.worker.dependency engine.go:564 "hook-retry-strategy" manifold worker started at 2020-05-08 23:18:03.174225559 +0000 UTC
2020-05-08 23:18:03 DEBUG juju.worker.logger logger.go:92 reconfiguring logging from "<root>=DEBUG" to "INFO"
2020-05-08 23:18:03 INFO juju.worker.uniter.charm bundles.go:79 downloading local:kubernetes/dex-auth-0 from API server
2020-05-08 23:18:03 INFO juju.downloader download.go:111 downloading from local:kubernetes/dex-auth-0
2020-05-08 23:18:03 INFO juju.downloader download.go:94 download complete ("local:kubernetes/dex-auth-0")
2020-05-08 23:18:03 INFO juju.downloader download.go:174 download verified ("local:kubernetes/dex-auth-0")
2020-05-08 23:18:04 INFO juju.worker.caasoperator caasoperator.go:400 operator "dex-auth" started
2020-05-08 23:18:04 INFO juju.agent.tools symlinks.go:20 ensure jujuc symlinks in /var/lib/juju/tools/unit-dex-auth-0
2020-05-08 23:18:04 INFO juju.worker.leadership tracker.go:194 dex-auth/0 promoted to leadership of dex-auth
2020-05-08 23:18:04 INFO juju.worker.uniter uniter.go:276 unit "dex-auth/0" started
2020-05-08 23:18:04 INFO juju.worker.uniter uniter.go:285 resuming charm install
2020-05-08 23:18:04 INFO juju.worker.uniter.charm bundles.go:79 downloading local:kubernetes/dex-auth-0 from API server
2020-05-08 23:18:04 INFO juju.downloader download.go:111 downloading from local:kubernetes/dex-auth-0
2020-05-08 23:18:04 INFO juju.downloader download.go:94 download complete ("local:kubernetes/dex-auth-0")
2020-05-08 23:18:04 INFO juju.downloader download.go:174 download verified ("local:kubernetes/dex-auth-0")
2020-05-08 23:18:05 INFO juju.worker.uniter uniter.go:315 hooks are retried true
2020-05-08 23:18:05 INFO juju.worker.uniter resolver.go:136 found queued "install" hook
2020-05-08 23:18:05 INFO juju-log Reactive main running for hook install
2020-05-08 23:18:31 INFO juju-log Invoking reactive handler: reactive/dex_auth.py:38:update_image
2020-05-08 23:18:31 INFO juju-log Invoking reactive handler: reactive/docker_resource.py:7:auto_fetch
2020-05-08 23:18:31 INFO juju-log Invoking reactive handler: reactive/docker_resource.py:18:fetch
2020-05-08 23:18:31 INFO juju-log status-set: maintenance: fetching resource: oci-image
2020-05-08 23:18:31 INFO juju.worker.uniter.operation runhook.go:142 ran "install" hook (via explicit, bespoke hook script)
2020-05-08 23:18:32 INFO juju.worker.uniter resolver.go:136 found queued "leader-elected" hook
2020-05-08 23:18:32 INFO juju-log Reactive main running for hook leader-elected
2020-05-08 23:18:32 INFO juju.worker.uniter.operation runhook.go:142 ran "leader-elected" hook (via explicit, bespoke hook script)
2020-05-08 23:18:32 INFO juju-log Reactive main running for hook config-changed
2020-05-08 23:18:33 INFO juju.worker.uniter.operation runhook.go:142 ran "config-changed" hook (via explicit, bespoke hook script)
2020-05-08 23:18:33 INFO juju.worker.uniter resolver.go:136 found queued "start" hook
2020-05-08 23:18:33 INFO juju-log Reactive main running for hook start
2020-05-08 23:18:33 INFO juju.worker.uniter.operation runhook.go:142 ran "start" hook (via explicit, bespoke hook script)
2020-05-08 23:18:40 INFO juju-log Reactive main running for hook update-status
2020-05-08 23:18:40 INFO juju.worker.uniter.operation runhook.go:142 ran "update-status" hook (via explicit, bespoke hook script)
2020-05-08 23:19:04 INFO juju-log Reactive main running for hook update-status
2020-05-08 23:19:05 INFO juju.worker.uniter.operation runhook.go:142 ran "update-status" hook (via explicit, bespoke hook script)
2020-05-08 23:19:33 INFO juju-log Reactive main running for hook update-status
2020-05-08 23:19:34 INFO juju.worker.uniter.operation runhook.go:142 ran "update-status" hook (via explicit, bespoke hook script)
2020-05-08 23:19:41 INFO juju.worker.uniter.relation statetracker.go:153 joining relation "dex-auth:oidc-client oidc-gatekeeper:oidc-client"
2020-05-08 23:19:41 INFO juju.worker.uniter.relation statetracker.go:189 joined relation "dex-auth:oidc-client oidc-gatekeeper:oidc-client"
2020-05-08 23:19:41 INFO juju.worker.uniter.operation runhook.go:145 skipped "oidc-client-relation-created" hook (missing)
2020-05-08 23:20:09 INFO juju-log Reactive main running for hook update-status
2020-05-08 23:20:10 INFO juju.worker.uniter.operation runhook.go:142 ran "update-status" hook (via explicit, bespoke hook script)
2020-05-08 23:20:39 INFO juju-log Reactive main running for hook update-status
2020-05-08 23:20:39 INFO juju.worker.uniter.operation runhook.go:142 ran "update-status" hook (via explicit, bespoke hook script)
2020-05-08 23:21:06 INFO juju-log Reactive main running for hook update-status
2020-05-08 23:21:06 INFO juju.worker.uniter.operation runhook.go:142 ran "update-status" hook (via explicit, bespoke hook script)
2020-05-08 23:21:34 INFO juju-log Reactive main running for hook update-status
2020-05-08 23:21:34 INFO juju.worker.uniter.operation runhook.go:142 ran "update-status" hook (via explicit, bespoke hook script)
2020-05-08 23:21:40 INFO juju-log oidc-client:7: Reactive main running for hook oidc-client-relation-joined
2020-05-08 23:21:40 INFO juju.worker.uniter.operation runhook.go:142 ran "oidc-client-relation-joined" hook (via explicit, bespoke hook script)
2020-05-08 23:21:40 INFO juju-log oidc-client:7: Reactive main running for hook oidc-client-relation-changed
2020-05-08 23:21:41 INFO juju.worker.uniter.operation runhook.go:142 ran "oidc-client-relation-changed" hook (via explicit, bespoke hook script)
2020-05-08 23:21:43 INFO juju-log oidc-client:7: Reactive main running for hook oidc-client-relation-changed
2020-05-08 23:21:44 INFO juju-log oidc-client:7: Invoking reactive handler: reactive/dex_auth.py:32:update_relation
2020-05-08 23:21:44 INFO juju-log oidc-client:7: Invoking reactive handler: reactive/dex_auth.py:43:start_charm
2020-05-08 23:21:44 INFO juju-log oidc-client:7: status-set: maintenance: configuring container
2020-05-08 23:21:44 INFO juju-log oidc-client:7: status-set: maintenance: creating container
2020-05-08 23:21:44 INFO juju-log oidc-client:7: Invoking reactive handler: reactive/dex_auth.py:27:charm_ready
2020-05-08 23:21:44 INFO juju-log oidc-client:7: status-set: active: 
2020-05-08 23:21:45 INFO juju.worker.uniter.operation runhook.go:142 ran "oidc-client-relation-changed" hook (via explicit, bespoke hook script)
2020-05-08 23:21:46 INFO juju-log oidc-client:7: Reactive main running for hook oidc-client-relation-changed
2020-05-08 23:21:47 INFO juju-log oidc-client:7: Invoking reactive handler: reactive/dex_auth.py:27:charm_ready
2020-05-08 23:21:47 INFO juju-log oidc-client:7: Invoking reactive handler: reactive/dex_auth.py:32:update_relation
2020-05-08 23:21:47 INFO juju-log oidc-client:7: Invoking reactive handler: reactive/dex_auth.py:43:start_charm
2020-05-08 23:21:47 INFO juju-log oidc-client:7: status-set: maintenance: configuring container
2020-05-08 23:21:48 INFO juju-log oidc-client:7: status-set: maintenance: creating container
2020-05-08 23:21:48 INFO juju-log oidc-client:7: Invoking reactive handler: reactive/dex_auth.py:27:charm_ready
2020-05-08 23:21:48 INFO juju-log oidc-client:7: status-set: active: 
2020-05-08 23:21:48 INFO juju.worker.uniter.operation runhook.go:142 ran "oidc-client-relation-changed" hook (via explicit, bespoke hook script)
2020-05-08 23:21:57 INFO juju-log Reactive main running for hook config-changed
2020-05-08 23:21:58 INFO juju-log Invoking reactive handler: reactive/dex_auth.py:27:charm_ready
2020-05-08 23:21:58 INFO juju-log status-set: active: 
2020-05-08 23:21:58 INFO juju.worker.uniter.operation runhook.go:142 ran "config-changed" hook (via explicit, bespoke hook script)
2020-05-08 23:22:05 INFO juju-log Reactive main running for hook update-status
2020-05-08 23:22:05 INFO juju-log Invoking reactive handler: reactive/dex_auth.py:27:charm_ready
2020-05-08 23:22:05 INFO juju-log status-set: active: 
2020-05-08 23:22:05 INFO juju.worker.uniter.operation runhook.go:142 ran "update-status" hook (via explicit, bespoke hook script)
2020-05-08 23:22:08 INFO juju.worker.caasoperator initializer.go:104 started pod init on "dex-auth/0"
2020-05-08 23:22:14 INFO juju.worker.leadership tracker.go:217 dex-auth leadership for dex-auth/1 denied
2020-05-08 23:22:14 INFO juju.agent.tools symlinks.go:20 ensure jujuc symlinks in /var/lib/juju/tools/unit-dex-auth-1
2020-05-08 23:22:14 INFO juju.worker.uniter uniter.go:276 unit "dex-auth/1" started
2020-05-08 23:22:14 INFO juju.worker.uniter uniter.go:285 resuming charm install
2020-05-08 23:22:14 INFO juju.worker.uniter.charm bundles.go:79 downloading local:kubernetes/dex-auth-0 from API server
2020-05-08 23:22:14 INFO juju.downloader download.go:111 downloading from local:kubernetes/dex-auth-0
2020-05-08 23:22:14 INFO juju.downloader download.go:94 download complete ("local:kubernetes/dex-auth-0")
2020-05-08 23:22:14 INFO juju.downloader download.go:174 download verified ("local:kubernetes/dex-auth-0")
2020-05-08 23:22:15 INFO juju.worker.uniter uniter.go:315 hooks are retried true
2020-05-08 23:22:15 INFO juju.worker.uniter resolver.go:136 found queued "install" hook
2020-05-08 23:22:15 INFO juju-log Reactive main running for hook install
2020-05-08 23:22:16 INFO juju-log Invoking reactive handler: reactive/dex_auth.py:38:update_image
2020-05-08 23:22:16 INFO juju-log Invoking reactive handler: reactive/docker_resource.py:7:auto_fetch
2020-05-08 23:22:16 INFO juju-log Invoking reactive handler: reactive/docker_resource.py:18:fetch
2020-05-08 23:22:16 INFO juju-log status-set: maintenance: fetching resource: oci-image
2020-05-08 23:22:16 INFO juju.worker.uniter.operation runhook.go:142 ran "install" hook (via explicit, bespoke hook script)
2020-05-08 23:22:16 INFO juju.worker.uniter.relation statetracker.go:153 joining relation "dex-auth:oidc-client oidc-gatekeeper:oidc-client"
2020-05-08 23:22:16 INFO juju.worker.uniter.relation statetracker.go:189 joined relation "dex-auth:oidc-client oidc-gatekeeper:oidc-client"
2020-05-08 23:22:16 INFO juju.worker.uniter.operation runhook.go:145 skipped "oidc-client-relation-created" hook (missing)
2020-05-08 23:22:17 INFO juju-log Reactive main running for hook leader-settings-changed
2020-05-08 23:22:17 INFO juju.worker.uniter.operation runhook.go:142 ran "leader-settings-changed" hook (via explicit, bespoke hook script)
2020-05-08 23:22:17 INFO juju.worker.caasoperator initializer.go:104 started pod init on "dex-auth/1"
2020-05-08 23:22:18 INFO juju-log Reactive main running for hook config-changed
2020-05-08 23:22:18 INFO juju.worker.uniter.operation runhook.go:142 ran "config-changed" hook (via explicit, bespoke hook script)
2020-05-08 23:22:19 INFO juju.worker.uniter resolver.go:136 found queued "start" hook
2020-05-08 23:22:19 INFO juju-log Reactive main running for hook start
2020-05-08 23:22:19 INFO juju.worker.uniter.operation runhook.go:142 ran "start" hook (via explicit, bespoke hook script)
2020-05-08 23:22:19 INFO juju-log oidc-client:7: Reactive main running for hook oidc-client-relation-joined
2020-05-08 23:22:20 INFO juju-log oidc-client:7: Invoking reactive handler: reactive/dex_auth.py:32:update_relation
2020-05-08 23:22:20 INFO juju-log oidc-client:7: Invoking reactive handler: reactive/dex_auth.py:43:start_charm
2020-05-08 23:22:20 INFO juju-log oidc-client:7: status-set: maintenance: configuring container
2020-05-08 23:22:20 INFO juju-log oidc-client:7: status-set: blocked: this unit is not a leader
2020-05-08 23:22:20 INFO juju.worker.uniter.operation runhook.go:142 ran "oidc-client-relation-joined" hook (via explicit, bespoke hook script)
2020-05-08 23:22:21 INFO juju-log oidc-client:7: Reactive main running for hook oidc-client-relation-changed
2020-05-08 23:22:21 INFO juju-log oidc-client:7: Invoking reactive handler: reactive/dex_auth.py:43:start_charm
2020-05-08 23:22:21 INFO juju-log oidc-client:7: status-set: maintenance: configuring container
2020-05-08 23:22:21 INFO juju-log oidc-client:7: status-set: blocked: this unit is not a leader
2020-05-08 23:22:21 INFO juju.worker.uniter.operation runhook.go:142 ran "oidc-client-relation-changed" hook (via explicit, bespoke hook script)
2020-05-08 23:22:36 INFO juju-log Reactive main running for hook update-status
2020-05-08 23:22:37 INFO juju-log Invoking reactive handler: reactive/dex_auth.py:27:charm_ready
2020-05-08 23:22:37 INFO juju-log status-set: active: 
2020-05-08 23:22:37 INFO juju.worker.uniter.operation runhook.go:142 ran "update-status" hook (via explicit, bespoke hook script)
2020-05-08 23:22:50 INFO juju-log Reactive main running for hook update-status
2020-05-08 23:22:50 INFO juju-log Invoking reactive handler: reactive/dex_auth.py:43:start_charm
2020-05-08 23:22:50 INFO juju-log status-set: maintenance: configuring container
2020-05-08 23:22:50 INFO juju-log status-set: blocked: this unit is not a leader
2020-05-08 23:22:50 INFO juju.worker.uniter.operation runhook.go:142 ran "update-status" hook (via explicit, bespoke hook script)
2020-05-08 23:23:04 INFO juju-log Reactive main running for hook update-status
2020-05-08 23:23:04 INFO juju-log Invoking reactive handler: reactive/dex_auth.py:27:charm_ready
2020-05-08 23:23:04 INFO juju-log status-set: active: 
2020-05-08 23:23:04 INFO juju.worker.uniter.operation runhook.go:142 ran "update-status" hook (via explicit, bespoke hook script)
2020-05-08 23:23:21 INFO juju-log Reactive main running for hook update-status
2020-05-08 23:23:21 INFO juju-log Invoking reactive handler: reactive/dex_auth.py:43:start_charm
2020-05-08 23:23:21 INFO juju-log status-set: maintenance: configuring container
2020-05-08 23:23:22 INFO juju-log status-set: blocked: this unit is not a leader
2020-05-08 23:23:22 INFO juju.worker.uniter.operation runhook.go:142 ran "update-status" hook (via explicit, bespoke hook script)
2020-05-08 23:23:28 INFO juju.worker.uniter uniter.go:525 unit "dex-auth/0" shutting down: catacomb 0xc0003be400 is dying
2020-05-08 23:23:28 INFO juju.worker.uniter uniter.go:525 unit "dex-auth/1" shutting down: catacomb 0xc00056cc00 is dying
2020-05-08 23:23:28 ERROR juju.worker.dependency engine.go:671 "operator" manifold worker returned unexpected error: permission denied
2020-05-08 23:23:32 INFO juju.worker.caasoperator caasoperator.go:400 operator "dex-auth" started
2020-05-08 23:23:32 INFO juju.worker.leadership tracker.go:217 dex-auth leadership for dex-auth/1 denied
2020-05-08 23:23:32 INFO juju.agent.tools symlinks.go:20 ensure jujuc symlinks in /var/lib/juju/tools/unit-dex-auth-1
2020-05-08 23:23:32 INFO juju.worker.uniter.relation statetracker.go:153 joining relation "dex-auth:oidc-client oidc-gatekeeper:oidc-client"
2020-05-08 23:23:32 INFO juju.worker.uniter.relation statetracker.go:189 joined relation "dex-auth:oidc-client oidc-gatekeeper:oidc-client"
2020-05-08 23:23:32 INFO juju.worker.uniter uniter.go:276 unit "dex-auth/1" started
2020-05-08 23:23:32 INFO juju.worker.uniter uniter.go:315 hooks are retried true
2020-05-08 23:23:32 INFO juju-log Reactive main running for hook leader-settings-changed
2020-05-08 23:23:33 INFO juju-log Invoking reactive handler: reactive/dex_auth.py:43:start_charm
2020-05-08 23:23:33 INFO juju-log status-set: maintenance: configuring container
2020-05-08 23:23:33 INFO juju-log status-set: blocked: this unit is not a leader
2020-05-08 23:23:33 INFO juju.worker.uniter.operation runhook.go:142 ran "leader-settings-changed" hook (via explicit, bespoke hook script)
2020-05-08 23:23:33 INFO juju.worker.caasoperator initializer.go:104 started pod init on "dex-auth/1"
2020-05-08 23:23:33 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" failed: : command terminated with exit code 127
2020-05-08 23:23:33 INFO juju.worker.uniter uniter.go:525 unit "dex-auth/1" shutting down: executing operation "remote init": caas-unit-init for unit "dex-auth/1" failed: : command terminated with exit code 127

Revision history for this message

Ian Booth (wallyworld) wrote on 2020-05-10:

#14

Yeah, right now, sadly, unit 1 will not be able to act as the leader until the lease from unit 1 times out. This is bug 1469731. If unit 1 does start, I'm thinking unit 0 would already have run start_charm() to get things set up?

Canonical Juju QA Bot (juju-qa-bot) on 2020-05-11

Changed in juju:
milestone:	2.8-rc1 → none

Ian Booth (wallyworld) on 2020-05-11

Changed in juju:
milestone:	none → 2.8-rc2

Revision history for this message

Kenneth Koski (knkski) wrote on 2020-05-11:

#15

Unit 0 will have run start_charm, unfortunately that means it will have been started up with incomplete configuration, and will not be running with data received from the relationship. Unit 1 could update the deployment if it could read relationship data.

Ian Booth (wallyworld) on 2020-05-15

Changed in juju:
milestone:	2.8-rc2 → 2.8.1

Revision history for this message

Tim Penhey (thumper) wrote on 2020-06-08:

#16

We think this has now been fixed due to leadership being revoked on unit removal. I'll mark the bug incomplete and if it is noticed on 2.8 again, please reopen with juju version and logs.

Changed in juju:
status:	Triaged → Incomplete
milestone:	2.8.1 → none

Revision history for this message

Kenneth Koski (knkski) wrote on 2020-06-11:

#17

I'm running into this with 2.8.0. I added a check to all of the Kubeflow charms that looks like this:

if not hookenv.is_leader():
layer.status.blocked("this unit is not a leader")
return False

That left me with many copies of some charms:

argo-controller/0* active idle 10.1.48.187
argo-controller/1 blocked idle 10.1.90.124 this unit is not a leader
argo-controller/2 blocked idle 10.1.90.127 this unit is not a leader
argo-controller/3 blocked idle 10.1.90.128 this unit is not a leader
argo-controller/4 error idle 10.1.90.126 Started container tensorflow-serve
argo-controller/5 blocked idle 10.1.90.125 this unit is not a leader
argo-controller/6 blocked idle 10.1.90.132 this unit is not a leader
argo-controller/7 blocked idle 10.1.90.130 this unit is not a leader
argo-controller/8 blocked idle 10.1.90.133 this unit is not a leader

Other charms worked fine, though. I'm not really sure what is triggering this behavior

Revision history for this message

Ian Booth (wallyworld) wrote on 2020-06-11:

#18

argo-controller/0 is the leader (as indicated by the *)

The other argo-controller units are not the leader.

So from that perspective, Juju is correctly only allowing one unit to be the leader.

Is the question why there are 8 argo-contoller units? Are the 8 corresponding pods?

Revision history for this message

Kenneth Koski (knkski) wrote on 2020-06-12:

#19

It looks like there's only one pod for argo-controller, but those extra units aren't going away, which seems wrong.

Also, the dex-auth charm is getting constantly recycled:

dex-auth/1880* terminated executing 10.1.90.162 5556/TCP (stop) unit stopped by the cloud
dex-auth/1881 blocked idle 10.1.19.92 5556/TCP this unit is not a leader

Revision history for this message

Kenneth Koski (knkski) wrote on 2020-06-16:

#20

I think this points at the underlying issue:

application-dex-auth: 12:24:02 ERROR juju.worker.caasoperator exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/6" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-6 --charm-dir /tmp/unit-dex-auth-6215732929/charm --upgrade" failed: sh: /var/lib/juju/tools/jujud: not found

I exec'ed into the dex-auth pod, and that file exists. However, trying to run it similarly fails with the "not found" error from sh.

After poking around a bit, I believe this is due to Alpine linux using musl, and Ubuntu using glibc. I ran "ldd /var/lib/juju/tools/jujud" and got this line:

/lib64/ld-linux-x86-64.so.2 (0x7f3ba0a74000)

That file doesn't exist on alpine linux, instead there's a /lib/ld-musl-x86_64.so.1 file.

I was able to install a compatibility layer with "apk add libc6-compat" and it got further, but still errored out:

    # ./jujud
    Error relocating ./jujud: __vfprintf_chk: symbol not found
    Error relocating ./jujud: __fprintf_chk: symbol not found

Which is the same error messages that ldd prints out. This is probably due to different versions of libc in play.

Harry Pidcock (hpidcock) on 2020-06-17

Changed in juju:
status:	Incomplete → In Progress
assignee:	nobody → Harry Pidcock (hpidcock)
milestone:	none → 2.8.1

Revision history for this message

Harry Pidcock (hpidcock) wrote on 2020-06-17:

#21

https://github.com/juju/juju/pull/11720

Harry Pidcock (hpidcock) on 2020-06-19

Changed in juju:
status:	In Progress → Fix Committed

Canonical Juju QA Bot (juju-qa-bot) on 2020-07-13

Changed in juju:
status:	Fix Committed → Fix Released

Revision history for this message

Kenneth Koski (knkski) wrote on 2020-07-14:

#22

It looks like the above error about jujud not being found is not the underlying cause of this issue. I tried deploying kubeflow with juju 2.8.1, and got the same behavior of getting errors about the unit not being a leader, without the error about jujud not being found.

It seems to be a flaky issue, as I was able to deploy kubeflow once successfully, but then wasn't able to a second time. The script broke due to juju-wait seeing a unit go into an error state, because of the error about not being a leader.

I'm able to reproduce this in CI, see here:

https://github.com/juju-solutions/bundle-kubeflow/pull/225/checks?check_run_id=870323388

Juju 2.8.1 errors out in those jobs due to this issue.

Revision history for this message

Kenneth Koski (knkski) wrote on 2020-07-14:

#23

Download full text (26.3 KiB)

I'm getting these errors from the operator pod of dex-auth (which is having these issues):

2020-07-14 22:38:17 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1863151550/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:17 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1863151550/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:21 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1414093216/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:21 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1414093216/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:25 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1422214098/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:25 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1422214098/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:28 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1886864084/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:28 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1886864084/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:32 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1909562150/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit...

I'm getting these errors from the operator pod of dex-auth (which is having these issues):

2020-07-14 22:38:17 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1863151550/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:17 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1863151550/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:21 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1414093216/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:21 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1414093216/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:25 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1422214098/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:25 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1422214098/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:28 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1886864084/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:28 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1886864084/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:32 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1909562150/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:38:32 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1909562150/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:38:36 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1009354056/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:36 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1009354056/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:40 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1129307578/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:38:40 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1129307578/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:38:44 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1007041788/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:38:44 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1007041788/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:38:48 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1245730190/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:48 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1245730190/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:52 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1818367984/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:52 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1818367984/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:56 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1884664994/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:38:56 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1884664994/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:00 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1611796516/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:39:00 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1611796516/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:39:04 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1408029430/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:04 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1408029430/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:08 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1574341016/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:39:08 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1574341016/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:39:11 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1346267274/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:11 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1346267274/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:15 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1363662412/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:15 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1363662412/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:20 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1371083102/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:39:20 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1371083102/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:39:24 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1322851392/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:39:24 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1322851392/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:39:27 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1221250930/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:27 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1221250930/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:31 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1601863540/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:31 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1601863540/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:35 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1125798598/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:35 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1125798598/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:39 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1813382120/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:39:39 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1813382120/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:39:43 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1040194906/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:39:43 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1040194906/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:39:47 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1248480668/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:47 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1248480668/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:51 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1977259310/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:51 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1977259310/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:55 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1117932688/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:39:55 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1117932688/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:39:59 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1600937538/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:39:59 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1600937538/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:40:02 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1992024260/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:40:02 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1992024260/charm --upgrade" failed: ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
ERROR failed to remove unit tools dir /var/lib/juju/tools/unit-dex-auth-1: unlinkat /var/lib/juju/tools/unit-dex-auth-1/action-get: permission denied
: command terminated with exit code 1
2020-07-14 22:40:06 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1217179286/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:40:06 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1217179286/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:40:10 ERROR juju.worker.uniter agent.go:31 resolver loop error: executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1706906168/charm --upgrade" failed: : command terminated with exit code 1
2020-07-14 22:40:10 ERROR juju.worker.caasoperator runner.go:430 exited "dex-auth/1": executing operation "remote init": caas-unit-init for unit "dex-auth/1" with command: "/var/lib/juju/tools/jujud caas-unit-init --unit unit-dex-auth-1 --charm-dir /tmp/unit-dex-auth-1706906168/charm --upgrade" failed: : command terminated with exit code 1

Revision history for this message

Harry Pidcock (hpidcock) wrote on 2020-07-14:

#24

What is the user of the container? I'm guessing it's not root. Which is probably the issue with regards to this.

Revision history for this message

Kenneth Koski (knkski) wrote on 2020-07-15:

#25

Yeah, the container is running as id 1001:

https://github.com/dexidp/dex/blob/master/Dockerfile#L16

This is a popular method of running containers, so it definitely seems like something we should support.

Revision history for this message

Kenneth Koski (knkski) wrote on 2020-08-24:

#26

I've narrowed down the issue to a small, reproducible test case:

juju add-model kubeflow --config update-status-hook-interval=30s
juju deploy cs:~kubeflow-charmers/dex-auth-53
juju deploy cs:~kubeflow-charmers/oidc-gatekeeper-53
juju relate dex-auth oidc-gatekeeper
juju config oidc-gatekeeper client-secret=password
juju config dex-auth static-username=admin static-password=password
juju wait -wv
juju config dex-auth public-url=localhost
juju config oidc-gatekeeper public-url=localhost

Canonical Juju

Juju 2.8 error about unit not being the leader

Bug Description

Other bug subscribers

Remote bug watches