Comment 4 for bug 1875481

Revision history for this message
Kenneth Koski (knkski) wrote :

I had this happen with two charms for my most recent attempt. The full logs from one of them:

$ kubectl logs --tail 100 -fl juju-operator=katib-controller
2020-04-28 13:28:46 DEBUG juju.worker.introspection socket.go:127 stats worker now serving
2020-04-28 13:28:46 DEBUG juju.worker.dependency engine.go:564 "api-config-watcher" manifold worker started at 2020-04-28 13:28:46.875767974 +0000 UTC
2020-04-28 13:28:46 DEBUG juju.worker.dependency engine.go:564 "migration-fortress" manifold worker started at 2020-04-28 13:28:46.876854252 +0000 UTC
2020-04-28 13:28:46 DEBUG juju.api apiclient.go:1105 successfully dialed "wss://172.31.33.208:17070/model/81f1584d-a77f-4f30-8a12-417f4c2fcedd/api"
2020-04-28 13:28:46 INFO juju.api apiclient.go:637 connection established to "wss://172.31.33.208:17070/model/81f1584d-a77f-4f30-8a12-417f4c2fcedd/api"
2020-04-28 13:28:46 INFO juju.worker.apicaller connect.go:158 [81f158] "application-katib-controller" successfully connected to "172.31.33.208:17070"
2020-04-28 13:28:46 DEBUG juju.api monitor.go:35 RPC connection died
2020-04-28 13:28:46 DEBUG juju.worker.dependency engine.go:584 "api-caller" manifold worker completed successfully
2020-04-28 13:28:46 DEBUG juju.worker.apicaller connect.go:128 connecting with old password
2020-04-28 13:28:47 DEBUG juju.api apiclient.go:1105 successfully dialed "wss://3.87.99.136:17070/model/81f1584d-a77f-4f30-8a12-417f4c2fcedd/api"
2020-04-28 13:28:47 INFO juju.api apiclient.go:637 connection established to "wss://3.87.99.136:17070/model/81f1584d-a77f-4f30-8a12-417f4c2fcedd/api"
2020-04-28 13:28:47 INFO juju.worker.apicaller connect.go:158 [81f158] "application-katib-controller" successfully connected to "3.87.99.136:17070"
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "api-caller" manifold worker started at 2020-04-28 13:28:47.105344894 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "upgrader" manifold worker started at 2020-04-28 13:28:47.115656 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "migration-minion" manifold worker started at 2020-04-28 13:28:47.115705665 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "log-sender" manifold worker started at 2020-04-28 13:28:47.115728209 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "upgrade-steps-runner" manifold worker started at 2020-04-28 13:28:47.116695916 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:584 "upgrade-steps-runner" manifold worker completed successfully
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "migration-inactive-flag" manifold worker started at 2020-04-28 13:28:47.117828492 +0000 UTC
2020-04-28 13:28:47 INFO juju.worker.caasupgrader upgrader.go:112 abort check blocked until version event received
2020-04-28 13:28:47 DEBUG juju.worker.caasupgrader upgrader.go:127 current agent binary version: 2.8-rc1
2020-04-28 13:28:47 INFO juju.worker.caasupgrader upgrader.go:118 unblocking abort check
2020-04-28 13:28:47 INFO juju.worker.migrationminion worker.go:140 migration phase is now: NONE
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "charm-dir" manifold worker started at 2020-04-28 13:28:47.128343176 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "api-address-updater" manifold worker started at 2020-04-28 13:28:47.128419952 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.logger logger.go:64 initial log config: "<root>=DEBUG"
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "proxy-config-updater" manifold worker started at 2020-04-28 13:28:47.129173596 +0000 UTC
2020-04-28 13:28:47 DEBUG juju.worker.dependency engine.go:564 "logging-config-updater" manifold worker started at 2020-04-28 13:28:47.129250537 +0000 UTC
2020-04-28 13:28:47 INFO juju.worker.logger logger.go:118 logger worker started
2020-04-28 13:28:47 DEBUG juju.worker.caasupgrader upgrader.go:150 desired agent binary version: 2.8-rc1
2020-04-28 13:28:47 DEBUG juju.worker.logger logger.go:92 reconfiguring logging from "<root>=DEBUG" to "<root>=WARNING"
2020-04-28 13:33:55 ERROR juju.worker.uniter.context context.go:753 "katib-controller/1" is not the leader but is setting application k8s spec
2020-04-28 13:33:55 ERROR juju-log pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:33:56 ERROR juju-log Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-katib-controller-1/charm/reactive/katib_controller.py", line 279, in start_charm
    'files/defaultTrialTemplate.yaml.tmpl'
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:33:56 ERROR juju.worker.uniter.operation runhook.go:136 hook "install" (via explicit, bespoke hook script) failed: exit status 1
2020-04-28 13:34:04 ERROR juju.worker.uniter.context context.go:753 "katib-controller/1" is not the leader but is setting application k8s spec
2020-04-28 13:34:04 ERROR juju-log pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:34:04 ERROR juju-log Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-katib-controller-1/charm/reactive/katib_controller.py", line 279, in start_charm
    'files/defaultTrialTemplate.yaml.tmpl'
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:34:04 ERROR juju.worker.uniter.operation runhook.go:136 hook "install" (via explicit, bespoke hook script) failed: exit status 1
2020-04-28 13:34:17 ERROR juju.worker.uniter.context context.go:753 "katib-controller/1" is not the leader but is setting application k8s spec
2020-04-28 13:34:17 ERROR juju-log pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:34:17 ERROR juju-log Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-katib-controller-1/charm/reactive/katib_controller.py", line 279, in start_charm
    'files/defaultTrialTemplate.yaml.tmpl'
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:34:17 ERROR juju.worker.uniter.operation runhook.go:136 hook "install" (via explicit, bespoke hook script) failed: exit status 1
2020-04-28 13:34:34 WARNING juju.worker.uniter.operation leader.go:123 we should run a leader-deposed hook here, but we can't yet
2020-04-28 13:34:38 ERROR juju.worker.uniter.context context.go:753 "katib-controller/1" is not the leader but is setting application k8s spec
2020-04-28 13:34:38 ERROR juju-log pod-spec-set encountered an error: `ERROR this unit is not the leader`
2020-04-28 13:34:38 ERROR juju-log Hook error:
Traceback (most recent call last):
  File "lib/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "lib/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "lib/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "lib/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-katib-controller-1/charm/reactive/katib_controller.py", line 279, in start_charm
    'files/defaultTrialTemplate.yaml.tmpl'
  File "lib/charms/layer/caas_base.py", line 34, in pod_spec_set
    run_hook_command("pod-spec-set", spec)
  File "lib/charms/layer/caas_base.py", line 13, in run_hook_command
    run([cmd], stdout=PIPE, stderr=PIPE, check=True, input=stdin.encode('utf-8'))
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['pod-spec-set']' returned non-zero exit status 1.

2020-04-28 13:34:38 ERROR juju.worker.uniter.operation runhook.go:136 hook "install" (via explicit, bespoke hook script) failed: exit status 1