Comment 40 for bug 2056193

Revision history for this message
John A Meinel (jameinel) wrote :

Summary from today's investigation.

We checked that the actions were still listed as "pending" but nothing was running. We *were* able to do "juju cancel-task XX" and have it move from "pending" to "cancelled"

We also saw that neither "juju exec httprequest-lego-provider/0 -- hostname" would not run, neither would "juju exec --execution-group=test -- hostname"
Both of them show up in `juju operations` but show up as "pending".

We also tested and saw that "juju exec nginx-ingress-integrator/0 -- hostname" worked correctly, showed up in `juju operations` and returned the output to the juju CLI correctly. (so we know exec as actions works, just not for this unit agent)

We then dug into `juju debug-log` a bit, turned up some logging for the `juju.worker.uniter` to TRACE and `juju.worker.cassoperator` to TRACE.

We didn't get a chance to try restarting just the unit agent, and seeing if there were any obvious failures in the log.

We did inspect /var/log/juju/machine-lock.log and could see that it has content for hook executions (update-status-hook) but does not appear to ever hold anything for actions. (we did not see any record of anything in nginx-ingress-integrator which did have successful hook runs)

We did see one thing that was a red-herring, there was a debug log entry that was "juju-exec listener stopping", because that is the local 'as a separate process, I want to be able to run a command inside a charm context".

We confirmed that on the local machine `/usr/bin/juju-exec httprequest-lego-provider/0 relation-ids` returned correct information. (it doesn't use Actions to run the command, but at least the unit agent is receiving requests and responding)

we sourced /etc/profile.d/juju_introspection.sh and ran `juju_engine_report`
https://pastebin.canonical.com/p/BGXRjzYvNF/

Everything seems to be reasonable. All the started counts are 1, and the things that are stopped seemed reasonable:
```
  signal-handler:
    error: '"dead-flag" not set: dependency not available'
    inputs:
    - dead-flag
    state: stopped
  uniter:
...
    report:
      local-state:
        installed: true
        leader: true
        operation-kind: continue
        operation-step: pending
        removed: false
        started: true
        stopped: false
  caas-zombie-prober:
    error: '"dead-flag" not set: dependency not available'
    inputs:
    - probe-http-server
    - dead-flag
    state: stopped
  upgrade-steps-runner:
    inputs:
    - agent
    - api-caller
    - upgrade-steps-gate
    - not-dead-flag
    start-count: 1
    state: stopped
```