Running vault authorize-charm action against non-leader fails

Bug #1915853 reported by Peter Sabaini
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
vault-charm
In Progress
Medium
Bartosz Woronicz

Bug Description

When running authorize-charm action against a non-leader I'm getting an error:

021-02-16 09:09:12 INFO juju-log Initializing Leadership Layer (is follower)
2021-02-16 09:09:12 INFO juju-log Invoking reactive handler: reactive/vault_handlers.py:217:configure_vault_mysql
2021-02-16 09:09:13 INFO juju-log Etcd detected, setting api_addr to http://172.21.30.59:8200
2021-02-16 09:09:13 INFO juju-log Invoking reactive handler: reactive/vault_handlers.py:254:mysql_setup
2021-02-16 09:09:13 INFO juju-log Invoking reactive handler: reactive/vault_handlers.py:283:database_not_ready
2021-02-16 09:09:13 INFO juju-log Invoking reactive handler: reactive/vault_handlers.py:366:cluster_connected
2021-02-16 09:09:13 INFO juju-log Invoking reactive handler: reactive/vault_handlers.py:511:send_vault_url_and_ca
2021-02-16 09:09:13 INFO juju-log Invoking reactive handler: reactive/vault_handlers.py:545:prime_assess_status
2021-02-16 09:09:13 INFO juju-log Invoking reactive handler: hooks/relations/tls-certificates/provides.py:45:joined:certificates
2021-02-16 09:09:16 INFO juju.worker.uniter.operation runhook.go:142 ran "update-status" hook (via explicit, bespoke hook script)
2021-02-16 09:09:32 INFO juju-log DEPRECATION WARNING: Function action_fail is being removed : moved to function_fail()
2021-02-16 09:09:32 INFO juju-log DEPRECATION WARNING: Function action_get is being removed : moved to function_get()
2021-02-16 09:09:32 WARNING authorize-charm ERROR cannot write leadership settings: cannot write settings: not the leader
2021-02-16 09:09:32 ERROR juju-log Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-vault-0/charm/actions/authorize-charm", line 191, in main
    action(args)
  File "/var/lib/juju/agents/unit-vault-0/charm/actions/authorize-charm", line 46, in authorize_charm_action
    hookenv.leader_set({vault.CHARM_ACCESS_ROLE_ID: role_id})
  File "/var/lib/juju/agents/unit-vault-0/.venv/lib/python3.6/site-packages/charmhelpers/core/hookenv.py", line 1166, in inner_translate_exc2
    return f(*args, **kwargs)
  File "/var/lib/juju/agents/unit-vault-0/.venv/lib/python3.6/site-packages/charmhelpers/core/hookenv.py", line 1227, in leader_set
    subprocess.check_call(cmd)
  File "/usr/lib/python3.6/subprocess.py", line 311, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['leader-set', 'local-charm-access-id=5e373741-72b3-969b-aa62-473847d3095f']' returned non-zero exit status 1.

Version cs:vault-40

Changed in vault-charm:
status: New → Triaged
importance: Undecided → Medium
tags: added: onboarding
Changed in vault-charm:
assignee: nobody → mastier1 (mastier1)
Revision history for this message
Bartosz Woronicz (mastier1) wrote :

I can reproduce the bug with the simple clustered vault setup

$ juju run-action --wait vault/4 authorize-charm token=s.Fz4eOtkZXi2hmM7xEbdydWrx
unit-vault-4:
  UnitId: vault/4
  id: "8"
  message: 'subprocess.CalledProcessError: Command ''[''leader-set'', ''local-charm-access-id=c220509c-043e-e450-9231-2e3063cd8886'']''
    returned non-zero exit status 1.'
  results:
    Stderr: |
      ERROR cannot write leadership settings: cannot write settings: not the leader
  status: failed
  timing:
    completed: 2021-02-19 16:14:24 +0000 UTC
    enqueued: 2021-02-19 16:14:19 +0000 UTC
    started: 2021-02-19 16:14:22 +0000 UTC

It seems the leader checker is not working correctly in this case for authorize-charm action. Will investigate further.

Changed in vault-charm:
status: Triaged → Confirmed
status: Confirmed → In Progress
Revision history for this message
Bartosz Woronicz (mastier1) wrote :

The problem is that the action method does not return immediately when setting status as failed when running on non-leader unit.

Additionally I see that also the problem happens for other action that relies on hookenv.leader_set like get-csr. So the patch will cover also these methods

$ juju run-action --wait vault/4 get-csr token=s.vRujzbfRgzf5fBtZKnY4HcCC ; juju debug-log
unit-vault-4:
  UnitId: vault/4
  id: "43"
  message: 'subprocess.CalledProcessError: Command ''[''leader-set'', ''root-ca='']''
    returned non-zero exit status 1.'
  results:
    Stderr: |
      ERROR cannot write leadership settings: cannot write settings: not the leader
  status: failed
  timing:
    completed: 2021-02-22 13:22:20 +0000 UTC
    enqueued: 2021-02-22 13:22:19 +0000 UTC
    started: 2021-02-22 13:22:19 +0000 UTC

Revision history for this message
Bartosz Woronicz (mastier1) wrote :

The proposed fix is here:
https://review.opendev.org/c/openstack/charm-vault/+/776935

it seems that unit_tests are not feasible here. I checked other charms and they do not cover actions in any way. Charm keystone for instance got kind of coverage but with some dummy test actions so it doesn't make sense.
On other hand functional test may run all actions on nonleader unit and check whether it returns the message "Please run on leader unit!". But does it make any sense ?

tags: added: good-first-bug
removed: onboarding
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.