[percona cluster -> mysql-innodb-cluster upgrade] "Vault cannot authorize approle" after unseal and change of database.

Bug #1946053 reported by Xav Paice
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Charm Guide
Triaged
Medium
Unassigned
vault-charm
Triaged
Medium
Unassigned

Bug Description

cs:vault-46, 3 units.

Units were upgraded to Focal from Bionic.

We removed the relation to mysql (bionic), and added a relation to mysql-router, per https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/percona-series-upgrade-to-focal.html

I then ran 'pause' on one unit, and 'resume'. On running resume, I logged into the unit and unsealed it.

The vault unit status is 'blocked', 'idle', "Vault cannot authorize approle"

The unit log contains the following traceback:

2021-10-05 03:35:41 DEBUG update-status active
2021-10-05 03:35:41 DEBUG worker.uniter.jujuc server.go:204 running hook tool "application-version-set"
2021-10-05 03:35:41 DEBUG worker.uniter.jujuc server.go:204 running hook tool "leader-get"
2021-10-05 03:35:41 DEBUG worker.uniter.jujuc server.go:204 running hook tool "leader-get"
2021-10-05 03:35:41 DEBUG worker.uniter.jujuc server.go:204 running hook tool "leader-get"
2021-10-05 03:35:43 DEBUG worker.uniter.jujuc server.go:204 running hook tool "leader-get"
2021-10-05 03:35:45 DEBUG worker.uniter.jujuc server.go:204 running hook tool "leader-get"
2021-10-05 03:35:49 DEBUG worker.uniter.jujuc server.go:204 running hook tool "leader-get"
2021-10-05 03:35:57 DEBUG worker.uniter.jujuc server.go:204 running hook tool "leader-get"
2021-10-05 03:36:13 DEBUG worker.uniter.jujuc server.go:204 running hook tool "leader-get"
2021-10-05 03:36:45 DEBUG worker.uniter.jujuc server.go:204 running hook tool "leader-get"
2021-10-05 03:37:45 DEBUG worker.uniter.jujuc server.go:204 running hook tool "leader-get"
2021-10-05 03:37:45 DEBUG worker.uniter.jujuc server.go:204 running hook tool "juju-log"
2021-10-05 03:37:45 WARNING juju-log InternalServerError: Unable to athorize approle. This may indicate failure to communicate with the database
2021-10-05 03:37:45 DEBUG worker.uniter.jujuc server.go:204 running hook tool "juju-log"
2021-10-05 03:37:45 ERROR juju-log Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-vault-8/charm/reactive/vault_handlers.py", line 856, in client_approle_authorized
    vault.get_local_client()
  File "/var/lib/juju/agents/unit-vault-8/.venv/lib/python3.8/site-packages/tenacity/__init__.py", line 339, in wrapped_f
    return self(f, *args, **kw)
  File "/var/lib/juju/agents/unit-vault-8/.venv/lib/python3.8/site-packages/tenacity/__init__.py", line 430, in __call__
    do = self.iter(retry_state=retry_state)
  File "/var/lib/juju/agents/unit-vault-8/.venv/lib/python3.8/site-packages/tenacity/__init__.py", line 378, in iter
    raise retry_exc.reraise()
  File "/var/lib/juju/agents/unit-vault-8/.venv/lib/python3.8/site-packages/tenacity/__init__.py", line 206, in reraise
    raise self.last_attempt.result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 437, in result
    return self.__get_result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/var/lib/juju/agents/unit-vault-8/.venv/lib/python3.8/site-packages/tenacity/__init__.py", line 433, in __call__
    result = fn(*args, **kwargs)
  File "/var/lib/juju/agents/unit-vault-8/charm/lib/charm/vault.py", line 254, in get_local_client
    client.auth_approle(app_role_id)
  File "/var/lib/juju/agents/unit-vault-8/.venv/lib/python3.8/site-packages/hvac/v1/__init__.py", line 2072, in auth_approle
    return self.auth('/v1/auth/{0}/login'.format(mount_point), json=params, use_token=use_token)
  File "/var/lib/juju/agents/unit-vault-8/.venv/lib/python3.8/site-packages/hvac/v1/__init__.py", line 1726, in auth
    return self._adapter.auth(
  File "/var/lib/juju/agents/unit-vault-8/.venv/lib/python3.8/site-packages/hvac/adapters.py", line 159, in auth
    response = self.post(url, **kwargs).json()
  File "/var/lib/juju/agents/unit-vault-8/.venv/lib/python3.8/site-packages/hvac/adapters.py", line 103, in post
    return self.request('post', url, **kwargs)
  File "/var/lib/juju/agents/unit-vault-8/.venv/lib/python3.8/site-packages/hvac/adapters.py", line 233, in request
    utils.raise_for_error(response.status_code, text, errors=errors)
  File "/var/lib/juju/agents/unit-vault-8/.venv/lib/python3.8/site-packages/hvac/utils.py", line 39, in raise_for_error
    raise exceptions.InternalServerError(message, errors=errors)
hvac.exceptions.InternalServerError: local node not active but active cluster node not found

Revision history for this message
Xav Paice (xavpaice) wrote :

Looks like the units are attempting to connect to the 'old' mysql host, rather than mysql-router, until Vault is restarted. Restarting the first unit resulted in that error, but when the second unit was restarted, the cluster re-formed and hooks started to run again.

This is, therefore an issue that:
- Vault needs to be restarted after changing the db relation, but is not (and, naturally, needs unsealing).
- the error message supplied by the charm doesn't actually tell us that
- vault needs to be added to the guide at https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/percona-series-upgrade-to-focal.html with some notes on how to properly restart.

Revision history for this message
Navdeep (navdeep-bjn) wrote :

Seeing the same issue with vault 1.7.9 revision 107

tags: added: sts
Revision history for this message
Moises Emilio Benzan Mora (moisesbenzan) wrote :
tags: added: cdo-qa foundations-engine
Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Triaging as Medium as a workaround does exist. Does need to be documented somewhere, probably in the charm-guide as part of upgrades.

Changed in vault-charm:
importance: Undecided → Medium
status: New → Triaged
Changed in charm-guide:
status: New → Triaged
importance: Undecided → Medium
summary: - "Vault cannot authorize approle" after unseal
+ [percona cluster -> mysql-innodb-cluster upgrade] "Vault cannot
+ authorize approle" after unseal and change of database.
Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

@moisesbenzan was your bug actually due to an upgrade/switch from percona-cluster to mysql-router/mysql-innodb-cluster? If not, it's almost certainly a different bug. Thanks.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.