failed to retry join raft cluster / error during raft bootstrap init call: Error making API request: unsupported path

Bug #2017514 reported by Dominik Bender
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
vault-charm
New
Undecided
Unassigned

Bug Description

After an upgrade from juju 2.9 -> 3.1 the unit vault/2 had a problem and I deployed the unit again. Unfortunately, the unit could not join the cluster

The third vault unit (before: vault/2) failed to join raft cluster.
2 Units are ready and active.

---
Apr 24 00:01:52 juju-b9f7b8-2-lxd-48 vault[287]: 2023-04-24T00:01:52.768Z [ERROR] core: failed to retry join raft cluster: retry=2s
Apr 24 00:01:52 juju-b9f7b8-2-lxd-48 vault[287]: 2023-04-24T00:01:52.768Z [WARN] core: join attempt failed:
Apr 24 00:01:52 juju-b9f7b8-2-lxd-48 vault[287]: error=
Apr 24 00:01:52 juju-b9f7b8-2-lxd-48 vault[287]: | error during raft bootstrap init call: Error making API request.
Apr 24 00:01:52 juju-b9f7b8-2-lxd-48 vault[287]: |
Apr 24 00:01:52 juju-b9f7b8-2-lxd-48 vault[287]: | URL: PUT http://10.105.121.5:8200/v1/sys/storage/raft/bootstrap/challenge
Apr 24 00:01:52 juju-b9f7b8-2-lxd-48 vault[287]: | Code: 404. Errors:
Apr 24 00:01:52 juju-b9f7b8-2-lxd-48 vault[287]: |
Apr 24 00:01:52 juju-b9f7b8-2-lxd-48 vault[287]: | * unsupported path
Apr 24 00:01:52 juju-b9f7b8-2-lxd-48 vault[287]:
Apr 24 00:01:52 juju-b9f7b8-2-lxd-48 vault[287]: 2023-04-24T00:01:52.768Z [ERROR] core: failed to retry join raft cluster: retry=2s
Apr 24 00:01:53 juju-b9f7b8-2-lxd-48 vault[287]: 2023-04-24T00:01:53.037Z [INFO] http: Accept error: accept tcp 127.0.0.1:8220: accept4: too many open files; retrying in 1s
---

Model Controller Cloud/Region Version SLA Timestamp
dbi7-c1 dbi7-prod dbi7/default 3.1.2 unsupported 10:12:41Z

App Version Status Scale Charm Channel Rev Exposed Message
vault 1.8.8 active 2 vault 1.8/stable 108 no Unit is ready (active: true, mlock: disabled)
vault-hacluster blocked 2 hacluster latest/edge 118 no Insufficient peer units for ha cluster (require 3)
vault-mysql-router 8.0.32 active 2 mysql-router 8.0/stable 35 no Unit is ready
---

vault status (leader)
Key Value
--- -----
Seal Type shamir
Initialized true
Sealed false
Total Shares 5
Threshold 3
Version 1.8.8
Cluster Name vault-cluster-58ab99ae
Cluster ID 0ffa6358-7fde-bf1c-10e4-96bac1b8227a
HA Enabled false

Tags: cdo-qa
Revision history for this message
Dominik Bender (ephermeral) wrote :
Revision history for this message
Dominik Bender (ephermeral) wrote :
Download full text (4.6 KiB)

Tested to add a third unit today again. Vault stuck in "Vault needs to be initialized".
Executing changes:
- set application options for vault-hacluster
- add unit vault/5 to 2/lxd/0

vault/5 blocked executing 2/lxd/49 10.105.121.85 8200/tcp Vault needs to be initialized
  vault-hacluster/5 active idle 10.105.121.85 Unit is ready and clustered
  vault-mysql-router/5 active idle 10.105.121.85 Unit is ready

debug-log shows "Failed to join raft cluster: HTTPConnectionPool(host='127.0.0.1', port=8220): Read timed out":
...
unit-vault-5: 22:09:13 INFO unit.vault/5.juju-log certificates:439: Invoking reactive handler: reactive/vault_handlers.py:474:cluster_connected
unit-vault-5: 22:09:13 INFO unit.vault/5.juju-log certificates:439: Invoking reactive handler: reactive/vault_handlers.py:516:join_raft_peers
unit-vault-5: 22:09:13 INFO unit.vault/5.juju-log certificates:439: Joining raft cluster address http://10.105.121.5:8200
unit-vault-5: 22:09:43 WARNING unit.vault/5.juju-log certificates:439: Failed to join raft cluster: HTTPConnectionPool(host='127.0.0.1', port=8220): Read timed out. (read timeout=30)
unit-vault-5: 22:09:43 INFO unit.vault/5.juju-log certificates:439: Joining raft cluster address http://10.105.121.6:8200
unit-vault-5: 22:10:13 WARNING unit.vault/5.juju-log certificates:439: Failed to join raft cluster: HTTPConnectionPool(host='127.0.0.1', port=8220): Read timed out. (read timeout=30)
unit-vault-5: 22:10:13 INFO unit.vault/5.juju-log certificates:439: Invoking reactive handler: reactive/vault_handlers.py:690:send_vault_url_and_ca
unit-vault-5: 22:10:13 WARNING unit.vault/5.juju-log certificates:439: Use of remote_binding in publish_url is deprecated. See LP Bug #1895185
unit-vault-5: 22:10:13 INFO unit.vault/5.juju-log certificates:439: Invoking reactive handler: reactive/vault_handlers.py:736:prime_assess_status
unit-vault-5: 22:10:13 INFO unit.vault/5.juju-log certificates:439: Invoking reactive handler: hooks/relations/tls-certificates/provides.py:45:joined:certificates
unit-vault-5: 22:09:12 INFO unit.vault/5.juju-log certificates:439: Reactive main running for hook certificates-relation-joined
unit-vault-5: 22:09:12 ERROR unit.vault/5.juju-log certificates:439: Unable to find implementation for relation: peers of vault-ha
unit-vault-5: 22:09:13 INFO unit.vault/5.juju-log certificates:439: Initializing Leadership Layer (is follower)
unit-vault-5: 22:09:13 INFO unit.vault/5.juju-log certificates:439: Initializing Snap Layer
unit-vault-5: 22:09:13 INFO unit.vault/5.juju-log certificates:439: Invoking reactive handler: reactive/vault_handlers.py:294:configure_vault_raft
unit-vault-5: 22:09:13 INFO unit.vault/5.juju-log certificates:439: Making dir /var/snap/vault/common/data/ root:root 700
unit-vault-5: 22:09:13 INFO unit.vault/5.juju-log certificates:439: Invoking reactive handler: reactive/vault_handlers.py:353:mysql_setup
unit-vault-5: 22:09:13 INFO unit.vault/5.juju-log certificates:439: Invoking reactive handler: reactive/vault_handlers.py:384:database_not_ready
...

syslog:
...
Apr 24 22:18:54 juju-b9f7b8-2-lxd-49 vault[6220]: 20...

Read more...

description: updated
Revision history for this message
Dominik Bender (ephermeral) wrote :

Strange. Today I tried again, now the third unit joined the cluster.

App Version Status Scale Charm Channel Rev Exposed Message
vault 1.8.8 active 3 vault 1.8/stable 108 no Unit is ready (active: true, mlock: disabled)
vault-hacluster active 3 hacluster latest/edge 118 no Unit is ready and clustered
vault-mysql-router 8.0.32 active 3 mysql-router 8.0/stable 35 no Unit is ready

Unit Workload Agent Machine Public address Ports Message
vault/0* active idle 0/lxd/13 10.105.121.5 8200/tcp Unit is ready (active: true, mlock: disabled)
  vault-hacluster/0 active idle 10.105.121.5 Unit is ready and clustered
  vault-mysql-router/0 active idle 10.105.121.5 Unit is ready
vault/1 active idle 1/lxd/14 10.105.121.6 8200/tcp Unit is ready (active: true, mlock: disabled)
  vault-hacluster/1* active idle 10.105.121.6 Unit is ready and clustered
  vault-mysql-router/1* active idle 10.105.121.6 Unit is ready
vault/6 active idle 2/lxd/50 10.105.121.86 8200/tcp Unit is ready (active: true, mlock: disabled)
  vault-hacluster/6 active idle 10.105.121.86 Unit is ready and clustered
  vault-mysql-router/6 active idle 10.105.121.86 Unit is ready

Revision history for this message
Konstantinos Kaskavelis (kaskavel) wrote :

Solutions QA team has a failed run with similar behavior:

Jul 15 12:49:23 vault3-9 vault[24682]: 2023-07-15T12:49:23.145Z [WARN] core: join attempt failed:
Jul 15 12:49:23 vault3-9 vault[24682]: error=
Jul 15 12:49:23 vault3-9 vault[24682]: | error during raft bootstrap init call: Error making API request.
Jul 15 12:49:23 vault3-9 vault[24682]: |
Jul 15 12:49:23 vault3-9 vault[24682]: | URL: PUT http://10.246.167.185:8200/v1/sys/storage/raft/bootstrap/challenge
Jul 15 12:49:23 vault3-9 vault[24682]: | Code: 503. Errors:
Jul 15 12:49:23 vault3-9 vault[24682]: |
Jul 15 12:49:23 vault3-9 vault[24682]: | * Vault is sealed
Jul 15 12:49:23 vault3-9 vault[24682]:
Jul 15 12:49:23 vault3-9 vault[24682]: 2023-07-15T12:49:23.145Z [ERROR] core: failed to retry join raft cluster: retry=2s
Jul 15 12:49:23 vault3-9 vault[24682]: 2023-07-15T12:49:23.145Z [ERROR] core: failed to retry join raft cluster: retry=2s
Jul 15 12:49:23 vault3-9 vault[24682]: 2023-07-15T12:49:23.152Z [INFO] core: security barrier not initialized
Jul 15 12:49:23 vault3-9 vault[24682]: 2023-07-15T12:49:23.152Z [INFO] core: attempting to join possible raft leader node: leader_addr=http://10.246.165.221:8200
Jul 15 12:49:23 vault3-9 vault[24682]: 2023-07-15T12:49:23.154Z [INFO] core: security barrier not initialized
Jul 15 12:49:23 vault3-9 vault[24682]: 2023-07-15T12:49:23.154Z [INFO] core: attempting to join possible raft leader node: leader_addr=http://10.246.165.221:8200
Jul 15 12:49:23 vault3-9 vault[24682]: 2023-07-15T12:49:23.154Z [INFO] core: security barrier not initialized
Jul 15 12:49:23 vault3-9 vault[24682]: 2023-07-15T12:49:23.154Z [INFO] core: security barrier not initialized
Jul 15 12:49:23 vault3-9 vault[24682]: 2023-07-15T12:49:23.154Z [INFO] core: attempting to join possible raft leader node: leader_addr=http://10.246.165.221:8200
Jul 15 12:49:23 vault3-9 vault[24682]: 2023-07-15T12:49:23.154Z [INFO] core: attempting to join possible raft leader node: leader_addr=http://10.246.165.221:8200
Jul 15 12:49:23 vault3-9 vault[24682]: 2023-07-15T12:49:23.155Z [WARN] core: join attempt failed:
Jul 15 12:49:23 vault3-9 vault[24682]: error=
Jul 15 12:49:23 vault3-9 vault[24682]: | error during raft bootstrap init call: Error making API request.
Jul 15 12:49:23 vault3-9 vault[24682]: |
Jul 15 12:49:23 vault3-9 vault[24682]: | URL: PUT http://10.246.165.221:8200/v1/sys/storage/raft/bootstrap/challenge
Jul 15 12:49:23 vault3-9 vault[24682]: | Code: 404. Errors:
Jul 15 12:49:23 vault3-9 vault[24682]: |
Jul 15 12:49:23 vault3-9 vault[24682]: | * unsupported path

Failed run: https://solutions.qa.canonical.com/testruns/9e59f6e9-206c-483a-a020-c7dabb57fd3e

Logs: https://oil-jenkins.canonical.com/artifacts/9e59f6e9-206c-483a-a020-c7dabb57fd3e/index.html

tags: added: cdo-q
tags: added: cdo-qa
removed: cdo-q
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.