Vault becomes inaccessible if an etcd unit is removed/down
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
vault-charm |
Fix Released
|
Critical
|
Liam Young |
Bug Description
If vault is deployed in an HA configuration then if the etcd unit with the lowest IP is shutdown then vault becomes inaccessible:
$ export VAULT_ADDR="http://
$ export VAULT_TOKEN=
$ vault status
Key Value
--- -----
Seal Type shamir
Sealed false
Total Shares 1
Threshold 1
Version 0.10.1
Cluster Name vault-cluster-
Cluster ID 6f5b5c26-
HA Enabled true
HA Cluster https:/
HA Mode active
$ juju status etcd | awk '/^etcd\// {print $5;}' | sort | head -n1
10.53.82.119
$ juju status | awk '/started.
juju-4d60d9-2
$ lxc stop juju-4d60d9-2
$ vault status
Error checking leader status: Error making API request.
URL: GET http://
Code: 500. Errors:
* context deadline exceeded
Looking at vaults config on a vault unit the etcd units are listed:
$ sudo grep -A5 ha_storage /var/snap/
address = "https:/
It looks suspiciously like it is supposed to work through them in order but never gets past the first one.
summary: |
- Vault becomes inaccessible if a vault unit and an etcd unit are removed + Vault becomes inaccessible if an etcd unit is removed |
Changed in vault-charm: | |
importance: | Undecided → Critical |
Changed in vault-charm: | |
status: | Confirmed → In Progress |
summary: |
- Vault becomes inaccessible if an etcd unit is removed + Vault becomes inaccessible if an etcd unit is removed/down |
Changed in vault-charm: | |
milestone: | none → 19.04 |
Changed in vault-charm: | |
status: | Fix Committed → Fix Released |
Confirmed - I was able to reproduce this issue with a three unit vault cluster - shutting down one of the etcd units results in the described error message.