Vault HA does not work as documented

Bug #1897818 reported by Cory Johns
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Charmed Kubernetes Testing
Medium
Cory Johns
Kubernetes Master Charm
Medium
Cory Johns
vault-charm
Undecided
Unassigned

Bug Description

In lp:1833595 [1] we added instructions for using Vault in HA without EasyRSA using a manual step to transition from non-HA to HA. This was later changed [2] to just use an overlay, stating that testing had shown that the manual transition was no longer necessary. However, it seems that the situation may have regressed or there was a mistake in the original testing, because following the current instructions leads to the secondary Vault unit going into an errored state, as well as showing misleading info in the status (though I already proposed a fix for this [3]). It seems that our Vault CI test also does not test it in HA, so we didn't catch this.

Ideally, the Vault charm would not go into an errored state, but in the meantime, we may need to revert the overlay change to the docs.

[1]: https://bugs.launchpad.net/charm-kubernetes-master/+bug/1833595
[2]: https://github.com/charmed-kubernetes/kubernetes-docs/pull/285
[3]: https://review.opendev.org/#/c/667234/

Revision history for this message
Cory Johns (johnsca) wrote :

I also saw k8s-master go into an errored state when doing the manual HA transition due to a connection error which we could handle more gracefully.

George Kraft (cynerva)
Changed in charmed-kubernetes-testing:
importance: Undecided → Medium
Changed in charm-kubernetes-master:
importance: Undecided → Medium
Changed in charmed-kubernetes-testing:
status: New → Triaged
Changed in charm-kubernetes-master:
status: New → Triaged
Revision history for this message
Cory Johns (johnsca) wrote :

Doc revert: https://github.com/charmed-kubernetes/kubernetes-docs/pull/482

CI PR for single-step Vault HA: https://github.com/charmed-kubernetes/jenkins/pull/641 (will need to modify this in the meantime to do the multi-step, per the doc PR above)

Cory Johns (johnsca)
Changed in vault-charm:
status: New → In Progress
status: In Progress → Invalid
Changed in charm-kubernetes-master:
status: Triaged → Invalid
Changed in charmed-kubernetes-testing:
assignee: nobody → Cory Johns (johnsca)
status: Triaged → In Progress
Changed in charm-kubernetes-master:
status: Invalid → In Progress
assignee: nobody → Cory Johns (johnsca)
Revision history for this message
Cory Johns (johnsca) wrote :

This updates the Jenkins test to actually use Vault in HA mode without EasyRSA, and works around the status issues in both the Vault and k8s-master charm: https://github.com/charmed-kubernetes/jenkins/pull/641

This fixes the hook error in the k8s-master charm during the Vault restart: https://github.com/juju-solutions/layer-vault-kv/pull/11

This fixes the libjuju unit.resolved() issue (which would be used in the workaround for the hook error in the k8s-master charm): https://github.com/juju/python-libjuju/pull/485

The status issues in the Vault charm were fixed in https://review.opendev.org/c/openstack/charm-vault/+/782585 and https://review.opendev.org/c/openstack/charm-vault/+/782585

Revision history for this message
Cory Johns (johnsca) wrote :

Also created https://review.opendev.org/c/openstack/charm-vault/+/784608 so that Vault will report when a manual restart is needed to pick up the HA config.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers