get-csr action should be labeled as destructive

Bug #1947265 reported by Paul Goins
30
This bug affects 4 people
Affects Status Importance Assigned to Milestone
vault-charm
Status tracked in Trunk
1.7
In Progress
High
Jadon Naas
1.8
In Progress
High
Jadon Naas
Trunk
Fix Released
High
Liam Young

Bug Description

I recently hit an issue on a cloud where the vault leader was in a blocked state with this status:

"Missing CA cert"

Upon investigation, I saw that the root-ca value, which was previously set via an upload-signed-csr action, was no longer set in the leader data; "leader-get" didn't return anything for it.

I found that breaking and re-establishing a certificates relation with another app (I used heat) would result in no certificate data coming across to that app, and thus that app being unable to start in HTTPS mode.

I traced this backwards, and found in the Juju audit logs that somoene had run the get-csr action a few months ago.

get-csr sounds benign, but it has some important side effects which affect how the charm operates:

    clear_flag('charm.vault.ca.ready')
    hookenv.leader_set(
        {'root-ca': None})

Unfortunately, the above has proven to not be benign, as described.

Can this be labeled somehow as destructive, or perhaps extra parameters added or something, so this doesn't end up wiping the root-ca leader value which is used as a flag by other parts of the charm - and which its absence blocks charm.vault.ca.ready from being set as well?

Tags: sts
Liam Young (gnuoy)
Changed in vault-charm:
status: New → In Progress
importance: Undecided → High
assignee: nobody → Liam Young (gnuoy)
Revision history for this message
Andrea Ieri (aieri) wrote :

subscribing field-high. We've hit this again; adding an extra action parameter as a safety mechanism would be tremendously useful

Revision history for this message
Seyeong Kim (seyeongkim) wrote :

I can reproduce this as below

1. deploy openstack env[1]
2. juju run-action --wait vault/leader get-csr
3. vault/0 gets Missing CA cert
4. inside vault/0
export VAULT_ADDR='http://127.0.0.1:8200'
vault read charm-pki-local/cert/ca_chain

Error reading charm-pki-local/cert/ca_chain: Error making API request.

URL: GET http://127.0.0.1:8200/v1/charm-pki-local/cert/ca_chain
Code: 500. Errors:

* stored CA information not able to be parsed

tags: added: sts
Revision history for this message
Seyeong Kim (seyeongkim) wrote :

Hello

I think the problem is that vault charm doesn't seperate root CA pki path and intermediate path.

Please refer to vault tutorial. ( https://learn.hashicorp.com/tutorials/vault/pki-engine )

They are seperating them. pki and pki_int.

So even get_csr overwrite intermediate, they can be recovered by step 2.4(on tutorial)

But in our charm env, we only have charm-pki-local.

When we do get_csr, then it breaks cert, so we need to make signed csr, it doesn't work with step 2.4.
It shows us "stored CA information not able to be parsed" error.

Revision history for this message
Liam Young (gnuoy) wrote :

Hi seyeongkim,
    Thank you for your update to the bug. I think what you have highlighted with the root CA sharing a mount point with the CA is a bug but it is not this bug. I believe that in most deployments the charm config option auto-generate-root-ca-cert is left to the default value of False in which case there is no root CA in vault. Users generate a csr for vault to act as an intermediate CA. The csr is then signed externally to vault (this replaces step 2.4 that you referenced in your comment) and uploaded.

Revision history for this message
Liam Young (gnuoy) wrote :

The destructive side-effect of the get-csr action is a result of the behaviour of vault not of the charm. This is noted here https://www.vaultproject.io/api-docs/secret/pki#generate-intermediate "This will overwrite any previously existing CA private key.". It can also been seen by interacting directly with the vault api:

juju run --unit vault/0 "leader-get local-charm-access-id"
f52eeaeb-da57-088b-c3e8-0e0437a01bd6

juju ssh vault/0
export VAULT_ADDR='http://127.0.0.1:8220'
export VAULT_TOKEN=$(vault write auth/approle/login role_id=f52eeaeb-da57-088b-c3e8-0e0437a01bd6 | awk '/token\s/ {print $NF}')

# Generating a certificate works:
vault write charm-pki-local/issue/local common_name="test-0.project.serverstack" ttl="24h"

# Generating a new csr:
vault write charm-pki-local/intermediate/generate/internal common_name="Vault Intermediate Authority"
Key Value
--- -----
csr -----BEGIN CERTIFICATE REQUEST-----
MIICbDCCAVQCAQAwJzElMCMGA1UEAxMcVmF1bHQgSW50ZXJtZWRpYXRlIEF1dGhv
cml0eTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBALmstpUShGX4DalT
fnopZJW3xq0TwraMXQReOmv3lvqE70XdszqqUP26KyMxvbc0WM9Qeqpa8QqeEAcx
N13qVCrbQb095cp5YkHBRflZvypnv80urJMkXzaKs895a8T0GI7xZ34RautVWxhU
ew3ryZCcWbRIX0wXMq+qv7lVZOTjNYgkGymHtjx8a4RNOSG9H4ZHFfaD/xUeC6Sv
/m0NGRiXKaFX43s1dcdcKLHfVwdXYGtHJEf3kCWJ1GpVrJy3i4RkDxg7ZVPXb2U1
ai3oM5coKBboRmUjJrUwVhENiy4whhe1hFZKBEqyXdagWeMCjvToo3m5i0fhn4y0
AndwJPsCAwEAAaAAMA0GCSqGSIb3DQEBCwUAA4IBAQA0IMHQ5Uz1n84bgJDONxQ3
eG1kd37BAb6eDLZ3iVfyOIzVSZ7sV+liN9xn/StlY6JfRF+pVsXeHQlmctlrGsZp
4fmuIBSRlWSDzovLC2TeRF2cwawn73M0dRcij8B5Qh18oB4RVnuoNhtJ+hB8Iv6t
Uf8TwaO1NGza2VScNZ0b1/aBahgshltJFj+bSDxkJVUapMzd9E+MermgYRJT7dC4
au66KLrx1RYjO+2M34qe9V56cFbY9OrP0bA8Bc9c72IwznuvJEgsOYqsjGcZ+xZy
bZd0/Wa+2bWdkNt7uEGrlR8PkFsl5sisoO3k92AZaf1WLlHP3mkwFxN4iuT/I3fk
-----END CERTIFICATE REQUEST-----

# Generating a new certificate now fails:
ubuntu@juju-569389-20220216093413-2:~$ vault write charm-pki-local/issue/local common_name="test-0.project.serverstack" ttl="24h"
Error writing data to charm-pki-local/issue/local: Error making API request.

URL: PUT http://127.0.0.1:8220/v1/charm-pki-local/issue/local
Code: 500. Errors:

* 1 error occurred:
        * error fetching CA certificate: stored CA information not able to be parsed

Revision history for this message
Erlon R. Cruz (sombrafam) wrote :
Revision history for this message
Erlon R. Cruz (sombrafam) wrote :

As my suggestion in the linked patch, I believe we should work on a more definitive fix so the behavior of the command would not delete any keys. That can be done by creating a side namespace in vault and only switching to it once the certs are uploaded.

Revision history for this message
Jadon Naas (jadonn) wrote :

It looks like the fix in https://review.opendev.org/c/openstack/charm-vault/+/829118 was successfully merged to trunk. I'm going to mark this as "Fix Released" at this time.

Changed in vault-charm:
status: In Progress → Fix Released
Revision history for this message
Wesley Hershberger (whershberger) wrote :

Hi Jadon,

Is it intended that the commit in question isn't present in any other branch than main?

$ git branch --contains 457a513
* master

get-csr still clobbers an existing CA on channel 1.7/stable (rev 371), and the new action isn't present on 1.8/stable either [1].

[1] https://charmhub.io/vault/actions?channel=1.8/stable

Revision history for this message
Jadon Naas (jadonn) wrote :

Hello Wesley!

I'm not sure if the intent of the team was to only keep the commit on trunk. I updated this bug to have 1.7 and 1.8 as affected targets. I put up cherry-picks for 1.8/stable and 1.7/stable for the fix:

1.8/stable cherry-pick: https://review.opendev.org/c/openstack/charm-vault/+/925427
1.7/stable cherry-pick: https://review.opendev.org/c/openstack/charm-vault/+/925429

Thanks for pointing this out!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.