NRPE check return error after snap upgrade

Bug #2060173 reported by Wong Hong Han
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
vault-charm
Triaged
High
Jadon Naas

Bug Description

We have hit the issue where snap socket return more information in the result, causing vault version check break.

The function only get information from the last element in the returned information.

```
info = json.loads(
    snapd.recv(1024 * 1024).decode('utf-8').split('\n')[-1])
```

However I have hit some different response on the following version:
```
snap 2.61.2
snapd 2.61.2
```

Where the response become
```
['HTTP/1.1 200 OK\r', 'Content-Type: application/json\r', 'Date: Thu, 04 Apr 2024 03:38:20 GMT\r', 'Transfer-Encoding: chunked\r', '\r', '946\r', '{"type":"sync","status-code":200,"status":"OK","result":{"id": ... }]}}\r', '0\r', '\r', '']
```

Noticed there is 3 more elements in the end of the array, resulting the check failed.

A better approach is to check if the element has `vault` in it and parse accordingly.
```
original_response = snapd.recv(1024 * 1024).decode('utf-8').split('\n')
result = [ele for ele in original_response if "vault" in ele]
info = json.loads(result[0])
```

Wong Hong Han (honghan)
description: updated
Revision history for this message
Jadon Naas (jadonn) wrote :
Download full text (4.0 KiB)

I confirmed the version check failing is happening for the Vault snap on version 1.15.6 on 1.15/stable right now. I got this error when I worked through the commands in the get_vault_snap_version() function:

Traceback (most recent call last):
  File "<stdin>", line 5, in <module>
  File "/usr/lib/python3.8/json/__init__.py", line 357, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.8/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Is this the kind of error you were seeing?

Also, what version of the Vault snap are you using? I did not get this error on version 1.5.9 of the Vault snap, for example.

Thank you for the fix suggestion! I did confirm that the fix you suggested would work. Here is output from working through that fix you provided:

>>> snapd = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
>>> snapd.connect(SNAPD_SOCKET)
>>> snapd.sendall(SNAPD_INFO_REQUEST.format(snap='vault').encode('utf-8'))
>>> original_response = snapd.recv(1024 * 1024).decode('utf-8').split('\n')
>>> result = [ele for ele in original_response if "vault" in ele]
>>> info = json.loads(result[0])
>>> print(info)
{'type': 'sync', 'status-code': 200, 'status': 'OK', 'result': {'id': 'bIb4p4yWWjyZdo2EU64whkZhw9QYYsMH', 'title': 'vault', 'summary': 'Vault is a tool for securely accessing secrets.', 'description': 'A modern system requires access to a multitude of secrets: database\ncredentials, API keys for external services, credentials for service-oriented\narchitecture communication, etc. Understanding who is accessing what secrets\nis already very difficult and platform-specific. Adding on key rolling,\nsecure storage, and detailed audit logs is almost impossible without a custom\nsolution. This is where Vault steps in.\n\nThis snap is maintained by Canonical.\n\n**Usage**\n \n* To start the Vault service, edit the configuration file at `/var/snap/vault/common/vault.hcl` and start the service with `sudo snap start vault.vaultd`.\n\n* To use the Vault Client, run `vault` commands. For example: `vault status`.', 'icon': 'https://dashboard.snapcraft.io/site_media/appmedia/2017/09/android-chrome-512x512.png', 'installed-size': 191340544, 'install-date': '2024-04-09T19:44:21.65660055Z', 'name': 'vault', 'publisher': {'id': 'canonical', 'username': 'canonical', 'display-name': 'Canonical', 'validation': 'verified'}, 'developer': 'canonical', 'status': 'active', 'type': 'app', 'base': 'core22', 'version': '1.15.6', 'channel': '1.15/stable', 'tracking-channel': '1.15/stable', 'ignore-validation': False, 'revision': '2213', 'confinement': 'strict', 'private': False, 'devmode': False, 'jailmode': False, 'apps': [{'snap': 'vault', 'name': 'vault'}, {'snap': 'vault', 'name': 'vaultd', 'daemon': 'simple', 'daemon-scope': 'system'}], 'mounted-from': '/var/lib/snapd/snaps/vault_2213.snap', 'links': {'issues': ['https://github.com/canonical/snap-vault/issues'], 'source': ['https://github.com/hashicor...

Read more...

Changed in vault-charm:
status: New → Confirmed
Revision history for this message
DUFOUR Olivier (odufourc) wrote :

I just faced this issue with an ongoing deployment.

This is not related at all to the snap version of Vault, but solely because of the latest update of snapd itself since 2.61.2+ version

After reusing the code to do a quick reproducer, I noticed the output of snapd's API isn't the same and introduce more elements as indicated in the first comment. (see attachment snapd-api-output-change.txt for more details)

So when doing "snapd.recv(1024 * 1024).decode('utf-8').split('\n')[-1]" (See code in [1])
In previous releases of snapd, it would match the json output from the API
With the latest release of snapd, it matches an empty line.

Another ugly and quick workaround is to do on Vault unit :
juju exec -a vault "sed -i -e 's/-1/-4/' /usr/lib/nagios/plugins/check_vault_health.py"

[1] https://opendev.org/openstack/charm-vault/src/commit/8413b3f9eed9cc4100d744729a04daa028fda035/src/files/nagios/check_vault_health.py#L40

Revision history for this message
Nobuto Murata (nobuto) wrote :

Subscribing ~field-high as it's affecting a customer delivery.

tl;dr the vault charm needs a bit of maintenance updates to react to the SRU recently completed:
https://bugs.launchpad.net/ubuntu/+source/snapd/+bug/2039017

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

That code in `def get_vault_snap_version():` is very brittle, I feel.

As @odufourc/@nobuto have pointed out, this is due to a change in behaviour of the response from the snapd socket, and not the charm. It's essentially a regression, but evenly, the charm code is brittle and very reliant on the formatting.

As for a fix: use `.splitlines()` instead of .split('\n'), and find the line that startswith("{") and endswith("}"), which can then be json.loads() - but catch any exceptions at that point. Then pull out the version (also with a trap for exceptions), and return it. Log any exceptions for future breakage.

Changed in vault-charm:
status: Confirmed → Triaged
importance: Undecided → High
importance: High → Critical
assignee: nobody → Jadon Naas (jadonn)
Changed in vault-charm:
importance: Critical → High
Revision history for this message
Jadon Naas (jadonn) wrote :

I put up a patch against master to fix this problem. It's at https://review.opendev.org/c/openstack/charm-vault/+/917766. I will continue to work to get it finished and merged.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.