tripleo_cephadm/tasks/export: Save tripleo_ceph_client_vars file JSONDecodeError

Bug #1935774 reported by John Fulton
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Medium
Emilien Macchi

Bug Description

While deploying with tripleo and cephadm the followint task and template do not seem to be handling environment data which is not propper JSON.

 https://github.com/openstack/tripleo-ansible/blob/master/tripleo_ansible/roles/tripleo_cephadm/tasks/export.yaml#L74-L80

 https://github.com/openstack/tripleo-ansible/blob/master/tripleo_ansible/roles/tripleo_cephadm/templates/ceph_client.yaml.j2

tripleo_cephadm/tasks/export: Save tripleo_ceph_client_vars file JSONDecodeError: Expecting value: line 1 column 1 (char 0)

2021-07-10 23:25:05,280 p=30009 u=root n=ansible | 2021-07-10 23:25:05.280012 | 80615f03-9370-813a-b3f1-000000000114 | TASK | Wait for the expected number of monitors to be running
2021-07-10 23:25:05,303 p=30009 u=root n=ansible | 2021-07-10 23:25:05.303077 | 126f82a0-4644-44e6-ab40-f449cd79bad0 | INCLUDED | /usr/share/ansible/roles/tripleo_cephadm/tasks/wait_for_expected_num_mons.yaml | mecha-az0
2021-07-10 23:25:05,312 p=30009 u=root n=ansible | 2021-07-10 23:25:05.312259 | 80615f03-9370-813a-b3f1-00000000033b | TASK | Read the spec file
2021-07-10 23:25:05,336 p=30009 u=root n=ansible | 2021-07-10 23:25:05.336111 | 80615f03-9370-813a-b3f1-00000000033b | OK | Read the spec file | mecha-az0
2021-07-10 23:25:05,342 p=30009 u=root n=ansible | 2021-07-10 23:25:05.342457 | 80615f03-9370-813a-b3f1-00000000033c | TASK | Parse each yaml document in the spec file looking for the list of mons
2021-07-10 23:25:05,377 p=30009 u=root n=ansible | 2021-07-10 23:25:05.376515 | 80615f03-9370-813a-b3f1-00000000033c | SKIPPED | Parse each yaml document in the spec file looking for the list of mons | mecha-az0 | item={'addr': '192.168.24.10', 'hostname': 'mecha-az0.ci.vexxhost.ca', 'labels': ['mon', '_admin', 'mgr', 'osd'], 'service_type': 'host'}
2021-07-10 23:25:05,387 p=30009 u=root n=ansible | 2021-07-10 23:25:05.387753 | 80615f03-9370-813a-b3f1-00000000033c | OK | Parse each yaml document in the spec file looking for the list of mons | mecha-az0 | item={'placement': {'hosts': ['mecha-az0.ci.vexxhost.ca']}, 'service_id': 'mon', 'service_name': 'mon', 'service_type': 'mon'}
2021-07-10 23:25:05,398 p=30009 u=root n=ansible | 2021-07-10 23:25:05.398790 | 80615f03-9370-813a-b3f1-00000000033c | SKIPPED | Parse each yaml document in the spec file looking for the list of mons | mecha-az0 | item={'placement': {'hosts': ['mecha-az0.ci.vexxhost.ca']}, 'service_id': 'mgr', 'service_name': 'mgr', 'service_type': 'mgr'}
2021-07-10 23:25:05,399 p=30009 u=root n=ansible | 2021-07-10 23:25:05.399473 | 80615f03-9370-813a-b3f1-00000000033c | SKIPPED | Parse each yaml document in the spec file looking for the list of mons | mecha-az0 | item={'data_devices': {'rotational': 0}, 'db_devices': {'rotational': 0}, 'placement': {'hosts': ['mecha-az0.ci.vexxhost.ca']}, 'service_id': 'default_drive_group', 'service_name': 'osd.default_drive_group', 'service_type': 'osd'}
2021-07-10 23:25:05,407 p=30009 u=root n=ansible | 2021-07-10 23:25:05.406942 | 80615f03-9370-813a-b3f1-00000000033e | TASK | Wait for expected number of mons to be running
2021-07-10 23:25:06,304 p=30009 u=root n=ansible | 2021-07-10 23:25:06.303429 | 80615f03-9370-813a-b3f1-00000000033e | CHANGED | Wait for expected number of mons to be running | mecha-az0
2021-07-10 23:25:06,311 p=30009 u=root n=ansible | 2021-07-10 23:25:06.311109 | 80615f03-9370-813a-b3f1-000000000115 | TASK | Run ceph mon dump to get all monitors
2021-07-10 23:25:07,584 p=30009 u=root n=ansible | 2021-07-10 23:25:07.583911 | 80615f03-9370-813a-b3f1-000000000115 | CHANGED | Run ceph mon dump to get all monitors | mecha-az0
2021-07-10 23:25:07,592 p=30009 u=root n=ansible | 2021-07-10 23:25:07.591907 | 80615f03-9370-813a-b3f1-000000000116 | TASK | Extract mons_json
2021-07-10 23:25:07,616 p=30009 u=root n=ansible | 2021-07-10 23:25:07.616063 | 80615f03-9370-813a-b3f1-000000000116 | OK | Extract mons_json | mecha-az0
2021-07-10 23:25:07,623 p=30009 u=root n=ansible | 2021-07-10 23:25:07.623500 | 80615f03-9370-813a-b3f1-000000000117 | TASK | Build mons_list
2021-07-10 23:25:07,648 p=30009 u=root n=ansible | 2021-07-10 23:25:07.647639 | 80615f03-9370-813a-b3f1-000000000117 | OK | Build mons_list | mecha-az0 | item=[{'type': 'v2', 'addr': '192.168.24.10:3300', 'nonce': 0}, {'type': 'v1', 'addr': '192.168.24.10:6789', 'nonce': 0}]
2021-07-10 23:25:07,655 p=30009 u=root n=ansible | 2021-07-10 23:25:07.654913 | 80615f03-9370-813a-b3f1-000000000118 | TASK | Set external_cluster_mon_ips from mons_list
2021-07-10 23:25:07,669 p=30009 u=root n=ansible | 2021-07-10 23:25:07.668692 | 80615f03-9370-813a-b3f1-000000000118 | OK | Set external_cluster_mon_ips from mons_list | mecha-az0
2021-07-10 23:25:07,675 p=30009 u=root n=ansible | 2021-07-10 23:25:07.675247 | 80615f03-9370-813a-b3f1-000000000119 | TASK | Extract keys
2021-07-10 23:25:08,338 p=30009 u=root n=ansible | 2021-07-10 23:25:08.337588 | 80615f03-9370-813a-b3f1-000000000119 | OK | Extract keys | mecha-az0 | item={'name': 'client.openstack', 'key': 'AQCnm+lgAAAAABAA+0pMmMjhmW+ku8Vj1ORJMA==', 'mode': '0600', 'caps': {'mgr': 'allow *', 'mon': 'profile rbd', 'osd': 'profile rbd pool=vms, profile rbd pool=volumes, profile rbd pool=images'}}
2021-07-10 23:25:08,970 p=30009 u=root n=ansible | 2021-07-10 23:25:08.969680 | 80615f03-9370-813a-b3f1-000000000119 | OK | Extract keys | mecha-az0 | item={'name': 'client.radosgw', 'key': 'AQCnm+lgAAAAABAAu/00i4ydTZaczzeL0ckN3g==', 'mode': '0600', 'caps': {'mgr': 'allow *', 'mon': 'allow rw', 'osd': 'allow rwx'}}
2021-07-10 23:25:08,982 p=30009 u=root n=ansible | 2021-07-10 23:25:08.982045 | 80615f03-9370-813a-b3f1-00000000011a | TASK | Save tripleo_ceph_client_vars file
2021-07-10 23:25:09,037 p=30009 u=root n=ansible | 2021-07-10 23:25:09.037113 | 80615f03-9370-813a-b3f1-00000000011a | FATAL | Save tripleo_ceph_client_vars file | mecha-az0 | error={"changed": false, "msg": "JSONDecodeError: Expecting value: line 1 column 1 (char 0)"}
2021-07-10 23:25:09,039 p=30009 u=root n=ansible | PLAY RECAP *********************************************************************
2021-07-10 23:25:09,039 p=30009 u=root n=ansible | mecha-az0 : ok=38 changed=13 unreachable=0 failed=1 skipped=42 rescued=0 ignored=0

Revision history for this message
John Fulton (jfulton-org) wrote :

Emilien,

Can you reproduce this with ansible-playbook -vvv and share the value tripleo_cephadm_client_keys ?

The config-download dir should have a cephadm directory containing a shell script you can modify to add -vvv so you can re-run only the relevant part.

Once I know what your env is producing I can ensure the template can handle it.

Thanks,
  John

Revision history for this message
Emilien Macchi (emilienm) wrote :

the from_json fails because the keys can't be found (UX could be improved here, with a check in the module maybe?)

also the root cause is because it's not working with a customized ceph cluster name:

    "ansible_loop_var": "item",
    "changed": false,
    "delta": "",
    "end": "",
    "invocation": {
        "module_args": {
            "attributes": null,
            "backup": null,
            "caps": null,
            "cluster": "ceph",
            "content": null,
            "delimiter": null,
            "dest": "/etc/ceph/",
            "directory_mode": null,
            "fetch_initial_keys": "false",
            "follow": false,
            "force": null,
            "group": null,
            "import_key": true,
            "mode": null,
            "name": "client.radosgw",
            "output_format": "json",
            "owner": null,
            "regexp": null,
            "remote_src": null,
            "secret": null,
            "selevel": null,
            "serole": null,
            "setype": null,
            "seuser": null,
            "src": null,
            "state": "info",
            "unsafe_writes": null,
            "user": "client.admin",
            "user_key": null
        }
    },
    "item": {
        "caps": {
            "mgr": "allow *",
            "mon": "allow rw",
            "osd": "allow rwx"
        },
        "key": "AQCnm+lgAAAAABAAu/00i4ydTZaczzeL0ckN3g==",
        "mode": "0600",
        "name": "client.radosgw"
    },
    "rc": 0,
    "start": "",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "skipped, since client.radosgw does not exist",
    "stdout_lines": [
        "skipped, since client.radosgw does not exist"
    ]
}

In my environment, I'm setting CephClusterName: az0

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (master)
Changed in tripleo:
status: Triaged → In Progress
Changed in tripleo:
importance: Undecided → Medium
assignee: John Fulton (jfulton-org) → Emilien Macchi (emilienm)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/800549

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-ansible/+/800472
Committed: https://opendev.org/openstack/tripleo-ansible/commit/b754b17724f5fb3b2ad1f374637acaeb0a7c17a1
Submitter: "Zuul (22348)"
Branch: master

commit b754b17724f5fb3b2ad1f374637acaeb0a7c17a1
Author: Emilien Macchi <email address hidden>
Date: Mon Jul 12 09:22:47 2021 -0400

    cephadm: add missing cluster name

    Two tasks were calling `ceph_key` module without the `cluster` parameter
    which is problematic in DCN environment, where we specify the cluster
    name.

    Closes-Bug: #1935774
    Change-Id: I2f5647ea9bc219f0eca9b9580662ca4b5c3c226e

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/tripleo-ansible/+/801248

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (stable/wallaby)

Related fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/801249

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/tripleo-ansible/+/801248
Committed: https://opendev.org/openstack/tripleo-ansible/commit/0ba5c160ea19aa7a7f02b7c6f802b0e49e0bccb3
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 0ba5c160ea19aa7a7f02b7c6f802b0e49e0bccb3
Author: Emilien Macchi <email address hidden>
Date: Mon Jul 12 09:22:47 2021 -0400

    cephadm: add missing cluster name

    Two tasks were calling `ceph_key` module without the `cluster` parameter
    which is problematic in DCN environment, where we specify the cluster
    name.

    Closes-Bug: #1935774
    Change-Id: I2f5647ea9bc219f0eca9b9580662ca4b5c3c226e
    (cherry picked from commit b754b17724f5fb3b2ad1f374637acaeb0a7c17a1)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/800549
Committed: https://opendev.org/openstack/tripleo-heat-templates/commit/30574f183fa0c3cfeaa0d7e89c17c045f65ecd29
Submitter: "Zuul (22348)"
Branch: master

commit 30574f183fa0c3cfeaa0d7e89c17c045f65ecd29
Author: John Fulton <email address hidden>
Date: Mon Jul 12 15:28:36 2021 -0400

    Test override of CephClusterName in 004 standalone

    Override CephClusterName to something other than the
    default in 004 to exercise code patch of related bug.

    Change-Id: I3d0ebaf009e4fda5cf17e2aabf4f9f689e6dfe33
    Related-Bug: #1935774
    Depends-On: I2f5647ea9bc219f0eca9b9580662ca4b5c3c226e

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/801249
Committed: https://opendev.org/openstack/tripleo-heat-templates/commit/a0fb4001a0841544893f100c60363ca6e4a3813b
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit a0fb4001a0841544893f100c60363ca6e4a3813b
Author: John Fulton <email address hidden>
Date: Mon Jul 12 15:28:36 2021 -0400

    Test override of CephClusterName in 004 standalone

    Override CephClusterName to something other than the
    default in 004 to exercise code patch of related bug.

    Change-Id: I3d0ebaf009e4fda5cf17e2aabf4f9f689e6dfe33
    Related-Bug: #1935774
    Depends-On: I2f5647ea9bc219f0eca9b9580662ca4b5c3c226e

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 3.3.0

This issue was fixed in the openstack/tripleo-ansible 3.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 4.1.0

This issue was fixed in the openstack/tripleo-ansible 4.1.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.