tripleo.storage.v1.ceph-install fails on ppc64le

Bug #1790447 reported by Tony Breeds
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Medium
Tony Breeds

Bug Description

The collect_nodes_uuid[1] task in tripleo.storage.v1.ceph-install unconditionally calls demidecode[2] in each node in the overcloud. dmidecode isn't supported on ppc64le and so causes the whole deploy to fail with something like:

---
(undercloud) [stack@director ~]$ WORKFLOW='tripleo.storage.v1.ceph-install'
(undercloud) [stack@director ~]$ UUID=$(mistral execution-list | grep $WORKFLOW | awk {'print $2'} | tail -1)
(undercloud) [stack@director ~]$ echo $UUID
c99331db-0ee7-47b4-aefc-a9e25c3286d6
(undercloud) [stack@director ~]$ mistral task-list $UUID -c ID -c Name -c State
+--------------------------------------+--------------------------+---------+
| ID | Name | State |
+--------------------------------------+--------------------------+---------+
| b40ea310-bfff-42ed-a36a-10737787ded9 | set_swift_container | SUCCESS |
| f53306ff-24be-49e2-a617-e482f0d2bd87 | collect_puppet_hieradata | SUCCESS |
| 88087252-07d8-426d-b3e2-f2a4ca7d8189 | check_hieradata | SUCCESS |
| 48929a63-4ae5-4689-94ce-bf28ad0dd5f7 | set_ip_lists | SUCCESS |
| 6bc7f6bc-71d7-412b-9298-b6857261bfa3 | merge_ip_lists | SUCCESS |
| ed2d92e4-6c2d-49a6-8f30-7992f4301b52 | enable_ssh_admin | SUCCESS |
| f70a3a39-c505-4d60-92cf-179b41bddcd3 | set_blacklisted_ips | SUCCESS |
| 0a203aab-002a-4aef-a3f6-dfb2f32e7d05 | verify_container_exists | ERROR |
| 0a3d443c-1f04-4f1d-843d-9a8bac24f117 | get_private_key | SUCCESS |
| 81c28e75-97d3-455f-a85e-ec2e823705b9 | create_container | SUCCESS |
| 8a248462-1e8d-404d-ba90-e3cbb6aa7519 | collect_nodes_uuid | ERROR |
| ac9ae3da-9a7f-493f-b109-9e750c12f4d8 | make_fetch_directory | SUCCESS |
+--------------------------------------+--------------------------+---------+
(undercloud) [stack@director ~]$ TASK_ID=8a248462-1e8d-404d-ba90-e3cbb6aa7519
(undercloud) [stack@director ~]$ mistral task-get-result $TASK_ID | jq . | sed -e 's/\\n/\n/g' -e 's/\\"/"/g' | grep -A1 '\"dmidecode -s'
                            "cmd": "dmidecode -s system-uuid", \
                            "failed": true, \
--
                                    "_raw_params": "dmidecode -s system-uuid", \
                                    "_uses_shell": false, \
--
                            "cmd": "dmidecode -s system-uuid", \
                            "failed": true, \
--
                                    "_raw_params": "dmidecode -s system-uuid", \
                                    "_uses_shell": false, \
--
                                    "_raw_params": "dmidecode -s system-uuid", \
                                    "_uses_shell": false, \
(undercloud) [stack@director ~]$
---

[1] http://git.openstack.org/cgit/openstack/tripleo-common/tree/workbooks/ceph-ansible.yaml#n89
[2] http://git.openstack.org/cgit/openstack/tripleo-common/tree/workbooks/ceph-ansible.yaml#n111

Tony Breeds (o-tony)
Changed in tripleo:
assignee: nobody → Tony Breeds (o-tony)
Changed in tripleo:
milestone: none → stein-1
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/599920

Revision history for this message
Tony Breeds (o-tony) wrote :

With this in place:

(undercloud) [stack@director ~]$ source ~stack/stackrc
(undercloud) [stack@director ~]$ WORKFLOW='tripleo.storage.v1.ceph-install'
(undercloud) [stack@director ~]$ UUID=$(mistral execution-list | grep $WORKFLOW | awk {'print $2'} | tail -1)
(undercloud) [stack@director ~]$ mistral task-list $UUID -c ID -c Name -c State
+--------------------------------------+----------------------------------------+---------+
| ID | Name | State |
+--------------------------------------+----------------------------------------+---------+
| 11cb072d-c381-4f0f-851e-015ccbbb4095 | set_swift_container | SUCCESS |
| feaff3a9-421e-4734-b12c-622095ef4ce2 | collect_puppet_hieradata | SUCCESS |
| 26131e3d-6e44-40a6-9659-411d869f92c7 | set_blacklisted_ips | SUCCESS |
| 5c17b066-fc85-4e21-9211-e87b181b9308 | check_hieradata | SUCCESS |
| 802279a3-9664-4942-8c39-1250348fa166 | set_ip_lists | SUCCESS |
| e71b3359-1e41-4c66-8677-dbafbc95e924 | merge_ip_lists | SUCCESS |
| fcb815ae-33cd-4ecb-9ddf-8981c6cf386b | enable_ssh_admin | SUCCESS |
| 57ff7bac-5c4f-4309-8f65-afb34b0c25a7 | get_private_key | SUCCESS |
| dd91814e-ec4b-4150-ba52-7db57b7fafe4 | verify_container_exists | ERROR |
| f7bb184c-ca88-4974-a456-ca74356a8b3d | make_fetch_directory | SUCCESS |
| 0e0ded2d-42d1-4d1d-8383-ba05a7c2b9f6 | create_container | SUCCESS |
| ea574df9-f472-4503-a482-4d37c1d2a3ca | collect_nodes_uuid | SUCCESS |
| 45293a3c-2e48-483d-a763-aa2ee1de8d27 | set_ip_uuids | SUCCESS |
| 7de03eaa-3655-4904-bc91-186464182243 | set_role_vars | SUCCESS |
| 8bdd9127-c11a-4272-a2f8-5303705aa846 | parse_node_data_lookup | SUCCESS |
| 9237015c-54ac-4abe-ac56-5bd725a88093 | map_node_data_lookup | SUCCESS |
| 724daaf0-f94c-4dc8-beaf-274c27138c80 | ceph_install | SUCCESS |
| ac952326-d7e8-4692-b2e1-a892fda2a566 | build_extra_vars | SUCCESS |
| 7d1fb2a2-bce6-45a2-9349-8f4d857e145c | save_fetch_directory | SUCCESS |
| 0ea12abe-55ba-4b20-bc92-8e7f508b19f8 | remove_ceph_osd_package_from_baremetal | SUCCESS |
| e3b2f867-5aa8-40cd-8dee-ff16bb9b5a64 | purge_fetch_directory | SUCCESS |
+--------------------------------------+----------------------------------------+---------+
(undercloud) [stack@director ~]$ read TASK_ID
ea574df9-f472-4503-a482-4d37c1d2a3ca
(undercloud) [stack@director ~]$ echo $json | sed -e 's/\\n/\n/g' -e 's/\\"/"/g' | jq .plays[0].tasks[0].hosts[].stdout
null
"DFE7543E-7D20-4232-9C57-1AC2D356CFEF"
null
(undercloud) [stack@director ~]$ echo $json | sed -e 's/\\n/\n/g' -e 's/\\"/"/g' | jq .plays[0].tasks[0].hosts[].rc
2
0
2

Where 2 nodes that returned '2' are ppc64le systems.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/601493

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (master)

Reviewed: https://review.openstack.org/599920
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=f4be399e5e8d4e0e2ddf78159dd3ace61e10caca
Submitter: Zuul
Branch: master

commit f4be399e5e8d4e0e2ddf78159dd3ace61e10caca
Author: Tony Breeds <email address hidden>
Date: Wed Sep 5 15:25:27 2018 +1000

    Handle missing or bad dmidecode

    dmidecode isn't functional on all architectures. Don't treat a missing
    binary or a missing DMI table as fatal from an install POV.

    Change-Id: I85f595f5fbb7fcd8c6dc589bb348e78148b80992
    Related-Bug: 1790447

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (master)

Change abandoned by Juan Antonio Osorio Robles (<email address hidden>) on branch: master
Review: https://review.openstack.org/601493
Reason: Purging the gate to free up resources and address the timeout issues

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/601493
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=459b2664d99befa452e1946a4e4cf400183dd9e3
Submitter: Zuul
Branch: master

commit 459b2664d99befa452e1946a4e4cf400183dd9e3
Author: Tony Breeds <email address hidden>
Date: Tue Sep 11 14:48:19 2018 +1000

    Handle missing or bad dmidecode

    dmidecode isn't functional on all architectures. Don't treat a missing
    binary or a missing DMI table as fatal from an install POV.

    Change-Id: I33c50ee00ac0b478839b2536f0b965e444e66e53
    Related-Bug: 1790447

Changed in tripleo:
milestone: stein-1 → stein-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (stable/rocky)

Related fix proposed to branch: stable/rocky
Review: https://review.openstack.org/624262

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (stable/rocky)

Related fix proposed to branch: stable/rocky
Review: https://review.openstack.org/624522

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (stable/rocky)

Reviewed: https://review.openstack.org/624262
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=244736fdfa3766c5e727695a0627be541ce74dc0
Submitter: Zuul
Branch: stable/rocky

commit 244736fdfa3766c5e727695a0627be541ce74dc0
Author: Tony Breeds <email address hidden>
Date: Wed Sep 5 15:25:27 2018 +1000

    Handle missing or bad dmidecode

    dmidecode isn't functional on all architectures. Don't treat a missing
    binary or a missing DMI table as fatal from an install POV.

    Change-Id: I85f595f5fbb7fcd8c6dc589bb348e78148b80992
    Related-Bug: 1790447
    (cherry picked from commit f4be399e5e8d4e0e2ddf78159dd3ace61e10caca)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (stable/rocky)

Reviewed: https://review.openstack.org/624522
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=35eb3e1f478a7bce394c7d3bc8150644074fc434
Submitter: Zuul
Branch: stable/rocky

commit 35eb3e1f478a7bce394c7d3bc8150644074fc434
Author: Tony Breeds <email address hidden>
Date: Tue Sep 11 14:48:19 2018 +1000

    Handle missing or bad dmidecode

    dmidecode isn't functional on all architectures. Don't treat a missing
    binary or a missing DMI table as fatal from an install POV.

    Change-Id: I33c50ee00ac0b478839b2536f0b965e444e66e53
    Related-Bug: 1790447
    (cherry picked from commit 459b2664d99befa452e1946a4e4cf400183dd9e3)

Revision history for this message
Tony Breeds (o-tony) wrote :

The fix above introduces a new bug, which has been backported. We now accept return values of 0, 1 or 2 from command modules that call 'dmidecode'. However we then unconditionally look at the stdout key in the result object. When the dmidecode binary is missing the result dictionary from the command module doesn't contain a stdout key (resulting in something like):

---
fatal: [overcloud-novacomputeppc64le-1]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'stdout'\n\nThe error appears to have been in '/var/lib/mistral/overcloud/ceph-ansible/nodes_uuid_playbook.yml': line 14, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n failed_when: machine_uuid.rc not in [0, 1, 2]\n - name: generate host vars from nodes data\n ^ here\n"}
---

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/629178

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/629128
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=f9b5401c1f478084031543d8c9f016068935596b
Submitter: Zuul
Branch: master

commit f9b5401c1f478084031543d8c9f016068935596b
Author: Tony Breeds <email address hidden>
Date: Tue Jan 8 20:58:21 2019 +1100

    Do not dereference .stdout if dmidecode is missing

    In 459b2664d99befa452e1946a4e4cf400183dd9e3 (Handle missing or bad
    dmidecode) we accept return values of 0, 1 or 2 from command modules
    that call 'dmidecode'. However we then unconditionally look at the
    stdout key in the result object. When the dmidecode binary is missing
    the result dictionary from the command module doesn't contain a stdout
    key (resulting in something like):

    ---
    fatal: [overcloud-novacomputeppc64le-1]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'stdout'\n\nThe error appears to have been in '/var/lib/mistral/overcloud/ceph-ansible/nodes_uuid_playbook.yml': line 14, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n failed_when: machine_uuid.rc not in [0, 1, 2]\n - name: generate host vars from nodes data\n ^ here\n"}
    ---

    This change just adds a default('') filter to the lookup so a missing
    key will fallback to an empty dictionary

    Closes-Bug: 1790447
    Change-Id: I7db180674c3696508a7f449e2e825e7083a00f6e

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/rocky)

Reviewed: https://review.openstack.org/629178
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=9d4dce3cea557c149dca98a2e1608ce73f42c1c8
Submitter: Zuul
Branch: stable/rocky

commit 9d4dce3cea557c149dca98a2e1608ce73f42c1c8
Author: Tony Breeds <email address hidden>
Date: Tue Jan 8 20:58:21 2019 +1100

    Do not dereference .stdout if dmidecode is missing

    In 459b2664d99befa452e1946a4e4cf400183dd9e3 (Handle missing or bad
    dmidecode) we accept return values of 0, 1 or 2 from command modules
    that call 'dmidecode'. However we then unconditionally look at the
    stdout key in the result object. When the dmidecode binary is missing
    the result dictionary from the command module doesn't contain a stdout
    key (resulting in something like):

    ---
    fatal: [overcloud-novacomputeppc64le-1]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'stdout'\n\nThe error appears to have been in '/var/lib/mistral/overcloud/ceph-ansible/nodes_uuid_playbook.yml': line 14, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n failed_when: machine_uuid.rc not in [0, 1, 2]\n - name: generate host vars from nodes data\n ^ here\n"}
    ---

    This change just adds a default('') filter to the lookup so a missing
    key will fallback to an empty dictionary

    Closes-Bug: 1790447
    Change-Id: I7db180674c3696508a7f449e2e825e7083a00f6e
    (cherry picked from commit f9b5401c1f478084031543d8c9f016068935596b)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 10.3.0

This issue was fixed in the openstack/tripleo-heat-templates 10.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 9.3.0

This issue was fixed in the openstack/tripleo-heat-templates 9.3.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.