tripleo-ansible ceph roles badly inherrit ansible environment variables

Bug #1863629 reported by John Fulton
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
John Fulton

Bug Description

Executions of ceph-related nested ansible in tripleo-ansible can result in unwanted newlines which break command parsing.

For example at point "2020-02-17 05:27:24" in a patche's log [1] the command made by [2] looks like it's getting additional new lines added which breaks the interpolation of the command. At least it looks like an extra newline problem to me because I see additional date entries in the log where I should only see one:

2020-02-17 05:27:24 | cmd: (blah blah blah)
2020-02-17 05:27:24 | ANSIBLE_CACHE_PLUGIN=...
2020-02-17 05:27:24 | ANSIBLE_ROLES_PATH=...

I think (from lines 24/25 of [1]) that the following variables contain new lines:

      - "{{ calling_ansible_environment_variables|join(' ') }}"
      - "{{ ceph_ansible_environment_variables|join(' ') }}"

Most likely the calling_ansible_environment_variables which is basically getting whatever the config-download ansible got.

Either we require these vars to not contain new lines or we modify lines 24/25 to go from this:

  - "{{ calling_ansible_environment_variables|join(' ') }}"

to something like this:

  - "{{ calling_ansible_environment_variables|join(' ')|replace('\n', '') }}"

[1] https://logserver.rdoproject.org/15/706815/7/openstack-check/tripleo-ci-rhel-8-scenario001-standalone-rdo/a444f63/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz

[2] https://opendev.org/openstack/tripleo-ansible/src/branch/master/tripleo_ansible/roles/tripleo_ceph_uuid/tasks/gather.yml

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (master)

Fix proposed to branch: master
Review: https://review.opendev.org/708252

Changed in tripleo:
status: Triaged → In Progress
wes hayutin (weshayutin)
tags: added: alert
Revision history for this message
John Fulton (jfulton-org) wrote : Re: tripleo-ansible ceph roles fail to correctly inherit calling ansible env
summary: - tripleo-ansible ceph executions add unwanted newlines when interpolating
- ansible environment variables
+ tripleo-ansible ceph roles fail to correctly inherit calling ansible env
Revision history for this message
John Fulton (jfulton-org) wrote :

New plan:

1. do not inherit by default (this will fix recent breaks)
2. improve how we inherit
3. probably continue to not inherit by default

Revision history for this message
John Fulton (jfulton-org) wrote :

To improve how the env is inherited, Alex pointed out that `env`, as used in [1], is not going to contain quoted variables and now that ANSIBLE_SSH_ARGS is getting set with multiple "-o foo=bar" [2] we need to have those quotes in order for the inheritance to not break. A better way to fix this is to use something like [3].

[1] https://review.opendev.org/#/c/674706/4/tripleo_ansible/roles/tripleo-ceph-common/tasks/main.yml

[2] https://review.opendev.org/#/c/705793/21/tripleoclient/utils.py@360

[3] https://review.opendev.org/#/c/706143/15/roles/tripleo_overcloud_image_build/tasks/main.yml@13

summary: - tripleo-ansible ceph roles fail to correctly inherit calling ansible env
+ tripleo-ansible ceph roles badly inherrit ansible environment variables
Changed in tripleo:
importance: Medium → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (master)

Reviewed: https://review.opendev.org/708252
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=692678dce5b7f4950387c154086a0c1cdc6b887b
Submitter: Zuul
Branch: master

commit 692678dce5b7f4950387c154086a0c1cdc6b887b
Author: John Fulton <email address hidden>
Date: Mon Feb 17 14:51:52 2020 -0500

    Do not inherit the calling ansible env by default

    As per I428f6deb416b540dae552b5fc7a778cbc7505e8c ANSIBLE_*
    environment variables are passed to Ansible executions which
    are used to deploy Ceph.

    The method of inheritance is fragile and possibly unsafe. This
    patch disables inheritance by default. In inheritance feature
    can be made more safe in subsequent patches for Related-Bug but
    but this new default should stay.

    Change-Id: Ifa6d147664fc993ca08c8fb2efa860f7605628cb
    Fixes-Bug: #1863629

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
Marios Andreou (marios-b) wrote :

more examples there

                 * https://f1b67923f9334b17f05d-278276a93d7fc7ef4213a4b26089e78a.ssl.cf1.rackcdn.com/708137/1/gate/tripleo-ci-centos-7-scenario004-standalone/70df41a/logs/undercloud/home/zuul/standalone_deploy.log
                        * 2020-02-18 03:08:21 | TASK [tripleo_ceph_uuid : run nodes-uuid command] ******************************
                        * 2020-02-18 03:08:21 | fatal: [undercloud]: FAILED! => changed=true

                * http://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_68d/708137/1/gate/tripleo-ci-centos-7-scenario001-standalone/68de2e5/logs/undercloud/home/zuul/standalone_deploy.log
                        * 2020-02-18 03:06:51 | TASK [tripleo_ceph_uuid : run nodes-uuid command] ******************************
                        * 2020-02-18 03:06:52 | fatal: [undercloud]: FAILED! => changed=true
                        * 2020-02-18 03:06:52 | stderr: '/bin/sh: -o: command not found'
                        * 2020-02-18 03:06:53 | Exception: Ansible execution failed. playbook: deploy_steps_playbook.yaml, Run Status: failed, Return Code: 2

looks like https://review.opendev.org/#/c/708252/ merged so rechecking my patch thanks for the quick fix folks

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/708485

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (stable/train)

Reviewed: https://review.opendev.org/708485
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=1a3620532e29a42ad4e05570979f9fee6725ddd7
Submitter: Zuul
Branch: stable/train

commit 1a3620532e29a42ad4e05570979f9fee6725ddd7
Author: John Fulton <email address hidden>
Date: Mon Feb 17 14:51:52 2020 -0500

    Do not inherit the calling ansible env by default

    As per I428f6deb416b540dae552b5fc7a778cbc7505e8c ANSIBLE_*
    environment variables are passed to Ansible executions which
    are used to deploy Ceph.

    The method of inheritance is fragile and possibly unsafe. This
    patch disables inheritance by default. In inheritance feature
    can be made more safe in subsequent patches for Related-Bug but
    but this new default should stay.

    Change-Id: Ifa6d147664fc993ca08c8fb2efa860f7605628cb
    Fixes-Bug: #1863629
    (cherry picked from commit 692678dce5b7f4950387c154086a0c1cdc6b887b)

tags: added: in-stable-train
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.