Handle intentional task failures properly

Bug #1633438 reported by Jesse Pretorius on 2016-10-14
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
openstack-ansible
Undecided
Kevin Carter

Bug Description

We have many tasks spread across many roles which intentionally fail, using 'failed_when: false'. Most of these generate output in the Ansible output which looks like an error and is very confusing to anyone seeing it for the first time. Even worse it requires walking the code or very carefully reading the output to see that it's an intentional fail which is OK.

We should take the time to handle these more gracefully and prevent output that looks like a failure. This improves our usability and causes less confusion to newcomers.

Some examples from https://review.openstack.org/386128 / http://logs.openstack.org/28/386128/1/check/gate-openstack-ansible-openstack-ansible-aio-ubuntu-trusty/4df3caa/console.html

2016-10-13 18:22:58.256281 | TASK [bootstrap-host : Determine whether partitions labeled openstack-data{1,2} are present] ***
2016-10-13 18:22:58.440644 | fatal: [localhost]: FAILED! => {"changed": false, "cmd": "parted --script -l -m | egrep -q ':ext4:openstack-data[12]:;$'", "delta": "0:00:00.010736", "end": "2016-10-13 18:22:58.401710", "failed": true, "rc": 1, "start": "2016-10-13 18:22:58.390974", "stderr": "", "stdout": "", "stdout_lines": [], "warnings": []}
2016-10-13 18:22:58.440771 | ...ignoring

2016-10-13 18:50:59.802765 | TASK [galera_server : Check major galera install version] **********************
2016-10-13 18:51:00.340879 | fatal: [aio1_galera_container-fbdcb707]: FAILED! => {"changed": true, "cmd": ["dpkg", "-s", "mariadb-galera-server-10.0"], "delta": "0:00:00.011568", "end": "2016-10-13 18:51:00.164284", "failed": true, "rc": 1, "start": "2016-10-13 18:51:00.152716", "stderr": "dpkg-query: package 'mariadb-galera-server-10.0' is not installed and no information is available\nUse dpkg --info (= dpkg-deb --info) to examine archive files,\nand dpkg --contents (= dpkg-deb --contents) to list their contents.", "stdout": "", "stdout_lines": [], "warnings": []}
2016-10-13 18:51:00.341304 | ...ignoring
2016-10-13 18:51:00.350742 |
2016-10-13 18:51:00.350793 | TASK [galera_server : Check for any galera install version] ********************
2016-10-13 18:51:00.894828 | fatal: [aio1_galera_container-fbdcb707]: FAILED! => {"changed": true, "cmd": "dpkg --get-selections | grep mariadb-galera-server", "delta": "0:00:00.012244", "end": "2016-10-13 18:51:00.720672", "failed": true, "rc": 1, "start": "2016-10-13 18:51:00.708428", "stderr": "", "stdout": "", "stdout_lines": [], "warnings": []}
2016-10-13 18:51:00.895223 | ...ignoring

2016-10-13 18:51:05.227996 | TASK [galera_server : Gather mysql facts] **************************************
2016-10-13 18:51:05.710802 | fatal: [aio1_galera_container-fbdcb707]: FAILED! => {"changed": false, "failed": true, "msg": "Mysql fact collection failed: \"ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1' (111 \"Connection refused\")\"."}
2016-10-13 18:51:05.711187 | ...ignoring

2016-10-13 18:58:53.882361 | TASK [os_keystone : Enable/disable mod_shib2 for apache2] **********************
2016-10-13 18:58:54.362767 | fatal: [aio1_keystone_container-5c43f682]: FAILED! => {"changed": false, "failed": true, "msg": "Failed to disable module shib2: "}
2016-10-13 18:58:54.363169 | ...ignoring

Fix proposed to branch: master
Review: https://review.openstack.org/386792

Changed in openstack-ansible:
assignee: nobody → Kevin Carter (kevin-carter)
status: New → In Progress
Changed in openstack-ansible:
assignee: Kevin Carter (kevin-carter) → Jesse Pretorius (jesse-pretorius)

Change abandoned by Jesse Pretorius (odyssey4me) (<email address hidden>) on branch: master
Review: https://review.openstack.org/386826
Reason: This repo has been retired.

Changed in openstack-ansible:
assignee: Jesse Pretorius (jesse-pretorius) → Kevin Carter (kevin-carter)

Reviewed: https://review.openstack.org/386796
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-haproxy_server/commit/?id=696056db674d06a855566e81e2fcc1a889c13a97
Submitter: Jenkins
Branch: master

commit 696056db674d06a855566e81e2fcc1a889c13a97
Author: Kevin Carter <email address hidden>
Date: Fri Oct 14 16:44:03 2016 -0500

    Remove 'ignore_errors: true' in favor of 'failed_when: false'

    This change removes the use of 'ignore_errors: true' because it causes deployers
    to see red output and a stacktrace, which traditionally means something is broken,
    even when the failure is known to have a fall back option or be intentional. This
    conversion will provide a generally cleaner interface.

    It should be noted that the 'failed' filter will still function normally. Tasks
    with the 'failed_when: false' option will still be marked as 'failed' in any
    registered variable. This change simply makes the output look cleaner.

    Change-Id: I460b4eb12c30d66769f3093a451dc39c77150ea3
    Closes-Bug: #1633438
    Signed-off-by: Kevin Carter <email address hidden>

Changed in openstack-ansible:
status: In Progress → Fix Released

Reviewed: https://review.openstack.org/386799
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-openstack_hosts/commit/?id=84587efdc8db5de1d1117c948afd45ee6dc7eec5
Submitter: Jenkins
Branch: master

commit 84587efdc8db5de1d1117c948afd45ee6dc7eec5
Author: Kevin Carter <email address hidden>
Date: Fri Oct 14 16:44:22 2016 -0500

    Remove 'ignore_errors: true' in favor of 'failed_when: false'

    This change removes the use of 'ignore_errors: true' because it causes deployers
    to see red output and a stacktrace, which traditionally means something is broken,
    even when the failure is known to have a fall back option or be intentional. This
    conversion will provide a generally cleaner interface.

    It should be noted that the 'failed' filter will still function normally. Tasks
    with the 'failed_when: false' option will still be marked as 'failed' in any
    registered variable. This change simply makes the output look cleaner.

    Change-Id: Icd0afaaf8f0d9c5e06751f284f99985af6a924c6
    Closes-Bug: #1633438
    Signed-off-by: Kevin Carter <email address hidden>

Reviewed: https://review.openstack.org/386794
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-galera_client/commit/?id=9dc6e60247b0f43ce8328743a6ead422e4415457
Submitter: Jenkins
Branch: master

commit 9dc6e60247b0f43ce8328743a6ead422e4415457
Author: Kevin Carter <email address hidden>
Date: Fri Oct 14 16:43:45 2016 -0500

    Remove 'ignore_errors: true' in favor of 'failed_when: false'

    This change removes the use of 'ignore_errors: true' because it causes deployers
    to see red output and a stacktrace, which traditionally means something is broken,
    even when the failure is known to have a fall back option or be intentional. This
    conversion will provide a generally cleaner interface.

    It should be noted that the 'failed' filter will still function normally. Tasks
    with the 'failed_when: false' option will still be marked as 'failed' in any
    registered variable. This change simply makes the output look cleaner.

    Change-Id: I2c1b39905720e8e6ecb51d88f36c9eb47329d328
    Closes-Bug: #1633438
    Signed-off-by: Kevin Carter <email address hidden>

Reviewed: https://review.openstack.org/386825
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-pip_install/commit/?id=c45165acd3581203e7ac0b6dfcf6c7c90ca3d3cc
Submitter: Jenkins
Branch: master

commit c45165acd3581203e7ac0b6dfcf6c7c90ca3d3cc
Author: Kevin Carter <email address hidden>
Date: Fri Oct 14 16:46:45 2016 -0500

    Remove 'ignore_errors: true' in favor of 'failed_when: false'

    This change removes the use of 'ignore_errors: true' because it causes deployers
    to see red output and a stacktrace, which traditionally means something is broken,
    even when the failure is known to have a fall back option or be intentional. This
    conversion will provide a generally cleaner interface.

    It should be noted that the 'failed' filter will still function normally. Tasks
    with the 'failed_when: false' option will still be marked as 'failed' in any
    registered variable. This change simply makes the output look cleaner.

    Change-Id: Iab6a4e4be29ae358cc95d94eb1090a7963a925df
    Closes-Bug: #1633438
    Signed-off-by: Kevin Carter <email address hidden>

Reviewed: https://review.openstack.org/386798
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-lxc_hosts/commit/?id=6642af670f67e30c99e03cf3ebcade5fc79bf811
Submitter: Jenkins
Branch: master

commit 6642af670f67e30c99e03cf3ebcade5fc79bf811
Author: Kevin Carter <email address hidden>
Date: Fri Oct 14 16:44:13 2016 -0500

    Remove 'ignore_errors: true' in favor of 'failed_when: false'

    This change removes the use of 'ignore_errors: true' because it causes deployers
    to see red output and a stacktrace, which traditionally means something is broken,
    even when the failure is known to have a fall back option or be intentional. This
    conversion will provide a generally cleaner interface.

    It should be noted that the 'failed' filter will still function normally. Tasks
    with the 'failed_when: false' option will still be marked as 'failed' in any
    registered variable. This change simply makes the output look cleaner.

    Change-Id: I298510082bcecb0b84eb252851f8044d8b7f7f61
    Closes-Bug: #1633438
    Signed-off-by: Kevin Carter <email address hidden>

Reviewed: https://review.openstack.org/386795
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-galera_server/commit/?id=5038acc3907c75d6a5ff1af6382e5abec4ea1650
Submitter: Jenkins
Branch: master

commit 5038acc3907c75d6a5ff1af6382e5abec4ea1650
Author: Kevin Carter <email address hidden>
Date: Fri Oct 14 16:43:57 2016 -0500

    Remove 'ignore_errors: true' in favor of 'failed_when: false'

    This change removes the use of 'ignore_errors: true' because it causes deployers
    to see red output and a stacktrace, which traditionally means something is broken,
    even when the failure is known to have a fall back option or be intentional. This
    conversion will provide a generally cleaner interface.

    It should be noted that the 'failed' filter will still function normally. Tasks
    with the 'failed_when: false' option will still be marked as 'failed' in any
    registered variable. This change simply makes the output look cleaner.

    Change-Id: I2a40fa9a0da45602a76f2d56611971fcf4063512
    Closes-Bug: #1633438
    Signed-off-by: Kevin Carter <email address hidden>

tags: added: in-stable-newton
84 comments hidden view all 164 comments

Reviewed: https://review.openstack.org/389773
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-os_ceilometer/commit/?id=b97fc6046995c3ec1491e8fa5c0644ea8e6d2f48
Submitter: Jenkins
Branch: stable/newton

commit b97fc6046995c3ec1491e8fa5c0644ea8e6d2f48
Author: Kevin Carter <email address hidden>
Date: Fri Oct 14 16:44:50 2016 -0500

    Remove 'ignore_errors: true' in favor of 'failed_when: false'

    This change removes the use of 'ignore_errors: true' because it causes deployers
    to see red output and a stacktrace, which traditionally means something is broken,
    even when the failure is known to have a fall back option or be intentional. This
    conversion will provide a generally cleaner interface.

    It should be noted that the 'failed' filter will still function normally. Tasks
    with the 'failed_when: false' option will still be marked as 'failed' in any
    registered variable. This change simply makes the output look cleaner.

    Change-Id: Ide6ae6a5ea5d2279c42003f379b3a3460e62a525
    Closes-Bug: #1633438
    Signed-off-by: Kevin Carter <email address hidden>
    (cherry picked from commit 2ef2b9e735fdb63abd4763f2bdfc92b3ca86aa97)

Reviewed: https://review.openstack.org/386813
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-os_magnum/commit/?id=14df603af349e90f240ed4429bf1638ab26cc877
Submitter: Jenkins
Branch: master

commit 14df603af349e90f240ed4429bf1638ab26cc877
Author: Kevin Carter <email address hidden>
Date: Fri Oct 14 16:45:43 2016 -0500

    Remove 'ignore_errors: true' in favor of 'failed_when: false'

    This change removes the use of 'ignore_errors: true' because it causes deployers
    to see red output and a stacktrace, which traditionally means something is broken,
    even when the failure is known to have a fall back option or be intentional. This
    conversion will provide a generally cleaner interface.

    It should be noted that the 'failed' filter will still function normally. Tasks
    with the 'failed_when: false' option will still be marked as 'failed' in any
    registered variable. This change simply makes the output look cleaner.

    Change-Id: Ib906fa3a67a8d70174da10608921a1a213dfd739
    Closes-Bug: #1633438
    Signed-off-by: Kevin Carter <email address hidden>

Reviewed: https://review.openstack.org/386818
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-os_sahara/commit/?id=3e65a5a166a8ddf92f957eca7b865b7fc07b2157
Submitter: Jenkins
Branch: master

commit 3e65a5a166a8ddf92f957eca7b865b7fc07b2157
Author: Kevin Carter <email address hidden>
Date: Fri Oct 14 16:46:08 2016 -0500

    Remove 'ignore_errors: true' in favor of 'failed_when: false'

    This change removes the use of 'ignore_errors: true' because it causes deployers
    to see red output and a stacktrace, which traditionally means something is broken,
    even when the failure is known to have a fall back option or be intentional. This
    conversion will provide a generally cleaner interface.

    It should be noted that the 'failed' filter will still function normally. Tasks
    with the 'failed_when: false' option will still be marked as 'failed' in any
    registered variable. This change simply makes the output look cleaner.

    Change-Id: Ia6f55ae6fd6270d03e6bf0541d577cd862b3a16b
    Closes-Bug: #1633438
    Signed-off-by: Kevin Carter <email address hidden>

Reviewed: https://review.openstack.org/392105
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-os_magnum/commit/?id=1aae6c283e0610c10ccb037ef42417ea19bcccf5
Submitter: Jenkins
Branch: stable/newton

commit 1aae6c283e0610c10ccb037ef42417ea19bcccf5
Author: Kevin Carter <email address hidden>
Date: Fri Oct 14 16:45:43 2016 -0500

    Remove 'ignore_errors: true' in favor of 'failed_when: false'

    This change removes the use of 'ignore_errors: true' because it causes deployers
    to see red output and a stacktrace, which traditionally means something is broken,
    even when the failure is known to have a fall back option or be intentional. This
    conversion will provide a generally cleaner interface.

    It should be noted that the 'failed' filter will still function normally. Tasks
    with the 'failed_when: false' option will still be marked as 'failed' in any
    registered variable. This change simply makes the output look cleaner.

    Change-Id: Ib906fa3a67a8d70174da10608921a1a213dfd739
    Closes-Bug: #1633438
    Signed-off-by: Kevin Carter <email address hidden>
    (cherry picked from commit 14df603af349e90f240ed4429bf1638ab26cc877)

This issue was fixed in the openstack/openstack-ansible-os_ceilometer 14.0.1 release.

Reviewed: https://review.openstack.org/394325
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-os_sahara/commit/?id=36281d2de08a40774db0727004a10103ee24a1d3
Submitter: Jenkins
Branch: stable/newton

commit 36281d2de08a40774db0727004a10103ee24a1d3
Author: Kevin Carter <email address hidden>
Date: Fri Oct 14 16:46:08 2016 -0500

    Remove 'ignore_errors: true' in favor of 'failed_when: false'

    This change removes the use of 'ignore_errors: true' because it causes deployers
    to see red output and a stacktrace, which traditionally means something is broken,
    even when the failure is known to have a fall back option or be intentional. This
    conversion will provide a generally cleaner interface.

    It should be noted that the 'failed' filter will still function normally. Tasks
    with the 'failed_when: false' option will still be marked as 'failed' in any
    registered variable. This change simply makes the output look cleaner.

    Change-Id: Ia6f55ae6fd6270d03e6bf0541d577cd862b3a16b
    Closes-Bug: #1633438
    Signed-off-by: Kevin Carter <email address hidden>
    (cherry picked from commit 3e65a5a166a8ddf92f957eca7b865b7fc07b2157)

This issue was fixed in the openstack/openstack-ansible-os_ceilometer 14.0.1 release.

This issue was fixed in the openstack/openstack-ansible 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-pip_install 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-galera_client 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-galera_server 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-ceph_client 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-haproxy_server 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-lxc_container_create 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-lxc_hosts 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-openstack_hosts 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_keystone 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_aodh 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_ceilometer 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_cinder 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_glance 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_gnocchi 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_heat 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_horizon 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_ironic 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_magnum 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_neutron 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_nova 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_rally 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_sahara 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_swift 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_tempest 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-rabbitmq_server 15.0.0.0b1 development milestone.

This issue was fixed in the openstack/openstack-ansible-os_magnum 14.0.3 release.

This issue was fixed in the openstack/openstack-ansible-os_sahara 14.0.3 release.

This issue was fixed in the openstack/openstack-ansible-os_magnum 14.0.3 release.

This issue was fixed in the openstack/openstack-ansible-os_sahara 14.0.3 release.

Displaying first 40 and last 40 comments. View all 164 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers