TQE::modify-image should fail early to avoid false positive

Bug #1771609 reported by Matt Young
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Invalid
High
Unassigned

Bug Description

An issue has been identified with how we're modifying images in CI. We're not detecting a failure early enough, which allows a corrupted/invalid IPA image to be used by subsequent deployment steps. In this case it causes introspection to fail, with the impact of taking out nearly all OVB jobs across all release pipelines.

This specific LP issue tracks identifying failures of this class. The example used here has a fix in testing now (https://review.openstack.org/568838 Mount /dev for chrooted environment)

https://bugs.launchpad.net/tripleo/+bug/1770972/comments/14

Reproducing specifics here for clarity.

---

When running this:

https://github.com/openstack/tripleo-quickstart-extras/blob/69ad943adda9000f79277f0230a5751869de9cb3/roles/modify-image/tasks/manual.yml#L33-L70

```
  - name: Run script on image
    shell: >
      mv {{ mount_tempdir }}/etc/resolv.conf{,_};
      echo -e "nameserver 8.8.8.8\nnameserver 8.8.4.4" > {{ mount_tempdir }}/etc/resolv.conf;
      cp {{ modify_script }} {{ mount_tempdir }}/tmp/{{ modify_script|basename }};
      {% if initramfs_image|bool %}sed -i "s/sudo //g" {{ mount_tempdir }}/tmp/{{ modify_script|basename }};{% endif %}
      set -o pipefail && chroot {{ mount_tempdir }} /bin/bash /tmp/{{ modify_script|basename }} 2>&1
      {{ timestamper_cmd }} > {{ working_dir }}/{{ modify_script|basename }}.$(date +%s).log;
      mv -f {{ mount_tempdir }}/etc/resolv.conf{_,};
```

the following was happening in the chroot:

https://logs.rdoproject.org/15/568715/2/openstack-check/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/Z5df1951657694a9ebaad63e71362a76a/undercloud/home/jenkins/repo_setup.sh.1526444104.log.txt.gz

```
+ sudo rm -rf '/etc/yum.repos.d/delorean*'
+ sudo rm -rf '/etc/yum.repos.d/*.rpmsave'
+ sudo yum clean all
error: Failed to initialize NSS library
There was a problem importing one of the Python modules
required to run yum. The error leading to this problem was:

   cannot import name ts

Please install a package which provides this module, or
verify that the module is installed correctly.

It's possible that the above module doesn't match the
current version of Python, which is:
2.7.5 (default, Apr 11 2018, 07:36:10)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-28)]

If you cannot solve this problem yourself, please go to
the yum faq at:
  http://yum.baseurl.org/wiki/Faq

```

---

We should be failing at this step and identifying that the chroot operation failed.

marking this as "high" importance as it's effectively a false positive from this step, which has a cost multiplier. However as we don't know this is happening apart from the issues linked above, "medium" might make more sense.

thoughts?

Tags: ci quickstart
Matt Young (halcyondude)
summary: - TQE:modify-image should fail early if failure occurs
+ TQE:modify-image should fail early to avoid false positive
Matt Young (halcyondude)
summary: - TQE:modify-image should fail early to avoid false positive
+ TQE::modify-image should fail early to avoid false positive
Matt Young (halcyondude)
Changed in tripleo:
milestone: none → rocky-2
Changed in tripleo:
milestone: rocky-2 → rocky-3
Changed in tripleo:
milestone: rocky-3 → rocky-rc1
Changed in tripleo:
milestone: rocky-rc1 → stein-1
Changed in tripleo:
milestone: stein-1 → stein-2
Changed in tripleo:
milestone: stein-2 → stein-3
Revision history for this message
Juan Antonio Osorio Robles (juan-osorio-robles) wrote :

Is this still needed?

Changed in tripleo:
milestone: stein-3 → stein-rc1
Changed in tripleo:
milestone: stein-rc1 → train-1
Changed in tripleo:
milestone: train-1 → train-2
Changed in tripleo:
milestone: train-2 → train-3
Changed in tripleo:
milestone: train-3 → ussuri-1
Changed in tripleo:
milestone: ussuri-1 → ussuri-2
wes hayutin (weshayutin)
Changed in tripleo:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.