periodic jobs delorean reporting broken when using featureset_override['dlrn_hash_tag']

Bug #1933448 reported by Marios Andreou
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Marios Andreou

Bug Description

As seen during testing at [1] when running a periodic job for a particular delorean hash (i.e. override with featureset_override['dlrn_hash_tag']) the reporting of the job result to delorean is broken. In [2] with dlrn_hash_tag: 4a47487922712b220c9e80ee0ca16ed9 you can see that the correct hash is specified for the containers:

 2021-06-22 16:27:34.055732 | primary | TASK [container-prep : echo container_build_id] ********************************
 2021-06-22 16:27:34.055765 | primary | Tuesday 22 June 2021 16:27:34 +0000 (0:00:00.031) 0:01:44.634 **********
 2021-06-22 16:27:34.071185 | primary | ok: [undercloud] => {
 2021-06-22 16:27:34.071227 | primary | "container_build_id": "4a47487922712b220c9e80ee0ca16ed9"
 2021-06-22 16:27:34.071239 | primary | }

but the hash reported by delorean is instead pulled from the current value of tripleo-ci-testing:

 2021-06-22 17:19:29.990691 | primary | + dlrnapi --url https://trunk.rdoproject.org/api-centos8-ussuri report-result --agg-hash 8067ada389f4f00c33cdd36ac7892dc2 --job-id periodic-tripleo-ci-centos-8-standalone-on-multinode-ipa-ussuri --info-url https://logserver.rdoproject.org/56/34256/2/check/periodic-tripleo-ci-centos-8-standalone-on-multinode-ipa-ussuri/ff76ac6 --timestamp 1624382369 --success True
 2021-06-22 17:19:32.053382 | primary | {
 2021-06-22 17:19:32.053521 | primary | "aggregate_hash": "8067ada389f4f00c33cdd36ac7892dc2",

The problem comes from [3] where hash_info is populated for delorean reporting. In particular, in the get_hash role there is no attempt to use the passed delorean hash, instead directly taking {{ promote_source }}/delorean.repo.md5 at [4].

To be clear this is a bug against tripleo-ci ruck|rover tooling and should not impact regular users. However being able to override a hash is an important part of the ruck|rover toolbox and needs to be fixed.

[1] https://review.rdoproject.org/r/c/testproject/+/34256/2/.zuul.yaml
[2] https://logserver.rdoproject.org/56/34256/2/check/periodic-tripleo-ci-centos-8-standalone-on-multinode-ipa-ussuri/ff76ac6/job-output.txt
[3] https://github.com/rdo-infra/review.rdoproject.org-config/blob/7cd678523975efa5235e1b034bab0e23344605cf/playbooks/tripleo-ci-periodic-base/pre.yaml#L6-L10
[4] https://github.com/rdo-infra/ci-config/blob/a0c8e40e395cbf64e301378a370138c54f7be742/ci-scripts/infra-setup/roles/get_hash/tasks/get_hash.yaml#L49-L61

Tags: ci
Revision history for this message
Marios Andreou (marios-b) wrote :

I posted a test fix at [1] so when featureset_override is provided with a delorean hash it is used directly for the md5.

I am trying to test with [2] except it isn't included with depends-on. It is ci-config and besides the fact that delorean probably can't build the ci-config repo (? I cannot see any related distgit @ https://github.com/rdo-packages?q=ci-config&type=&language=&sort= ), the conditional fails there [3] because zuul.override_repo is stable/ussuri for the periodic-tripleo-ci-centos-8-standalone-on-multinode-ipa-ussuri job running in my testproject e.g. logs @ [4]

Not sure how to proceed... I have considered if we can enhance build-test-packages to apply changes from git directly to a checkout? Even if we do want to do that I don't think it is somehting that can be done in this ruck|rover cycle and needs discussion etc.

We could try merging https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/34275 - the conditional there means it only happens when user specifies the hash. It might fix the issue, or we may need something further which we can iterate for.

[1] https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/34275/1/ci-scripts/infra-setup/roles/get_hash/tasks/get_hash.yaml
[2] https://review.rdoproject.org/r/c/testproject/+/34256
[3] https://opendev.org/openstack/tripleo-quickstart-extras/src/commit/e423bb068a96ed919ead08226f9447b2cdfc0332/roles/build-test-packages/tasks/main.yml#L200
[4] https://logserver.rdoproject.org/56/34256/3/check/periodic-tripleo-ci-centos-8-standalone-on-multinode-ipa-ussuri/7385797/zuul-info/inventory.yaml

Revision history for this message
Marios Andreou (marios-b) wrote :

moving this to fix-released - the patch at https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/34275 worked.

Example at [1] where we are specifying

    "dlrn_hash_tag: 6439b21a91a11b464ad5b2cc147e81cd"

From the logs at [2] you can see:

        * 2021-06-29 17:50:14.267246 | primary | "container_build_id": "6439b21a91a11b464ad5b2cc147e81cd"
...
        * 2021-06-29 17:42:23.565173 | TASK [get_hash : If set use a passed hash 6439b21a91a11b464ad5b2cc147e81cd]
...
        * 2021-06-29 21:07:48.633352 | primary | + dlrnapi --url https://trunk.rdoproject.org/api-centos8-master-uc report-result --agg-hash 6439b21a91a11b464ad5b2cc147e81cd --job-id periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-master --info-url https://logserver.rdoproject.org/25/34325/2/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-master/4c1f33c --timestamp 1625000868 --success True
2021-06-29 21:07:50.381573 | primary | {
2021-06-29 21:07:50.381716 | primary | "aggregate_hash": "6439b21a91a11b464ad5b2cc147e81cd",

And the hash_info.sh file contains the specified hash at [3].

[1] https://review.rdoproject.org/r/c/testproject/+/34325/2/.zuul.yaml
[2] https://logserver.rdoproject.org/25/34325/2/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-master/4c1f33c/job-output.txt
[3] https://logserver.rdoproject.org/25/34325/2/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-master/4c1f33c/logs/undercloud/home/zuul/workspace/hash_info.sh.txt.gz

Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.