Promoter raises PromotionError after successful promotion: candidate hash timestamps are not gathered correctly.

Bug #1874839 reported by Gabriele Cerami
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Incomplete
High
Gabriele Cerami

Bug Description

logs at

http://promoter.rdoproject.org/centos8_master.log-20200423

show that promoter fails with

2020-04-23 22:07:02,344 28533 ERROR promoter Dlrn promote 'aggregate: 724a63c3a2f42ab8fda68b9f796c246a, commit: 75e2d0853ee460b6d4d8b9f42208b6e95bf3e3e0, distro: 99cfdee900f916eb84b606955e081b4a01f08d3d, component: ui, timestamp: 1587659479' from tripleo-ci-testing to current-tripleo: (subhash aggregate: 724a63c3a2f42ab8fda68b9f796c246a, commit: 75e2d0853ee460b6d4d8b9f42208b6e95bf3e3e0, distro: 99cfdee900f916eb84b606955e081b4a01f08d3d, component: ui, timestamp: None) API returned different promoted hash: 'aggregate: 724a63c3a2f42ab8fda68b9f796c246a, commit: d24de1440e562f55f015829a8e18a339cc2b1343, distro: 88306870fae5b8186c4ab72b86840deee1b57edf, component: compute, timestamp: None'
2020-04-23 22:07:02,344 28533 ERROR promoter Candidate hash 'aggregate: 724a63c3a2f42ab8fda68b9f796c246a, commit: 75e2d0853ee460b6d4d8b9f42208b6e95bf3e3e0, distro: 99cfdee900f916eb84b606955e081b4a01f08d3d, component: ui, timestamp: None': client dlrn_client FAILED promotion attempt to current-tripleo
2020-04-23 22:07:02,344 28533 ERROR promoter API returned different promoted hash
Traceback (most recent call last):
  File "/home/centos/ci-config-refactored/ci-scripts/dlrnapi_promoter/logic.py", line 140, in promote
    candidate_label=candidate_label)
  File "/home/centos/ci-config-refactored/ci-scripts/dlrnapi_promoter/dlrn_client.py", line 359, in promote
    candidate_label=candidate_label)
  File "/home/centos/ci-config-refactored/ci-scripts/dlrnapi_promoter/dlrn_client.py", line 550, in promote_hash
    raise PromotionError("API returned different promoted hash")
PromotionError: API returned different promoted hash
2020-04-23 22:07:02,347 28533 ERROR promoter Error while trying to promote tripleo-ci-testing to current-tripleo

then summary shows:

2020-04-23 22:07:04,994 28533 INFO promoter Summary: Promoted 0 hashes this round
2020-04-23 22:07:04,994 28533 INFO promoter ------- -------- Promoter terminated normally

But if we check dlrn api, the hash is indeed promoted.

Tags: ci
Revision history for this message
Gabriele Cerami (gcerami) wrote :

This is a problem in a runtime check.
The promoter uses batch promotion API call to promote. This API requires a list of hashes to promote all together. The hashes are then processed in order, and the last hash that promotes, is the one bound to the aggregate hash. The API then returns the aggregate hash with the bound commit/distro hash as proof that the promotion succeeded.
The promoter then checks that the aggregate hash in the request corresponds to the aggregate hash in the response, bound commit/distro included.

In this case we are seeing that requested and promoted aggregate hashes are bound to different commit/distro hashes.

So the error is in the expectation for the runtime check, not in the process itself.

Revision history for this message
Gabriele Cerami (gcerami) wrote :

The code that assembles the list of hashes to batch promote, sorts the hashes by timestamp, which is a fair assumption. The commit/distro hash that last promoted is the one that is usually bound to the aggregate hash.
The problem here is that the timestamp of the hash is gathered incorrectly from the component commit.yaml (dt_commit field). None of the information in commit.yaml refers to the promotion timestamp.

Timestamp should be gathered directly from the promotion query API.

Working on a fix.

wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-rc1 → ussuri-rc3
wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-rc3 → victoria-1
Changed in tripleo:
milestone: victoria-1 → victoria-3
Changed in tripleo:
milestone: victoria-3 → wallaby-1
Changed in tripleo:
milestone: wallaby-1 → wallaby-2
Changed in tripleo:
milestone: wallaby-2 → wallaby-3
Changed in tripleo:
milestone: wallaby-3 → wallaby-rc1
Changed in tripleo:
milestone: wallaby-rc1 → xena-1
Revision history for this message
Marios Andreou (marios-b) wrote :

This is an automated action. Bug status has been set to 'Incomplete' and target milestone has been removed due to inactivity. If you disagree please re-set these values and reach out to us on freenode #tripleo

Changed in tripleo:
milestone: xena-1 → none
status: In Progress → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.