dlrn_promoter is updating links in dlrn_trunk prior to all the containers being pushed

Bug #1846662 reported by wes hayutin on 2019-10-04
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Critical
Unassigned

Bug Description

Container promotion in progress:

2019-10-03 23:19:19,025 23202 DEBUG promoter periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-stein-upload at 2019-10-01T08:32:12, logs at https://logs.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-stein-upload/1e13973
2019-10-03 23:19:19,026 23202 DEBUG promoter periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-stein at 2019-10-01T09:38:25, logs at https://logs.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-stein/2bd88dd
2019-10-03 23:19:19,026 23202 INFO promoter Promoting the container images for dlrn hash a361c5edd5734320f781e10358a602f4ba57fc04 on stein to current-tripleo
2019-10-03 23:19:19,026 23202 INFO promoter Running: env ANSIBLE_LOG_PATH=/home/centos/promoter_logs/container-push/20191003-231919.log RELEASE=stein COMMIT_HASH=a361c5edd5734320f781e10358a602f4ba57fc04 DISTRO_HASH=647b08e47a72f7533142c09074f227628f08f9fa FULL_HASH=a361c5edd5734320f781e10358a602f4ba57fc04_647b08e4 PROMOTE_NAME=current-tripleo SCRIPT_ROOT=/home/centos/ci-config/ DISTRO_NAME=centos DISTRO_VERSION=7 ansible-playbook /home/centos/ci-config/ci-scripts/container-push/container-push.yml

The dlrn_trunk sym link has already been update..

 delorean.repo 2019-10-01 03:56 230
[delorean]
name=delorean-openstack-tripleo-heat-templates-7e7d310e1b7ef51a163560cfa56c675afa3a1683
baseurl=https://trunk.rdoproject.org/centos7-stein/7e/7d/7e7d310e1b7ef51a163560cfa56c675afa3a1683_a3792cfa
enabled=1
gpgcheck=0
priority=1

[DIR] current-tripleo/ 2019-10-03 21:43 -

wes hayutin (weshayutin) wrote :

The order of promotion needs to be preserved in case something goes wrong while things promote.
The order needs to be.

1. Pull / tag/ push containers
2. update overcloud image links
3. update dlrn_trunk links

If a containers are now being requested on the promoted hash because dln_trunk was update, howevever the containers may not be in docker.io yet.

Gabriele Cerami (gcerami) wrote :

Something weird happened somewhere, because the order in the script has not changed. Not in the code, not in the promoter server.

Gabriele Cerami (gcerami) wrote :

I see the promoter was trying to promote a361c5edd5734320f781e10358a602f4ba57fc04, but the dlrn.repo you shown has 7e7d310e1b7ef51a163560cfa56c675afa3a1683.
Where did you see the promoter pushing containers for a promoted hash ?

Ronelle Landy (rlandy) wrote :

As far as we can tell:

 - docker.io: consider https://hub.docker.com/r/tripleostein/centos-binary-mariadb/tags

current-tripleo335.01 MB
Last updated14 hours ago by rdotripleomirror
64
 --- vs ---
335.01 MB
a361c5edd5734320f781e10358a602f4ba57fc04_647b08e4335.01 MB
Last updated14 hours ago by rdotripleomirror
335.01 MB

so hash a361c matches current-tripleo

which is consistent with images:

https://images.rdoproject.org/centos7/stein/rdo_trunk/current-tripleo/undercloud.qcow2.md5
https://images.rdoproject.org/centos7/stein/rdo_trunk
/a361c5edd5734320f781e10358a602f4ba57fc04_647b08e4/undercloud.qcow2.md5

compare with rdo registry:

3 days ago

a361c5edd5734320f781e10358a602f4ba57fc04_647b08e4_x86_64

pushed image

sha256:8b7375d81bd9685cd076b9e9e54b9bc122cf2d2c373fe45222fa1887d69ef713

3 days ago

current-tripleo

pushed image

sha256:8b7375d81bd9685cd076b9e9e54b9bc122cf2d2c373fe45222fa1887d69ef713

Now let's look at the referenced jobs:

Now the dlrn repo shows: name=delorean-openstack-tripleo-common-a361c5edd5734320f781e10358a602f4ba57fc04

but in the referenced jobs ...
baseurl=http://mirror.regionone.rdo-cloud-tripleo.rdoproject.org:8080/rdo/centos7-stein/66/6d/666df6cc4171f6ecebafb083a778aa4597301c13_df7e45b9

^^ not yet updated.

Ronelle Landy (rlandy) wrote :

This is a problem though .... https://logs.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-stein/2bd88dd/logs/undercloud/etc/yum.repos.d/delorean.repo.txt.gz

hash 666df6cc4171f6ecebafb083a778aa4597301c13_df7e45b9

is in use while trying to promote

a361c5edd5734320f781e10358a602f4ba57fc04&distro_hash=647b08e47a72f7533142c09074f227628f08f9fa

It's not repeatable though - the next hash matches dlrn repo to hash under test.

Possible that this happened just once???

wes hayutin (weshayutin) wrote :

perhaps.. before push containers, ensure the current-tripleo repo is NOT yet at the same hash of the promotion.. If it is.. throw an error in the tests.

wes hayutin (weshayutin) wrote :

Please update the bug w/ the review that adds tests.. removing promotion blocker

tags: removed: promotion-blocker
Marios Andreou (marios-b) wrote :

I am looking here because marios|rover but I posted something yesterday that may be relevant

its just a wip for now for https://tree.taiga.io/project/tripleo-ci-board/task/1325 - continue digging/starging dlrnapi_promoter.py @ https://review.rdoproject.org/r/23037

We discussed a bit with panda yesterday - adding it elsewhere (e.g. in the ansible tasks) is another option to explore. This is in the promoter.py itself.

Changed in tripleo:
milestone: train-rc1 → ussuri-1
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers