periodic-tripleo-ci-centos-9-scenario010-kvm-internal-standalone-master is failing tempest octavia tests - show_pool operating_status failed to update to ONLINE within the required time 300. Current status of show_pool: OFFLINE

Bug #1992668 reported by Ronelle Landy
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Unassigned

Bug Description

scenario010 kvm tests were failing sporadically. Since then we have a specified nested-virt aggregate. When testing out this aggregate, the master jobs are failing with a new error.

periodic-tripleo-ci-centos-9-scenario010-kvm-internal-standalone-master tests are failing tempest octavia tests with:

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/octavia_tempest_plugin/tests/scenario/v2/test_pool.py", line 113, in test_UDP_LC_pool_with_listener_CRUD
    self._test_pool_CRUD(listener_protocol=const.UDP,
  File "/usr/lib/python3.9/site-packages/octavia_tempest_plugin/tests/scenario/v2/test_pool.py", line 514, in _test_pool_CRUD
    pool = waiters.wait_for_status(
  File "/usr/lib/python3.9/site-packages/octavia_tempest_plugin/tests/waiters.py", line 96, in wait_for_status
    raise exceptions.TimeoutException(message)
tempest.lib.exceptions.TimeoutException: Request timed out
Details: (PoolScenarioTest:test_UDP_LC_pool_with_listener_CRUD) show_pool operating_status failed to update to ONLINE within the required time 300. Current status of show_pool: OFFLINE

Example log:

https://sf.hosted.upshift.rdu2.redhat.com/logs/82/430582/1/check/periodic-tripleo-ci-centos-9-scenario010-kvm-internal-standalone-master/e65121a/logs/undercloud/var/log/tempest/stestr_results.html

Same error on vexxhost:

https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-scenario010-kvm-standalone-master/440bdf0/logs/undercloud/var/log/tempest/stestr_results.html.gz

Wallaby tests pass - so this is master only.

Revision history for this message
Ronelle Landy (rlandy) wrote :
tags: added: promotion-blocker
Changed in tripleo:
milestone: none → zed-1
importance: Undecided → Critical
status: New → Triaged
description: updated
Revision history for this message
Gregory Thiemonge (gthiemonge) wrote :

There is a mismatch between the version of the control plane and the version of the amphora image:

on the control plane:
/usr/bin/octavia-worker version 11.1.0.dev10
(this is master)

in the amphora:
/usr/bin/amphora-agent version 9.1.0.dev56
(9 is xena)

It appears that the RDO amphora image for master hasn't been updated since February:
https://images.rdoproject.org/octavia/master/
amphora-x64-haproxy-..> 2022-02-24 00:15 584M

Revision history for this message
Jakob Meng (jm1337) wrote :

Last success of build job 'octavia-build-amphora-centos8' was on 2022-02-24 00:01:48:

https://review.rdoproject.org/zuul/builds?job_name=octavia-build-amphora-centos8&branch=master&result=SUCCESS&skip=0

Revision history for this message
Jakob Meng (jm1337) wrote :
Revision history for this message
Jakob Meng (jm1337) wrote :

Patch to add 'octavia-build-amphora-centos9' job:

  https://review.rdoproject.org/r/c/rdo-jobs/+/45627

Testproject:

  https://review.rdoproject.org/r/c/testproject/+/45628

The latter fails during image build with

  error: can't create transaction lock on /tmp/dib_build.8AioN7ve/mnt/var/lib/rpm/.rpm.lock (Permission denied)

in

  https://github.com/openstack/diskimage-builder/blob/master/diskimage_builder/elements/yum-minimal/root.d/08-yum-chroot

logs: https://logserver.rdoproject.org/28/45628/2/check/octavia-build-amphora-centos9/437a634/job-output.txt

Revision history for this message
Jakob Meng (jm1337) wrote :

With Gregory's SELinux patch [1] applied to playbooks/octavia-build-amphora/run.yaml, the job gets past the previous step and now fails with error [2]:

  No package basesystem available.
  Exiting due to strict setting.
  Error: No package basesystem available.
  Failed to download initial packages: basesystem filesystem setup centos-gpg-keys centos-linux-release centos-linux-repos

in the same file root.d/08-yum-chroot [3].

[1] https://review.rdoproject.org/r/c/rdo-jobs/+/45632
[2] https://logserver.rdoproject.org/28/45628/3/check/octavia-build-amphora-centos9/783933c/job-output.txt
[3] https://github.com/openstack/diskimage-builder/blob/master/diskimage_builder/elements/yum-minimal/root.d/08-yum-chroot

Revision history for this message
Jakob Meng (jm1337) wrote :

Having adapted the octavia-build-amphora/run.yaml playbook to CentOS 9 Stream [1], the testproject job [2] is failing with the same error as the centos8 job [3]:

 dib-run-parts Running /tmp/in_target.d/post-install.d/12-enable-prometheus-proxy-systemd
 Failed to enable unit, unit prometheus-proxy.service does not exist.

[1] https://review.rdoproject.org/r/c/rdo-jobs/+/45637
[2] https://review.rdoproject.org/r/c/testproject/+/45628
[3] https://logserver.rdoproject.org/28/45628/4/check/octavia-build-amphora-centos9/9eee4b0/job-output.txt

Revision history for this message
Gregory Thiemonge (gthiemonge) wrote :

Commit proposed to openstack/octavia master:

861145: Fix prometheus-proxy service name in Red Hat-based distros | https://review.opendev.org/c/openstack/octavia/+/861145

Revision history for this message
Jakob Meng (jm1337) wrote :

Patch to update OpenStack releases for periodic octavia-build-amphora-centos{7,8,9} jobs:

  https://review.rdoproject.org/r/c/config/+/45638

Revision history for this message
Jakob Meng (jm1337) wrote :

Alfredo built a temporary yum repo [1] which has octavia rpms with Gregory's upstream patch [2] and included it in our testproject [3],[4]. With that, the new job periodic octavia-build-amphora-centos9 passes.

[1] https://review.rdoproject.org/r/c/openstack/octavia-distgit/+/45639
[2] https://review.opendev.org/c/openstack/octavia/+/861145
[3] https://review.rdoproject.org/r/c/rdo-jobs/+/45640
[4] https://review.rdoproject.org/r/c/testproject/+/45628

Revision history for this message
Jakob Meng (jm1337) wrote :

All patches for rdo c9 master have been merged [1]. The c9 master octavia component has been promoted [2], but a promotion of the integration line is still outstanding before we see passes on the periodic image build job [3] and new images in [4].

Gregory's patch [5] still has to be backported to Wallaby etc. and then promoted through component and integration lines!

[1] https://review.rdoproject.org/r/q/topic:octavia-build-amphora
[2] https://trunk.rdoproject.org/centos9-master/component/octavia/promoted-components/versions.csv
[3] https://review.rdoproject.org/r/c/config/+/45638
[4] https://images.rdoproject.org/octavia/master/
[5] https://review.opendev.org/c/openstack/octavia/+/861145

Revision history for this message
Ananya Banerjee (frenzyfriday) wrote :

Octavia was promoted. We are waiting for master promotion.

Revision history for this message
Ananya Banerjee (frenzyfriday) wrote :

We had a master promotion. The job had a green run. Adding it back to criteria.

Ronelle Landy (rlandy)
Changed in tripleo:
status: Triaged → Fix Released
Revision history for this message
Jakob Meng (jm1337) wrote :

Build job octavia-build-amphora-centos9 is still failing for yoga and zed branches [1] because Gregory's patch [2] still has to be backported to Yoga and Zed and then promoted through component and integration lines.

[1] https://review.rdoproject.org/zuul/builds?job_name=octavia-build-amphora-centos9
[2] https://review.opendev.org/c/openstack/octavia/+/861145

Changed in tripleo:
status: Fix Released → Triaged
Revision history for this message
Jakob Meng (jm1337) wrote :
Revision history for this message
Jakob Meng (jm1337) wrote :

For Yoga, Gregory had to backport a patch to build proper Octavia rpms:

https://review.rdoproject.org/r/c/openstack/octavia-distgit/+/45775

Revision history for this message
Jakob Meng (jm1337) wrote :

Cherry Picks for Yoga and Zed have been merged. Image build job for Zed has passed [1] and the one for Yoga will once [2] has been merged.

[1] https://review.rdoproject.org/zuul/builds?job_name=octavia-build-amphora-centos9
[2] https://review.rdoproject.org/r/c/openstack/octavia-distgit/+/45775

Revision history for this message
Jakob Meng (jm1337) wrote :

Now all build jobs are passing, we are done with this bug :D

Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.