periodic-tripleo-ci-centos-9-scenario003-standalone-wallaby is failing deploy - Failed containers: designate_db_sync

Bug #1989795 reported by Ronelle Landy
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Unassigned

Bug Description

periodic-tripleo-ci-centos-9-scenario003-standalone-wallaby has been failing the standalone deployment since 09/13:

https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-9-scenario003-standalone-wallaby&skip=0

The error is in container designate_db_sync:

2022-09-15 07:01:15.905266 | | WARNING | ERROR: Can't run container designate_db_sync
stderr: + sudo -E kolla_set_configs
sudo: unable to send audit message: Operation not permitted
INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
INFO:__main__:Validating config file
INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
INFO:__main__:Copying service configuration files
INFO:__main__:Creating directory /etc/designate/private
INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/designate/private/bind1.conf to /etc/designate/private/bind1.conf
INFO:__main__:Deleting /etc/designate/designate.conf
INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/designate/designate.conf to /etc/designate/designate.conf
INFO:__main__:Creating directory /etc/my.cnf.d
INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/my.cnf.d/tripleo.cnf to /etc/my.cnf.d/tripleo.cnf
INFO:__main__:Writing out command to execute
INFO:__main__:Setting permission for /var/log/designate
++ cat /run_command
+ CMD='/usr/bin/bootstrap_host_exec designate_central su designate -s /bin/bash -c '\''designate-manage --config-file /etc/designate/designate.conf database sync'\'''
+ ARGS=
+ [[ ! -n '' ]]
+ . kolla_extend_start
+ echo 'Running command: '\''/usr/bin/bootstrap_host_exec designate_central su designate -s /bin/bash -c '\''designate-manage --config-file /etc/designate/designate.conf database sync'\'''\'''
+ exec /usr/bin/bootstrap_host_exec designate_central su designate -s /bin/bash -c ''\''designate-manage' --config-file /etc/designate/designate.conf database 'sync'\'''
2022-09-15 07:01:15.907051 | fa163e81-657f-2e96-a17f-0000000023f0 | FATAL | Create containers managed by Podman for /var/lib/tripleo-config/container-startup-config/step_3 | standalone | error={"changed": false, "msg": "Failed containers: designate_db_sync"}
2022-09-15 07:01:15.907919 | fa163e81-657f-2e96-a17f-0000000023f0 | TIMING | tripleo_container_manage : Create containers managed by Podman for /var/lib/tripleo-config/container-startup-config/step_3 | standalone | 0:15:22.822866 | 44.24s

Example logs:

https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-scenario003-standalone-wallaby/f973cd2/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz

https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-scenario003-standalone-wallaby/e700966/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz

https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-scenario003-standalone-wallaby/c833bef/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz

Revision history for this message
Ronelle Landy (rlandy) wrote :
Changed in tripleo:
milestone: none → zed-1
importance: Undecided → Critical
status: New → Triaged
tags: added: promotion-blocker
Revision history for this message
Ronelle Landy (rlandy) wrote :
Revision history for this message
Michael Johnson (johnsom) wrote :
Download full text (5.2 KiB)

The error is in DB migration 80, which hasn't changed in seven years:

2022-09-15 11:00:37.348 9 CRITICAL designate [designate-manage - - - - -] Unhandled error: oslo_db.exception.DBMigrationError: Neither 'Column' object nor 'Comparator' object has an attribute '_get_table'
2022-09-15 11:00:37.348 9 ERROR designate Traceback (most recent call last):
2022-09-15 11:00:37.348 9 ERROR designate File "/usr/lib64/python3.9/site-packages/sqlalchemy/sql/elements.py", line 846, in __getattr__
2022-09-15 11:00:37.348 9 ERROR designate return getattr(self.comparator, key)
2022-09-15 11:00:37.348 9 ERROR designate AttributeError: 'Comparator' object has no attribute '_get_table'
2022-09-15 11:00:37.348 9 ERROR designate
2022-09-15 11:00:37.348 9 ERROR designate The above exception was the direct cause of the following exception:
2022-09-15 11:00:37.348 9 ERROR designate
2022-09-15 11:00:37.348 9 ERROR designate Traceback (most recent call last):
2022-09-15 11:00:37.348 9 ERROR designate File "/usr/lib/python3.9/site-packages/oslo_db/sqlalchemy/migration.py", line 87, in db_sync
2022-09-15 11:00:37.348 9 ERROR designate migration = versioning_api.upgrade(engine, repository, version)
2022-09-15 11:00:37.348 9 ERROR designate File "/usr/lib/python3.9/site-packages/migrate/versioning/api.py", line 186, in upgrade
2022-09-15 11:00:37.348 9 ERROR designate return _migrate(url, repository, version, upgrade=True, err=err, **opts)
2022-09-15 11:00:37.348 9 ERROR designate File "<decorator-gen-15>", line 2, in _migrate
2022-09-15 11:00:37.348 9 ERROR designate File "/usr/lib/python3.9/site-packages/migrate/versioning/util/__init__.py", line 167, in with_engine
2022-09-15 11:00:37.348 9 ERROR designate return f(*a, **kw)
2022-09-15 11:00:37.348 9 ERROR designate File "/usr/lib/python3.9/site-packages/migrate/versioning/api.py", line 366, in _migrate
2022-09-15 11:00:37.348 9 ERROR designate schema.runchange(ver, change, changeset.step)
2022-09-15 11:00:37.348 9 ERROR designate File "/usr/lib/python3.9/site-packages/migrate/versioning/schema.py", line 93, in runchange
2022-09-15 11:00:37.348 9 ERROR designate change.run(self.engine, step)
2022-09-15 11:00:37.348 9 ERROR designate File "/usr/lib/python3.9/site-packages/migrate/versioning/script/py.py", line 154, in run
2022-09-15 11:00:37.348 9 ERROR designate script_func(engine)
2022-09-15 11:00:37.348 9 ERROR designate File "/usr/lib/python3.9/site-packages/designate/storage/impl_sqlalchemy/migrate_repo/versions/080_domain_to_zone_rename.py", line 108, in upgrade
2022-09-15 11:00:37.348 9 ERROR designate drop_foreign_key(fk)
2022-09-15 11:00:37.348 9 ERROR designate File "/usr/lib/python3.9/site-packages/designate/storage/impl_sqlalchemy/migrate_repo/versions/080_domain_to_zone_rename.py", line 52, in drop_foreign_key
2022-09-15 11:00:37.348 9 ERROR designate table = fk_def[0]._get_table()
2022-09-15 11:00:37.348 9 ERROR designate File "/usr/lib64/python3.9/site-packages/sqlalchemy/sql/elements.py", line 848, in __getattr__
2022-09-15 11:00:37.348 9 ERROR designate util.raise_(
2022-09-15 11:00:37.348 9 ERROR designate File "/usr/lib64/python3.9/si...

Read more...

Revision history for this message
Michael Johnson (johnsom) wrote :

I am not sure why that would start failing now. My guess is one of the following packages updated: sqlalchemy, oslo.db, or migrate.

Revision history for this message
Michael Johnson (johnsom) wrote :

https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-scenario003-standalone-wallaby/f973cd2/logs/undercloud/var/log/dnf.rpm.log.txt.gz

lists:
2022-09-15T06:38:45-0400 SUBDEBUG Installed: python3-sqlalchemy-1.4.37-3.el9.x86_64

Where upper-constraints and the passing job list:
SQLAlchemy===1.3.23
2022-09-12T18:07:48-0400 SUBDEBUG Installed: python3-sqlalchemy-1.3.24-1.el9s.x86_64

Revision history for this message
Michael Johnson (johnsom) wrote :

Confirmed, the wallaby version of Designate is not compatible with the 1.4.x version of SQLAlchemy.
You will need to go back to aligning to the wallaby upper-constraints.txt version.

(Though I tested 1.3.24 as well and it passes fine)

Changes related to this bug cause the issue (feature removal in 1.4 branch):
https://www.sqlalchemy.org/trac/ticket/4755

Revision history for this message
Marios Andreou (marios-b) wrote (last edit ):

Michael Johnson o/ thank you for digging here much appreciated

we merged this on the 13th (when the job started failing): Revert "Downgrade python3-sqlalchemy" https://review.opendev.org/c/openstack/tripleo-quickstart/+/850568/3#message-e80ed32855f76477d3025858e57d44a3c5857f6e

Did some digging and the rdoinfo pin for this on wallaby/9 is already pinned to the right version i.e. cloud9s-openstack-wallaby-testing: python-sqlalchemy-1.3.24-1.el9s
 https://github.com/redhat-openstack/rdoinfo/blob/819d8b0b549fe88c3979fcfb851ff1a82974c045/buildsys-tags/cloud9s-openstack-wallaby-testing.yml#L842

We can re-add the exclusion at https://review.opendev.org/c/openstack/tripleo-quickstart/+/850568/3/config/release/tripleo-ci/CentOS-9/promotion-testing-hash-wallaby.yml but I am not sure why we need it?

[EDIT]: forgot to add link to the previous related bug https://bugs.launchpad.net/tripleo/+bug/1982227 (where we did the downgrade for sqlalchemy)
[EDIT]: another related bug there https://bugs.launchpad.net/tripleo/+bug/1982195 (we were tracking two apparently)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-quickstart (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-quickstart/+/858030

Revision history for this message
Marios Andreou (marios-b) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-quickstart (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-quickstart/+/858030
Committed: https://opendev.org/openstack/tripleo-quickstart/commit/054053061579c626d960b38cd2bb0874de1651bb
Submitter: "Zuul (22348)"
Branch: master

commit 054053061579c626d960b38cd2bb0874de1651bb
Author: Marios Andreou <email address hidden>
Date: Fri Sep 16 07:55:35 2022 +0000

    Revert "Revert "Downgrade python3-sqlalchemy""

    This reverts commit ab4cff620fc31ecad36b42635a6bf538264eefab.

    Reason for revert: causing problems again new bug @ https://bugs.launchpad.net/tripleo/+bug/1989795

    Change-Id: I998f9e7ab1ef4713660a75f6bc77e5e43a08b549
    Related-Bug: 1989795
    Related-Bug: 1982195

Revision history for this message
Ronelle Landy (rlandy) wrote :
Changed in tripleo:
status: Triaged → Fix Released
Revision history for this message
Marios Andreou (marios-b) wrote :

moving this back to in progress

we are running with pinned version of sqlalchemy so we need to track the proper fix and when we can revert revert that https://review.opendev.org/c/openstack/tripleo-quickstart/+/858030

Changed in tripleo:
status: Fix Released → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-quickstart (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-quickstart/+/860810

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-quickstart (master)

Change abandoned by "Marios Andreou <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-quickstart/+/860810
Reason: nope! that one instead https://review.opendev.org/c/openstack/tripleo-heat-templates/+/860824 :)

Rabi Mishra (rabi)
Changed in tripleo:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.