tripleo component standalone-upgrade ussuri DBConnectionError No route to host

Bug #1906822 reported by Marios Andreou
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Marios Andreou

Bug Description

At [1][2] the periodic-tripleo-ci-centos-8-standalone-upgrade-tripleo-ussuri is failing during the standalone deployment. The error comes from glance-api service container that cannot reach the database [3]:

  2020-12-03 10:03:47.568 16 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 1 attempts left.: oslo_db.exception.DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.24.3' ([Errno 113] No route to host)")
  2020-12-03 10:03:57.582 16 CRITICAL glance [-] Unhandled error: oslo_db.exception.DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.24.3' ([Errno 113] No route to host)")
  (Background on this error at: http://sqlalche.me/e/e3q8)
  2020-12-03 10:03:57.582 16 ERROR glance Traceback (most recent call last):
  2020-12-03 10:03:57.582 16 ERROR glance File "/usr/lib/python3.6/site-packages/pymysql/connections.py", line 920, in connect
  2020-12-03 10:03:57.582 16 ERROR glance **kwargs)
  2020-12-03 10:03:57.582 16 ERROR glance File "/usr/lib64/python3.6/socket.py", line 724, in create_connection
  2020-12-03 10:03:57.582 16 ERROR glance raise err
  2020-12-03 10:03:57.582 16 ERROR glance File "/usr/lib64/python3.6/socket.py", line 713, in create_connection
  2020-12-03 10:03:57.582 16 ERROR glance sock.connect(sa)
  2020-12-03 10:03:57.582 16 ERROR glance OSError: [Errno 113] No route to host

This is a new job, being added with [4]. This issue doesn't seem to affect master or victoria as can be seen in the test review at [5].

The integration pipeline standalone-upgrade-ussuri also seems to be unaffected with green runs @ [6]

[1] https://logserver.rdoproject.org/01/31201/1/check/periodic-tripleo-ci-centos-8-standalone-upgrade-tripleo-ussuri/f68f65b/job-output.txt
[2] https://logserver.rdoproject.org/01/31201/1/check/periodic-tripleo-ci-centos-8-standalone-upgrade-tripleo-ussuri/a1ce402/job-output.txt
[3] https://logserver.rdoproject.org/01/31201/1/check/periodic-tripleo-ci-centos-8-standalone-upgrade-tripleo-ussuri/a1ce402/logs/undercloud/var/log/containers/glance/api.log.txt.gz
[4] https://review.rdoproject.org/r/#/c/31200/
[5] https://review.rdoproject.org/r/#/c/31201/
[6] https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-standalone-upgrade-ussuri

summary: - tripleo component standalone-upgrade ussuri glance DBConnectionError No
- route to host
+ tripleo component standalone-upgrade ussuri DBConnectionError No route
+ to host
Revision history for this message
Marios Andreou (marios-b) wrote :

still happening see [1].

doesn't appear to be glance specific so removed that from the title

In logs from yesterday [2] I see the same issue but in the keyston log:

  * 2020-12-13 20:00:56.150 27 DEBUG migrate.versioning.repository [-] Repository /usr/lib/python3.6/site-packages/keystone/common/sql/migrate_repo loaded successfully __init__ /usr/lib/python3.6/site-packages/migrate/versioning/repository.py:82
  2020-12-13 20:00:56.150 27 DEBUG migrate.versioning.repository [-] Config: OrderedDict([('db_settings', OrderedDict([('repository_id', 'keystone'), ('version_table', 'migrate_version'), ('required_dbs', '[]'), ('use_timestamp_numbering', 'False')]))]) __init__ /usr/lib/python3.6/site-packages/migrate/versioning/repository.py:83
  2020-12-13 20:00:59.295 27 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -1 attempts left.: oslo_db.exception.DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.24.3' ([Errno 113] No route to host)")

I don't know if this is related to the bug at [3] yet - the error is happening in a different step though so really not sure about that (i.e. that it is a pacemaker version related issue).

[1] https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-standalone-upgrade-tripleo-ussuri
[2] https://logserver.rdoproject.org/openstack-component-tripleo/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-standalone-upgrade-tripleo-ussuri/0e7721a/logs/undercloud/var/log/containers/keystone/keystone.log.txt.gz
[3] https://bugs.launchpad.net/tripleo/+bug/1907769

Revision history for this message
Marios Andreou (marios-b) wrote :

hmmm this might be because I forgot to add the explicit ha environment files for the ussuri job definition in [1]

I am trying to do that in [2] let's see if it resolves the issue.

[1] https://github.com/rdo-infra/rdo-jobs/blob/2c65bcec6a8fa3c2e2b353dc1bd5c276b813ff87/zuul.d/component-jobs.yaml#L2716
[2] https://review.rdoproject.org/r/31370

Revision history for this message
Marios Andreou (marios-b) wrote :

yes confirmed that with [1] in my test run at [2] I am no longer hitting this issue... it is now failing for a different bug :D [3] \o/ logs at [4] if interested

I will close this one out once we merge [1]

[1] https://review.rdoproject.org/r/#/c/31370/
[2] https://review.rdoproject.org/r/#/c/31201/
[3] https://bugs.launchpad.net/tripleo/+bug/1907769
[4] https://logserver.rdoproject.org/01/31201/2/check/periodic-tripleo-ci-centos-8-standalone-upgrade-tripleo-ussuri/9031b8a/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz

Changed in tripleo:
status: Triaged → Fix Released
Changed in tripleo:
assignee: nobody → Marios Andreou (marios-b)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.