nova-scheduler can fail to start if keystone setup take too long

Bug #1931283 reported by Marian Gasparovic
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Gnocchi Charm
New
Undecided
Unassigned
MySQL InnoDB Cluster Charm
Invalid
High
Unassigned
OpenStack Nova Cloud Controller Charm
Confirmed
High
Unassigned

Bug Description

Keystone fails witha traceback on accessing database.

Original issue is in nova-cloud-controller, one of three units stays in blocked state with "Services not running that should be: nova-scheduler"

n-c-c log shows plenty of keystone 500 errors

And keystone_error.log shows plenty of tracebacks

oslo_db.exception.DBNonExistentTable: (sqlite3.OperationalError) no such table: project
[SQL: SELECT project.id AS project_id, project.name AS project_name, project.domain_id AS project_domain_id, project.description AS project_description, project.enabled AS project_enabled, project.extra AS project_extra, project.parent_id AS project_parent_id, project.is_domain AS project_is_domain

This is the first time we encountered it in our tests.

Artifacts from the test run
https://oil-jenkins.canonical.com/artifacts/842595bd-b750-405e-b7a7-2fee838ebd90/index.html

Revision history for this message
Liam Young (gnuoy) wrote :

I am guessing that this is a cloud that has significant resource contention as the problem is down so the nova-scheduler systemd service giving up waiting for keystone to be ready. This can be seen by looking at https://oil-jenkins.canonical.com/artifacts/842595bd-b750-405e-b7a7-2fee838ebd90/generated/generated/openstack/juju-crashdump-openstack-2021-06-08-03.31.59.tar.gz

The nova-cc/2 unit show the nova-scheduler that failed to start and that reports that keystone is returning 500s:

nova-cloud-controller_2/var/log/nova/nova-scheduler.log
2021-06-08 00:20:18.011 109392 ERROR nova keystoneauth1.exceptions.http.InternalServerError: Internal Server Error (HTTP 500)

The timestamp seems to tie up with keystone being unable to access MySQL due to an ACL issue:

keystone_2/var/log/apache2/keystone_error.log
2021-06-08 00:21:06.464096 sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (1045, "Access denied for user 'keystone'@'192.168.33.92' (using password: YES)")

It seems the grant was made shortly after this:
mysql-innodb-cluster_0/var/log/juju/machine-0-lxd-7.log
2021-06-08 00:21:06 DEBUG juju-log db-router:153: Grant does NOT exist for host '192.168.33.92' on db 'keystone'
2021-06-08 00:21:06 DEBUG juju-log db-router:153: Grant exists for host '192.168.33.92' on db 'keystone'

The journal file on nova-cc/2 shows the scheduler gave up ~50s before keystone was granted db access:
$ journalctl --utc --file nova-cloud-controller_2/var/log/journal/50fc85977028476087ad98d996a6f007/system.journal | grep 'Failed to start OpenStack Compute Scheduler'
Jun 08 00:20:18 juju-695a4b-4-lxd-8 systemd[1]: Failed to start OpenStack Compute Scheduler.

(I notice that the error included in the bug description is a red herring it shows that at the keystone was trying to access the local sqlite db. This is quite normal when standing up a new cloud as the relation with MySQL has not yet formed and the mysql config is not in the service conf file so the service falls back to its default of using a local sqlite db.
)

summary: - no such table: project
+ nova-scheduler can fail to start if keystone setup take too long
Changed in charm-mysql-innodb-cluster:
importance: Undecided → High
status: New → Confirmed
Changed in charm-nova-cloud-controller:
status: New → Confirmed
Changed in charm-mysql-innodb-cluster:
status: Confirmed → Invalid
Changed in charm-nova-cloud-controller:
importance: Undecided → High
Revision history for this message
Bas de Bruijne (basdbruijne) wrote :
Revision history for this message
Alexander Balderson (asbalderson) wrote :
Revision history for this message
Moises Emilio Benzan Mora (moisesbenzan) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.