openstack terraform fails due to "temporary overloading or maintenance" Keystone fails to connect to mysql

Bug #2045298 reported by Alexander Balderson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Snap
New
Undecided
Unassigned

Bug Description

testing the openstack snap from rev 332 with 3 dedicated control instances and 3 instances running as compute/storage:
2023-11-30-00:57:34 root DEBUG ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓
2023-11-30-00:57:34 root DEBUG ┃ Node ┃ Status ┃ Control ┃ Compute ┃ Storage ┃
2023-11-30-00:57:34 root DEBUG ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━┩
2023-11-30-00:57:34 root DEBUG │ solqa-lab1-server-31.nosilo.lab1.sol… │ up │ x │ │ │
2023-11-30-00:57:34 root DEBUG │ solqa-lab1-server-32.nosilo.lab1.sol… │ up │ x │ │ │
2023-11-30-00:57:34 root DEBUG │ solqa-lab1-server-33.nosilo.lab1.sol… │ up │ x │ │ │
2023-11-30-00:57:34 root DEBUG │ solqa-lab1-server-34.nosilo.lab1.sol… │ up │ │ x │ x │
2023-11-30-00:57:34 root DEBUG │ solqa-lab1-server-35.nosilo.lab1.sol… │ up │ │ x │ x │
2023-11-30-00:57:34 root DEBUG │ solqa-lab1-server-36.nosilo.lab1.sol… │ up │ │ x │ x │
2023-11-30-00:57:34 root DEBUG └───────────────────────────────────────┴────────┴─────────┴─────────┴─────────┘

The deployment comes up and is happily idle before running the terraform to configure flavors and images. While running the terraform for creating the images, the code errored out and reported:

Error: Error creating openstack_compute_flavor_v2 m1.small: The service is currently unable to handle the request due to a temporary overloading or maintenance. This is a temporary condition. Try again later.

  with openstack_compute_flavor_v2.m1_small,
  on main.tf line 22, in resource "openstack_compute_flavor_v2" "m1_small":
  22: resource "openstack_compute_flavor_v2" "m1_small" {

and similar errors for different images and networks.

Looking at the openstack logs, near the timing where this happened the primary keystone unit reports that it got connection refused to the mysql-router and mysql. The whole traceback is too long to post, but the key bits at the end are:

2023-11-30T01:10:07.844Z [wsgi-keystone] 2023-11-30 01:10:07.844770 oslo_db.exception.DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on 'keystone-mysql-router.openstack.svc.cluster.local' ([Errno 111] Connection refused)")
2023-11-30T01:10:07.844Z [wsgi-keystone] 2023-11-30 01:10:07.844772 (Background on this error at: https://sqlalche.me/e/14/e3q8)

The MSQL router for keystone, and MYSQL for Keystone both look to be healthy, and I was unable to find many more errors inside the logs that looked relevant.

I wonder if the right approach is to retry the configuration, but understanding why Keystone lost connection seems relevant.

The testrun with this error can be found at:
https://solutions.qa.canonical.com/testruns/86582fa7-4caf-4980-a275-dbcfc6633267/

and the logs can be found at:
https://oil-jenkins.canonical.com/artifacts/86582fa7-4caf-4980-a275-dbcfc6633267/index.html

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.