stx-openstack fails to come back up after controllers reboot
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
zhipeng liu |
Bug Description
Brief Description
-----------------
Rebooted both controllers but Openstack fails to come back up.
[root@controller-1 sysadmin(
Failed to discover available identity versions when contacting http://
Service Unavailable (HTTP 503)
Node status:
[root@controller-1 sysadmin(
+----+-
| id | hostname | personality | administrative | operational | availability |
+----+-
| 1 | controller-0 | controller | unlocked | enabled | degraded |
| 2 | controller-1 | controller | unlocked | enabled | available |
+----+-
[root@controller-1 sysadmin(
+------
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+------
| 400.001 | Service group cloud-services warning; dbmon(enabled-
| | | controller. | | 3:02:13. |
| | | service_
| | | services.host= | | |
| | | controller-1 | | |
| | | | | |
| 400.001 | Service group cloud-services warning; dbmon(enabled-
| | | controller. | | 3:01:13. |
| | | service_
| | | services.host= | | |
| | | controller-0 | | |
| | | | | |
| 200.006 | controller-0 is degraded due to the failure of its 'pci-irq- | host=controller-0. | major | 2020-06-03T1 |
| | affinity-agent' process. Auto recovery of this major process is in | process=pci-irq- | | 2:46:13. |
| | progress. | affinity-agent | | 918380 |
| | | | | |
+------
[root@controller-1 sysadmin(
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cinder-
cinder-
cinder-
cinder-
cinder-
cinder-
fm-rest-
fm-rest-
glance-
heat-api-
heat-cfn-
heat-engine-
heat-engine-
horizon-
keystone-
libvirt-
mariadb-server-0 0/1 CrashLoopBackOff 8 9h 172.16.166.158 controller-1 <none> <none>
neutron-
neutron-
neutron-
neutron-
neutron-
neutron-
neutron-
nova-api-
nova-api-
nova-api-
nova-compute-
nova-conductor-
nova-novncproxy
nova-scheduler-
nova-scheduler-
nova-service-
Severity
--------
Critical: openstack is unusable
Steps to Reproduce
------------------
1. Reboot both controllers with reboot -f
2. Wait for them to come back up
Expected Behavior
------------------
'openstack endpoint list' should work
Actual Behavior
----------------
[root@controller-1 sysadmin(
Failed to discover available identity versions when contacting http://
Service Unavailable (HTTP 503)
Reproducibility
---------------
100% reproducible
System Configuration
-------
AIO-DX ipv4
Branch/Pull Time/Commit
-------
master
Test Activity
-------------
Developer Testing
Workaround
----------
None
description: | updated |
tags: | added: stx.distro.openstack |
Changed in starlingx: | |
assignee: | yong hu (yhu6) → zhipeng liu (zhipengs) |
Changed in starlingx: | |
status: | Triaged → Confirmed |
tags: | added: stx.retestneeded |
Changed in starlingx: | |
status: | Confirmed → Fix Released |
tags: | removed: stx.retestneeded |
Issue seems to be caused by MariaDB not recovering. This leads to all Openstack services not responding.