Brief Description
-----------------
Rebooted both controllers but Openstack fails to come back up.
[root@controller-1 sysadmin(keystone_admin)]# openstack endpoint list
Failed to discover available identity versions when contacting http://keystone.openstack.svc.cluster.local/v3. Attempting to parse version from URL.
Service Unavailable (HTTP 503)
Node status:
[root@controller-1 sysadmin(keystone_admin)]# system host-list
+----+--------------+-------------+----------------+-------------+--------------+
| id | hostname | personality | administrative | operational | availability |
+----+--------------+-------------+----------------+-------------+--------------+
| 1 | controller-0 | controller | unlocked | enabled | degraded |
| 2 | controller-1 | controller | unlocked | enabled | available |
+----+--------------+-------------+----------------+-------------+--------------+
[root@controller-1 sysadmin(keystone_admin)]# fm alarm-list
+----------+---------------------------------------------------------------------+------------------------+----------+--------------+
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+----------+---------------------------------------------------------------------+------------------------+----------+--------------+
| 400.001 | Service group cloud-services warning; dbmon(enabled-active, ) | service_domain= | minor | 2020-06-03T1 |
| | | controller. | | 3:02:13. |
| | | service_group=cloud- | | 385662 |
| | | services.host= | | |
| | | controller-1 | | |
| | | | | |
| 400.001 | Service group cloud-services warning; dbmon(enabled-standby, ) | service_domain= | minor | 2020-06-03T1 |
| | | controller. | | 3:01:13. |
| | | service_group=cloud- | | 112434 |
| | | services.host= | | |
| | | controller-0 | | |
| | | | | |
| 200.006 | controller-0 is degraded due to the failure of its 'pci-irq- | host=controller-0. | major | 2020-06-03T1 |
| | affinity-agent' process. Auto recovery of this major process is in | process=pci-irq- | | 2:46:13. |
| | progress. | affinity-agent | | 918380 |
| | | | | |
+----------+---------------------------------------------------------------------+------------------------+----------+--------------+
Last Pass
---------
Did this test scenario pass previously? If so, please indicate the load/pull time info of the last pass.
Use this section to also indicate if this is a new test scenario.
Brief Description
-----------------
Rebooted both controllers but Openstack fails to come back up.
[root@controller-1 sysadmin( keystone_ admin)] # openstack endpoint list keystone. openstack. svc.cluster. local/v3. Attempting to parse version from URL.
Failed to discover available identity versions when contacting http://
Service Unavailable (HTTP 503)
Node status: keystone_ admin)] # system host-list ------- ------+ ------- ------+ ------- ------- --+---- ------- --+---- ------- ---+ ------- ------+ ------- ------+ ------- ------- --+---- ------- --+---- ------- ---+ ------- ------+ ------- ------+ ------- ------- --+---- ------- --+---- ------- ---+ keystone_ admin)] # fm alarm-list ----+-- ------- ------- ------- ------- ------- ------- ------- ------- ------- ----+-- ------- ------- ------- -+----- -----+- ------- ------+ ----+-- ------- ------- ------- ------- ------- ------- ------- ------- ------- ----+-- ------- ------- ------- -+----- -----+- ------- ------+ active, ) | service_domain= | minor | 2020-06-03T1 | group=cloud- | | 385662 | standby, ) | service_domain= | minor | 2020-06-03T1 | group=cloud- | | 112434 | ----+-- ------- ------- ------- ------- ------- ------- ------- ------- ------- ----+-- ------- ------- ------- -+----- -----+- ------- ------+
[root@controller-1 sysadmin(
+----+-
| id | hostname | personality | administrative | operational | availability |
+----+-
| 1 | controller-0 | controller | unlocked | enabled | degraded |
| 2 | controller-1 | controller | unlocked | enabled | available |
+----+-
[root@controller-1 sysadmin(
+------
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+------
| 400.001 | Service group cloud-services warning; dbmon(enabled-
| | | controller. | | 3:02:13. |
| | | service_
| | | services.host= | | |
| | | controller-1 | | |
| | | | | |
| 400.001 | Service group cloud-services warning; dbmon(enabled-
| | | controller. | | 3:01:13. |
| | | service_
| | | services.host= | | |
| | | controller-0 | | |
| | | | | |
| 200.006 | controller-0 is degraded due to the failure of its 'pci-irq- | host=controller-0. | major | 2020-06-03T1 |
| | affinity-agent' process. Auto recovery of this major process is in | process=pci-irq- | | 2:46:13. |
| | progress. | affinity-agent | | 918380 |
| | | | | |
+------
[root@controller-1 sysadmin( keystone_ admin)] # kubectl get pods -o wide -n openstack | grep -v Running | grep -v Completed api-59979594ff- 25hrl 0/1 Init:0/2 0 26m 172.16.192.109 controller-0 <none> <none> backup- 6dd95fc9dd- svp5r 0/1 Init:0/4 0 26m 172.16.192.123 controller-0 <none> <none> scheduler- 76c65f6979- 5tmt5 0/1 Init:0/2 0 26m 172.16.192.82 controller-0 <none> <none> volume- b7dfbb7b9- f47bk 0/1 Init:0/4 0 26m 172.16.192.90 controller-0 <none> <none> volume- b7dfbb7b9- mtkz8 0/1 Init:3/4 7 9h 172.16.166.185 controller-1 <none> <none> volume- usage-audit- 1591187700- g64xf 0/1 Init:0/1 0 27m 172.16.166.157 controller-1 <none> <none> api-8b5b97bf8- qdlbx 0/1 Init:0/1 0 26m 172.16.192.67 controller-0 <none> <none> api-8b5b97bf8- v5wz9 0/1 CrashLoopBackOff 9 8h 172.16.166.140 controller-1 <none> <none> api-6b74f659d- w9t4g 0/1 Init:0/3 0 26m 172.16.192.101 controller-0 <none> <none> 846d848bd9- hd46z 0/1 Init:0/1 0 26m 172.16.192.108 controller-0 <none> <none> 9d6f7ffc5- rvb4d 0/1 Init:0/1 0 26m 172.16.192.73 controller-0 <none> <none> 6487ff65c6- zk4n7 0/1 Init:0/1 0 26m 172.16.192.80 controller-0 <none> <none> cleaner- 1591187700- kd2pn 0/1 Init:0/1 0 27m 172.16.166.156 controller-1 <none> <none> 65d4b5bdcf- ltms2 0/1 Init:0/1 0 21m 172.16.192.83 controller-0 <none> <none> api-6c76774bf7- l7c4d 0/1 Init:0/1 0 26m 172.16.192.113 controller-0 <none> <none> libvirt- default- 4mf4v 0/1 Init:0/3 1 9h 192.168.204.2 controller-0 <none> <none> dhcp-agent- controller- 0-937646f6- r5skk 0/1 Init:0/1 1 9h 192.168.204.2 controller-0 <none> <none> l3-agent- controller- 0-937646f6- fk5hc 0/1 Init:0/1 1 9h 192.168.204.2 controller-0 <none> <none> metadata- agent-controlle r-0-937646f6- nzkxz 0/1 Init:0/2 1 9h 192.168.204.2 controller-0 <none> <none> ovs-agent- controller- 0-937646f6- dbfc2 0/1 Init:0/3 1 9h 192.168.204.2 controller-0 <none> <none> server- 7c9678cf58- dq52p 0/1 Init:0/1 0 26m 172.16.192.74 controller-0 <none> <none> server- 7c9678cf58- s85dg 0/1 CrashLoopBackOff 8 9h 172.16.166.191 controller-1 <none> <none> sriov-agent- controller- 0-937646f6- qxt9g 0/1 Init:0/2 1 9h 192.168.204.2 controller-0 <none> <none> metadata- b9b4fdb9b- d2gr6 0/1 CrashLoopBackOff 8 9h 172.16.166.177 controller-1 <none> <none> metadata- b9b4fdb9b- kg859 0/1 Init:0/2 0 26m 172.16.192.110 controller-0 <none> <none> osapi-856679d49 f-4ljnl 0/1 Init:0/1 0 26m 172.16.192.117 controller-0 <none> <none> controller- 0-937646f6- 9lrqs 0/2 Init:0/6 1 9h 192.168.204.2 controller-0 <none> <none> 6cbc75dd89- nxvwc 0/1 Init:0/1 0 26m 172.16.192.66 controller-0 <none> <none> -5bd676cfc4- 82r8x 0/1 Init:0/3 0 26m 172.16.192.72 controller-0 <none> <none> 7fbf5cdd4- ckmkd 0/1 CrashLoopBackOff 6 9h 172.16.166.172 controller-1 <none> <none> 7fbf5cdd4- h65j5 0/1 Init:0/1 0 26m 172.16.192.121 controller-0 <none> <none> cleaner- 1591189200- 927h5 0/1 Init:0/1 0 6m56s 172.16.192.92 controller-0 <none> <none>
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cinder-
cinder-
cinder-
cinder-
cinder-
cinder-
fm-rest-
fm-rest-
glance-
heat-api-
heat-cfn-
heat-engine-
heat-engine-
horizon-
keystone-
libvirt-
mariadb-server-0 0/1 CrashLoopBackOff 8 9h 172.16.166.158 controller-1 <none> <none>
neutron-
neutron-
neutron-
neutron-
neutron-
neutron-
neutron-
nova-api-
nova-api-
nova-api-
nova-compute-
nova-conductor-
nova-novncproxy
nova-scheduler-
nova-scheduler-
nova-service-
Severity
--------
Critical: openstack is unusable
Steps to Reproduce
------------------
1. Reboot both controllers with reboot -f
2. Wait for them to come back up
Expected Behavior
------------------
'openstack endpoint list' should work
Actual Behavior
----------------
State what is the actual behavior
Reproducibility
---------------
100% reproducible
System Configuration ------- ------
-------
AIO-DX ipv4
Branch/Pull Time/Commit ------- ------- --
-------
master
Last Pass
---------
Did this test scenario pass previously? If so, please indicate the load/pull time info of the last pass.
Use this section to also indicate if this is a new test scenario.
Test Activity
-------------
Developer Testing
Workaround
----------
None