OpenStack pods were not recovered after force reboot active controller
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
yong hu |
Bug Description
Brief Description
-----------------
Many OpenStack pods fail to recover or were slow to recover after force rebooting the active controller
Severity
--------
Major
Steps to Reproduce
------------------
- Install and configure system, apply stx-openstack application
- 'sudo reboot -f' from active controller
Expected Behavior
------------------
- system swacts to the standby controller and all OpenStack pods recover to Running or Completed states.
Actual Behavior
----------------
- After force rebooting the controller, a number of OpenStack pods stuck in Init state. The keystone API and cinder-volume pods crushed.
controller-0:~$ kubectl get pods --all-namespaces | grep -v -e Completed -e Running
NAMESPACE NAME READY STATUS RESTARTS AGE
openstack cinder-
openstack cinder-
openstack fm-rest-
openstack glance-
openstack heat-api-
openstack heat-cfn-
openstack heat-engine-
openstack heat-engine-
openstack horizon-
openstack keystone-
openstack keystone-
openstack neutron-
openstack nova-api-
openstack nova-api-
openstack nova-conductor-
openstack nova-novncproxy
openstack nova-scheduler-
openstack nova-service-
Reproducibility
---------------
Intermittent (2 out of 3)
System Configuration
-------
Multi-node system
Branch/Pull Time/Commit
-------
r/stx.3.0 as of 2019-12-05 02:30:00
Timestamp/Logs
--------------
2019-12-06 15:21:50,338] 181 INFO MainThread host_helper.
[2019-12-06 15:21:50,338] 311 DEBUG MainThread ssh.send :: Send 'sudo reboot -f'
description: | updated |
tags: | added: stx.3.0 |
Changed in starlingx: | |
importance: | Undecided → Medium |
Changed in starlingx: | |
assignee: | zhipeng liu (zhipengs) → yong hu (yhu6) |
tags: | removed: stx.cherrypickneeded |
Assigning to the distro.openstack PL for triage and release recommendation -- keystone & cinder pods are not recovering.