nova and neutron service didn't recover after force unlocking the host
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Triaged
|
Medium
|
Jim Gauld |
Bug Description
Brief Description
-----------------
After force rebooting a host, the neuron and nova services keep in Init status and did not recover.
Severity
--------
Provide the severity of the defect.
Critical
Steps to Reproduce
------------------
1. When the host is unlocked and available, use "sudo reboot -f" to reboot the host. eg. compute-0
2. Waiting for enough time and run "kubectl get pod" to check the pods status
Expected Behavior
------------------
All pods are running or completed
Actual Behavior
----------------
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
openstack libvirt-
openstack neutron-
openstack neutron-
openstack neutron-
openstack neutron-
openstack neutron-
openstack nova-compute-
Reproducibility
---------------
100% Reproducible
System Configuration
-------
2 + 2 system or two node system
Branch/Pull Time/Commit
-------
stx master as of: 20190720T013000Z
Last Pass
---------
20190720T013000Z
Timestamp/Logs
--------------
[2019-08-06 02:38:58,214] 165 INFO MainThread host_helper.
[2019-08-06 02:38:58,214] 301 DEBUG MainThread ssh.send :: Send 'sudo reboot -f'
[2019-08-06 02:38:58,328] 423 DEBUG MainThread ssh.expect :: Output:
Password:
[2019-08-06 02:38:58,329] 301 DEBUG MainThread ssh.send :: Send 'Li69nux*'
[2019-08-06 02:39:08,488] 423 DEBUG MainThread ssh.expect :: Output:
Rebooting.
packet_write_wait: Connection to 192.168.204.174 port 22: Broken pipe
controller-1:~$
[2019-08-06 02:39:38,507] 3619 INFO MainThread system_
[2019-08-06 02:39:38,508] 466 DEBUG MainThread ssh.exec_cmd:: Executing command...
[2019-08-06 02:39:38,508] 301 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-08-06 02:39:40,047] 423 DEBUG MainThread ssh.expect :: Output:
+----+-
| id | hostname | personality | administrative | operational | availability |
+----+-
| 1 | controller-0 | controller | unlocked | enabled | available |
| 2 | compute-0 | worker | unlocked | disabled | offline |
| 3 | compute-1 | worker | unlocked | enabled | available |
| 4 | controller-1 | controller | unlocked | enabled | available |
+----+-
[2019-08-06 02:49:45,734] 301 DEBUG MainThread ssh.send :: Send 'kubectl get pod --all-namespaces --field-
[2019-08-06 02:49:46,009] 423 DEBUG MainThread ssh.expect :: Output:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
openstack libvirt-
openstack neutron-
openstack neutron-
openstack neutron-
openstack neutron-
openstack neutron-
openstack nova-compute-
[2019-08-06 03:02:08,744] 301 DEBUG MainThread ssh.send :: Send 'fm --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-08-06 03:02:10,193] 423 DEBUG MainThread ssh.expect :: Output:
+------
| UUID | Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+------
| 0192e25d-
| 4cb4a0ee-
| 9ac05c3b-
| a2e2ec3c-
| 2409cab2-
+------
controller-1:~$
[2019-08-06 03:02:10,194] 301 DEBUG MainThread ssh.send :: Send 'echo $?'
[2019-08-06 03:02:10,297] 423 DEBUG MainThread ssh.expect :: Output:
0
controller-1:~$
[2019-08-06 03:02:10,297] 1534 DEBUG MainThread ssh.get_
[2019-08-06 03:02:10,297] 466 DEBUG MainThread ssh.exec_cmd:: Executing command...
[2019-08-06 03:02:10,297] 301 DEBUG MainThread ssh.send :: Send 'kubectl get pod --all-namespaces --field-
[2019-08-06 03:02:10,528] 423 DEBUG MainThread ssh.expect :: Output:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
openstack libvirt-
openstack neutron-
openstack neutron-
openstack neutron-
openstack neutron-
openstack neutron-
openstack nova-compute-
openstack nova-service-
controller-1:~$
[2019-08-06 03:02:10,528] 301 DEBUG MainThread ssh.send :: Send 'echo $?'
[2019-08-06 03:02:10,631] 423 DEBUG MainThread ssh.expect :: Output:
0
controller-1:~$
[2019-08-06 03:02:10,632] 1534 DEBUG MainThread ssh.get_
[2019-08-06 03:02:10,632] 466 DEBUG MainThread ssh.exec_cmd:: Executing command...
[2019-08-06 03:02:10,632] 301 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-08-06 03:02:12,153] 423 DEBUG MainThread ssh.expect :: Output:
+------
| application | version | manifest name | manifest file | status | progress |
+------
| platform-integ-apps | 1.0-7 | platform-
| stx-openstack | 1.0-17-
+------
controller-1:~$
[2019-08-06 03:02:12,153] 301 DEBUG MainThread ssh.send :: Send 'echo $?'
[2019-08-06 03:02:12,256] 423 DEBUG MainThread ssh.expect :: Output:
0
controller-1:~$
[2019-08-06 03:02:12,258] 266 DEBUG MainThread conftest.
+++++++
Test steps started for: testcases/
[2019-08-06 03:02:12,258] 1534 DEBUG MainThread ssh.get_
[2019-08-06 03:02:12,259] 466 DEBUG MainThread ssh.exec_cmd:: Executing command...
[2019-08-06 03:02:12,259] 301 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-08-06 03:02:13,793] 423 DEBUG MainThread ssh.expect :: Output:
+----+-
| id | hostname | personality | administrative | operational | availability |
+----+-
| 1 | controller-0 | controller | unlocked | enabled | available |
| 2 | compute-0 | worker | unlocked | enabled | degraded |
| 3 | compute-1 | worker | unlocked | enabled | available |
| 4 | controller-1 | controller | unlocked | enabled | available |
+----+-
Test Activity
-------------
MTC Regression Testing
tags: | added: stx.retestneeded |
tags: |
added: stx.distro.openstack removed: stx.containers |
Found similar issue: https:/ /bugs.launchpad .net/starlingx/ +bug/1816842