neutron-l3-agent and neutron-dhcp-agent never recovered after force reboot
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Won't Fix
|
High
|
Joseph Richard |
Bug Description
Brief Description
-----------------
neutron-l3-agent and neutron-dhcp-agent was down and never recovered after compute-3 was force rebooted. Below shows before it was up and after compute got rebooted it went down.
Send 'openstack --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-07-07 04:53:48,062] 423 DEBUG MainThread ssh.expect :: Output:
+------
| ID | Agent Type | Host | Availability Zone | Alive | State | Binary |
+------
| 02266b80-
| 0398ee2d-
| 0488ea35-
| 084b585c-
| 0dc1dd0e-
| 167b07dc-
| 1b8f8bb7-
| 1d09fdbd-
| 2ac2053c-
| 2b3e4e26-
| 45a5a758-
| 4684d7f7-
| 46ad03ef-
| 4c4413b2-
| 5593bdb5-
| 58290c99-
| 603e727d-
| 639282a0-
| 6e13479a-
| 79ac04e7-
| 844028c3-
| 864fbc99-
| a647d5ea-
| ad20d257-
| caad4860-
+------
Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-07-07 04:54:57,648] 423 DEBUG MainThread ssh.expect :: Output:
+----+-
| id | hostname | personality | administrative | operational | availability |
+----+-
| 1 | controller-0 | controller | unlocked | enabled | available |
| 2 | compute-0 | worker | unlocked | enabled | available |
| 3 | compute-1 | worker | unlocked | enabled | available |
| 4 | compute-2 | worker | unlocked | enabled | available |
| 5 | compute-3 | worker | unlocked | enabled | available |
| 6 | compute-4 | worker | unlocked | enabled | available |
| 7 | controller-1 | controller | unlocked | enabled | available |
| 8 | storage-0 | storage | unlocked | enabled | available |
| 9 | storage-1 | storage | unlocked | enabled | available |
+----+-
controller-1:~$
[2019-07-07 04:54:57,649] 301 DEBUG MainThread ssh.send :: Send 'echo $?'
[2019-07-07 04:54:57,751] 423 DEBUG MainThread ssh.expect :: Output:
0
controller-1:~$
[2019-07-07 04:54:57,752] 282 DEBUG MainThread system_
[2019-07-07 04:54:57,752] 1534 DEBUG MainThread ssh.get_
[2019-07-07 04:54:57,752] 466 DEBUG MainThread ssh.exec_cmd:: Executing command...
[2019-07-07 04:54:57,753] 301 DEBUG MainThread ssh.send :: Send 'whoami'
[2019-07-07 04:54:57,857] 423 DEBUG MainThread ssh.expect :: Output:
sysadmin
controller-1:~$
[2019-07-07 04:54:57,857] 301 DEBUG MainThread ssh.send :: Send ''
[2019-07-07 04:54:57,959] 423 DEBUG MainThread ssh.expect :: Output:
controller-1:~$
[2019-07-07 04:54:57,960] 1185 INFO MainThread ssh.connect :: Attempt to connect to compute-3 from 128.224.150.45...
[2019-07-07 04:54:57,960] 301 DEBUG MainThread ssh.send :: Send '/usr/bin/ssh -o RSAAuthenticati
[2019-07-07 04:54:58,131] 423 DEBUG MainThread ssh.expect :: Output:
Warning: Permanently added 'compute-
sysadmin@
[2019-07-07 04:54:58,131] 301 DEBUG MainThread ssh.send :: Send 'Li69nux*'
[2019-07-07 04:54:58,398] 423 DEBUG MainThread ssh.expect :: Output:
Last login: Sat Jul 6 23:06:06 2019 from controller-0
/etc/motd.
compute-3:~$
[2019-07-07 04:54:58,712] 165 INFO MainThread host_helper.
[2019-07-07 04:54:58,713] 301 DEBUG MainThread ssh.send :: Send 'sudo reboot -f'
[2019-07-07 04:54:58,826] 423 DEBUG MainThread ssh.expect :: Output:
Password:
Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-07-07 04:55:40,600] 423 DEBUG MainThread ssh.expect :: Output:
+----+-
| id | hostname | personality | administrative | operational | availability |
+----+-
| 1 | controller-0 | controller | unlocked | enabled | available |
| 2 | compute-0 | worker | unlocked | enabled | available |
| 3 | compute-1 | worker | unlocked | enabled | available |
| 4 | compute-2 | worker | unlocked | enabled | available |
| 5 | compute-3 | worker | unlocked | disabled | offline |
| 6 | compute-4 | worker | unlocked | enabled | available |
| 7 | controller-1 | controller | unlocked | enabled | available |
| 8 | storage-0 | storage | unlocked | enabled | available |
| 9 | storage-1 | storage | unlocked | enabled | available |
+----+-
end 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-07-07 05:00:05,427] 423 DEBUG MainThread ssh.expect :: Output:
+----+-
| id | hostname | personality | administrative | operational | availability |
+----+-
| 1 | controller-0 | controller | unlocked | enabled | available |
| 2 | compute-0 | worker | unlocked | enabled | available |
| 3 | compute-1 | worker | unlocked | enabled | available |
| 4 | compute-2 | worker | unlocked | enabled | available |
| 5 | compute-3 | worker | unlocked | enabled | available |
| 6 | compute-4 | worker | unlocked | enabled | available |
| 7 | controller-1 | controller | unlocked | enabled | available |
| 8 | storage-0 | storage | unlocked | enabled | available |
| 9 | storage-1 | storage | unlocked | enabled | available |
+----+-
Send 'openstack --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-07-07 05:00:43,303] 423 DEBUG MainThread ssh.expect :: Output:
+----+-
| ID | Hypervisor Hostname | Hypervisor Type | Host IP | State |
+----+-
| 4 | compute-1 | QEMU | 192.168.223.77 | up |
| 7 | compute-2 | QEMU | 192.168.223.176 | up |
| 10 | compute-0 | QEMU | 192.168.223.171 | up |
| 13 | compute-3 | QEMU | 192.168.223.237 | up |
| 16 | compute-4 | QEMU | 192.168.223.64 | up |
+----+-
Send 'openstack --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-07-07 05:03:19,248] 423 DEBUG MainThread ssh.expect :: Output:
+------
| ID | Agent Type | Host | Availability Zone | Alive | State | Binary |
+------
| 02266b80-
| 0398ee2d-
| 0488ea35-
| 084b585c-
| 0dc1dd0e-
| 167b07dc-
| 1b8f8bb7-
| 1d09fdbd-
| 2ac2053c-
| 2b3e4e26-
| 45a5a758-
| 4684d7f7-
| 46ad03ef-
| 4c4413b2-
| 5593bdb5-
| 58290c99-
| 603e727d-
| 639282a0-
| 6e13479a-
| 79ac04e7-
| 844028c3-
| 864fbc99-
| a647d5ea-
| ad20d257-
| caad4860-
+------
controller-1:~$
[2019-07-07 05:03:19,248] 301 DEBUG MainThread ssh.send :: Send 'echo $?'
Severity
--------
Major
Steps to Reproduce
------------------
1. Verify system health by checking alarm, hosts in available state and network agent list
2. force reboot compute wait for the hosts become available
3. As description says neutron-l3-agent and neutron-dhcp-agent never came up.
Expected Behavior
------------------
After force reboot network agent list shows all the binary up.
Actual Behavior
----------------
As per description 2 binary’s are not up.
Reproducibility
---------------
Tried only once in latest load.
System Configuration
-------
storage system
Branch/Pull Time/Commit
-------
20190706T013000Z
Last Pass
---------
Not known
Timestamp/Logs
--------------
2019-07-
Test Activity
-------------
Regression test
description: | updated |
tags: | added: stx.retestneeded |
tags: | added: stx.regression |
summary: |
neutron-l3-agent and neutron-dhcp-agent never recovered after force - reboot on compute + reboot |
tags: | removed: stx.nfv |
tags: | removed: stx.retestneeded |
please add the collect logs