After controller swact incorrect host state for system host-list on split brain test scenario
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Eric MacDonald |
Bug Description
Bug Description : As HA improvement test split brain test scenario was executed stopping the traffic flowing from active controller (controller-0) to controller-1 and compute-0 using ip table command. Soon after this there was a swact as expected from controller-0 to controller-1 because controller-1 is the healthier controller who can see compute-0. After this host-list from new active controller(
-------
[wrsroot@
+----+-
| id | hostname | personality | administrative | operational | availability |
+----+-
| 1 | controller-0 | controller | unlocked | disabled | online |
| 2 | controller-1 | controller | unlocked | enabled | failed |
| 3 | storage-0 | storage | unlocked | enabled | available |
| 4 | storage-1 | storage | unlocked | enabled | available |
| 5 | compute-0 | worker | unlocked | enabled | failed |
| 6 | compute-1 | worker | unlocked | enabled | available |
| 7 | compute-2 | worker | unlocked | enabled | available |
| 8 | compute-3 | worker | unlocked | enabled | available |
+----+-
services on controller-1 are all enabled-active, and on controller-0 are all disabled.
=======
SYSTEM: yow-cgcs-
=======
controller-0:~$ sudo sm-dump
Password:
-Service_
oam-services disabled disabled
controller-services disabled disabled
cloud-services disabled disabled
patching-services disabled disabled
directory-services disabled disabled
web-services disabled disabled
storage-services disabled disabled
storage-
vim-services disabled disabled
-------
controller-1:~$ sudo sm-dump
Password:
-Service_
oam-services active active
controller-services active active
cloud-services active active
patching-services active active
directory-services active active
web-services active active
storage-services active active
storage-
vim-services active active
-------
Severity
--------
Major
Steps to Reproduce
------------------
1. Execute below command from active controller-0 to block standby controller-1 and compute-0 traffic.
sudo iptables -I INPUT 1 -s 192.168.223.57 -j DROP && sudo iptables -I INPUT 1 -s 192.168.222.156 -j DROP && \
sudo iptables -I INPUT 1 -s 192.168.222.4 -j DROP && sudo iptables -I INPUT 1 -s 192.168.223.3 -j DROP && sudo iptables -I INPUT 1 -s 128.224.150.57 -j DROP
2. After the above command swact to controller-1 but instable host-list display as per description .
Expected Behavior
------------------
Stable host-list correct information
Actual Behavior
----------------
As per description
Reproducibility
---------------
Yes . Reproduced as controller-0 as active controller
System Configuration
-------
regular system
Branch/Pull Time/Commit
-------
StarlingX_
Timestamp/Logs
--------------
2019-01-
description: | updated |
Changed in starlingx: | |
assignee: | Eric MacDonald (rocksolidmtce) → Cindy Xie (xxie1) |
tags: |
added: stx.2.0 removed: stx.2019.05 |
tags: | added: stx.retestneeded |
Changed in starlingx: | |
assignee: | Cindy Xie (xxie1) → chen haochuan (martin1982) |
Marking as release gating -- issue found during feature testing (HA Recovery Improvements)