Activity log for bug #1813976

Date Who What changed Old value New value Message
2019-01-30 18:30:59 Anujeyan Manokeran bug added bug
2019-01-30 18:47:27 Ghada Khalil starlingx: assignee Eric MacDonald (rocksolidmtce)
2019-01-30 18:47:29 Ghada Khalil starlingx: importance Undecided Medium
2019-01-30 18:47:31 Ghada Khalil starlingx: status New Triaged
2019-01-30 18:48:01 Ghada Khalil tags stx.2019.05 stx.metal
2019-01-30 18:59:12 Anujeyan Manokeran description Bug Description : As HA improvement test split brain test scenario was executed stopping the traffic flowing from active controller (controller-0) to controller-1 and compute-0 using ip table command. Soon after this there was a swact as expected from controller-0 to controller-1 because controller-1 is the healthier controller who can see compute-0. After this host-list from new active controller(controller-1) was showing incorrect data which is controller-0 is online and controller-1 is failed and compute-0 failed. This was discussed with Eric and Bin found to be an issue in maintenance not controller-1 is still think control-0 is active since it is getting message from controller-0 and unable to send messages. ------------+-------------+--------------+ [wrsroot@controller-1 ~(keystone_admin)]$ system host-list +----+--------------+-------------+----------------+-------------+--------------+ | id | hostname | personality | administrative | operational | availability | +----+--------------+-------------+----------------+-------------+--------------+ | 1 | controller-0 | controller | unlocked | disabled | online | | 2 | controller-1 | controller | unlocked | enabled | failed | | 3 | storage-0 | storage | unlocked | enabled | available | | 4 | storage-1 | storage | unlocked | enabled | available | | 5 | compute-0 | worker | unlocked | enabled | failed | | 6 | compute-1 | worker | unlocked | enabled | available | | 7 | compute-2 | worker | unlocked | enabled | available | | 8 | compute-3 | worker | unlocked | enabled | available | +----+--------------+-------------+----------------+-------------+--------------+ services on controller-1 are all enabled-active, and on controller-0 are all disabled. ==================================================================== SYSTEM: yow-cgcs-wildcat-113_121 ==================================================================== controller-0:~$ sudo sm-dump Password: -Service_Groups------------------------------------------------------------------------ oam-services disabled disabled controller-services disabled disabled cloud-services disabled disabled patching-services disabled disabled directory-services disabled disabled web-services disabled disabled storage-services disabled disabled storage-monitoring-services disabled disabled vim-services disabled disabled --------------------------------------------------------------------------------------- controller-1:~$ sudo sm-dump Password: -Service_Groups------------------------------------------------------------------------ oam-services active active controller-services active active cloud-services active active patching-services active active directory-services active active web-services active active storage-services active active storage-monitoring-services active active vim-services active active --------------------------------------------------------------------------------------- Severity -------- Major Steps to Reproduce ------------------ 1. Execute below command from active controller-0 to block standby controller-1 and compute-0 traffic. sudo iptables -I INPUT 1 -s 192.168.223.57 -j DROP && sudo iptables -I INPUT 1 -s 192.168.222.156 -j DROP && \ sudo iptables -I INPUT 1 -s 192.168.222.4 -j DROP && sudo iptables -I INPUT 1 -s 192.168.223.3 -j DROP && sudo iptables -I INPUT 1 -s 128.224.150.57 -j DROP 2. After the above command swact to controller-1 but instable host-list display as per description . Expected Behavior ------------------ Stable host-list correct information Actual Behavior ---------------- As per description Reproducibility --------------- Yes . Reproduced as controller-0 as active controller System Configuration -------------------- regular system Branch/Pull Time/Commit ----------------------- StarlingX_Upstream_build release branch build as of 019-01-16_20-18-01 Timestamp/Logs -------------- 2019-01-28T16:29:08.185 Bug Description : As HA improvement test split brain test scenario was executed stopping the traffic flowing from active controller (controller-0) to controller-1 and compute-0 using ip table command. Soon after this there was a swact as expected from controller-0 to controller-1 because controller-1 is the healthier controller who can see compute-0. After this host-list from new active controller(controller-1) was showing incorrect data which is controller-0 is online and controller-1 is failed and compute-0 failed. It is an issue in maintenance not controller-1 is still think control-0 is active since it is getting message from controller-0 and unable to send messages. ------------+-------------+--------------+ [wrsroot@controller-1 ~(keystone_admin)]$ system host-list +----+--------------+-------------+----------------+-------------+--------------+ | id | hostname | personality | administrative | operational | availability | +----+--------------+-------------+----------------+-------------+--------------+ | 1 | controller-0 | controller | unlocked | disabled | online | | 2 | controller-1 | controller | unlocked | enabled | failed | | 3 | storage-0 | storage | unlocked | enabled | available | | 4 | storage-1 | storage | unlocked | enabled | available | | 5 | compute-0 | worker | unlocked | enabled | failed | | 6 | compute-1 | worker | unlocked | enabled | available | | 7 | compute-2 | worker | unlocked | enabled | available | | 8 | compute-3 | worker | unlocked | enabled | available | +----+--------------+-------------+----------------+-------------+--------------+ services on controller-1 are all enabled-active, and on controller-0 are all disabled. ====================================================================          SYSTEM: yow-cgcs-wildcat-113_121 ==================================================================== controller-0:~$ sudo sm-dump Password: -Service_Groups------------------------------------------------------------------------ oam-services disabled disabled controller-services disabled disabled cloud-services disabled disabled patching-services disabled disabled directory-services disabled disabled web-services disabled disabled storage-services disabled disabled storage-monitoring-services disabled disabled vim-services disabled disabled --------------------------------------------------------------------------------------- controller-1:~$ sudo sm-dump Password: -Service_Groups------------------------------------------------------------------------ oam-services active active controller-services active active cloud-services active active patching-services active active directory-services active active web-services active active storage-services active active storage-monitoring-services active active vim-services active active --------------------------------------------------------------------------------------- Severity -------- Major Steps to Reproduce ------------------ 1. Execute below command from active controller-0 to block standby controller-1 and compute-0 traffic. sudo iptables -I INPUT 1 -s 192.168.223.57 -j DROP && sudo iptables -I INPUT 1 -s 192.168.222.156 -j DROP && \ sudo iptables -I INPUT 1 -s 192.168.222.4 -j DROP && sudo iptables -I INPUT 1 -s 192.168.223.3 -j DROP && sudo iptables -I INPUT 1 -s 128.224.150.57 -j DROP  2. After the above command swact to controller-1 but instable host-list display as per description . Expected Behavior ------------------ Stable host-list correct information Actual Behavior ---------------- As per description Reproducibility --------------- Yes . Reproduced as controller-0 as active controller System Configuration -------------------- regular system Branch/Pull Time/Commit ----------------------- StarlingX_Upstream_build release branch build as of 019-01-16_20-18-01 Timestamp/Logs -------------- 2019-01-28T16:29:08.185
2019-03-18 19:06:09 Ken Young starlingx: assignee Eric MacDonald (rocksolidmtce) Cindy Xie (xxie1)
2019-03-20 19:53:07 Ken Young bug added subscriber Ken Young
2019-04-05 20:36:45 Ken Young tags stx.2019.05 stx.metal stx.2.0 stx.metal
2019-04-09 18:17:43 Ghada Khalil tags stx.2.0 stx.metal stx.2.0 stx.metal stx.retestneeded
2019-04-26 23:43:19 chen haochuan starlingx: assignee Cindy Xie (xxie1) chen haochuan (martin1982)
2019-04-29 11:52:26 Bill Zvonar bug added subscriber Bill Zvonar
2019-05-03 14:43:44 Frank Miller starlingx: assignee chen haochuan (martin1982) Eric MacDonald (rocksolidmtce)
2019-05-03 14:46:54 Frank Miller bug added subscriber Dariush Eslimi
2019-05-07 20:27:37 OpenStack Infra starlingx: status Triaged In Progress
2019-05-13 14:32:17 OpenStack Infra starlingx: status In Progress Fix Released
2019-05-31 20:41:47 Anujeyan Manokeran tags stx.2.0 stx.metal stx.retestneeded stx.2.0 stx.metal