2019-01-30 18:30:59 |
Anujeyan Manokeran |
bug |
|
|
added bug |
2019-01-30 18:47:27 |
Ghada Khalil |
starlingx: assignee |
|
Eric MacDonald (rocksolidmtce) |
|
2019-01-30 18:47:29 |
Ghada Khalil |
starlingx: importance |
Undecided |
Medium |
|
2019-01-30 18:47:31 |
Ghada Khalil |
starlingx: status |
New |
Triaged |
|
2019-01-30 18:48:01 |
Ghada Khalil |
tags |
|
stx.2019.05 stx.metal |
|
2019-01-30 18:59:12 |
Anujeyan Manokeran |
description |
Bug Description : As HA improvement test split brain test scenario was executed stopping the traffic flowing from active controller (controller-0) to controller-1 and compute-0 using ip table command. Soon after this there was a swact as expected from controller-0 to controller-1 because controller-1 is the healthier controller who can see compute-0. After this host-list from new active controller(controller-1) was showing incorrect data which is controller-0 is online and controller-1 is failed and compute-0 failed. This was discussed with Eric and Bin found to be an issue in maintenance not controller-1 is still think control-0 is active since it is getting message from controller-0 and unable to send messages.
------------+-------------+--------------+
[wrsroot@controller-1 ~(keystone_admin)]$ system host-list
+----+--------------+-------------+----------------+-------------+--------------+
| id | hostname | personality | administrative | operational | availability |
+----+--------------+-------------+----------------+-------------+--------------+
| 1 | controller-0 | controller | unlocked | disabled | online |
| 2 | controller-1 | controller | unlocked | enabled | failed |
| 3 | storage-0 | storage | unlocked | enabled | available |
| 4 | storage-1 | storage | unlocked | enabled | available |
| 5 | compute-0 | worker | unlocked | enabled | failed |
| 6 | compute-1 | worker | unlocked | enabled | available |
| 7 | compute-2 | worker | unlocked | enabled | available |
| 8 | compute-3 | worker | unlocked | enabled | available |
+----+--------------+-------------+----------------+-------------+--------------+
services on controller-1 are all enabled-active, and on controller-0 are all disabled.
====================================================================
SYSTEM: yow-cgcs-wildcat-113_121
====================================================================
controller-0:~$ sudo sm-dump
Password:
-Service_Groups------------------------------------------------------------------------
oam-services disabled disabled
controller-services disabled disabled
cloud-services disabled disabled
patching-services disabled disabled
directory-services disabled disabled
web-services disabled disabled
storage-services disabled disabled
storage-monitoring-services disabled disabled
vim-services disabled disabled
---------------------------------------------------------------------------------------
controller-1:~$ sudo sm-dump
Password:
-Service_Groups------------------------------------------------------------------------
oam-services active active
controller-services active active
cloud-services active active
patching-services active active
directory-services active active
web-services active active
storage-services active active
storage-monitoring-services active active
vim-services active active
---------------------------------------------------------------------------------------
Severity
--------
Major
Steps to Reproduce
------------------
1. Execute below command from active controller-0 to block standby controller-1 and compute-0 traffic.
sudo iptables -I INPUT 1 -s 192.168.223.57 -j DROP && sudo iptables -I INPUT 1 -s 192.168.222.156 -j DROP && \
sudo iptables -I INPUT 1 -s 192.168.222.4 -j DROP && sudo iptables -I INPUT 1 -s 192.168.223.3 -j DROP && sudo iptables -I INPUT 1 -s 128.224.150.57 -j DROP
2. After the above command swact to controller-1 but instable host-list display as per description .
Expected Behavior
------------------
Stable host-list correct information
Actual Behavior
----------------
As per description
Reproducibility
---------------
Yes . Reproduced as controller-0 as active controller
System Configuration
--------------------
regular system
Branch/Pull Time/Commit
-----------------------
StarlingX_Upstream_build release branch build as of 019-01-16_20-18-01
Timestamp/Logs
--------------
2019-01-28T16:29:08.185 |
Bug Description : As HA improvement test split brain test scenario was executed stopping the traffic flowing from active controller (controller-0) to controller-1 and compute-0 using ip table command. Soon after this there was a swact as expected from controller-0 to controller-1 because controller-1 is the healthier controller who can see compute-0. After this host-list from new active controller(controller-1) was showing incorrect data which is controller-0 is online and controller-1 is failed and compute-0 failed. It is an issue in maintenance not controller-1 is still think control-0 is active since it is getting message from controller-0 and unable to send messages.
------------+-------------+--------------+
[wrsroot@controller-1 ~(keystone_admin)]$ system host-list
+----+--------------+-------------+----------------+-------------+--------------+
| id | hostname | personality | administrative | operational | availability |
+----+--------------+-------------+----------------+-------------+--------------+
| 1 | controller-0 | controller | unlocked | disabled | online |
| 2 | controller-1 | controller | unlocked | enabled | failed |
| 3 | storage-0 | storage | unlocked | enabled | available |
| 4 | storage-1 | storage | unlocked | enabled | available |
| 5 | compute-0 | worker | unlocked | enabled | failed |
| 6 | compute-1 | worker | unlocked | enabled | available |
| 7 | compute-2 | worker | unlocked | enabled | available |
| 8 | compute-3 | worker | unlocked | enabled | available |
+----+--------------+-------------+----------------+-------------+--------------+
services on controller-1 are all enabled-active, and on controller-0 are all disabled.
====================================================================
SYSTEM: yow-cgcs-wildcat-113_121
====================================================================
controller-0:~$ sudo sm-dump
Password:
-Service_Groups------------------------------------------------------------------------
oam-services disabled disabled
controller-services disabled disabled
cloud-services disabled disabled
patching-services disabled disabled
directory-services disabled disabled
web-services disabled disabled
storage-services disabled disabled
storage-monitoring-services disabled disabled
vim-services disabled disabled
---------------------------------------------------------------------------------------
controller-1:~$ sudo sm-dump
Password:
-Service_Groups------------------------------------------------------------------------
oam-services active active
controller-services active active
cloud-services active active
patching-services active active
directory-services active active
web-services active active
storage-services active active
storage-monitoring-services active active
vim-services active active
---------------------------------------------------------------------------------------
Severity
--------
Major
Steps to Reproduce
------------------
1. Execute below command from active controller-0 to block standby controller-1 and compute-0 traffic.
sudo iptables -I INPUT 1 -s 192.168.223.57 -j DROP && sudo iptables -I INPUT 1 -s 192.168.222.156 -j DROP && \
sudo iptables -I INPUT 1 -s 192.168.222.4 -j DROP && sudo iptables -I INPUT 1 -s 192.168.223.3 -j DROP && sudo iptables -I INPUT 1 -s 128.224.150.57 -j DROP
2. After the above command swact to controller-1 but instable host-list display as per description .
Expected Behavior
------------------
Stable host-list correct information
Actual Behavior
----------------
As per description
Reproducibility
---------------
Yes . Reproduced as controller-0 as active controller
System Configuration
--------------------
regular system
Branch/Pull Time/Commit
-----------------------
StarlingX_Upstream_build release branch build as of 019-01-16_20-18-01
Timestamp/Logs
--------------
2019-01-28T16:29:08.185 |
|
2019-03-18 19:06:09 |
Ken Young |
starlingx: assignee |
Eric MacDonald (rocksolidmtce) |
Cindy Xie (xxie1) |
|
2019-03-20 19:53:07 |
Ken Young |
bug |
|
|
added subscriber Ken Young |
2019-04-05 20:36:45 |
Ken Young |
tags |
stx.2019.05 stx.metal |
stx.2.0 stx.metal |
|
2019-04-09 18:17:43 |
Ghada Khalil |
tags |
stx.2.0 stx.metal |
stx.2.0 stx.metal stx.retestneeded |
|
2019-04-26 23:43:19 |
chen haochuan |
starlingx: assignee |
Cindy Xie (xxie1) |
chen haochuan (martin1982) |
|
2019-04-29 11:52:26 |
Bill Zvonar |
bug |
|
|
added subscriber Bill Zvonar |
2019-05-03 14:43:44 |
Frank Miller |
starlingx: assignee |
chen haochuan (martin1982) |
Eric MacDonald (rocksolidmtce) |
|
2019-05-03 14:46:54 |
Frank Miller |
bug |
|
|
added subscriber Dariush Eslimi |
2019-05-07 20:27:37 |
OpenStack Infra |
starlingx: status |
Triaged |
In Progress |
|
2019-05-13 14:32:17 |
OpenStack Infra |
starlingx: status |
In Progress |
Fix Released |
|
2019-05-31 20:41:47 |
Anujeyan Manokeran |
tags |
stx.2.0 stx.metal stx.retestneeded |
stx.2.0 stx.metal |
|