2019-05-28 18:01:32 |
Nimalini Rasa |
bug |
|
|
added bug |
2019-05-28 18:02:07 |
Nimalini Rasa |
attachment added |
|
controller-0 log https://bugs.launchpad.net/starlingx/+bug/1830774/+attachment/5267224/+files/controller-0_20190528.150921.tar |
|
2019-05-28 19:00:12 |
Numan Waheed |
tags |
|
stx.retestneeded |
|
2019-05-28 19:24:36 |
Nimalini Rasa |
attachment added |
|
controller-1 log https://bugs.launchpad.net/starlingx/+bug/1830774/+attachment/5267226/+files/controller-1_20190528.192143.tar |
|
2019-05-29 17:01:42 |
Nimalini Rasa |
attachment added |
|
controller-0_20190529.164737.tar https://bugs.launchpad.net/starlingx/+bug/1830774/+attachment/5267443/+files/controller-0_20190529.164737.tar |
|
2019-05-29 17:02:17 |
Nimalini Rasa |
attachment added |
|
controller-1_20190529.164929.tar https://bugs.launchpad.net/starlingx/+bug/1830774/+attachment/5267444/+files/controller-1_20190529.164929.tar |
|
2019-05-29 17:38:24 |
Eric MacDonald |
starlingx: assignee |
|
Eric MacDonald (rocksolidmtce) |
|
2019-05-29 17:45:38 |
Ghada Khalil |
description |
Brief Description
-----------------
After rebooting active controller (controller-0), controller-1 did not take activity. When the controller-0 came out of reboot, it became the active controller.
Severity
--------Feature is
Major
Steps to Reproduce
------------------
Install system and issue reboot on active controller
Expected Behavior
------------------
Activity switch to standby controller (controller-1)
Actual Behavior
----------------
No swact
Reproducibility
---------------
seen once
System Configuration
--------------------
2+10
Branch/Pull Time/Commit
-----------------------
Private build:2019-05-23
Last Pass
---------
not known
Timestamp/Logs
--------------
2019-05-28T14:55:45.000 controller-0 -sh: info HISTORY: PID=276470 UID=1875 sudo reboot
See services going down on controller-0 in sm-customer.log:
| 2019-05-28T14:55:48.015 | 560 | service-scn | docker-distribution | enabled-active | disabled | process (pid=102746) failed
| 2019-05-28T14:55:48.052 | 561 | service-scn | docker-distribution | disabled | enabling | enabled-active state requested
| 2019-05-28T14:55:48.055 | 562 | service-scn | docker-distribution | enabling | enabled-active | enable success
| 2019-05-28T14:55:48.114 | 563 | service-scn | etcd | enabled-active | disabled | process (pid=102871) failed
| 2019-05-28T14:55:48.186 | 564 | service-scn | registry-token-server | enabled-active | disabled | process (pid=101767) failed
| 2019-05-28T14:55:48.228 | 565 | node-scn | controller-0 | unlocked-enabled | unlocked-disabled | node not ready, node unhealthy set
But on controller-1, it fails to take activity:
| 2019-05-28T14:55:51.046 | 189 | service-domain-scn | controller | backup | leader | leader change
| 2019-05-28T14:55:51.050 | 190 | node-scn | controller-1 | unlocked-enabled | unlocked-disabled | node not ready
| 2019-05-28T14:55:51.050 | 191 | interface-scn | cluster-host-interface | enabled | disabled | node disabled
| 2019-05-28T14:55:51.050 | 192 | interface-scn | oam-interface | enabled | disabled | node disabled
| 2019-05-28T14:55:51.050 | 193 | interface-scn | management-interface | enabled | disabled | node disabled
After controller-0 booted and came up it took activity and it shows both controllers as enabled-available:
[wrsroot@controller-0 ~(keystone_admin)]$ system host-list
+----+--------------+-------------+----------------+-------------+--------------+
| id | hostname | personality | administrative | operational | availability |
+----+--------------+-------------+----------------+-------------+--------------+
| 1 | controller-0 | controller | unlocked | enabled | available |
| 2 | controller-1 | controller | unlocked | enabled | available |
Test Activity
-------------
Platform Testing |
Brief Description
-----------------
After rebooting active controller (controller-0), controller-1 did not take activity. When the controller-0 came out of reboot, it became the active controller.
Severity
--------
Major
Steps to Reproduce
------------------
Install system and issue reboot on active controller
Expected Behavior
------------------
Activity switch to standby controller (controller-1)
Actual Behavior
----------------
No swact
Reproducibility
---------------
seen once
System Configuration
--------------------
2+10
Branch/Pull Time/Commit
-----------------------
Private build:2019-05-23
Last Pass
---------
not known
Timestamp/Logs
--------------
2019-05-28T14:55:45.000 controller-0 -sh: info HISTORY: PID=276470 UID=1875 sudo reboot
See services going down on controller-0 in sm-customer.log:
| 2019-05-28T14:55:48.015 | 560 | service-scn | docker-distribution | enabled-active | disabled | process (pid=102746) failed
| 2019-05-28T14:55:48.052 | 561 | service-scn | docker-distribution | disabled | enabling | enabled-active state requested
| 2019-05-28T14:55:48.055 | 562 | service-scn | docker-distribution | enabling | enabled-active | enable success
| 2019-05-28T14:55:48.114 | 563 | service-scn | etcd | enabled-active | disabled | process (pid=102871) failed
| 2019-05-28T14:55:48.186 | 564 | service-scn | registry-token-server | enabled-active | disabled | process (pid=101767) failed
| 2019-05-28T14:55:48.228 | 565 | node-scn | controller-0 | unlocked-enabled | unlocked-disabled | node not ready, node unhealthy set
But on controller-1, it fails to take activity:
| 2019-05-28T14:55:51.046 | 189 | service-domain-scn | controller | backup | leader | leader change
| 2019-05-28T14:55:51.050 | 190 | node-scn | controller-1 | unlocked-enabled | unlocked-disabled | node not ready
| 2019-05-28T14:55:51.050 | 191 | interface-scn | cluster-host-interface | enabled | disabled | node disabled
| 2019-05-28T14:55:51.050 | 192 | interface-scn | oam-interface | enabled | disabled | node disabled
| 2019-05-28T14:55:51.050 | 193 | interface-scn | management-interface | enabled | disabled | node disabled
After controller-0 booted and came up it took activity and it shows both controllers as enabled-available:
[wrsroot@controller-0 ~(keystone_admin)]$ system host-list
+----+--------------+-------------+----------------+-------------+--------------+
| id | hostname | personality | administrative | operational | availability |
+----+--------------+-------------+----------------+-------------+--------------+
| 1 | controller-0 | controller | unlocked | enabled | available |
| 2 | controller-1 | controller | unlocked | enabled | available |
Test Activity
-------------
Platform Testing |
|
2019-05-29 17:46:09 |
Ghada Khalil |
tags |
stx.retestneeded |
stx.metal stx.retestneeded |
|
2019-05-29 17:47:16 |
Ghada Khalil |
starlingx: importance |
Undecided |
Medium |
|
2019-05-29 17:47:18 |
Ghada Khalil |
starlingx: status |
New |
Triaged |
|
2019-05-29 17:47:26 |
Ghada Khalil |
tags |
stx.metal stx.retestneeded |
stx.2.0 stx.metal stx.retestneeded |
|
2019-06-06 15:12:47 |
Eric MacDonald |
starlingx: status |
Triaged |
In Progress |
|
2019-06-12 14:15:46 |
OpenStack Infra |
starlingx: status |
In Progress |
Fix Released |
|
2019-08-15 15:15:38 |
Ming Lei |
tags |
stx.2.0 stx.metal stx.retestneeded |
stx.2.0 stx.metal |
|