SX controller take too long to recover after host-unlock

Bug #1890323 reported by Peng Peng on 2020-08-04
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Low
Unassigned

Bug Description

Brief Description
-----------------
30 minutes after SX host-unlock, controller node was still not recvoered.

Severity
--------
Major

Steps to Reproduce
------------------
SX host-unlock

TC-name: /networking/test_sriovdp.py::TestSriovMixed::()::test_sriovdp_mixed_add_vf_interface[1]

Expected Behavior
------------------
controller node recovered less than 5 mins

Actual Behavior
----------------
controller node not recovered after 30 mins

Reproducibility
---------------
Unknown - first time this is seen in sanity

System Configuration
--------------------
One node system

Lab-name: SM-3

Branch/Pull Time/Commit
-----------------------
2020-08-04_00-00-00

Last Pass
---------
2020-08-03_00-00-00

Timestamp/Logs
--------------
[2020-08-04 09:30:02,171] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[abcd:204::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'

[2020-08-04 09:58:21,003] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[abcd:204::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne show'
[2020-08-04 09:58:21,914] 436 DEBUG MainThread ssh.expect :: Output:
Authorization failed: Unable to establish connection to http://[abcd:204::1]:5000/v3/auth/tokens
controller-0:~$

Test Activity
-------------
Sanity

Peng Peng (ppeng) wrote :

Issue was reproduced on
WCP_112
2020-08-16_22-54-19

[2020-08-17 07:29:18,063] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[abcd:204::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'

[2020-08-17 07:57:34,118] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[abcd:204::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne show'
[2020-08-17 07:57:35,394] 436 DEBUG MainThread ssh.expect :: Output:
Authorization failed: Unable to establish connection to http://[abcd:204::1]:5000/v3/auth/tokens
controller-0:~$

It seems controller had double reboot after host-unlock.

collect log also added.

Ghada Khalil (gkhalil) wrote :

@Peng, Did the original occurrence also involve a double reboot?

Ghada Khalil (gkhalil) wrote :

Also are you still seeing these double reboots on these two systems or any others?

Ghada Khalil (gkhalil) on 2020-09-09
Changed in starlingx:
importance: Undecided → Low
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers