Active controller became degraded after lock/unlock compute node
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
Paul-Ionut Vaduva |
Bug Description
Brief Description
-----------------
After lock/unlock one compute node, the active controller became degraded. 200.006 alarm raised.
After active controller force reboot, the system was recovered and alarm was cleared.
Severity
--------
Major
Steps to Reproduce
------------------
as description
TC-name: mtc/test_
Expected Behavior
------------------
Actual Behavior
----------------
Reproducibility
---------------
Unknown - first time this is seen in sanity, will monitor
System Configuration
-------
Multi-node system
IPv4
Lab-name: WCP_3-6
Branch/Pull Time/Commit
-------
2019-12-10_20-00-00
Last Pass
---------
2019-12-10_20-00-00 on (WP_8-12)
Timestamp/Logs
--------------
[2019-12-11 08:58:20,124] 311 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-12-11 08:58:21,300] 433 DEBUG MainThread ssh.expect :: Output:
+----+-
| id | hostname | personality | administrative | operational | availability |
+----+-
| 1 | controller-0 | controller | unlocked | enabled | available |
| 2 | compute-0 | worker | unlocked | enabled | available |
| 3 | compute-1 | worker | unlocked | enabled | available |
| 4 | controller-1 | controller | unlocked | enabled | available |
+----+-
[2019-12-11 08:58:22,661] 311 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-12-11 08:59:40,320] 311 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-12-11 09:05:59,264] 311 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-12-11 09:06:00,442] 433 DEBUG MainThread ssh.expect :: Output:
+----+-
| id | hostname | personality | administrative | operational | availability |
+----+-
| 1 | controller-0 | controller | unlocked | enabled | degraded |
| 2 | compute-0 | worker | unlocked | enabled | available |
| 3 | compute-1 | worker | unlocked | enabled | available |
| 4 | controller-1 | controller | unlocked | enabled | available |
+----+-
[sysadmin@
[2019-12-11 09:11:08,717] 311 DEBUG MainThread ssh.send :: Send 'fm --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-12-11 09:11:09,693] 433 DEBUG MainThread ssh.expect :: Output:
+------
| UUID | Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+------
| 26e10dab-
+------
[sysadmin@
Test Activity
-------------
Sanity
description: | updated |
tags: | added: stx.retestneeded |
tags: | added: stx.4.0 |
Changed in starlingx: | |
status: | Fix Released → Confirmed |
tags: | removed: stx.cherrypickneeded |
Waiting from triage by Dan to understand if this issue is introduced by recent code changes related to: https:/ /review. opendev. org/#/c/ 695917/