2019-05-13 18:45:22 |
Chris Winnicki |
bug |
|
|
added bug |
2019-05-13 18:45:22 |
Chris Winnicki |
attachment added |
|
controller-0_20190513.183928.tar https://bugs.launchpad.net/bugs/1828877/+attachment/5263502/+files/controller-0_20190513.183928.tar |
|
2019-05-13 18:47:08 |
Chris Winnicki |
description |
Brief Description
-----------------
pci-irq-affinity-agent fails to start - controller-0 stuck in degraded state after initial unlock
pmond reports the following (continuously):
/var/log/pmond.log (snippet)
2019-05-13T18:34:52.837 [93604.05081] controller-0 pmond mon pmonFsm.cpp ( 565) pmon_passive_handler : Info : pci-irq-affinity-agent stability period (20 secs)
2019-05-13T18:34:52.837 [93604.05082] controller-0 pmond mon pmonHdlr.cpp (1003) process_running : Info : pci-irq-affinity-agent process not running
2019-05-13T18:34:52.837 [93604.05083] controller-0 pmond mon pmonHdlr.cpp (1305) respawn_process : Info : pci-irq-affinity-agent Spawn (1200886)
2019-05-13T18:34:53.837 [93604.05084] controller-0 pmond mon pmonHdlr.cpp ( 897) want_degrade_clear : Warn : pci-irq-affinity-agent is still failed 'major' ; degrade assert
controller-0 stuck in degraded state:
[wrsroot@controller-0 ~(keystone_admin)]$ system host-list
+----+--------------+-------------+----------------+-------------+--------------+
| id | hostname | personality | administrative | operational | availability |
+----+--------------+-------------+----------------+-------------+--------------+
| 1 | controller-0 | controller | unlocked | enabled | degraded |
+----+--------------+-------------+----------------+-------------+--------------+
(Alarm snippet)
fm alarm-list
[wrsroot@controller-0 ~(keystone_admin)]$ fm alarm-list
+-------+------------------------------------------------------------------------------------+--------------------------------------+----------+----------------+
| Alarm | Reason Text | Entity ID | Severity | Time Stamp |
| ID | | | | |
+-------+------------------------------------------------------------------------------------+--------------------------------------+----------+----------------+
| 200. | controller-0 is degraded due to the failure of its 'pci-irq-affinity-agent' | host=controller-0.process=pci-irq- | major | 2019-05-13T16: |
| 006 | process. Auto recovery of this major process is in progress. | affinity-agent | | 40:46.408005 |
| | | | | |
+-------+------------------------------------------------------------------------------------+--------------------------------------+----------+----------------+
[wrsroot@controller-0 ~(keystone_admin)]$ date
Mon May 13 18:43:31 UTC 2019
Severity
--------
Major: System cannot be fully installed
Steps to Reproduce
------------------
Install controller-0 as All-in-one dublex mode
Expected Behavior
------------------
controller-0 should not be in degraded state after initial unlock
Actual Behavior
----------------
pci-irq-affinity-agent process keeps failing
controller-0 never gets out of degraded state
Reproducibility
---------------
100% reproducible on build: 20190512T233000Z
System Configuration
--------------------
1+1 system (AIO-DX)
Internal lab name: cgcs-wildcat-69-70
Branch/Pull Time/Commit
-----------------------
BUILD_ID="20190512T233000Z"
JOB="STX_build_master_master"
BUILD_BY="starlingx.build@cengn.ca"
Last Pass
---------
20190508T233000Z
Timestamp/Logs
--------------
Attached
Test Activity
-------------
Lab install |
Brief Description
-----------------
pci-irq-affinity-agent fails to start - controller-0 stuck in degraded state after initial unlock
pmond reports the following (continuously):
/var/log/pmond.log (snippet)
2019-05-13T18:34:52.837 [93604.05081] controller-0 pmond mon pmonFsm.cpp ( 565) pmon_passive_handler : Info : pci-irq-affinity-agent stability period (20 secs)
2019-05-13T18:34:52.837 [93604.05082] controller-0 pmond mon pmonHdlr.cpp (1003) process_running : Info : pci-irq-affinity-agent process not running
2019-05-13T18:34:52.837 [93604.05083] controller-0 pmond mon pmonHdlr.cpp (1305) respawn_process : Info : pci-irq-affinity-agent Spawn (1200886)
2019-05-13T18:34:53.837 [93604.05084] controller-0 pmond mon pmonHdlr.cpp ( 897) want_degrade_clear : Warn : pci-irq-affinity-agent is still failed 'major' ; degrade assert
controller-0 stuck in degraded state:
[wrsroot@controller-0 ~(keystone_admin)]$ system host-list
+----+--------------+-------------+----------------+-------------+--------------+
| id | hostname | personality | administrative | operational | availability |
+----+--------------+-------------+----------------+-------------+--------------+
| 1 | controller-0 | controller | unlocked | enabled | degraded |
+----+--------------+-------------+----------------+-------------+--------------+
(Alarm snippet)
fm alarm-list
[wrsroot@controller-0 ~(keystone_admin)]$ fm alarm-list
+-------+------------------------------------------------------------------------------------+--------------------------------------+----------+----------------+
| Alarm | Reason Text | Entity ID | Severity | Time Stamp |
| ID | | | | |
+-------+------------------------------------------------------------------------------------+--------------------------------------+----------+----------------+
| 200. | controller-0 is degraded due to the failure of its 'pci-irq-affinity-agent' | host=controller-0.process=pci-irq- | major | 2019-05-13T16: |
| 006 | process. Auto recovery of this major process is in progress. | affinity-agent | | 40:46.408005 |
| | | | | |
+-------+------------------------------------------------------------------------------------+--------------------------------------+----------+----------------+
[wrsroot@controller-0 ~(keystone_admin)]$ date
Mon May 13 18:43:31 UTC 2019
The issue is possibly caused by:
https://review.opendev.org/#/c/640264/
Severity
--------
Major: System cannot be fully installed
Steps to Reproduce
------------------
Install controller-0 as All-in-one dublex mode
Expected Behavior
------------------
controller-0 should not be in degraded state after initial unlock
Actual Behavior
----------------
pci-irq-affinity-agent process keeps failing
controller-0 never gets out of degraded state
Reproducibility
---------------
100% reproducible on build: 20190512T233000Z
System Configuration
--------------------
1+1 system (AIO-DX)
Internal lab name: cgcs-wildcat-69-70
Branch/Pull Time/Commit
-----------------------
BUILD_ID="20190512T233000Z"
JOB="STX_build_master_master"
BUILD_BY="starlingx.build@cengn.ca"
Last Pass
---------
20190508T233000Z
Timestamp/Logs
--------------
Attached
Test Activity
-------------
Lab install |
|
2019-05-13 20:26:22 |
Ghada Khalil |
starlingx: assignee |
|
zhipeng liu (zhipengs) |
|
2019-05-13 20:26:43 |
Ghada Khalil |
starlingx: importance |
Undecided |
High |
|
2019-05-13 20:26:55 |
Ghada Khalil |
bug |
|
|
added subscriber Bill Zvonar |
2019-05-13 20:27:07 |
Ghada Khalil |
tags |
|
stx.2.0 |
|
2019-05-13 20:28:01 |
Ghada Khalil |
tags |
stx.2.0 |
stx.2.0 stx.integ |
|
2019-05-13 20:28:51 |
Numan Waheed |
tags |
stx.2.0 stx.integ |
stx.2.0 stx.integ stx.retestneeded |
|
2019-05-14 02:07:43 |
Ghada Khalil |
starlingx: status |
New |
Triaged |
|
2019-05-14 03:08:22 |
OpenStack Infra |
starlingx: status |
Triaged |
In Progress |
|
2019-05-14 12:16:49 |
Ghada Khalil |
tags |
stx.2.0 stx.integ stx.retestneeded |
stx.2.0 stx.metal stx.retestneeded |
|
2019-05-14 12:18:01 |
Ghada Khalil |
summary |
pci-irq-affinity-agent fails to start - controller-0 stuck in degraded state after initial unlock |
All-in-one: pci-irq-affinity-agent fails to start - controller-0 stuck in degraded state after initial unlock |
|
2019-05-14 13:39:31 |
Ghada Khalil |
tags |
stx.2.0 stx.metal stx.retestneeded |
stx.2.0 stx.integ stx.retestneeded |
|
2019-05-15 15:17:06 |
Ghada Khalil |
tags |
stx.2.0 stx.integ stx.retestneeded |
stx.2.0 stx.integ stx.retestneeded stx.sanity |
|
2019-05-15 17:49:02 |
Ghada Khalil |
removed subscriber Bill Zvonar |
|
|
|
2019-05-15 21:08:35 |
OpenStack Infra |
starlingx: status |
In Progress |
Fix Released |
|
2019-06-19 16:57:57 |
Chris Winnicki |
tags |
stx.2.0 stx.integ stx.retestneeded stx.sanity |
stx.2.0 stx.integ stx.sanity |
|