Comment 0 for bug 1828877

Revision history for this message
Chris Winnicki (chriswinnicki) wrote : pci-irq-affinity-agent fails to start - controller-0 stuck in degraded state after initial unlock

Brief Description
-----------------
pci-irq-affinity-agent fails to start - controller-0 stuck in degraded state after initial unlock

pmond reports the following (continuously):
/var/log/pmond.log (snippet)

2019-05-13T18:34:52.837 [93604.05081] controller-0 pmond mon pmonFsm.cpp ( 565) pmon_passive_handler : Info : pci-irq-affinity-agent stability period (20 secs)
2019-05-13T18:34:52.837 [93604.05082] controller-0 pmond mon pmonHdlr.cpp (1003) process_running : Info : pci-irq-affinity-agent process not running
2019-05-13T18:34:52.837 [93604.05083] controller-0 pmond mon pmonHdlr.cpp (1305) respawn_process : Info : pci-irq-affinity-agent Spawn (1200886)
2019-05-13T18:34:53.837 [93604.05084] controller-0 pmond mon pmonHdlr.cpp ( 897) want_degrade_clear : Warn : pci-irq-affinity-agent is still failed 'major' ; degrade assert

controller-0 stuck in degraded state:
[wrsroot@controller-0 ~(keystone_admin)]$ system host-list
+----+--------------+-------------+----------------+-------------+--------------+
| id | hostname | personality | administrative | operational | availability |
+----+--------------+-------------+----------------+-------------+--------------+
| 1 | controller-0 | controller | unlocked | enabled | degraded |
+----+--------------+-------------+----------------+-------------+--------------+

(Alarm snippet)
fm alarm-list
[wrsroot@controller-0 ~(keystone_admin)]$ fm alarm-list
+-------+------------------------------------------------------------------------------------+--------------------------------------+----------+----------------+
| Alarm | Reason Text | Entity ID | Severity | Time Stamp |
| ID | | | | |
+-------+------------------------------------------------------------------------------------+--------------------------------------+----------+----------------+
| 200. | controller-0 is degraded due to the failure of its 'pci-irq-affinity-agent' | host=controller-0.process=pci-irq- | major | 2019-05-13T16: |
| 006 | process. Auto recovery of this major process is in progress. | affinity-agent | | 40:46.408005 |
| | | | | |
+-------+------------------------------------------------------------------------------------+--------------------------------------+----------+----------------+
[wrsroot@controller-0 ~(keystone_admin)]$ date
Mon May 13 18:43:31 UTC 2019

Severity
--------
Major: System cannot be fully installed

Steps to Reproduce
------------------
Install controller-0 as All-in-one dublex mode

Expected Behavior
------------------
controller-0 should not be in degraded state after initial unlock

Actual Behavior
----------------
pci-irq-affinity-agent process keeps failing
controller-0 never gets out of degraded state

Reproducibility
---------------
100% reproducible on build: 20190512T233000Z

System Configuration
--------------------
1+1 system (AIO-DX)
Internal lab name: cgcs-wildcat-69-70

Branch/Pull Time/Commit
-----------------------
BUILD_ID="20190512T233000Z"
JOB="STX_build_master_master"
<email address hidden>"

Last Pass
---------
20190508T233000Z

Timestamp/Logs
--------------
Attached

Test Activity
-------------
Lab install