BUILD_ID="20190622T013000Z" SYSTEM_NAME="yow-cgcs-wolfpass-03_07" (2+3 HW system) VSWITCH_TYPE="ovs-dpdk" [Failed in teardown in nova regression testcase nova/test_force_lock_with_vms.py::test_force_lock_with_mig_vms] $ system application-list --------+---------+-----------+ | application | version | manifest name | manifest file | status | progress | +---------------------+--------------------------------+-------------------------------+------------ | platform-integ-apps | 1.0-7 | platform-integration-manifest | manifest.yaml | applied | completed | | stx-openstack | 1.0-16-centos-stable-versioned | armada-manifest | stx-openstack.yaml | applied | completed 1. All instances that were running on compute-0 landed on compute-1 (when compute-0 was force locked at approx 2019-06-25 14:18:30) nova hypervisor reverts to down and disabled on compute-0 ]$ nova hypervisor-list +--------------------------------------+---------------------+-------+----------+ | ID | Hypervisor hostname | State | Status | +--------------------------------------+---------------------+-------+----------+ | a8aea911-05a5-4410-9f39-08630783d373 | compute-0 | down | disabled | | ba4344e6-5d97-4b42-a67b-eabbc95b531b | compute-1 | up | enabled | | a612e8ad-5026-4d9e-b103-339b6d94eefe | compute-2 | up | enabled | +--------------------------------------+---------------------+-------+----------+ [sysadmin@controller-0 ~(keystone_admin)]$ date Tue Jun 25 14:22:00 UTC 2019 compute-0 is unlocked ~ 2019-06-25 14:24:15 2 alarms are reported after this (from Horizon). Only the CPU alarm on compute-1 clears 200.006 compute-0 is degraded due to the failure of its 'pci-irq-affinity-agent' process. Auto recovery of this major process is in progress. host=compute-0.process=pci-irq-affinity-agent major 2019-06-25T10:28:17 100.101 Platform CPU threshold exceeded ; threshold 90.00%, actual 93.43% host=compute-1 major 2019-06-25T10:33:11 +----+--------------+-------------+----------------+-------------+--------------+ | id | hostname | personality | administrative | operational | availability | +----+--------------+-------------+----------------+-------------+--------------+ | 1 | controller-0 | controller | unlocked | enabled | available | | 2 | controller-1 | controller | unlocked | enabled | available | | 3 | compute-0 | worker | unlocked | disabled | intest | | 4 | compute-1 | worker | unlocked | enabled | degraded | | 5 | compute-2 | worker | unlocked | enabled | available $ date Tue Jun 25 14:28:15 UTC 2019 $ nova hypervisor-list;date +--------------------------------------+---------------------+-------+---------+ | ID | Hypervisor hostname | State | Status | +--------------------------------------+---------------------+-------+---------+ | a8aea911-05a5-4410-9f39-08630783d373 | compute-0 | down | enabled | | ba4344e6-5d97-4b42-a67b-eabbc95b531b | compute-1 | up | enabled | | a612e8ad-5026-4d9e-b103-339b6d94eefe | compute-2 | up | enabled | +--------------------------------------+---------------------+-------+---------+ Tue Jun 25 14:41:14 UTC 2019 $ nova hypervisor-list; date +--------------------------------------+---------------------+-------+---------+ | ID | Hypervisor hostname | State | Status | +--------------------------------------+---------------------+-------+---------+ | a8aea911-05a5-4410-9f39-08630783d373 | compute-0 | down | enabled | | ba4344e6-5d97-4b42-a67b-eabbc95b531b | compute-1 | up | enabled | | a612e8ad-5026-4d9e-b103-339b6d94eefe | compute-2 | up | enabled | +--------------------------------------+---------------------+-------+---------+ Tue Jun 25 15:05:37 UTC 2019 Result pci-irq-affinity-agent alarm on compute-0 did not clear long after the compute was unlocked The hypervisor on compute-1 remained in state down (status enabled)