Brief Description
-----------------
In a DX system, after force reboot active controller, alarm "270.001 Host controller-0 compute services failure, failed to get openstack token from keystone" was raised, and and never cleared.
Severity
--------
Major
Steps to Reproduce
------------------
sudo -f reboot
TC-name: mtc/test_evacuate.py::TestTisGuest::test_evacuate_vms
Expected Behavior
------------------
eventually 270.001 alarm should be cleared
Actual Behavior
----------------
Reproducibility
---------------
Intermittent
System Configuration
--------------------
Two node system
Lab-name: WP_1-2
Branch/Pull Time/Commit
-----------------------
stx master as of 2019-05-23_18-37-00
Last Pass
---------
2019-05-18_06-36-50
Timestamp/Logs
--------------
[2019-05-24 14:57:16,230] 262 DEBUG MainThread ssh.send :: Send 'fm --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne alarm-list --nowrap --uuid'
[2019-05-24 14:57:18,716] 387 DEBUG MainThread ssh.expect :: Output:
+--------------------------------------+----------+------------------------------------------------------------------------+--------------------------------------------------------------------------+----------+----------------------------+
| UUID | Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+--------------------------------------+----------+------------------------------------------------------------------------+--------------------------------------------------------------------------+----------+----------------------------+
| 4332dfc7-86f4-4de8-a5bf-bad51b6838a1 | 400.001 | Service group cloud-services warning; dbmon(enabled-active, ) | service_domain=controller.service_group=cloud-services.host=controller-0 | minor | 2019-05-24T14:57:12.373192 |
| a13d78a1-0f5e-4603-a94f-e6f739edbeef | 400.001 | Service group cloud-services warning; dbmon(enabled-standby, ) | service_domain=controller.service_group=cloud-services.host=controller-1 | minor | 2019-05-24T14:54:59.746219 |
| 95f73f81-cc23-4cf3-83c5-06a0ad586a06 | 100.114 | NTP configuration does not contain any valid or reachable NTP servers. | host=controller-0.ntp | major | 2019-05-24T14:14:18.881022 |
| e6b48246-2c3b-472a-8285-71fcfe615d7a | 200.010 | controller-0 access to board management module has failed. | host=controller-0 | warning | 2019-05-24T14:08:47.419154 |
+--------------------------------------+----------+------------------------------------------------------------------------+--------------------------------------------------------------------------+----------+----------------------------+
[wrsroot@controller-0 ~(keystone_admin)]$
[wrsroot@controller-0 ~(keystone_admin)]$
[2019-05-24 15:06:15,644] 139 INFO MainThread host_helper.reboot_hosts:: Rebooting active controller: controller-0
[2019-05-24 15:06:15,644] 262 DEBUG MainThread ssh.send :: Send 'sudo reboot -f'
[2019-05-24 15:13:02,193] 262 DEBUG MainThread ssh.send :: Send 'fm --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne alarm-list --nowrap --uuid'
[2019-05-24 15:13:04,267] 387 DEBUG MainThread ssh.expect :: Output:
+--------------------------------------+----------+-----------------------------------------------------------------------------------------+------------------------------------+----------+----------------------------+
| UUID | Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+--------------------------------------+----------+-----------------------------------------------------------------------------------------+------------------------------------+----------+----------------------------+
| 51afc9b4-91ef-4d28-a29e-24c39b0b5145 | 200.010 | controller-0 access to board management module has failed. | host=controller-0 | warning | 2019-05-24T15:08:34.204058 |
| 2bc24217-3f96-46fd-9687-5f88f17b6aaa | 270.001 | Host controller-0 compute services failure, failed to get openstack token from keystone | host=controller-0.services=compute | critical | 2019-05-24T15:07:30.646680 |
| 95f73f81-cc23-4cf3-83c5-06a0ad586a06 | 100.114 | NTP configuration does not contain any valid or reachable NTP servers. | host=controller-0.ntp | major | 2019-05-24T14:14:18.881022 |
+--------------------------------------+----------+-----------------------------------------------------------------------------------------+------------------------------------+----------+----------------------------+
controller-1:~$
[2019-05-24 16:04:45,891] 262 DEBUG MainThread ssh.send :: Send 'fm --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne alarm-list --nowrap --uuid'
[2019-05-24 16:04:48,281] 387 DEBUG MainThread ssh.expect :: Output:
+--------------------------------------+----------+-----------------------------------------------------------------------------------------+--------------------------------------------------------------------------+----------+----------------------------+
| UUID | Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+--------------------------------------+----------+-----------------------------------------------------------------------------------------+--------------------------------------------------------------------------+----------+----------------------------+
| 8c9b9e65-c341-428d-bf74-fb6601ef6618 | 400.001 | Service group cloud-services warning; dbmon(enabled-active, ) | service_domain=controller.service_group=cloud-services.host=controller-1 | minor | 2019-05-24T16:04:23.328927 |
| 51afc9b4-91ef-4d28-a29e-24c39b0b5145 | 200.010 | controller-0 access to board management module has failed. | host=controller-0 | warning | 2019-05-24T15:08:34.204058 |
| 2bc24217-3f96-46fd-9687-5f88f17b6aaa | 270.001 | Host controller-0 compute services failure, failed to get openstack token from keystone | host=controller-0.services=compute | critical | 2019-05-24T15:07:30.646680 |
| 95f73f81-cc23-4cf3-83c5-06a0ad586a06 | 100.114 | NTP configuration does not contain any valid or reachable NTP servers. | host=controller-0.ntp | major | 2019-05-24T14:14:18.881022 |
+--------------------------------------+----------+-----------------------------------------------------------------------------------------+--------------------------------------------------------------------------+----------+----------------------------+
controller-1:~$
Test Activity
-------------
Sanity
Is this a case of a stale alarm or is the compute service in a failed state?