Brief Description
250.001 alarm (controller-0 configuration is out-of-date), appeared after powering on subclouds
Severity
Major.
Steps to Reproduce
Deploy 1000 virtual subclouds
Install WRA on the System Controller and on the subclouds
Power off all the subclouds
Power on all the subclouds
Expected Behavior
All 1000 subclouds are back online without any alarms.
Actual Behavior
Subclouds in a degraded state raised 250.001 alarm
Reproducibility
Run 1: 1 out of 1000 subclouds
Run 2: 5 out of 1000 subclouds
System Configuration
Distributed Cloud (DC1000-2)
Load info (eg: 2022-03-10_20-00-07)
SW_VERSION="22.12"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID="2022-11-29_22-00-05"
SRC_BUILD_ID="11"
Last Pass
NA
Timestamp/Logs
// Collect All
System Controller: /folk/cgts_logs/CGTS-41723/ALL_NODES_20221207.123222.tar
Subcloud47: /folk/cgts_logs/CGTS-41723/subcloud47_20221207.122123.tar
// Subcloud47 degraded
$ dcmanager alarm summary | grep -v OK +--------------+-----------------+--------------+--------------+----------+----------+ | NAME | CRITICAL_ALARMS | MAJOR_ALARMS | MINOR_ALARMS | WARNINGS | STATUS | +--------------+-----------------+--------------+--------------+----------+----------+ | subcloud47 | 0 | 1 | 0 | 0 | degraded | +--------------+-----------------+--------------+--------------+----------+----------+
// 250.001 alarm
$ fm alarm-list
+----------+----------------------------------------------------------------------------------------------------------------------+-------------------+----------+----------------------+
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+----------+----------------------------------------------------------------------------------------------------------------------+-------------------+----------+----------------------+
| 250.001 | controller-0 Configuration is out-of-date. (applied: 0db3c488-8b9d-4a73-ab13-6d7e2f1ebbe0 target: | host=controller-0 | major | 2022-12-07T09:45:16. |
| | 1e7583c2-2e50-4272-8800-f3c35e7f70ce) | | | 351124 |
[sysadmin@controller-0 ~(keystone_admin)]$ date
Wed 07 Dec 2022 12:18:01 PM UTC
// target UUID grep in sysinv.log
$ grep "1e7583c2-2e50-4272-8800-f3c35e7f70ce" /var/log/sysinv.log sysinv 2022-12-07 09:45:16.175 73386 INFO sysinv.conductor.manager [-] _config_update_hosts personalities=['controller'] host_uuids=['b8ed3bbe-5d82-49e1-b61b-d488e520518c'] reboot=False config_uuid=1e7583c2-2e50-4272-8800-f3c35e7f70ce tb= File "/usr/lib/python3/dist-packages/sysinv/conductor/manager.py", line 5826, in _controller_config_active_apply sysinv 2022-12-07 09:45:16.224 73386 INFO sysinv.conductor.manager [-] Setting config target of host 'controller-0' to '1e7583c2-2e50-4272-8800-f3c35e7f70ce'. sysinv 2022-12-07 09:45:16.347 73386 WARNING sysinv.conductor.manager [-] controller-0: iconfig out of date: target 1e7583c2-2e50-4272-8800-f3c35e7f70ce, applied 0db3c488-8b9d-4a73-ab13-6d7e2f1ebbe0 sysinv 2022-12-07 09:45:16.349 73386 WARNING sysinv.conductor.manager [-] SYS_I Raise system config alarm: host controller-0 config applied: 0db3c488-8b9d-4a73-ab13-6d7e2f1ebbe0 vs. target: 1e7583c2-2e50-4272-8800-f3c35e7f70ce. sysinv 2022-12-07 09:45:16.439 73386 INFO sysinv.conductor.manager [-] _config_update_hosts config_uuid=1e7583c2-2e50-4272-8800-f3c35e7f70ce sysinv 2022-12-07 09:45:16.470 73386 INFO sysinv.conductor.manager [-] applying runtime manifest config_uuid=1e7583c2-2e50-4272-8800-f3c35e7f70ce, classes: ['openstack::keystone::endpoint::runtime', 'platform::firewall::runtime'] sysinv 2022-12-07 09:45:16.553 73386 INFO sysinv.puppet.puppet [-] Updating hiera for host: controller-0 with config_uuid: 1e7583c2-2e50-4272-8800-f3c35e7f70ce
// Config_applied
root@controller-0:/var/home/sysadmin# cat /etc/platform/.config_applied
Alarms
Subcloud was free of alarms after the deployment
Test Activity
Scalability Testing
Workaround
Lock/Unlock subcloud controller-0
Fixed By: https:/ /review. opendev. org/c/starlingx /config/ +/887646