Comment 0 for bug 1928333

Revision history for this message
Bart Wensley (bartwensley) wrote :

Brief Description
-----------------
The dcmanager audit is running both the patch audit and the load audit for each subcloud every 30 seconds or so (instead of every 15 minutes as intended). The issue is triggered as soon as the first patch is applied in the System Controller region. Here is a sample from DC-2:

2021-05-12 04:37:57.314 572129 INFO dcmanager.audit.patch_audit [-] Triggered patch audit for subcloud: subcloud66.
2021-05-12 04:37:58.783 572129 INFO dcmanager.audit.patch_audit [-] Auditing load of subcloud subcloud66
2021-05-12 04:37:58.838 572129 INFO dcmanager.audit.patch_audit [-] Patch audit completed for subcloud: subcloud66.
2021-05-12 04:38:28.846 572129 INFO dcmanager.audit.patch_audit [-] Triggered patch audit for subcloud: subcloud66.
2021-05-12 04:38:30.475 572129 INFO dcmanager.audit.patch_audit [-] Auditing load of subcloud subcloud66
2021-05-12 04:38:30.547 572129 INFO dcmanager.audit.patch_audit [-] Patch audit completed for subcloud: subcloud66.
2021-05-12 04:39:00.763 572129 INFO dcmanager.audit.patch_audit [-] Triggered patch audit for subcloud: subcloud66.
2021-05-12 04:39:02.172 572129 INFO dcmanager.audit.patch_audit [-] Auditing load of subcloud subcloud66
2021-05-12 04:39:02.219 572129 INFO dcmanager.audit.patch_audit [-] Patch audit completed for subcloud: subcloud66.

With a large number of subclouds in the system, it contributes to flooding the logs (see CGTS-24652) and it results in an excessive amount of messaging between the system controller and subclouds (30x the expected messages in steady state).

Severity
--------
Major: System/Feature is usable but degraded

Steps to Reproduce
------------------
1. Install a distributed cloud system.
2. Import and apply a patch in the system controller region.

Expected Behavior
-----------------
After the patch is applied, a patch audit should be triggered for all subclouds. After that audit is completed, the audit frequency should go back to 15 minutes.

Actual Behavior
---------------
The audit frequency changes to 30s permanently.

Reproducibility
---------------
100% Reproducible

System Configuration
--------------------
Distributed Cloud

Branch/Pull Time/Commit
-----------------------
SW_VERSION="21.05"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID="2021-05-11_00-00-06"
SRC_BUILD_ID="26"
BUILD_DATE="2021-05-11 00:02:31 -0400"

Last Pass
---------
This was broken by the following commit (February 25, 2021):
https://review.opendev.org/c/starlingx/distcloud/+/769216

Timestamp/Logs
--------------
See above

Test Activity
-------------
Developer Testing

Workaround
----------
None