DCManager audit backoff to avoid flooding with RCP messages

Bug #1979335 reported by Li Zhu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
In Progress
Undecided
Unassigned

Bug Description

Brief Description
-----------------
To avoid flooding with RCP messages during massive outages or any long
lasting failure, the DC audit manager scales back subcloud auditing by
limiting the retry times for the failures in a patch audit cycle
(which is the base of all other types of audit cycles).

Severity
--------
Minor

Steps to Reproduce
------------------
1.Install Distributed Cloud with a bunch of subclouds
2.Trigger a long lasting failure

Expected Behavior
------------------
No RPC message congestion and audit logs will not rotate within a short
amount of time.
The audit performance will not be impacted.

Actual Behavior
----------------
RPC message and audit logs congestion.
The audit performance was somehow impacted.

Reproducibility
---------------
Reproducible

System Configuration
--------------------
Distributed Cloud

Branch/Pull Time/Commit
-----------------------
Starlingx master from 2022-06-21

Last Pass
---------
N/A

Timestamp/Logs
--------------
N/A

Test Activity
-------------
Developer Testing

Workaround
----------
N/A

Li Zhu (lzhu1)
description: updated
description: updated
Changed in starlingx:
status: New → In Progress
Ghada Khalil (gkhalil)
tags: added: stx.distcloud
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on distcloud (master)

Change abandoned by "Li Zhu <email address hidden>" on branch: master
Review: https://review.opendev.org/c/starlingx/distcloud/+/846626
Reason: redo it later

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.