Brief Description
-----------------
The following issue was observed in a distributed cloud configuration. The /var/log partition was filled up due to space taken by a large number of filebeat deleted files.
Severity
--------
Critical
Steps to Reproduce
------------------
Set up a large distributed cloud with stx-monitor applied and soak for a few days with some test activities such as deploying, managing/unamaging and removing subclouds.
Expected Behavior
------------------
Service logs are saved to disks and rotated accordingly
Actual Behavior
----------------
logmgmt process was hogging cpu, no logs were flushed to disk. Log files were rotated rapidly with almost no content and critical alarms were generated.
System Configuration
--------------------
IPv6 Distributed Cloud
Branch/Pull Time/Commit
-----------------------
Feb 22 master code
Last Pass
---------
N/A
Timestamp/Logs
--------------
As logs were not flushed to disk, there are
See list of deleted files as a result of running the command "sudo lsof|grep deleted" attached
Test Activity
-------------
Evaluation
Workaround
----------
Kill logmgmt process and delete filebeat pods.
Brief Description
-----------------
The following issue was observed in a distributed cloud configuration. The /var/log partition was filled up due to space taken by a large number of filebeat deleted files.
Severity
--------
Critical
Steps to Reproduce
------------------
Set up a large distributed cloud with stx-monitor applied and soak for a few days with some test activities such as deploying, managing/unamaging and removing subclouds.
Expected Behavior
------------------
Service logs are saved to disks and rotated accordingly
Actual Behavior
----------------
logmgmt process was hogging cpu, no logs were flushed to disk. Log files were rotated rapidly with almost no content and critical alarms were generated.
The problem documented here (courtesy of Al Bailey) /www.elastic. co/guide/ en/beats/ filebeat/ master/ faq-deleted- files-are- not-freed. html
https:/
might be the cause of this issue
Reproducibility
---------------
Seen once
System Configuration ------- ------
-------
IPv6 Distributed Cloud
Branch/Pull Time/Commit ------- ------- --
-------
Feb 22 master code
Last Pass
---------
N/A
Timestamp/Logs
--------------
As logs were not flushed to disk, there are
See list of deleted files as a result of running the command "sudo lsof|grep deleted" attached
Test Activity
-------------
Evaluation
Workaround
----------
Kill logmgmt process and delete filebeat pods.