kubernetes api-server event time to live needs to be increased
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
David Sullivan |
Bug Description
Brief Description
-----------------
The Kubernetes API server event time to live defaults to 1 hour. That mechanism controls how long an event remains in the system before it is deleted. The default setting value results in the end user being unable to query events beyond the 1 hour time threshold. For debugging purposes 1 hour is too short to get a clear picture of what system transitions may have occurred.
The retention period can be controlled with a kube-apiserver option (--event-ttl).
Severity
--------
Minor, but impacts system debugability.
Steps to Reproduce
------------------
Run "kubectl get events " and observe that there are no events beyond 1hr of system uptime.
Expected Behavior
------------------
We should provide a longer retention period to allow time to gather system information following a critical issue. I suggest a 24 hour retention as a better alternative but that decision is subject to testing to determine the system storage impact of persisting events for 24 hours in a large system.
Actual Behavior
----------------
Events older than 1hr are deleted.
Reproducibility
---------------
100%
System Configuration
-------
Any
Branch/Pull Time/Commit
-------
20190527T233000Z
Last Pass
---------
Never
Timestamp/Logs
--------------
N/A
Test Activity
-------------
Developer Testing
Changed in starlingx: | |
assignee: | Tee Ngo (teewrs) → David Sullivan (dsullivanwr) |
Marking as release gating; this should be evaluated as part of stx.2.0 system engineering activities when planned.