TCP keepalive timeouts too high in pods
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
Bart Wensley |
Bug Description
Brief Description
-----------------
The TCP keepalive timeouts in pods are currently set to the following:
net.ipv4.
net.ipv4.
net.ipv4.
This means that a dropped TCP connection can take more than 2 hours to be removed. That can cause large delays in reacting to unexpected events like the uncontrolled reboot of a host.
Severity
--------
Major: the reaction time to dropped TCP connections can impact recovery from process restarts, host reboots, etc...
Steps to Reproduce
------------------
N/A
Expected Behavior
------------------
When a TCP connection from inside a pod is dropped, it should be cleaned up in a reasonable amount of time. The current settings for the host OS should be used for pods:
net.ipv4.
net.ipv4.
net.ipv4.
Actual Behavior
----------------
See above
Reproducibility
---------------
Reproducible
System Configuration
-------
All
Branch/Pull Time/Commit
-------
All
Last Pass
---------
Never
Timestamp/Logs
--------------
N/A
Test Activity
-------------
Developer Testing
Changed in starlingx: | |
assignee: | nobody → Bart Wensley (bartwensley) |
Fix proposed to branch: master /review. opendev. org/670822
Review: https:/