TCP memory leak, slow network (arm64)

Bug #2045560 reported by Lev Petrushchak
34
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Committed
Undecided
Philip Cox
Mantic
Fix Committed
Medium
Philip Cox
Noble
Fix Committed
Medium
Philip Cox

Bug Description

Hello! 👋

We have Ubuntu OS-based servers running in both AWS and Azure clouds. These servers are handling thousands of connections, and we've been experiencing issues with TCP memory usage since upgrading to Ubuntu 22.04.3 from 22.04.2.

$ cat /proc/net/sockstat
sockets: used 6642
TCP: inuse 5962 orphan 0 tw 292 alloc 6008 mem 128989
UDP: inuse 5 mem 0
UDPLITE: inuse 0
RAW: inuse 0
FRAG: inuse 0 memory 0

As shown in the output below, even after stopping all possible services and closing all open connections, the TCP memory usage remains high and only decreases very slowly.

$ cat /proc/net/sockstat
sockets: used 138
TCP: inuse 2 orphan 0 tw 0 alloc 3 mem 128320
UDP: inuse 3 mem 0
UDPLITE: inuse 0
RAW: inuse 0
FRAG: inuse 0 memory 0

I have attached a screenshot of linear TCP memory usage growth, which we believe may indicate a TCP memory leak

When net.ipv4.tcp_mem limit is reached, it causes network slowdown

We've never had these issues before, and the only solution we've found so far is to reboot the node. Do you have any suggestions on how to troubleshoot further?

Thank you for any help or guidance you can provide!

ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: linux-image-6.2.0-1015-aws 6.2.0-1015.15~22.04.1
ProcVersionSignature: Ubuntu 6.2.0-1015.15~22.04.1-aws 6.2.16
Uname: Linux 6.2.0-1015-aws aarch64
ApportVersion: 2.20.11-0ubuntu82.5
Architecture: arm64
CasperMD5CheckResult: unknown
CloudArchitecture: aarch64
CloudID: aws
CloudName: aws
CloudPlatform: ec2
CloudRegion: us-west-2
CloudSubPlatform: metadata (http://169.254.169.254)
Date: Mon Dec 4 13:13:08 2023
Ec2AMI: ami-095a68e28e781dfe1
Ec2AMIManifest: (unknown)
Ec2Architecture: arm64
Ec2AvailabilityZone: us-west-2b
Ec2Imageid: ami-095a68e28e781dfe1
Ec2InstanceType: m7g.large
Ec2Instancetype: m7g.large
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
Ec2Region: us-west-2
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=C.UTF-8
 SHELL=/bin/bash
RebootRequiredPkgs: Error: path contained symlinks.
SourcePackage: linux-signed-aws-6.2
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Lev Petrushchak (sabretus) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-signed-aws-6.2 (Ubuntu):
status: New → Confirmed
Revision history for this message
Ryan Huddleston (rshuddleston) wrote :

affects 6.2.0-1011-aws -> 6.2.0-1016-aws, 6.2.0-1011-azure -> 6.2.0-1016-azure kernels, did not affect 5.19 and prior kernels in 22.04.2

Revision history for this message
Lev Petrushchak (sabretus) wrote :

We tried to downgrade the kernel in Ubuntu 22.04.3 from 6.2 to 5.15 by installing linux-image-{azure|aws}-lts-22.04 and it works, LTS 5.15 kernel is not affected by this problem.

Revision history for this message
Lev Petrushchak (sabretus) wrote : Re: TCP memory leak, slow network

We have confirmed that this issue only related to ARM64 architecture

summary: - Possible TCP memory leak
+ TCP memory leak, slow network
description: updated
summary: - TCP memory leak, slow network
+ TCP memory leak, slow network (arm64)
description: updated
Revision history for this message
Jonathan Heathcote (mossblaser) wrote :

As another "me too" situation, I'm seeing the same phenomenon, though on Rocky 9 rather than Ubuntu and on older kernels (5.14). Reporting details here on the off chance this provides some insight.

Hardware: Ampere Altra Max 128 cores (aarch64), ConnectX6-DX NICs (2 x dual 100G port)
Kernel versions tested: 5.15 (Rocky 9 native kernel) and 6.8 (elrepo kernel), both configured with 64 KB pages
OS: Rocky Linux 9
Software: nginx serving ~90k HTTPS clients at ~350 GBit/s (a synthetic load test)
Bare-metal (no virtualisation).

In my test environment, ~90k HTTPS connections are opened (and reused via keepalive) and used to stream ~350 GBit/s of traffic to a cluster of load generators. In this scenario, TCP memory gradually creeps up until reaching the memory pressure threshold in /proc/sys/net/ipv4/tcp_mem (243890 pages, or 15.6 GB in this system). At this point memory usage growth actually increases slightly (and increased CPU load and response times are also observed). The system eventually reaches the ultimate limit (365832 pages, or 23.4 GB) at which point most connections fail and all requests receive very slow responses.

Closing all connections or restarting nginx does not free up the memory, only a reboot resolves the situation -- as reported above already.

Leaked memory appears to persist even if all connections are closed prior to hitting any of the above limits.

Unfortunately I don't yet have any ideas on how to fix this but would be glad to hear (and will share) any insights about what might be going on here!

Revision history for this message
Jonathan Heathcote (mossblaser) wrote :

I've been digging into this and this appears to be a regression introduced by the following patch https://github.com/torvalds/linux/commit/3cd3399dd7a84 which was first released in Linux 6.0.0.

The bug is not a memory leak but rather a bug in how memory usage is counted. Excess memory is not actually being consumed, though the bug is still fatal since the counter controls Linux's memory pressure logic.

The (apparently) responsible patch is a performance optimisation which attempts to reduce the frequency of writes to the system-wide counter which (I suspect) is subtly misusing some atomic operation on ARM. If you undo this patch in a recent kernel, the bug disappears.

I am currently working on a detailed bug report for the relevant Kernel maintainers.

NB: It appears that the "5.15" kernel shipped by Rocky (and RHEL) includes a back-port of this bug, hence my seeing the bug in that kernel version on Rocky Linux. A non-RedHat-patched vanilla build of 5.15 does not exhibit the bug in my system either.

Revision history for this message
Lev Petrushchak (sabretus) wrote :

Hi Jonathan,

Thanks for all the information and the deep dive into this issue! Please post the link to the kernel bug report here when it's ready.

Revision history for this message
Jonathan Heathcote (mossblaser) wrote :

The bug report can be found here:

https://<email address hidden>/

The subsequently produced patch (not by me!) to fix this can be found here:

https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=3584718cf2ec

I've also verified that the above patch does fix the problem at least in my case!

Philip Cox (philcox)
Changed in linux-signed-aws-6.2 (Ubuntu):
assignee: nobody → Philip Cox (philcox)
Revision history for this message
Philip Cox (philcox) wrote :

I've looked at this issue in some detail now. The initial memory leak was introduced in linux kernel version v6.0, and the fix was introduced in the v6.9 kernel version. This means that the 6.5, and 6.8 based ubuntu kernels will contain this memory leak.

I will submit a fix for the generic 6.8, and generic 6.5 kernels so all ubuntu kernels will receive the fix.

For reference: The commit that introduces this issue is 3cd3399dd7a84ada85cb839989cdf7310e302c7d

https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=3cd3399dd7a84ada85cb839989cdf7310e302c7d

And the commit I will be working with to provide the fix is the stable backport to the 6.8.y stable kernel which is: d2fa3493811ecd49f1581940111ccf475fa5cd38

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d2fa3493811ecd49f1581940111ccf475fa5cd38

affects: linux-signed-aws-6.2 (Ubuntu) → linux (Ubuntu)
Changed in linux (Ubuntu Mantic):
status: New → Confirmed
assignee: nobody → Philip Cox (philcox)
Changed in linux (Ubuntu Noble):
assignee: nobody → Philip Cox (philcox)
status: New → Confirmed
Changed in linux (Ubuntu):
status: Confirmed → Fix Committed
Revision history for this message
Philip Cox (philcox) wrote :

Two patches are actually needed for this fix. From the 6.8.y stable branch, the following two patches cherry-pick and build clean on the generic ubuntu mantic (6.5), and noble (6.8) kernels. I will perform some more testing and send these out for review.

The two patches are:

d2fa3493811ecd49f1581940111ccf475fa5cd38 net: fix sk_memory_allocated_{add|sub} vs softirqs
e830c804e26733fb09219cb9b92f167715c3b8a0 net: make SK_MEMORY_PCPU_RESERV tunable

Revision history for this message
Philip Cox (philcox) wrote :

The review for this change has been sent out. It is: https://lists.ubuntu.com/archives/kernel-team/2024-May/151087.html

Stefan Bader (smb)
Changed in linux (Ubuntu Noble):
importance: Undecided → Medium
Changed in linux (Ubuntu Mantic):
importance: Undecided → Medium
Changed in linux (Ubuntu Noble):
status: Confirmed → Fix Committed
Stefan Bader (smb)
Changed in linux (Ubuntu Mantic):
status: Confirmed → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.