Outbound TCP Throughput drops to zero for several drivers

Bug #1390604 reported by Rick Wright
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Unassigned
Utopic
Fix Released
High
Chris J Arges
Vivid
Fix Released
High
Unassigned

Bug Description

There is a bug with TCP in kernel 3.16+ described as:

"Some drivers are unable to perform TX completions in a bound time.
They instead call skb_orphan()

Problem is skb_fclone_busy() has to detect this case, otherwise
we block TCP retransmits and can freeze unlucky tcp sessions on
mostly idle hosts."

A patch for this has been submitted upstream:
https://patchwork.ozlabs.org/patch/405110/

The Google engineer that submitted that patch also adds the following:

Backported patch for 3.16 or 3.17 kernel is much simpler :

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 4e4932b5079b..a8794367cd20 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2082,7 +2082,8 @@ static bool skb_still_in_host_queue(const struct sock *sk,
        const struct sk_buff *fclone = skb + 1;

        if (unlikely(skb->fclone == SKB_FCLONE_ORIG &&
- fclone->fclone == SKB_FCLONE_CLONE)) {
+ fclone->fclone == SKB_FCLONE_CLONE &&
+ fclone->sk == sk)) {
                NET_INC_STATS_BH(sock_net(sk),
                                 LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES);
                return true;

I do not believe that this problem affects any release prior to 14.10, and I don't know which version of the patch you may need, but have included both.

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1390604

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Rick Wright (wrigri) wrote :

This bug was not discovered by me. Instead, this is a tracking bug for getting an upstream fix included in Ubuntu.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Christopher M. Peñalver (penalvch) wrote :

Rick Wright, thanks for the heads up.

tags: added: cherry-pick regression-release utopic
Changed in linux (Ubuntu):
importance: Undecided → High
status: Confirmed → Triaged
tags: added: bot-stop-nagging
tags: added: kernel-da-key
Revision history for this message
Luis Henriques (henrix) wrote :

Thank you Rick, the backport of this patch (provided by Eric Dumazet) has been queued for the stable kernel 3.16 and should hit Utopic soon.

Revision history for this message
Tim Gardner (timg-tpi) wrote :

Upstream fix 39bb5e62867de82b269b07df900165029b928359 (net: skb_fclone_busy() needs to detect orphaned skb)

Changed in linux (Ubuntu Vivid):
status: Triaged → Fix Released
Changed in linux (Ubuntu Utopic):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Revision history for this message
Tim Gardner (timg-tpi) wrote :

Looks like there are some scaffolding patches needed as well.

Changed in linux (Ubuntu Utopic):
assignee: Tim Gardner (timg-tpi) → Chris J Arges (arges)
Revision history for this message
Chris J Arges (arges) wrote :

Luis already has this applied in his queue:
1cf39b5f88166c08b9bf9917b16c598fe9e68ab7

Brad Figg (brad-figg)
Changed in linux (Ubuntu Utopic):
status: In Progress → Fix Committed
Andy Whitcroft (apw)
Changed in linux (Ubuntu Utopic):
importance: Undecided → High
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Fix is confirmed by a Canonical partner.

Adam Conrad (adconrad)
Changed in linux (Ubuntu Utopic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers