TCP stack bug related to F-RTO

Bug #567394 reported by R. Jones
270
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linux
Fix Released
High
linux (Ubuntu)
Invalid
Undecided
Unassigned
Hardy
Fix Released
Medium
Unassigned

Bug Description

I have hit the TCP kernel issue reported in the Linux Kernel mailing list at: http://lkml.org/lkml/2009/9/24/319 (subject: "TCP stack bug related to F-RTO?", date: Thu, 24 Sep 2009 10:43:19 -0700 (PDT), from: Joe Cao) in Ubuntu 8.04 LTS with latest updates. The conclusion of this thread is documented at http://lkml.org/lkml/2009/9/25/274 . The conclusion recommends going to a newer kernel than is provided in the Ubuntu 8.04 LTS release.

From what I have seen, it is obvious to me that this bug can be taken advantage of to produce a remote denial of service attack against Ubuntu 8.04 LTS servers on the public internet via resource exhaustion. This could either be by hitting a limit on the number of open sockets allowed on a service, or (worse) exhausting all available sockets on the OS.

ubuntu-bug -p linux fails to connect to report this bug.

Kernel: 2.6.24-27-generic (also applies to 2.6.24-27-server and earlier kernels.)

$ more lsb_release.log
Description: Ubuntu 8.04.4 LTS
Release: 8.04
$ more version.log
Ubuntu 2.6.24-27.68-generic

Both dmesg log file and the lspci-vnvn log file have more information in them than is needed for this bug report. If specific lines from the files are needed, please request the specific information.

Kees Cook (kees)
affects: linux-meta (Ubuntu) → linux (Ubuntu)
visibility: private → public
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Hi R.,

Please be sure to confirm this issue exists with the latest development release of Ubuntu. ISO CD images are available from http://cdimage.ubuntu.com/releases/ . If the issue remains, please run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux 567394

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-kernel-logs
tags: added: needs-upstream-testing
tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
R. Jones (robejo05) wrote :

As noted in the linked Linux kernel mailing list refence ( http://lkml.org/lkml/2009/9/25/274 ), the kernel fixes are in kernels 2.6.25.7 and 2.6.26 or later. Since Ubuntu 10.04, 9.10 and 9.04 contain later kernels, this bug should not apply to them.

In preliminary testing under Ubuntu 10.04, I have not (yet) noticed this bug there, where I am using the 2.6.32-21-generic kernel. [I am not set up for full testing against that system at this time.]

Ubuntu 9.10 (karmic) is showing kernel version 2.6.31(-21.59), so it should not be impacted.

Ubuntu 9.04 (jaunty) is showing kernel version 2.6.28(-18), so it should not be impacted.

I don't have Ubuntu 8.10 (intrepid) installed, so I don't have a quick way to check that kernel version.

I don't run Ubuntu 6.06(.1), so I don't know if that is impacted.

When I get the chance, I will try the mainline kernel builds under Ubuntu 8.04(.4) to see if a fix is there.

Thanks.

Revision history for this message
Kees Cook (kees) wrote :

When I read this earlier, it seemed like just a performance issue. On closer inspection, this will cause the server to potentially get to a giant retransmit timeout that the client can ignore? I.e. client will ack at every doubling of the server's timeout, keeping those sockets alive? Do you have a way to test this that you can share?

Revision history for this message
R. Jones (robejo05) wrote :

Kees Cook:

You are correct that the client and server get in a retransmission loop that gets longer and longer delays. Unfortunately for the client, it means the client does not get the data it is waiting for (if the protocol is the client needing data from the server). Eventually either the client, or a hub / switch / etc. along the way, tends to eventually send an RSYN, causing the connection to get closed out and terminated prematurely. Yes, the client continues to ack the server and waits for the next (needed) packet that never comes.

At this time I do not have a way to exercise the bug that I can make public. I will say that it significantly helps if the client turns off the SACK (Selective Acknowledgement) TCP option when you want to exercise the bug. The mailing list report used a large TCP window size on the client end. I hit it with a small TCP window size (and poor handling of out-of-order packets) on the client end.

Revision history for this message
Marc Deslauriers (mdeslaur) wrote :
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Stefan Bader (smb) wrote :

When I read back there, it seems the first two mentioned patches alone were not helping. Looking into the 2.6.25.7 range it seems there is a range of 5 patches that could be touching the problem area:

commit 7a0c866aacab51afa7a6cbf6eccf5e1aa5fd64b9
Author: Ilpo Järvinen <email address hidden>

    tcp: Fix inconsistency source (CA_Open only when !tcp_left_out(tp))

commit 99d737e98d81762332242cc82e5604520842911a
Author: Ilpo Järvinen <email address hidden>

    tcp FRTO: work-around inorder receivers

commit 59a16700219922a1b095abd76caa25fd4417470c
Author: Ilpo Järvinen <email address hidden>

    tcp FRTO: SACK variant is errorneously used with NewReno

commit 76ab0a7c88886400dd16870db65106215f3e4aa3
Author: Ilpo Järvinen <email address hidden>

    tcp FRTO: Fix fallback to conventional recovery

commit 47478b42b8e74c2311674eda6700a0ced1509383
Author: Ilpo Järvinen <email address hidden>

    tcp: fix skb vs fack_count out-of-sync condition

Revision history for this message
Stefan Bader (smb) wrote :

Ok, so

commit 47478b42b8e74c2311674eda6700a0ced1509383
Author: Ilpo Järvinen <email address hidden>

    tcp: fix skb vs fack_count out-of-sync condition

will not apply, the rest would after some tweaks. If I would be building test kernels with those applied, could we get one of those tested against the issue?

Revision history for this message
Fabián Rodríguez (magicfab) wrote :

Stefan, I can try & help into testing those. Let me know what is needed.

Revision history for this message
Stefan Bader (smb) wrote : Re: [Bug 567394] Re: TCP stack bug related to F-RTO

Fabiáne,

if you can reproduce the problem it would be good (mybe you can say how you do
that). There has been an issue a while ago that looked similar but at that time
the changes I found seemed not to be sufficient. There is a set that we could
try. So if you can reproduce, I could do a test kernel to see whether the issues
go away. Let me know what flavour and arch (i386 or amd64) you need.

Revision history for this message
R. Jones (robejo05) wrote :

Flavor: Production is generic kernel. I'm building a test system, and suspect it will install the server kernel.

Arch: Must be amd64.

Bob Jones

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Stefan Bader
Sent: Monday, May 10, 2010 8:22 AM
To: Bob Jones
Subject: Re: [Bug 567394] Re: TCP stack bug related to F-RTO

Fabiáne,

if you can reproduce the problem it would be good (mybe you can say how you do
that). There has been an issue a while ago that looked similar but at that time
the changes I found seemed not to be sufficient. There is a set that we could
try. So if you can reproduce, I could do a test kernel to see whether the issues
go away. Let me know what flavour and arch (i386 or amd64) you need.

--
TCP stack bug related to F-RTO
https://bugs.launchpad.net/bugs/567394
You received this bug notification because you are a direct subscriber
of the bug.

Status in The Linux Kernel: Unknown
Status in “linux” package in Ubuntu: Confirmed

Bug description:
I have hit the TCP kernel issue reported in the Linux Kernel mailing list at: http://lkml.org/lkml/2009/9/24/319 (subject: "TCP stack bug related to F-RTO?", date: Thu, 24 Sep 2009 10:43:19 -0700 (PDT), from: Joe Cao) in Ubuntu 8.04 LTS with latest updates. The conclusion of this thread is documented at http://lkml.org/lkml/2009/9/25/274 . The conclusion recommends going to a newer kernel than is provided in the Ubuntu 8.04 LTS release.

>From what I have seen, it is obvious to me that this bug can be taken advantage of to produce a remote denial of service attack against Ubuntu 8.04 LTS servers on the public internet via resource exhaustion. This could either be by hitting a limit on the number of open sockets allowed on a service, or (worse) exhausting all available sockets on the OS.

ubuntu-bug -p linux fails to connect to report this bug.

Kernel: 2.6.24-27-generic (also applies to 2.6.24-27-server and earlier kernels.)

$ more lsb_release.log
Description: Ubuntu 8.04.4 LTS
Release: 8.04
$ more version.log
Ubuntu 2.6.24-27.68-generic

Both dmesg log file and the lspci-vnvn log file have more information in them than is needed for this bug report. If specific lines from the files are needed, please request the specific information.

To unsubscribe from this bug, go to:
https://bugs.launchpad.net/linux/+bug/567394/+subscribe

Revision history for this message
R. Jones (robejo05) wrote :

It has taken a bit.... I now have a test system that I can load a test kernel on to verify the fixes. The test system is using the default (current) server kernel. My production system is using the general kernel.

Let me know how you want me to obtain and install the kernel for testing.

Bob Jones

Revision history for this message
Stefan Bader (smb) wrote :

Ok, I uploaded kernel packages to http://people.canonical.com/~smb/lp567394 for testing. I think one of the linux-image files should be sufficient, but I put headers up there just in case. There is a v1 and v2. As upstream was so very clear about what fixed the problem in there comments, v1 contains the following patches:

commit 99d737e98d81762332242cc82e5604520842911a
Author: Ilpo Järvinen <email address hidden>

    tcp FRTO: work-around inorder receivers

commit 59a16700219922a1b095abd76caa25fd4417470c
Author: Ilpo Järvinen <email address hidden>

    tcp FRTO: SACK variant is errorneously used with NewReno

commit 76ab0a7c88886400dd16870db65106215f3e4aa3
Author: Ilpo Järvinen <email address hidden>

    tcp FRTO: Fix fallback to conventional recovery

while v2 additionally has

commit 7a0c866aacab51afa7a6cbf6eccf5e1aa5fd64b9
Author: Ilpo Järvinen <email address hidden>

    tcp: Fix inconsistency source (CA_Open only when !tcp_left_out(tp))

So, if v1 fixes the issue we could go with the set of three (there has been another issue with F-RTO where I tried two of the patches and those did not seem to help), otherwise v2 or if even that does not help we can sit down and cry in despair.

Revision history for this message
R. Jones (robejo05) wrote :

Thanks. I've grabbed the set of files for the amd64 configuration. Now to read up on how to install these test .deb files. [Let's see, can I just add a line to apt's source list configuration file and use the standard cli or gui tools to install as an update?]

The headers will be useful since the test system is a VM under VMWare hypervisor...

I'll let you know when I get the chance to run the various test. [One to confirm the bug exists in the test system and another to show either v1 or v2 kernel patch fixes the problem.]

Revision history for this message
Stefan Bader (smb) wrote :

On 05/18/2010 07:20 PM, R. Jones wrote:
> to read up on how to install these test .deb files. [Let's see, can I

Download and "sudo dpkg -i <package file>"

Revision history for this message
R. Jones (robejo05) wrote :

Preliminary testing shows both versions of the patched kernel corrected the issue.

See private log for more details.

Revision history for this message
R. Jones (robejo05) wrote :

At the application level, testing of both V1 and V2 patches worked. When packets captured with Wireshark were reviewed, V1 patch looked OK. Traffic using the V2 patched kernel had some additional issues that I haven't had time to investigate.

I have rolled the V1 patched kernel to my production system and seems stable.

I would recommend proceeding with the V1 patched kernel with the next steps for release.

Revision history for this message
Colin Watson (cjwatson) wrote : Please test proposed package

Accepted linux into hardy-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

tags: added: verification-needed
Revision history for this message
R. Jones (robejo05) wrote :

I'm now waiting for the US mirrors to update. It will probably be Monday before I get to this as a result.

Thanks for the reference to the testing / enable proposed archive documentation. Since I'm used to apt-get, I'll need to comment and uncomment the hardy-proposed entry in the sources.list as I work with specific entries from the proposed distribution due to the mix of utilities used for updating.

Revision history for this message
R. Jones (robejo05) wrote :

I'm having trouble getting the update of the hardy-proposed archive. See private landscape case # 9186 for details.

Revision history for this message
Stefan Bader (smb) wrote :

This is a bit of a work-around when not everything in proposed is wanted and for people that know what they are doing. In this case only the kernel package is needed, so https://launchpad.net/ubuntu/hardy/+source/linux/2.6.24-28.71 is the place were the build of the proposed kernel in question is recorded. There is a area of builds on the lower right. Follow the link of the desired architecture and look under the "Build files" section. Download the linux-image-2.6.24-28-<flavour>*.deb file that matches the flavour shown on the target machine with "uname -r" and install it with "sudo dpkg -i <pkg>".

Revision history for this message
Stefan Bader (smb) wrote :

Any progress on verifying the kernel in proposed? It would be good to get that to updates, but need a verification first.

Revision history for this message
R. Jones (robejo05) wrote :

See private support log (case 9186). I'm trying to get the test run by the end of the week.

Revision history for this message
R. Jones (robejo05) wrote :

Initial testing shows the new kernel is working.

Testing with SACK option disabled, and one of my shorter downloads, showed no problem. Normally I would hit the kernel bug with this test. I've run this test twice without hitting the problem. So, it looks like the problem is fixed. And I haven't seen any issues with running the new kernel.

I will leave a longer download running. Unfortunately due to work schedule, it may be a few days before I can post those results.

Revision history for this message
R. Jones (robejo05) wrote :

The long download finally succeeded without problems. The production server is running with the update without issues at the application level. Looks good.

Revision history for this message
Stefan Bader (smb) wrote :

Ok, thanks for checking.

tags: added: verification-done
removed: needs-kernel-logs needs-upstream-testing verification-needed
Changed in linux (Ubuntu Hardy):
importance: Undecided → Medium
status: New → Fix Committed
Changed in linux (Ubuntu):
status: Confirmed → Invalid
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 2.6.24-28.71

---------------
linux (2.6.24-28.71) hardy-proposed; urgency=low

  [Upstream Kernel Changes]

  * tcp FRTO: Fix fallback to conventional recovery
    - LP: #567394
  * tcp FRTO: SACK variant is errorneously used with NewReno
    - LP: #567394
  * tcp FRTO: work-around inorder receivers
    - LP: #567394
 -- Stefan Bader <email address hidden> Wed, 09 Jun 2010 11:15:27 +0200

Changed in linux (Ubuntu Hardy):
status: Fix Committed → Fix Released
Changed in linux:
status: Unknown → Fix Released
Changed in linux:
importance: Unknown → High
To post a comment you must log in.
This report contains Public Security information  
Everyone can see this security related information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.