Mandriva Linux

TCP option ordering in 2.6.27 is not interoperable with some network equipment

Reported by Marques Johansson on 2008-09-02
92
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Unknown
Release Notes for Ubuntu
Medium
Steve Langasek
autorun4linuxcd (Ubuntu)
High
Tim Gardner
Intrepid
High
Tim Gardner
linux (Mandriva)
Invalid
High
procps (Ubuntu)
Critical
Colin Watson
Intrepid
Critical
Colin Watson

Bug Description

2.6.27 introduced a networking issue, leaving sites like youtube.com, and microsoft.com not working via the web browsers on one of my systems (but not others). This system, like the others, was upgraded from Hardy.

I am unable to transfer files from archive.ubuntu.com via apt (http) or lftp (ftp, not http). I can connect via ftp and traverse and list files/directories, but when I try to "get", the transfer stays at 0 bytes.

Likewise, if I do a naive "telnet youtube.com 80"
GET / HTTP/1.1
HOST: youtube.com
I get a response, but when the web browser (firefox,opera) attempts to load these sites they do not progress - the 'transferring' status stays fixed. Some web sites (google, yahoo) work fine though.

When I reboot this system back to the 2.6.26-5 kernel everything works as it should. I tried setting rp_filter=0 on all, eth0, and default (instead of the Ubuntu default 1), as I have seen this and ecn settings cause site specific problems. Those settings had no effect in this case.

I only have remote access to this system. I am able to remotely ssh into this box, but I could not scp a file to it. I was able to scp to this host from a host on the same local network, however. I have tested this problem using both the wired ethernet adapter, and using a wireless adapter. The same problem existed in both configurations - some websites could be reached, while others could not.

The other Intrepid host on the same network did not have these connection issues.

I just installed Intrepid alpha 5 and I have the same issue.

This is an issue with both 2.6.27-1 and 2.6.27-2. Reverting back to 2.6.26-5 works (had to copy the packages over via USB of course).

Timothy G. Rundle (tgrundle) wrote :

I am experiencing the same issue (also in 2.6.27-3). If I boot in Hardy everything works just fine, but in Intrepid Alpha 5, some websites and apt don't work. All LAN traffic works just fine. I also eliminated any hardware issues by replacing the route, network line, and the networks card. I did a snoop of teh network traffic, but nothing stood out to me there

I have attached the output of the lspci.

Alex Converse (ajc30) wrote :

I'm experiencing the same issue with an Intel Corporation 82566DC Gigabit Network Connection (rev 02)

bobbyjean58 (bobbyjean58) wrote :

Similar problem since 2.6.27 kernel was introduced in Intrepid. I can connect to some sites, like iGoogle and Yahoo (plus others when I can find them), but not others (like Ubuntu Forums). I can not do any updates either in terminal (cursor just sits there blinking with Waiting for Headers displayed) nor Synaptic. I did find a work around though. If I install Firestarter then internet surfing is back to normal once again (under 2.6.27 kernel). Strange, I know, but it works. My chipset is VIA VT6103L for my on-board Lan connection. I also run thru a Westell 6100 Nat combo modem. I've never had this problem in prior versions of Ubuntu. BTW, in Hardy I compiled a 2.6.27-rc5 kernel and had the same problem.

Glad I found someone else who's having the same problem. At least I know it's not just me.

Thanks.

Marques Johansson (marques) wrote :

On an up-to-date Intrepid system with 2.6.27-4 (i386) I am still experiencing this problem. 2.6.27-2-generic also showed this problem.

When using 2.6.26-5 network connections work correctly.

Marques Johansson (marques) wrote :

Here's the dmesg from 2.6.27-4-generic. Some sites do not make functioning connection - including Ubuntu updates via apt.

bobbyjean58 (bobbyjean58) wrote :

It's the 2.6.27 kernel that's the problem. Yesterday I tried Fedora 10, which uses the 2.6.27-* kernel, and had the same problem, but, just like in Ubuntu, if you install firestarter (I downloaded a copy and saved it before I installed Fedora) internet surfing is fine. Strange.

2.6.27-5 still has this bug.

description: updated

I experience the same bug with Intrepid beta1. Doing a telnet on port 80 to a slow site works. Doing a GET / is extremely slow, less than 100 Bytes per second. Because of this behaviour, any browser I used (Firefox, Opera, Konqueror) give connection errors.

booting Intrepid beta 1 with a hardy kernel 2.4.24-19 works OK

Nathan Barnes (nathanvbarnes) wrote :

Hi this is my first ever post to a ubuntu / launchpad forum.
I am having the exact same problem with Ubuntu 8.10. I have found that turning off the spi firewall in my Netgear WPN824 seems to fix it. However the system update manager is still running very slow.
Everything worked fine in Hardy, which i used on the same computer previously (an eepc 900). However my house mate wants the spi firewall on, so i cant use internet :(

Nathan Barnes (nathanvbarnes) wrote :

I just Installed firestarter and that fixes both Web surfing and update manager downloads. Thanks BobbyJean

Daniel Leiter (dleiter) wrote :

I had been trying to find a solution to this problem for a week! This just made my day, thanks!

Forest Bond (forest-bond) wrote :

I'm experiencing the same issue on my HP Mini Note 2133, both on the wireless interface (b43) and the wired interface (tg3).

I did some tracing with wireshark and found that a response packet was being dropped and not resent. It was followed by a duplicate ACK. I don't know TCP well enough to provide any further insight, but I do have the wireshark capture file. Please ask for it if it would be useful.

Forest Bond (forest-bond) wrote :

One more thing: I don't think it is the site that causes the problem. It's just that larger HTTP downloads are more likely to fail due to increased likelihood of dropped packets.

Forest Bond (forest-bond) wrote :

BTW, firestarter doesn't seem to help here. I suspect that if it appears to work for anyone, it's probably coincidental. Please feel free to correct me if I'm wrong.

What's confusing about this bug is that it certainly appears to be in the TCP stack somewhere, but it obviously isn't affecting a lot of people or there would be more reports. It's not limited to a particular driver, though, because I see it with both my wired and wireless connections.

This is a real show-stopper for me, though. TCP becomes totally unreliable.

Forest Bond (forest-bond) wrote :

Text export of wireshark trace is attached.

BTW, I'm wondering if this is related to https://bugs.launchpad.net/ubuntu/+source/linux/+bug/230456. I don't really know anything about this, though, so I could be way off...

Forest Bond (forest-bond) wrote :

Yeah, can't be #230456 because that patch is already applied.

I have another wireshark trace attached. The basic problem is "TCP Previous segment lost" followed by a series of "TCP Dup Ack".

Forest Bond (forest-bond) wrote :

One more wireshark capture. This one shows multiple duplicate ACKs.

Dhaval Patel (dhuv) wrote :

I also notice that on my Verizon DSL it does NOT work but on a Comcast Cable or Time Warner Cable connection, it is fine. I hope this info helps with the troubleshooting. For the time being I will downgrade to 2.6.25.

Forest Bond (forest-bond) wrote :

I'm also on Verizon (well, Earthlink via Fairpoint, previously Verizon) DSL.

Marques Johansson (marques) wrote :

I also experienced the problem with Verizon DSL as the ISP.

bobbyjean58 (bobbyjean58) wrote :

I'm on Verizon DSL.

Omer (mermerico) wrote :

I have this problem as well. Running 64 bit, wireless with ndiswrapper, verizon dsl. I would like to mention that the websites that don't work differ from person to person. I couldn't access launchpad for instance. Switching to the older kernel fixed my problem, I didn't get to try firestarter because google hadn't cached this page since people mentioned that :-). People on the ubuntu forums suggested using some network-manager ppa, but because I couldn't access launchpad, I coudn't do that. Intrepid should really be delayed to make sure this problem gets fixed. It would give the high command a chance to include openoffice 3 as well.

Forest Bond (forest-bond) wrote :

Attached PCAP files as requested by slangasek.

Forest Bond (forest-bond) wrote :

Attached PCAP files as requested by slangasek.

Forest Bond (forest-bond) wrote :

Attached PCAP files as requested by slangasek.

Steve Langasek (vorlon) wrote :

Such a bug has a significant impact on the release so I'm confirming this for intrepid, but from the available information, I'm not sure that we're really looking at a kernel bug here. The wireshark traces available certainly seem to indicate a failure of the return traffic from the server to be received by the client, which I would expect to manifest much more dramatically if it were the result of a bug in the client stack.

I've requested full pcap traces, which may provide more information here.

Changed in linux:
importance: Undecided → High
status: New → Confirmed
Steve Langasek (vorlon) wrote :

I've run a comparable test against www.youtube.com here, and find that the server responds to the query with one very large, fragmented packet. Given that these packet fragments never appear at all in Forest's trace, and given that there's a common theme that this affects users of Verizon DSL, it seems likely that the router is discarding them for some reason.

Nothing stands out to me in the packet trace that would explain why those fragments would be discarded, though.

What router do each of you have in place on your Verizon DSL?

Forest Bond (forest-bond) wrote :

The complicating factor is that the problem does not exist at all on 2.6.24 (running intrepid userland), but is very easy to reproduce with 2.6.27. I agree that our network environments are related, but the kernel version seems to be a factor, as well.

I have a Westell WireSpeed Data Gateway. I can get the exact model if it's important.

Manuel Siggen (manuel-siggen) wrote :

I experienced the same problem on my Thinkpad X40 : running the 2.6.27 kernel I could ping all hosts on the web, but couldn't wget/curl any of them. Even apt-get update couldn't connect to security.ubuntu.com ! Everything works fine with the 2.6.24 kernel. I tried this at home, I'll try it again at work to see if I can reproduce this strange behavior.

Dhaval Patel (dhuv) wrote :

It does not matter which router you use. I have tried this on a few Verizon DSL connections with different DSL modems and routers (Netgear, Linux gateway, Linksys), they all have the same problem. I am able to ping the server and since one of my servers on the inet has this problem I was able to do a tcpdump on my laptop and server. The server definitely gets the request and even sends a response. I did not spend much time verifying each of the lines in the tcpdump but there definitely is a request and response.

I have a server that is on a DSL connection that is on Cloud9 but their lines are leased from Verizon so it is basically a Verizon connection. When my laptop is on a cable connection I can get to all sites just fine, but I am not able to get to the site on that server. So it is a problem with all clients trying to access servers on a Verizon network as well.

NathanMoureyII (snafu7t5) wrote :

I also had this problem. I found my solution here:

http://wiki.mandriva.com/en/2009.0_Errata#Slow_or_no_internet_response_.28particularly_web_browsing.29

Hope this helps.

Nate.

Forest Bond (forest-bond) wrote :

Indeed, this looks very similar:

https://qa.mandriva.com/show_bug.cgi?id=43372

I will try the fix in a few and report back.

Forest Bond (forest-bond) wrote :

I *think* that `sysctl -w net.ipv4.tcp_timestamps=0` resolves this issue for me.

bobbyjean58 (bobbyjean58) wrote :

One of the first things I tried to rectify this problem was to disable ipv6. That didn't work, at least for me.

Steve Langasek (vorlon) wrote :

bobbyjean58, I think the pointer to wiki.mandriva.com referred to the tcp_timestamps change, not ipv6. Can you confirm that using the sysctl command Forest mentions in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/264019/comments/35 fixes the problem for you, when run as root?

If so, I'll close the linux kernel task here and we'll document this in the release notes.

bobbyjean58 (bobbyjean58) wrote :

Thanks, for the clarification, Steve. Running 'sysctl -w net.ipv4.tcp_timestamps=0' as root works but you lose it once you reboot.

Yes, I uninstalled Firestarter and rebooted before trying that command.

bobbyjean58 (bobbyjean58) wrote :

I placed the command 'sysctl -w net.ipv4.tcp_timestamps=0' in /etc/init.d/rc.local, rebooted, and as of right now everything is fine. No connection problems. But for a novice this might be a bit of a problem to do. Isn't there a way to include this command in the Intrepid final release by default?

Thanks for finding a solution for this, Nathan.

Forest Bond (forest-bond) wrote :

bobbyjean58,

The standard place for this kind of setting is in /etc/sysctl.conf. Add the following line to that file:

net.ipv4.tcp_timestamps=0

This is preferred over /etc/init.d/rc.local (actually, you may want to change /etc/rc.local instead of that file, too).

I agree, though. Ubuntu is supposed to be easy to use, and editing /etc/sysctl.conf is outside of the what many users ought to be reasonably considered to do. I understand that this is ultimately a router bug, but why aren't other OS's affected? Do they also fail to set the timestamp header?

Steve Langasek (vorlon) on 2008-10-16
Changed in linux:
status: Confirmed → Invalid
Colin Watson (cjwatson) on 2008-10-22
Changed in ubuntu-release-notes:
assignee: nobody → kamion
importance: Undecided → Medium
status: New → Triaged
Colin Watson (cjwatson) on 2008-10-22
Changed in ubuntu-release-notes:
status: Triaged → Fix Released
Steve Langasek (vorlon) on 2008-10-22
Changed in ubuntu-release-notes:
status: Fix Released → In Progress
assignee: kamion → vorlon
Changed in linux:
status: Invalid → Triaged
Changed in linux:
status: Unknown → Confirmed
Steve Langasek (vorlon) on 2008-10-22
Changed in ubuntu-release-notes:
status: In Progress → Fix Released
Matt Zimmerman (mdz) on 2008-10-23
Changed in linux:
milestone: none → intrepid-updates
21 comments hidden view all 101 comments

Thanks all for the TCP timestamps workaround on the 8.10 rc release notes page. I can confirm experiencing the described "certain sites" connection problem in ibex after an upgrade from hardy LTS.

I am using a Westel Model 327W wired/wireless router on Verizon DSL. After an attempt to diagnose the issue with system monitor, I rtfm'ed the known issues page and the description looked like it was written to me personally.

Kudos to all for resolving the issue.

Nick Lowe (n-lowe) wrote :

I agree... You should really delay for this if you need more time...

RC shouldn't mean Release ComeHellOrHighWater

The users who are most likely to hit this are home users behind their aged/unmaintained consumer routers who are highly unlikely to understand why they can't access the Web and will just go elsewhere...

Graham Lyon (eviltwin) wrote :

Just to clarify on my point above, what Nick said about it being people who aren't technically minded being affected by this is what I was attempting to show. On mass, the people who are affected by this will be users who don't read the release notes (I often don't bother, for example, only when downloading RC builds do I generally do that). That means that saying "oh well all they have to do is sudo blah and it will all work" is absolutely pointless. This gives the worst out of the box experience for any linux converts and I am anticipating my girlfriend ringing me at the end of the month because she's upgraded her laptop and the internet no longer works...

If you truly care about ubuntu being "linux for human beings" then you will delay the release to get this fixed - having network access is one of the most fundamental features of the distro. How many threads in the forums are you anticipating seeing of people who can't be bothered to search the forums, don't read the release notes and seem to have internet access issues? (assuming they can get to the forums!)

Let's look at the release notes. First, the description isn't /technically/ accurate: the issue isn't a result of tcp timestamping, it's a result of the ordering and disabling timestamps happens to fix the issue. So, I'm a home user and I'm told I should run the command:
sudo sysctl -w net.ipv4.tcp_timestamps=0
Great, where do I type that? What's this TCP stuff anyway? Those release notes are written by devs and clearly target devs at some points, you're just not really thinking about who this affects most badly...

As always, feel free to retort my above points...

Nick Lowe (n-lowe) wrote :

Looking at the rease notes which -normal- user is going to even associate Internet with TCP... (In the UK, most people think TCP is a brand of antiseptic!)

It is implicit to -us- only.

If I asked my brother or mom to read it, what would they say?

They'll just think, I can't get to where I want to go, this doesn't work, Ubuntu is broken!

It really does seem to be another case of Linux, written by geeks for geeks. (And I really thought Ubuntu aimed to be better than that!)

bobbyjean58 (bobbyjean58) wrote :

Quote
"I'm sorry, but just no. You clearly didn't think that through: if they are affected by this issue then they won't be able to get the updated kernel, surely you thought of this? The whole point of the release candidate stage is to catch critical bugs and this bug /is/ critical as you are prevented from updating your installation and so cannot EVER be free of the bug short of downloading the packages to USB. Please correct me if I'm wrong..."

Exactly. How the heck are people affected by this problem going to update the kernel, at some future time, if they can't connect to the update server or seek help if they can't connect to Ubuntu Forum? As we say here in southwest Virginia...that dog don't hunt. BTW, my Verizon DSL modem is only 3 years old (Westell 6100).

Nick Lowe (nick-int-r) wrote :

I've just seen:

http://kerneltrap.org/node/16680

This isn't going to be a corner case issue.

If the problem here is one of QA process having rebuilt and not rebuild time itself, what is needed here IMO is a clear case of risk management.

My view is that, even this late in the release cycle, the fix is so obvious and limited in code footprint/scope, that the question needs to be asked:

What is the chance of us introducing a regression making this change?

-weighted against-

What do we know or think will happen if we do nothing about this?

For me, considering that, I definitely feel that this is a strong candidate to be an exception to normal release rules.

Changed in linux:
status: Confirmed → Fix Released
Steve Langasek (vorlon) wrote :

Good point regarding the opacity of the release note entry's title, thanks. I've updated this now to read:

  Unable to browse some websites when connected to particular Internet providers

I'm reluctant to name particular ISPs or routers in our release notes because we haven't established what routers have the bug. Given that users have reported it with Verizon with multiple models of DSL routers it seems likely that the problem lies with Verizon's routers and not with the Westells themselves, but interjecting guesses into the release notes isn't what we want to be doing.

And yes, this is a question of being able to release on time /at all/, not just a question of QAing the result. But we can aim to have the fixed kernel available ASAP after the 8.10 release.

Colin Watson (cjwatson) wrote :

Due to time constraints, it looks as if we're going to work around this in procps for 8.10.

Changed in procps:
assignee: nobody → kamion
importance: Undecided → Critical
milestone: none → ubuntu-8.10
status: New → In Progress
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package procps - 1:3.2.7-9ubuntu2

---------------
procps (1:3.2.7-9ubuntu2) intrepid; urgency=low

  * Add debian/sysctl.d/10-tcp-timestamps-workaround.conf to disable TCP
    timestamping, since its implementation in the 2.6.27 kernel in Ubuntu
    8.10 causes problems with certain routers (LP: #264019).

 -- Colin Watson <email address hidden> Mon, 27 Oct 2008 10:47:21 +0000

Changed in procps:
status: In Progress → Fix Released
Matt Zimmerman (mdz) wrote :

The procps workaround should resolve this issue for affected users for the 8.10 release. We also plan to issue a kernel update after the release with the proper fix, once it has been fully tested.

[For people reading this as a Launchpad bug report: this mail was sent
to users who reported problems here, and CCed to the bug report for the
record.]

Hi,

I understand that all of you have reported being affected by
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/264019. We've put a
workaround in place for Ubuntu 8.10 (turning off TCP timestamps by
default), and I'd like to confirm that this successfully works around
the bug for you.

Assuming that you're still running intrepid, you can apply our
workaround by upgrading to version 1:3.2.7-9ubuntu2 of the 'procps'
package. This is available now on archive.ubuntu.com, although if you're
using a mirror this may take a while to propagate, and if the bug itself
prevents you from reaching archive.ubuntu.com then of course this may be
a problem. I've put copies of the packages here, which you can fetch
from a separate machine and install with 'sudo dpkg -i' if you need to:

  http://www.chiark.greenend.org.uk/~cjwatson/tmp/procps/procps_3.2.7-9ubuntu2_amd64.deb
    (md5sum: 9e7bc81b5bd2122fcbb6a8095e13ca3b, for 64-bit PCs)
  http://www.chiark.greenend.org.uk/~cjwatson/tmp/procps/procps_3.2.7-9ubuntu2_i386.deb
    (md5sum: f574ef691b8ab6492bad30fdbe2d0829, for 32-bit PCs)

Please report your findings as comments on
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/264019 so that we
can confirm this fix as quickly as possible or, if necessary, figure out
what else we need to do.

Thanks,

--
Colin Watson [<email address hidden>]

Matt Zimmerman (mdz) wrote :

As the workaround is now in place, I'm lowering the importance of the kernel task to medium as it's no longer urgent.

Changed in linux:
importance: High → Medium
Michael Hipp (michael-hipp) wrote :

I can confirm the fix works (this on an Alltel 3G/EVDO network with WinXP as a router - don't laugh). The problem had been consistent and repeatable before. Good work Colin and anyone else that helped.

Could you let us know when the fix has made it into an iso so we can test a fresh install?

I can confirm that running procps_3.2.7-9ubuntu2_i386.deb worked as expected.

I downloaded the .deb package, reversed the net.ipv4.tcp_timestamps = 0 in /etc/sysctl.conf fix and then applied the package.

I am using a Westell Model 327W (D90-327W15-06) Manufactured 9/2006 with Verizon DSL.

Thanks all!

On Mon, Oct 27, 2008 at 03:57:37PM -0000, Michael Hipp wrote:
> I can confirm the fix works (this on an Alltel 3G/EVDO network with
> WinXP as a router - don't laugh). The problem had been consistent and
> repeatable before. Good work Colin and anyone else that helped.

> Could you let us know when the fix has made it into an iso so we can
> test a fresh install?

The fix is in all the daily ISO images from today.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Changed in linux:
status: Unknown → Invalid

The new procps deb I just updated via synaptic fixed the network issue. Congrats. This will give a new user an opportunity to fairly judge Ubuntu without having to deal with a connection issue. Nice work.

Thanks for your efforts to everyone involved.

Bob Good

Tim Gardner (timg-tpi) wrote :

Intrepid - tcp: Restore ordering of TCP options for the sake of inter-operability

Changed in linux:
assignee: nobody → timg-tpi
importance: Medium → High
status: Triaged → Fix Committed
Nick Lowe (n-lowe) wrote :

http://lwn.net/SubscriberLink/304791/b4651360c812df4a/?format=printable

Posted Oct 29, 2008 14:29 UTC (Wed) by davem (subscriber, #4154) [Link]
Btw, if you care at all about your data, you will not run
Ubuntu's release that doesn't fix the kernel and instead
turns timestamps off.

If you turn timestamps off, at rates of 1GB/s and above you
are exposed to possible sequence number wraparound. This in
turn can lead to data corruption. Without timestamps there is
no PAWS protection (Protection Against Wrapped Sequence numbers)
and thus at high enough data rates new data can be interpreted
as old data and vice versa, corrupting your data stream.

Ubuntu made the wrong decision, there is simply no argument for
the way this was "handled."

I don't understand why everyone gets their tits in a knot when
even the slightest suggestion of slipping a release is suggested
in order to fix a serious bug like one of this magnitude. It is
always the right thing to do, and it avoids crap like what is
happening here.

To reiterate, if timestamps are off, you are exposed to possible
data corruption.

Graham Lyon (eviltwin) wrote :

>Ubuntu made the wrong decision, there is simply no argument for
>the way this was "handled."

Called it ;)

Matt Zimmerman (mdz) wrote :

Let's leave the childish remarks to certain bitter Red Hat employees on LWN, ok? This is a bug report and is not the place to engage in this sort of nonsense.

Nick Lowe (n-lowe) wrote :

Sorry? Respectfully, you seem to be missing the point here.

"So its a data corruption issue on top of a security issue.

Has this been communicated into the Ubuntu bug tracker?

-jef"

I have quoted and posted a link here for that very reason.

Thanks,

Nick

Nick Lowe (n-lowe) wrote :

That certain "bitter" red hat employee would happen to be one of the maintainers of the Linux TCP/IP stack:

http://lxr.linux.no/linux+v2.6.26/MAINTAINERS#L2947

I certainly had no appreciation of the importance of PAWS protection...

Nick Lowe (n-lowe) wrote :

Just to clarify this in stark terms...

The workaround -introduces- a data corruption problem at high data rates because it disables PAWS protection in the TCP/IP stack by virtue of the timestamps no longer being there.

The server release will go out with this, which will be run in places highly likely to see these data rates!

Stefano Maioli (smaioli) wrote :

If I understand this correctly, TCP timestamps were completely broken in Hardy, so the workaround introduces no new bug or regression. Plus, a fixed kernel will probably come in a matter of days. If this is true I don't see the point in delaying the release.

On Wed, Oct 29, 2008 at 10:56:23PM -0000, Nick Lowe wrote:
> Just to clarify this in stark terms...

> The workaround -introduces- a data corruption problem at high data rates
> because it disables PAWS protection in the TCP/IP stack by virtue of the
> timestamps no longer being there.

> The server release will go out with this, which will be run in places
> highly likely to see these data rates!

It has always been our intention to provide a kernel update immediately
post-release to correct this issue. This new information doesn't change
that plan (and indeed, Dave appears to acknowledge that disabling tcp
timestamping is the lesser evil here).

We should consider re-adding this to the release notes based on this latest
information, so thank you for the pointer.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Changed in ubuntu-release-notes:
status: Fix Released → Confirmed

the decision has been made to release this as a security update immediately post-release, so a separate release notes entry is not warranted.

Changed in ubuntu-release-notes:
status: Confirmed → Invalid
Pete Graner (pgraner) wrote :

For those watching this bug you can find out why we did things this way here:

http://blog.redvoodoo.org/2008/10/tcp-timestamps-2627-why-ubuntu-put-out.html

Pete Graner (pgraner) wrote :

linux (2.6.27-7.15) intrepid-security; urgency=low

  [ Upstream Kernel Changes ]

  * tcp: Restore ordering of TCP options for the sake of inter-operability
    - LP: #264019

 -- Tim Gardner <email address hidden> Mon, 27 Oct 2008 19:28:06 -0600

Changed in linux:
status: Fix Committed → Fix Released
Colin Watson (cjwatson) wrote :

procps (1:3.2.7-9ubuntu3) intrepid-security; urgency=low

  * Drop debian/sysctl.d/10-tcp-timestamps-workaround.conf again now that we
    have a fixed kernel, and make sure it gets removed on upgrade to this
    version (LP: #264019).

 -- Colin Watson <email address hidden> Thu, 30 Oct 2008 03:39:39 +0000

Pete Graner (pgraner) wrote :

Technical explanation of how the procps package undoes the tcp timestamp setting described earlier in this bug:

http://blog.redvoodoo.org/2008/10/backing-out-procps-change-how-it-works.html

Dean Holland (speedster-3fl) wrote :

I am still seeing issues even with the 2.6.27-7.15 kernel and updated procps package. Setting tcp_timestamps to 0 does not resolve it for me.

I can, however, reboot into the 2.6.24-21-server (from Hardy) and all browsing etc. returns to normal.

speedster@whinge:~$ apt-cache policy procps linux-image-2.6.27-7-server
procps:
  Installed: 1:3.2.7-9ubuntu2.1
  Candidate: 1:3.2.7-9ubuntu2.1
  Version table:
 *** 1:3.2.7-9ubuntu2.1 0
        500 http://mirror.3fl.net.au intrepid-updates/main Packages
        500 http://security.ubuntu.com intrepid-security/main Packages
        100 /var/lib/dpkg/status
     1:3.2.7-9ubuntu2 0
        500 http://mirror.3fl.net.au intrepid/main Packages
linux-image-2.6.27-7-server:
  Installed: 2.6.27-7.15
  Candidate: 2.6.27-7.15
  Version table:
 *** 2.6.27-7.15 0
        500 http://mirror.3fl.net.au intrepid-updates/main Packages
        500 http://security.ubuntu.com intrepid-security/main Packages
        100 /var/lib/dpkg/status
     2.6.27-7.14 0
        500 http://mirror.3fl.net.au intrepid/main Packages

Il giorno Sun, 02 Nov 2008 13:38:39 -0000
Dean Holland <email address hidden> ha scritto:

> I am still seeing issues even with the 2.6.27-7.15 kernel and updated
> procps package. Setting tcp_timestamps to 0 does not resolve it for
> me.

try adding:
echo 0 > /proc/sys/net/ipv4/tcp_sack

> try adding:
> echo 0 > /proc/sys/net/ipv4/tcp_sack

No good unfortunately, I captured the output of tcpdump and have attached it.

Dean Holland (speedster-3fl) wrote :

libpcap output on 2.6.24-21

Luke Maurer (luke-maurer) wrote :

I'm seeing a rather bizarre variant of this. I just upgraded my masquerading home server to Intrepid, and now browsing from the server itself works fine, but browsing from *other* machines does not, including both my Intrepid desktop machine and my iPhone. Very similar symptoms - if I'm reading tshark right, the request goes out (and sometimes even gets ACK'd) but the HTTP response never materializes. Setting either tcp_timestamps or tcp_sack to 0 doesn't help, and I just upgraded yesterday so everything should be up to date.

Luke Maurer (luke-maurer) wrote :

Ugh ... my bad. It's a userspace problem; I just booted into the Hardy kernel and it's still broken ... so nevermind.

Dean Holland (speedster-3fl) wrote :

I still believe mine is a kernel issue, as per the information in this Ubuntu Forums thread: http://ubuntuforums.org/showthread.php?p=6128733

This machine runs inside of KVM but that should not make any difference as it works _perfectly_ with 2.6.24-21, leading me to believe this hasn't been completely resolved.

To summarise, only changing the default kernel in menu.lst to 2.6.24-21 and rebooting makes the issue go away. I have tried:
 - empty firewall; only a masquerade rule
 - removing all QoS settings
 - clean install of Intrepid (64-bit) as the current was upgraded from Hardy
 - clean install of Intrepid (32-bit)
 - changing MTU and MSS clamping to lower sizes (1460 & 1420)
 - changed the type of network card from virtio to the default (rtl8139)

I see 2.6.7-7.16 is out, but as that was just a security fix I didn't expect that would have resolved it.

On Sat, Nov 08, 2008 at 05:09:26AM -0000, Dean Holland wrote:
> I still believe mine is a kernel issue, as per the information in this
> Ubuntu Forums thread: http://ubuntuforums.org/showthread.php?p=6128733
>
> This machine runs inside of KVM but that should not make any difference
> as it works _perfectly_ with 2.6.24-21, leading me to believe this
> hasn't been completely resolved.
>
> To summarise, only changing the default kernel in menu.lst to 2.6.24-21 and rebooting makes the issue go away. I have tried:
> - empty firewall; only a masquerade rule
> - removing all QoS settings
> - clean install of Intrepid (64-bit) as the current was upgraded from Hardy
> - clean install of Intrepid (32-bit)
> - changing MTU and MSS clamping to lower sizes (1460 & 1420)
> - changed the type of network card from virtio to the default (rtl8139)
>
> I see 2.6.7-7.16 is out, but as that was just a security fix I didn't
> expect that would have resolved it.

If disabling TCP timestamping doesn't work around the problem, then your
symptoms are different from the others described in this bug, and you should
file your report separately.

--
 - mdz

O.Chr.Jensen (ochrj) on 2009-08-28
Changed in linux:
status: Fix Released → In Progress
Steve Langasek (vorlon) on 2009-08-28
Changed in linux:
status: In Progress → Fix Released
Changed in linux (Mandriva):
importance: Unknown → High
Displaying first 40 and last 40 comments. View all 101 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.