TCP option ordering in 2.6.27 is not interoperable with some network equipment

Bug #264019 reported by Marques Johansson
92
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Unknown
Release Notes for Ubuntu
Medium
Steve Langasek
autorun4linuxcd (Ubuntu)
High
Tim Gardner
Intrepid
High
Tim Gardner
linux (Mandriva)
Invalid
High
procps (Ubuntu)
Critical
Colin Watson
Intrepid
Critical
Colin Watson

Bug Description

2.6.27 introduced a networking issue, leaving sites like youtube.com, and microsoft.com not working via the web browsers on one of my systems (but not others). This system, like the others, was upgraded from Hardy.

I am unable to transfer files from archive.ubuntu.com via apt (http) or lftp (ftp, not http). I can connect via ftp and traverse and list files/directories, but when I try to "get", the transfer stays at 0 bytes.

Likewise, if I do a naive "telnet youtube.com 80"
GET / HTTP/1.1
HOST: youtube.com
I get a response, but when the web browser (firefox,opera) attempts to load these sites they do not progress - the 'transferring' status stays fixed. Some web sites (google, yahoo) work fine though.

When I reboot this system back to the 2.6.26-5 kernel everything works as it should. I tried setting rp_filter=0 on all, eth0, and default (instead of the Ubuntu default 1), as I have seen this and ecn settings cause site specific problems. Those settings had no effect in this case.

I only have remote access to this system. I am able to remotely ssh into this box, but I could not scp a file to it. I was able to scp to this host from a host on the same local network, however. I have tested this problem using both the wired ethernet adapter, and using a wireless adapter. The same problem existed in both configurations - some websites could be reached, while others could not.

The other Intrepid host on the same network did not have these connection issues.

Revision history for this message
Hinrik Örn Sigurðsson (hinrik-sig) wrote :

I just installed Intrepid alpha 5 and I have the same issue.

Revision history for this message
Hinrik Örn Sigurðsson (hinrik-sig) wrote :

This is an issue with both 2.6.27-1 and 2.6.27-2. Reverting back to 2.6.26-5 works (had to copy the packages over via USB of course).

Revision history for this message
Timothy G. Rundle (tgrundle) wrote :

I am experiencing the same issue (also in 2.6.27-3). If I boot in Hardy everything works just fine, but in Intrepid Alpha 5, some websites and apt don't work. All LAN traffic works just fine. I also eliminated any hardware issues by replacing the route, network line, and the networks card. I did a snoop of teh network traffic, but nothing stood out to me there

I have attached the output of the lspci.

Revision history for this message
Alex Converse (ajc30) wrote :

I'm experiencing the same issue with an Intel Corporation 82566DC Gigabit Network Connection (rev 02)

Revision history for this message
bobbyjean58 (bobbyjean58) wrote :

Similar problem since 2.6.27 kernel was introduced in Intrepid. I can connect to some sites, like iGoogle and Yahoo (plus others when I can find them), but not others (like Ubuntu Forums). I can not do any updates either in terminal (cursor just sits there blinking with Waiting for Headers displayed) nor Synaptic. I did find a work around though. If I install Firestarter then internet surfing is back to normal once again (under 2.6.27 kernel). Strange, I know, but it works. My chipset is VIA VT6103L for my on-board Lan connection. I also run thru a Westell 6100 Nat combo modem. I've never had this problem in prior versions of Ubuntu. BTW, in Hardy I compiled a 2.6.27-rc5 kernel and had the same problem.

Glad I found someone else who's having the same problem. At least I know it's not just me.

Thanks.

Revision history for this message
Marques Johansson (marques) wrote :

On an up-to-date Intrepid system with 2.6.27-4 (i386) I am still experiencing this problem. 2.6.27-2-generic also showed this problem.

When using 2.6.26-5 network connections work correctly.

Revision history for this message
Marques Johansson (marques) wrote :

Here's the dmesg from 2.6.27-4-generic. Some sites do not make functioning connection - including Ubuntu updates via apt.

Revision history for this message
bobbyjean58 (bobbyjean58) wrote :

It's the 2.6.27 kernel that's the problem. Yesterday I tried Fedora 10, which uses the 2.6.27-* kernel, and had the same problem, but, just like in Ubuntu, if you install firestarter (I downloaded a copy and saved it before I installed Fedora) internet surfing is fine. Strange.

Revision history for this message
Hinrik Örn Sigurðsson (hinrik-sig) wrote :

2.6.27-5 still has this bug.

description: updated
Revision history for this message
cropr (ruben-decrop) wrote : Re: unable to visit some websites and ftpsites with 2.6.27

I experience the same bug with Intrepid beta1. Doing a telnet on port 80 to a slow site works. Doing a GET / is extremely slow, less than 100 Bytes per second. Because of this behaviour, any browser I used (Firefox, Opera, Konqueror) give connection errors.

booting Intrepid beta 1 with a hardy kernel 2.4.24-19 works OK

Revision history for this message
Nathan Barnes (nathanvbarnes) wrote :

Hi this is my first ever post to a ubuntu / launchpad forum.
I am having the exact same problem with Ubuntu 8.10. I have found that turning off the spi firewall in my Netgear WPN824 seems to fix it. However the system update manager is still running very slow.
Everything worked fine in Hardy, which i used on the same computer previously (an eepc 900). However my house mate wants the spi firewall on, so i cant use internet :(

Revision history for this message
Nathan Barnes (nathanvbarnes) wrote :

I just Installed firestarter and that fixes both Web surfing and update manager downloads. Thanks BobbyJean

Revision history for this message
Daniel Leiter (dleiter) wrote :

I had been trying to find a solution to this problem for a week! This just made my day, thanks!

Revision history for this message
Forest Bond (forest-bond) wrote :

I'm experiencing the same issue on my HP Mini Note 2133, both on the wireless interface (b43) and the wired interface (tg3).

I did some tracing with wireshark and found that a response packet was being dropped and not resent. It was followed by a duplicate ACK. I don't know TCP well enough to provide any further insight, but I do have the wireshark capture file. Please ask for it if it would be useful.

Revision history for this message
Forest Bond (forest-bond) wrote :

One more thing: I don't think it is the site that causes the problem. It's just that larger HTTP downloads are more likely to fail due to increased likelihood of dropped packets.

Revision history for this message
Forest Bond (forest-bond) wrote :

BTW, firestarter doesn't seem to help here. I suspect that if it appears to work for anyone, it's probably coincidental. Please feel free to correct me if I'm wrong.

What's confusing about this bug is that it certainly appears to be in the TCP stack somewhere, but it obviously isn't affecting a lot of people or there would be more reports. It's not limited to a particular driver, though, because I see it with both my wired and wireless connections.

This is a real show-stopper for me, though. TCP becomes totally unreliable.

Revision history for this message
Forest Bond (forest-bond) wrote :

Text export of wireshark trace is attached.

BTW, I'm wondering if this is related to https://bugs.launchpad.net/ubuntu/+source/linux/+bug/230456. I don't really know anything about this, though, so I could be way off...

Revision history for this message
Forest Bond (forest-bond) wrote :

Yeah, can't be #230456 because that patch is already applied.

I have another wireshark trace attached. The basic problem is "TCP Previous segment lost" followed by a series of "TCP Dup Ack".

Revision history for this message
Forest Bond (forest-bond) wrote :

One more wireshark capture. This one shows multiple duplicate ACKs.

Revision history for this message
Dhaval Patel (dhuv) wrote :

I also notice that on my Verizon DSL it does NOT work but on a Comcast Cable or Time Warner Cable connection, it is fine. I hope this info helps with the troubleshooting. For the time being I will downgrade to 2.6.25.

Revision history for this message
Forest Bond (forest-bond) wrote :

I'm also on Verizon (well, Earthlink via Fairpoint, previously Verizon) DSL.

Revision history for this message
Marques Johansson (marques) wrote :

I also experienced the problem with Verizon DSL as the ISP.

Revision history for this message
bobbyjean58 (bobbyjean58) wrote :

I'm on Verizon DSL.

Revision history for this message
Omer (mermerico) wrote :

I have this problem as well. Running 64 bit, wireless with ndiswrapper, verizon dsl. I would like to mention that the websites that don't work differ from person to person. I couldn't access launchpad for instance. Switching to the older kernel fixed my problem, I didn't get to try firestarter because google hadn't cached this page since people mentioned that :-). People on the ubuntu forums suggested using some network-manager ppa, but because I couldn't access launchpad, I coudn't do that. Intrepid should really be delayed to make sure this problem gets fixed. It would give the high command a chance to include openoffice 3 as well.

Revision history for this message
Forest Bond (forest-bond) wrote :

Attached PCAP files as requested by slangasek.

Revision history for this message
Forest Bond (forest-bond) wrote :

Attached PCAP files as requested by slangasek.

Revision history for this message
Forest Bond (forest-bond) wrote :

Attached PCAP files as requested by slangasek.

Revision history for this message
Steve Langasek (vorlon) wrote :

Such a bug has a significant impact on the release so I'm confirming this for intrepid, but from the available information, I'm not sure that we're really looking at a kernel bug here. The wireshark traces available certainly seem to indicate a failure of the return traffic from the server to be received by the client, which I would expect to manifest much more dramatically if it were the result of a bug in the client stack.

I've requested full pcap traces, which may provide more information here.

Changed in linux:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Steve Langasek (vorlon) wrote :

I've run a comparable test against www.youtube.com here, and find that the server responds to the query with one very large, fragmented packet. Given that these packet fragments never appear at all in Forest's trace, and given that there's a common theme that this affects users of Verizon DSL, it seems likely that the router is discarding them for some reason.

Nothing stands out to me in the packet trace that would explain why those fragments would be discarded, though.

What router do each of you have in place on your Verizon DSL?

Revision history for this message
Forest Bond (forest-bond) wrote :

The complicating factor is that the problem does not exist at all on 2.6.24 (running intrepid userland), but is very easy to reproduce with 2.6.27. I agree that our network environments are related, but the kernel version seems to be a factor, as well.

I have a Westell WireSpeed Data Gateway. I can get the exact model if it's important.

Revision history for this message
Manuel Siggen (manuel-siggen) wrote :

I experienced the same problem on my Thinkpad X40 : running the 2.6.27 kernel I could ping all hosts on the web, but couldn't wget/curl any of them. Even apt-get update couldn't connect to security.ubuntu.com ! Everything works fine with the 2.6.24 kernel. I tried this at home, I'll try it again at work to see if I can reproduce this strange behavior.

Revision history for this message
Dhaval Patel (dhuv) wrote :

It does not matter which router you use. I have tried this on a few Verizon DSL connections with different DSL modems and routers (Netgear, Linux gateway, Linksys), they all have the same problem. I am able to ping the server and since one of my servers on the inet has this problem I was able to do a tcpdump on my laptop and server. The server definitely gets the request and even sends a response. I did not spend much time verifying each of the lines in the tcpdump but there definitely is a request and response.

I have a server that is on a DSL connection that is on Cloud9 but their lines are leased from Verizon so it is basically a Verizon connection. When my laptop is on a cable connection I can get to all sites just fine, but I am not able to get to the site on that server. So it is a problem with all clients trying to access servers on a Verizon network as well.

Revision history for this message
NathanMoureyII (snafu7t5) wrote :

I also had this problem. I found my solution here:

http://wiki.mandriva.com/en/2009.0_Errata#Slow_or_no_internet_response_.28particularly_web_browsing.29

Hope this helps.

Nate.

Revision history for this message
Forest Bond (forest-bond) wrote :

Indeed, this looks very similar:

https://qa.mandriva.com/show_bug.cgi?id=43372

I will try the fix in a few and report back.

Revision history for this message
Forest Bond (forest-bond) wrote :

I *think* that `sysctl -w net.ipv4.tcp_timestamps=0` resolves this issue for me.

Revision history for this message
bobbyjean58 (bobbyjean58) wrote :

One of the first things I tried to rectify this problem was to disable ipv6. That didn't work, at least for me.

Revision history for this message
Steve Langasek (vorlon) wrote :

bobbyjean58, I think the pointer to wiki.mandriva.com referred to the tcp_timestamps change, not ipv6. Can you confirm that using the sysctl command Forest mentions in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/264019/comments/35 fixes the problem for you, when run as root?

If so, I'll close the linux kernel task here and we'll document this in the release notes.

Revision history for this message
bobbyjean58 (bobbyjean58) wrote :

Thanks, for the clarification, Steve. Running 'sysctl -w net.ipv4.tcp_timestamps=0' as root works but you lose it once you reboot.

Yes, I uninstalled Firestarter and rebooted before trying that command.

Revision history for this message
bobbyjean58 (bobbyjean58) wrote :

I placed the command 'sysctl -w net.ipv4.tcp_timestamps=0' in /etc/init.d/rc.local, rebooted, and as of right now everything is fine. No connection problems. But for a novice this might be a bit of a problem to do. Isn't there a way to include this command in the Intrepid final release by default?

Thanks for finding a solution for this, Nathan.

Revision history for this message
Forest Bond (forest-bond) wrote :

bobbyjean58,

The standard place for this kind of setting is in /etc/sysctl.conf. Add the following line to that file:

net.ipv4.tcp_timestamps=0

This is preferred over /etc/init.d/rc.local (actually, you may want to change /etc/rc.local instead of that file, too).

I agree, though. Ubuntu is supposed to be easy to use, and editing /etc/sysctl.conf is outside of the what many users ought to be reasonably considered to do. I understand that this is ultimately a router bug, but why aren't other OS's affected? Do they also fail to set the timestamp header?

Steve Langasek (vorlon)
Changed in linux:
status: Confirmed → Invalid
Colin Watson (cjwatson)
Changed in ubuntu-release-notes:
assignee: nobody → kamion
importance: Undecided → Medium
status: New → Triaged
Colin Watson (cjwatson)
Changed in ubuntu-release-notes:
status: Triaged → Fix Released
Steve Langasek (vorlon)
Changed in ubuntu-release-notes:
status: Fix Released → In Progress
assignee: kamion → vorlon
Changed in linux:
status: Invalid → Triaged
Changed in linux:
status: Unknown → Confirmed
Steve Langasek (vorlon)
Changed in ubuntu-release-notes:
status: In Progress → Fix Released
Matt Zimmerman (mdz)
Changed in linux:
milestone: none → intrepid-updates
21 comments hidden view all 101 comments
Revision history for this message
williamlweaver (weaver-lasalle) wrote :

Thanks all for the TCP timestamps workaround on the 8.10 rc release notes page. I can confirm experiencing the described "certain sites" connection problem in ibex after an upgrade from hardy LTS.

I am using a Westel Model 327W wired/wireless router on Verizon DSL. After an attempt to diagnose the issue with system monitor, I rtfm'ed the known issues page and the description looked like it was written to me personally.

Kudos to all for resolving the issue.

Revision history for this message
Nick Lowe (n-lowe) wrote :

I agree... You should really delay for this if you need more time...

RC shouldn't mean Release ComeHellOrHighWater

The users who are most likely to hit this are home users behind their aged/unmaintained consumer routers who are highly unlikely to understand why they can't access the Web and will just go elsewhere...

Revision history for this message
Graham Lyon (eviltwin) wrote :

Just to clarify on my point above, what Nick said about it being people who aren't technically minded being affected by this is what I was attempting to show. On mass, the people who are affected by this will be users who don't read the release notes (I often don't bother, for example, only when downloading RC builds do I generally do that). That means that saying "oh well all they have to do is sudo blah and it will all work" is absolutely pointless. This gives the worst out of the box experience for any linux converts and I am anticipating my girlfriend ringing me at the end of the month because she's upgraded her laptop and the internet no longer works...

If you truly care about ubuntu being "linux for human beings" then you will delay the release to get this fixed - having network access is one of the most fundamental features of the distro. How many threads in the forums are you anticipating seeing of people who can't be bothered to search the forums, don't read the release notes and seem to have internet access issues? (assuming they can get to the forums!)

Let's look at the release notes. First, the description isn't /technically/ accurate: the issue isn't a result of tcp timestamping, it's a result of the ordering and disabling timestamps happens to fix the issue. So, I'm a home user and I'm told I should run the command:
sudo sysctl -w net.ipv4.tcp_timestamps=0
Great, where do I type that? What's this TCP stuff anyway? Those release notes are written by devs and clearly target devs at some points, you're just not really thinking about who this affects most badly...

As always, feel free to retort my above points...

Revision history for this message
Nick Lowe (n-lowe) wrote :

Looking at the rease notes which -normal- user is going to even associate Internet with TCP... (In the UK, most people think TCP is a brand of antiseptic!)

It is implicit to -us- only.

If I asked my brother or mom to read it, what would they say?

They'll just think, I can't get to where I want to go, this doesn't work, Ubuntu is broken!

It really does seem to be another case of Linux, written by geeks for geeks. (And I really thought Ubuntu aimed to be better than that!)

Revision history for this message
bobbyjean58 (bobbyjean58) wrote :

Quote
"I'm sorry, but just no. You clearly didn't think that through: if they are affected by this issue then they won't be able to get the updated kernel, surely you thought of this? The whole point of the release candidate stage is to catch critical bugs and this bug /is/ critical as you are prevented from updating your installation and so cannot EVER be free of the bug short of downloading the packages to USB. Please correct me if I'm wrong..."

Exactly. How the heck are people affected by this problem going to update the kernel, at some future time, if they can't connect to the update server or seek help if they can't connect to Ubuntu Forum? As we say here in southwest Virginia...that dog don't hunt. BTW, my Verizon DSL modem is only 3 years old (Westell 6100).

Revision history for this message
Nick Lowe (nick-int-r) wrote :

I've just seen:

http://kerneltrap.org/node/16680

This isn't going to be a corner case issue.

If the problem here is one of QA process having rebuilt and not rebuild time itself, what is needed here IMO is a clear case of risk management.

My view is that, even this late in the release cycle, the fix is so obvious and limited in code footprint/scope, that the question needs to be asked:

What is the chance of us introducing a regression making this change?

-weighted against-

What do we know or think will happen if we do nothing about this?

For me, considering that, I definitely feel that this is a strong candidate to be an exception to normal release rules.

Changed in linux:
status: Confirmed → Fix Released
Revision history for this message
Steve Langasek (vorlon) wrote :

Good point regarding the opacity of the release note entry's title, thanks. I've updated this now to read:

  Unable to browse some websites when connected to particular Internet providers

I'm reluctant to name particular ISPs or routers in our release notes because we haven't established what routers have the bug. Given that users have reported it with Verizon with multiple models of DSL routers it seems likely that the problem lies with Verizon's routers and not with the Westells themselves, but interjecting guesses into the release notes isn't what we want to be doing.

And yes, this is a question of being able to release on time /at all/, not just a question of QAing the result. But we can aim to have the fixed kernel available ASAP after the 8.10 release.

Revision history for this message
Colin Watson (cjwatson) wrote :

Due to time constraints, it looks as if we're going to work around this in procps for 8.10.

Changed in procps:
assignee: nobody → kamion
importance: Undecided → Critical
milestone: none → ubuntu-8.10
status: New → In Progress
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package procps - 1:3.2.7-9ubuntu2

---------------
procps (1:3.2.7-9ubuntu2) intrepid; urgency=low

  * Add debian/sysctl.d/10-tcp-timestamps-workaround.conf to disable TCP
    timestamping, since its implementation in the 2.6.27 kernel in Ubuntu
    8.10 causes problems with certain routers (LP: #264019).

 -- Colin Watson <email address hidden> Mon, 27 Oct 2008 10:47:21 +0000

Changed in procps:
status: In Progress → Fix Released
Revision history for this message
Matt Zimmerman (mdz) wrote :

The procps workaround should resolve this issue for affected users for the 8.10 release. We also plan to issue a kernel update after the release with the proper fix, once it has been fully tested.

Revision history for this message
Colin Watson (cjwatson) wrote : Testing of bug 264019

[For people reading this as a Launchpad bug report: this mail was sent
to users who reported problems here, and CCed to the bug report for the
record.]

Hi,

I understand that all of you have reported being affected by
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/264019. We've put a
workaround in place for Ubuntu 8.10 (turning off TCP timestamps by
default), and I'd like to confirm that this successfully works around
the bug for you.

Assuming that you're still running intrepid, you can apply our
workaround by upgrading to version 1:3.2.7-9ubuntu2 of the 'procps'
package. This is available now on archive.ubuntu.com, although if you're
using a mirror this may take a while to propagate, and if the bug itself
prevents you from reaching archive.ubuntu.com then of course this may be
a problem. I've put copies of the packages here, which you can fetch
from a separate machine and install with 'sudo dpkg -i' if you need to:

  http://www.chiark.greenend.org.uk/~cjwatson/tmp/procps/procps_3.2.7-9ubuntu2_amd64.deb
    (md5sum: 9e7bc81b5bd2122fcbb6a8095e13ca3b, for 64-bit PCs)
  http://www.chiark.greenend.org.uk/~cjwatson/tmp/procps/procps_3.2.7-9ubuntu2_i386.deb
    (md5sum: f574ef691b8ab6492bad30fdbe2d0829, for 32-bit PCs)

Please report your findings as comments on
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/264019 so that we
can confirm this fix as quickly as possible or, if necessary, figure out
what else we need to do.

Thanks,

--
Colin Watson [<email address hidden>]

Revision history for this message
Forest Bond (forest-bond) wrote : Re: unable to visit some websites and ftpsites with 2.6.27

This is the right decision. Thanks!

Revision history for this message
Matt Zimmerman (mdz) wrote :

As the workaround is now in place, I'm lowering the importance of the kernel task to medium as it's no longer urgent.

Changed in linux:
importance: High → Medium
Revision history for this message
Michael Hipp (michael-hipp) wrote :

I can confirm the fix works (this on an Alltel 3G/EVDO network with WinXP as a router - don't laugh). The problem had been consistent and repeatable before. Good work Colin and anyone else that helped.

Could you let us know when the fix has made it into an iso so we can test a fresh install?

Revision history for this message
williamlweaver (weaver-lasalle) wrote :

I can confirm that running procps_3.2.7-9ubuntu2_i386.deb worked as expected.

I downloaded the .deb package, reversed the net.ipv4.tcp_timestamps = 0 in /etc/sysctl.conf fix and then applied the package.

I am using a Westell Model 327W (D90-327W15-06) Manufactured 9/2006 with Verizon DSL.

Thanks all!

Revision history for this message
Steve Langasek (vorlon) wrote : Re: [Bug 264019] Re: unable to visit some websites and ftpsites with 2.6.27

On Mon, Oct 27, 2008 at 03:57:37PM -0000, Michael Hipp wrote:
> I can confirm the fix works (this on an Alltel 3G/EVDO network with
> WinXP as a router - don't laugh). The problem had been consistent and
> repeatable before. Good work Colin and anyone else that helped.

> Could you let us know when the fix has made it into an iso so we can
> test a fresh install?

The fix is in all the daily ISO images from today.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Changed in linux:
status: Unknown → Invalid
Revision history for this message
bobbyjean58 (bobbyjean58) wrote : Re: unable to visit some websites and ftpsites with 2.6.27

The new procps deb I just updated via synaptic fixed the network issue. Congrats. This will give a new user an opportunity to fairly judge Ubuntu without having to deal with a connection issue. Nice work.

Thanks for your efforts to everyone involved.

Bob Good

Revision history for this message
Tim Gardner (timg-tpi) wrote :

Intrepid - tcp: Restore ordering of TCP options for the sake of inter-operability

Changed in linux:
assignee: nobody → timg-tpi
importance: Medium → High
status: Triaged → Fix Committed
Revision history for this message
Nick Lowe (n-lowe) wrote :

http://lwn.net/SubscriberLink/304791/b4651360c812df4a/?format=printable

Posted Oct 29, 2008 14:29 UTC (Wed) by davem (subscriber, #4154) [Link]
Btw, if you care at all about your data, you will not run
Ubuntu's release that doesn't fix the kernel and instead
turns timestamps off.

If you turn timestamps off, at rates of 1GB/s and above you
are exposed to possible sequence number wraparound. This in
turn can lead to data corruption. Without timestamps there is
no PAWS protection (Protection Against Wrapped Sequence numbers)
and thus at high enough data rates new data can be interpreted
as old data and vice versa, corrupting your data stream.

Ubuntu made the wrong decision, there is simply no argument for
the way this was "handled."

I don't understand why everyone gets their tits in a knot when
even the slightest suggestion of slipping a release is suggested
in order to fix a serious bug like one of this magnitude. It is
always the right thing to do, and it avoids crap like what is
happening here.

To reiterate, if timestamps are off, you are exposed to possible
data corruption.

Revision history for this message
Graham Lyon (eviltwin) wrote :

>Ubuntu made the wrong decision, there is simply no argument for
>the way this was "handled."

Called it ;)

Revision history for this message
Matt Zimmerman (mdz) wrote :

Let's leave the childish remarks to certain bitter Red Hat employees on LWN, ok? This is a bug report and is not the place to engage in this sort of nonsense.

Revision history for this message
Nick Lowe (n-lowe) wrote :

Sorry? Respectfully, you seem to be missing the point here.

"So its a data corruption issue on top of a security issue.

Has this been communicated into the Ubuntu bug tracker?

-jef"

I have quoted and posted a link here for that very reason.

Thanks,

Nick

Revision history for this message
Nick Lowe (n-lowe) wrote :

That certain "bitter" red hat employee would happen to be one of the maintainers of the Linux TCP/IP stack:

http://lxr.linux.no/linux+v2.6.26/MAINTAINERS#L2947

I certainly had no appreciation of the importance of PAWS protection...

Revision history for this message
Nick Lowe (n-lowe) wrote :

Just to clarify this in stark terms...

The workaround -introduces- a data corruption problem at high data rates because it disables PAWS protection in the TCP/IP stack by virtue of the timestamps no longer being there.

The server release will go out with this, which will be run in places highly likely to see these data rates!

Revision history for this message
Stefano Maioli (smaioli) wrote :

If I understand this correctly, TCP timestamps were completely broken in Hardy, so the workaround introduces no new bug or regression. Plus, a fixed kernel will probably come in a matter of days. If this is true I don't see the point in delaying the release.

Revision history for this message
Steve Langasek (vorlon) wrote : Re: [Bug 264019] Re: unable to visit some websites and ftpsites with 2.6.27

On Wed, Oct 29, 2008 at 10:56:23PM -0000, Nick Lowe wrote:
> Just to clarify this in stark terms...

> The workaround -introduces- a data corruption problem at high data rates
> because it disables PAWS protection in the TCP/IP stack by virtue of the
> timestamps no longer being there.

> The server release will go out with this, which will be run in places
> highly likely to see these data rates!

It has always been our intention to provide a kernel update immediately
post-release to correct this issue. This new information doesn't change
that plan (and indeed, Dave appears to acknowledge that disabling tcp
timestamping is the lesser evil here).

We should consider re-adding this to the release notes based on this latest
information, so thank you for the pointer.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Changed in ubuntu-release-notes:
status: Fix Released → Confirmed
Revision history for this message
Steve Langasek (vorlon) wrote : Re: unable to visit some websites and ftpsites with 2.6.27

the decision has been made to release this as a security update immediately post-release, so a separate release notes entry is not warranted.

Changed in ubuntu-release-notes:
status: Confirmed → Invalid
Revision history for this message
Pete Graner (pgraner) wrote :

For those watching this bug you can find out why we did things this way here:

http://blog.redvoodoo.org/2008/10/tcp-timestamps-2627-why-ubuntu-put-out.html

Revision history for this message
Pete Graner (pgraner) wrote :

linux (2.6.27-7.15) intrepid-security; urgency=low

  [ Upstream Kernel Changes ]

  * tcp: Restore ordering of TCP options for the sake of inter-operability
    - LP: #264019

 -- Tim Gardner <email address hidden> Mon, 27 Oct 2008 19:28:06 -0600

Changed in linux:
status: Fix Committed → Fix Released
Revision history for this message
Colin Watson (cjwatson) wrote :

procps (1:3.2.7-9ubuntu3) intrepid-security; urgency=low

  * Drop debian/sysctl.d/10-tcp-timestamps-workaround.conf again now that we
    have a fixed kernel, and make sure it gets removed on upgrade to this
    version (LP: #264019).

 -- Colin Watson <email address hidden> Thu, 30 Oct 2008 03:39:39 +0000

Revision history for this message
Pete Graner (pgraner) wrote :

Technical explanation of how the procps package undoes the tcp timestamp setting described earlier in this bug:

http://blog.redvoodoo.org/2008/10/backing-out-procps-change-how-it-works.html

Revision history for this message
Jamie Strandboge (jdstrand) wrote :
Revision history for this message
Dean Holland (speedster-3fl) wrote :

I am still seeing issues even with the 2.6.27-7.15 kernel and updated procps package. Setting tcp_timestamps to 0 does not resolve it for me.

I can, however, reboot into the 2.6.24-21-server (from Hardy) and all browsing etc. returns to normal.

speedster@whinge:~$ apt-cache policy procps linux-image-2.6.27-7-server
procps:
  Installed: 1:3.2.7-9ubuntu2.1
  Candidate: 1:3.2.7-9ubuntu2.1
  Version table:
 *** 1:3.2.7-9ubuntu2.1 0
        500 http://mirror.3fl.net.au intrepid-updates/main Packages
        500 http://security.ubuntu.com intrepid-security/main Packages
        100 /var/lib/dpkg/status
     1:3.2.7-9ubuntu2 0
        500 http://mirror.3fl.net.au intrepid/main Packages
linux-image-2.6.27-7-server:
  Installed: 2.6.27-7.15
  Candidate: 2.6.27-7.15
  Version table:
 *** 2.6.27-7.15 0
        500 http://mirror.3fl.net.au intrepid-updates/main Packages
        500 http://security.ubuntu.com intrepid-security/main Packages
        100 /var/lib/dpkg/status
     2.6.27-7.14 0
        500 http://mirror.3fl.net.au intrepid/main Packages

Revision history for this message
Aldo Maggi (sentiniate) wrote : Re: [Bug 264019] Re: unable to visit some websites and ftpsites with 2.6.27

Il giorno Sun, 02 Nov 2008 13:38:39 -0000
Dean Holland <email address hidden> ha scritto:

> I am still seeing issues even with the 2.6.27-7.15 kernel and updated
> procps package. Setting tcp_timestamps to 0 does not resolve it for
> me.

try adding:
echo 0 > /proc/sys/net/ipv4/tcp_sack

Revision history for this message
Dean Holland (speedster-3fl) wrote : Re: unable to visit some websites and ftpsites with 2.6.27

> try adding:
> echo 0 > /proc/sys/net/ipv4/tcp_sack

No good unfortunately, I captured the output of tcpdump and have attached it.

Revision history for this message
Dean Holland (speedster-3fl) wrote :

libpcap output on 2.6.24-21

Revision history for this message
Luke Maurer (luke-maurer) wrote :

I'm seeing a rather bizarre variant of this. I just upgraded my masquerading home server to Intrepid, and now browsing from the server itself works fine, but browsing from *other* machines does not, including both my Intrepid desktop machine and my iPhone. Very similar symptoms - if I'm reading tshark right, the request goes out (and sometimes even gets ACK'd) but the HTTP response never materializes. Setting either tcp_timestamps or tcp_sack to 0 doesn't help, and I just upgraded yesterday so everything should be up to date.

Revision history for this message
Luke Maurer (luke-maurer) wrote :

Ugh ... my bad. It's a userspace problem; I just booted into the Hardy kernel and it's still broken ... so nevermind.

Revision history for this message
Dean Holland (speedster-3fl) wrote :

I still believe mine is a kernel issue, as per the information in this Ubuntu Forums thread: http://ubuntuforums.org/showthread.php?p=6128733

This machine runs inside of KVM but that should not make any difference as it works _perfectly_ with 2.6.24-21, leading me to believe this hasn't been completely resolved.

To summarise, only changing the default kernel in menu.lst to 2.6.24-21 and rebooting makes the issue go away. I have tried:
 - empty firewall; only a masquerade rule
 - removing all QoS settings
 - clean install of Intrepid (64-bit) as the current was upgraded from Hardy
 - clean install of Intrepid (32-bit)
 - changing MTU and MSS clamping to lower sizes (1460 & 1420)
 - changed the type of network card from virtio to the default (rtl8139)

I see 2.6.7-7.16 is out, but as that was just a security fix I didn't expect that would have resolved it.

Revision history for this message
Matt Zimmerman (mdz) wrote : Re: [Bug 264019] Re: TCP option ordering in 2.6.27 is not interoperable with some network equipment

On Sat, Nov 08, 2008 at 05:09:26AM -0000, Dean Holland wrote:
> I still believe mine is a kernel issue, as per the information in this
> Ubuntu Forums thread: http://ubuntuforums.org/showthread.php?p=6128733
>
> This machine runs inside of KVM but that should not make any difference
> as it works _perfectly_ with 2.6.24-21, leading me to believe this
> hasn't been completely resolved.
>
> To summarise, only changing the default kernel in menu.lst to 2.6.24-21 and rebooting makes the issue go away. I have tried:
> - empty firewall; only a masquerade rule
> - removing all QoS settings
> - clean install of Intrepid (64-bit) as the current was upgraded from Hardy
> - clean install of Intrepid (32-bit)
> - changing MTU and MSS clamping to lower sizes (1460 & 1420)
> - changed the type of network card from virtio to the default (rtl8139)
>
> I see 2.6.7-7.16 is out, but as that was just a security fix I didn't
> expect that would have resolved it.

If disabling TCP timestamping doesn't work around the problem, then your
symptoms are different from the others described in this bug, and you should
file your report separately.

--
 - mdz

O.Chr.Jensen (ochrj)
Changed in linux:
status: Fix Released → In Progress
Steve Langasek (vorlon)
Changed in linux:
status: In Progress → Fix Released
Changed in linux (Mandriva):
importance: Unknown → High
Displaying first 40 and last 40 comments. View all 101 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.