Fedora

[regression] all network apps / browsers suffer from multi-second delays by default due to IPv6 DNS lookups

Reported by camper365 on 2009-08-23
This bug affects 217 people
Affects Status Importance Assigned to Milestone
eglibc (Ubuntu)
High
Matthias Klose
Karmic
High
Unassigned
Lucid
High
Matthias Klose
glibc (Fedora)
Confirmed
Unknown

Bug Description

In Karmic, DNS lookups take a very long time with some routers, because glibc's DNS resolver tries to do IPv6 (AAAA) lookups even if there are no (non-loopback) IPv6 interfaces configured. Routers which do not repond to this cause the lookup to take 20 seconds (until the IPv6 query times out).

*** PLEASE DO NOT COMMENT ON THIS BUG unless you have something constructive to say. Everything that can be said has already been said, and if you comment, you are just adding noise. Please let those that actually know what they are doing concentrate on fixing this bug from now on. ***

If disabling IPv6 or using good DNS servers like openDNS fixes the problem, you are not dealing with this bug. Please refrain from complaining here in that case

camper365 (camper365) wrote :
Micah Gersten (micahg) wrote :

Thank you for reporting this to Ubuntu. Could you please see if you have the same trouble with other browsers such as epiphany-webkit or midori?

Changed in firefox-3.5 (Ubuntu):
status: New → Incomplete
camper365 (camper365) wrote :

Yes it does apply to other browsers.

Micah Gersten (micahg) wrote :

This would appear to be more than a Firefox problem since other browsers are involved. I'm removing the Firefox 3.5 package from the bug and asking for reassignment to the appropriate package.

tags: added: needs-reassignment
affects: firefox-3.5 (Ubuntu) → ubuntu
Changed in ubuntu:
status: Incomplete → New
summary: - Firefox is slow by default due to IPv6 DNS lookups
+ Browsers are slow by default due to IPv6 DNS lookups

I've been struggling with this bug as well, for me it started with updates I installed on 3rd sept even though I had no problems like this in karmic earlier (at this point I installed updates from about two weeks back though). It affects all network apps (not just browsers). I originally filed a ticket with my ISP because I thought it was their DNS servers that were slow.

Martin Olsson (mnemo) wrote :

What I was seeing what 20-40 seconds page loads for certain webpages, when I set network.dns.disableIPv6 to true most pages loads with 1-3 seconds.

Martin Olsson (mnemo) on 2009-09-06
summary: - Browsers are slow by default due to IPv6 DNS lookups
+ [karmic regression] all network apps / browsers suffer from multi-second
+ delays by default due to IPv6 DNS lookups
affects: ubuntu → linux (Ubuntu)

This is a problem with the DNS resolver.

This problem will occur for any DNS request which the DNS resolver does not support.
The proper solution is to fix the DNS resolver.

What happens:
 - Program is IPv6 enabled.
 - When it looks up a hostname, getaddrinfo() asks first for a AAAA record
 - the DNS resolver sees the request for the AAAA record, goes "uhmmm I dunno what it is, lets throw it away"
 - DNS client (getaddrinfo() in libc) waits for a response..... has to time out as there is no response. (THIS IS THE DELAY)
 - No records received yet, thus getaddrinfo() goes for a the A record request. This works.
 - Program gets the A records and uses those.

This does NOT only affect IPv6 (AAAA) records, it also affects any other DNS record that the resolver does not support.
Generally these resolvers are embedded into the "NAT boxes" that consumers have.

Working solution, as we are on Linux anyway: don't use the DNS resolver in the NAT box, but install eg pdns-recursor and use that.

Of course that does not fix the broken box, which might be the NAT box, or the resolvers at the ISP.
Some other people start using OpenDNS because those "work" (But that is not really true either: https://lists.dns-oarc.net/pipermail/dns-operations/2009-July/004217.html)

Note that the DNS queries go over IPv4 (transport), there is no IPv6 _connectivity_ involved here.

Markus Thielmann (thielmann) wrote :

I'm not convinced, that this is a resolver bug. I'm running an IPv6 enabled system (aiccu tunnel with sixxs.net), so all IPv6 requests are answered by an IPv6 enabled DNS server. I'm still experiencing the same problems. Additional to that, this bug was introduced by Karmic and didn't happen before.

Bernard Bou (bbou) wrote :

The 5 second lag occurs with the Livebox (used by Orange, 12 million broadband internet customers in Europe). Better fix this unless you want a number of users to tweak their config files to either disable ipv6 or disable box-based dns server, not something anybody enjoys doing.

max123 (maxrest) wrote :

I also suffer from this problem, it _is_ the DNS-resolver, like Jeroen analysed - there should really be a fix for Karmic RS..

Jeroen Massar (massar) wrote :

@ Markus's #8 comment: as I mentioned "Note that the DNS queries go over IPv4 (transport), there is no IPv6 _connectivity_ involved here.".

You also state 'so all IPv6 requests are answered by an IPv6 enabled DNS server."; well, unless you configured IPv6 DNS resolver addresses in your /etc/resolv.conf then queries will still go over IPv4 (transport), even though they are AAAA queries. AICCU only provides IPv6 connectivity (transport) it does not configure DNS resolvers though.

@ Bernard's #9 comment: most likely your livebox contains one of these broken DNS resolvers. Happens a lot that CPEs have this issue. Try the below to check this out. Configuring resolv.conf with OpenDNS or other working DNS servers (eg the ones of your ISP directly, instead of the livebox) might solve your problem. Do also please realize that this problem ALSO occurs on other platforms than Linux, eg Windows, which is what the majority of people are using; what to use is a choice of the user afterall....

To verify this, do a:
for i in `cat /etc/resolv.conf | grep ^nameserver | cut -f2 -d' '`; do dig @$i www.microsoft.com AAAA; done

This should return quite quickly, even though no AAAA records for www.microsoft.com exist yet. Now, if you have a broken resolver somewhere along the way, these requests won't return quickly (unless they are locally or on-path cached as negative).

I have had the same problem, affecting all network activities. Particularly a problem when performing upgrades via aptitude. Problem completely solved by specifying Opendns as my DNS servers. I do not have this problem when running Jaunty, XP or Vista.

David Solbach (d-vidsolbach) wrote :

Just updated to karmic and experienced the same problem (1-3 second delays on dns lookup).
Switching off IPv6 dns support in firefox "fixes" the problem.

Do do that open up Mozilla or Firefox and type in 'about:config' in the address bar
Scroll down to "network.dns.disableIPv6", it's defaulted to a value of false, change it to true.

hope that helps.

camper365 (camper365) wrote :

That's a "fix" in Firefox, but the problem still exists in every other network app (evolution, aptitude, etc.)
So web browsing (which is what most users are doing anyway) is normal speed but everything else is behaving like you have dial-up.
After the final release, if the bug isn't fixed in every app people might start complaining to their isp or even drop ubuntu (or not upgrade to Karmic)

Pconfig (thomas9999) wrote :

I also notice the same problem in kubuntu. I remember this happened before on my upgrade to intrepid.

Brian Pitts (bpitts) wrote :

This is still present in the today's build. I don't understand why this isn't prioritized as release critical, since it makes web browsing and other network-related tasks unbearably slow.

max123 (maxrest) wrote :

Right, I also stress the incredible delay in thunderbird, network apps on the shell, update manager and everything else network related due to ipv6 lookups without having an ipv6 ip again!

At the university, where I get ipv6 as well, everything works as usual but at home with ipv4 its hardly usable..

Please investigate into this bug, I provide myself for testing things on this topic..

regards, max

Pconfig (thomas9999) wrote :

Temporary workaround can be found here:

http://ubuntuforums.org/archive/index.php/t-1281820.html

This proves that it has something to do with DNS resolving.

Jeroen Massar (massar) wrote :

For everybody not reading the other comments, #7 actually explains what goes on....

Yes, indeed, probably the best solution is to use just install a local DNS resolver (pdns-resolver), which hits the roots/gtld's etc itself. This is not very friendly to the general Internet, but heck, with the largest DNS server doing short TTLs and based on geography it might not matter too much.

Thus kids, "apt-get install pdns-recursor" and edit your /etc/resolv.conf to point to 127.0.0.1 when you get hit by this issue.

Pconfig (thomas9999) wrote :

I really think the opendns workaround is better at the time. But both solutions aren't good enough. You can't tell your grandmother to edit some config files because her internet is slow

camper365 (camper365) wrote :

I agree that this bug should be considered release critical, if not just applying the workaround. What could be added to network-manager is a feature for when you connect it tries to obtain an ipv6 dns and if it succeeds, it uses the network dns resolver or if it fails then it uses pdns-resolver (I just don't think that would work for this release, maybe in Lucid)

Zack Evans (zevans23) wrote :

I have had a privoxy go-slow - several seconds on every lookup - since installing Karmic beta. Hadn't really noticed a problem in any other app but web browsing did sometimes feel sluggish.

In a brainwave just now I have tried disabling ipv6 (using grub method) and now privoxy is working beautifully. I have also noticed that web browsing feels snappier generally, so I think this was slowing -all- of my apps down by a large enough fraction for me to feel the difference now.

Just to reiterate: it's repeatable for me with privoxy. IPV6 on - privoxy massive latency. IPV6 off - privoxy works fine.

I have a Draytek so blaming the router isn't practical - these have a MASSIVE installed base. Whether it's strictly the router's fault or not, it would not be ubuntu of Ubuntu to get all academically correct about it, we need some sort of workaround that can be achieved by clicking buttons.

To be honest, only the advanced users would want IPv6 anyway, so why not have it off by default and make it very easy to switch on?

Zack Evans (zevans23) wrote :

Should also say quite happy to test any other proposed workaround.

camper365 (camper365) wrote :

I have found that when I ping a site (for example, www.google.com) and I ping the url (www.google.com) it takes a while, but if I ping the IP address (63.251.179.13) then the lag is gone

@ Pconfig / #20

> You can't tell your grandmother to edit some config files because her internet is slow

Does your grandmother use Ubuntu then? If so, then just help her out in fixing the issue :)

@ Zack Evans / #23

> I have a Draytek so blaming the router isn't practical - these have a MASSIVE installed base

This problem also is in effect when the user has Windows and IPv6 enabled on that. The problem lies in the DNS resolver (which might not be the NAT box (what you call "router") but might be even your ISP, and thus you can avoid the problem by not using the DNS resolver in the NAT box. You might of course also try to upgrade your router, maybe they fixed the problem (you upgrade your Ubuntu and other things too, because they have issues, thus try that)

> To be honest, only the advanced users would want IPv6 anyway, so why not have it off by default and make it very easy to switch on?

Because in a few years or so you will have to enable IPv6 as there won't be any new hosts with IPv4 addresses. As such, better bite the apple today and fix those IPv6 issues, then wait till you really need it.

@ camper365 / #24

yes, that is correct, as when you ping www.google.com it has to lookup the hostname in DNS, while if you ping the address, it doesn't. DNS resolving (thus figuring out which address belongs to the requested hostname) is where the problem lies. See the hints about OpenDNS or pdns-recursor to solve it.

@ Ragnarel / #25

as per comment #11 try a:
  for i in `cat /etc/resolv.conf | grep ^nameserver | cut -f2 -d' '`; do dig @$i www.microsoft.com AAAA; done
when connected to wireless and when not connected to wireless. Or just for that matter, check if you are using the same nameservers when connected to wireless and wired, if they are different then you already got a small part of the answer.

Martin Olsson (mnemo) wrote :

Since we're running out of time, maybe we can just ship "network.dns.disableIPv6==true" as the Firefox default? I'd love a real fix for this bug but the RC is coming up very very soon now.

Markus Thielmann (thielmann) wrote :

Is it possible, that some patch changed the usage order of the nameserver from /etc/resolv.conf?

My router does deliver a "dead" nameserver via DHCP [1], which was never a problem since Ubuntu used to question the first (local) nameserver. The local nameserver resolves any given request without a problem [2]. If I remove the dead nameserver from resolv.conf, I no longer have any problems resolving DNS queries.

So it *might* be a solution to just change the usage order of the DNS servers to solve this "bug". Please notice, that a lot of users never experienced this problem before Karmic, so it might be hard to blame their hardware for this, even if it might be technically true... :-)

[1] It's a SE515, which delivers 217.237.151.97, despite any configuration.
[2] dig @192.168.1.1 www.microsoft.com AAAA without any noticeable delay

Micah Gersten (micahg) wrote :

There's at least enough information here to confirm the issue. I'll see if I can get someone to look at it.

Changed in network-manager (Ubuntu):
importance: Undecided → High
status: New → Confirmed
DodgeV83 (spamfrelow) wrote :

This 100% fixed my problem!

1. In /etc/dhcp3/dhclient.conf add the following line:

prepend domain-name-servers 208.67.222.222,208.67.220.220;

2. In /etc/nsswitch.conf edit this line

hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4

to this

hosts: files dns

I'm not sure which one of these did it to be honest, but it's fixed!

Darren Worrall (dazworrall) wrote :

My results are a little different. At the moment I'm using a draytek router, and am indeed suffering slow resolution in all my apps. Running the snippet above:

for i in `cat /etc/resolv.conf | grep ^nameserver | cut -f2 -d' '`; do dig @$i www.microsoft.com AAAA; done

Is very quick though. I consistently have slow resolution when running updates, but the same command against archive.ubuntu.com is also very quick.

The router has something to do with it I'm sure, my router at home doesn't give me any trouble at all, but querying so directly like this is reproducibly fast, while querying indirectly through update-manager, is reproducibly slow.

csulok (shikakaa) wrote :

For the what ultimately AND universally fixed/worked around the problem was the following:

edit /etc/sysctl.conf and add the following to the bottom:

#Disable IPv6
net.ipv6.conf.all.disable_ipv6=1

Jeroen Massar (massar) wrote :

@ csulok / #32

What that does is avoid fixing the problem. You disable IPv6, and thus glibc plays smart and does not resolve AAAA records anymore.

Your DNS resolver though is still broken. You might not notice it now, but if for instance per next year DNSSEC gets turned on you will run into it again.... (and you will probably just disable DNSSEC....)

csulok, your trick didn't solve my problem.

2009/10/21 Jeroen Massar <email address hidden>:
> @ csulok / #32
>
> What that does is avoid fixing the problem. You disable IPv6, and thus
> glibc plays smart and does not resolve AAAA records anymore.
>
> Your DNS resolver though is still broken. You might not notice it now,
> but if for instance per next year DNSSEC gets turned on you will run
> into it again.... (and you will probably just disable DNSSEC....)
>
> --
> [karmic regression] all network apps / browsers suffer from multi-second delays by default due to IPv6 DNS lookups
> https://bugs.launchpad.net/bugs/417757
> You received this bug notification because you are a direct subscriber
> of a duplicate bug.
>

I'm experiencing these problems on my Dell Studio 1555 laptop with Karmic Beta. I hope it gets fixed soon!

Nech (gerard-guadall) wrote :

I have two targets
07:02.0 Network controller: Broadcom Corporation BCM4318 [AirForce One 54g] 802.11g Wireless LAN Controller (rev 02)
00:19.0 Ethernet controller: Intel Corporation 82566DC Gigabit Network Connection (rev 02)

I work better using wifi than using wired connection.

Jeroen Massar (massar) wrote :

@ Nech / #36

> I work better using wifi than using wired connection.

So, like I ask everybody else, check to see if there is a huge latency time difference when doing:

for i in `cat /etc/resolv.conf | grep ^nameserver | cut -f2 -d' '`; do dig @$i www.microsoft.com AAAA; done

Over the wired or wireless; quiker maybe is to check if you get a different set of DNS servers when connected over wired or wireless (just check if /etc/resolv.conf changes).

Nech (gerard-guadall) wrote :

I think the problem is not DNS. Actually, when I visit different websites in a short period of time, then everything get saturated. Some websites not load, and other take up to 2 or 3 minutes to do so. I tried it using Google Chromium also, and the result was the same. Is a new instalation the karmic, and the upgrade just happened

Results of wireless
-----------------------
; <<>> DiG 9.6.1-P1 <<>> @80.58.0.33 www.microsoft.com AAAA
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49350
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;www.microsoft.com. IN AAAA

;; ANSWER SECTION:
www.microsoft.com. 2921 IN CNAME toggle.www.ms.akadns.net.
toggle.www.ms.akadns.net. 247 IN CNAME g.www.ms.akadns.net.
g.www.ms.akadns.net. 265 IN CNAME lb1.www.ms.akadns.net.

;; AUTHORITY SECTION:
akadns.net. 90 IN SOA internal.akadns.net. hostmaster.akamai.com. 1256289512 90000 90000 90000 180

;; Query time: 104 msec
;; SERVER: 80.58.0.33#53(80.58.0.33)
;; WHEN: Fri Oct 23 11:20:03 2009
;; MSG SIZE rcvd: 170

wired
-------
; <<>> DiG 9.6.1-P1 <<>> @80.58.0.33 www.microsoft.com AAAA
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1196
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;www.microsoft.com. IN AAAA

;; ANSWER SECTION:
www.microsoft.com. 2805 IN CNAME toggle.www.ms.akadns.net.
toggle.www.ms.akadns.net. 140 IN CNAME g.www.ms.akadns.net.
g.www.ms.akadns.net. 144 IN CNAME lb1.www.ms.akadns.net.

;; AUTHORITY SECTION:
akadns.net. 91 IN SOA internal.akadns.net. hostmaster.akamai.com. 1256289631 90000 90000 90000 180

;; Query time: 102 msec
;; SERVER: 80.58.0.33#53(80.58.0.33)
;; WHEN: Fri Oct 23 11:22:00 2009
;; MSG SIZE rcvd: 170

Jeroen Massar (massar) wrote :

@ Nech / #38

As you have the same DNS server for both wired and wireless, most very likely _your problem_ is not a DNS issue* like what the others show here.

* = unless an upstream of your DNS server has the "drop unknown DNS records" problem and your resolver caches the negative answer correctly, which will cause any subsequent query, like the ones above, to be quick again.

To solve your problem, I guess you'll have to take a peek with Wireshark...

Zack Evans (zevans23) wrote :

My problem goes away if I disable IPv6. If I boot with IPv6 though, so I have the problem, DNS lookups from the command line happen quickly.

for i in `cat /etc/resolv.conf | grep ^nameserver | cut -f2 -d' '`; do dig @$i www.microsoft.com AAAA; done

is practically instant. (I have also tried it with some other hostnames to check it is not cacheing hiding the problem.)

I should add I did not have this problem in Jaunty, and no equipment has changed, only the upgrade to Karmic.

So, as I type, with IPv6 enabled, Privoxy is grinding, everything else seems OK.

If I reboot with IPv6 off, Privoxy and everything else will be OK. DNS AAAA lookups seem OK whether enabled or disabled.

So is there some other subtle interaction between Privoxy and IPv6?

camper365 (camper365) on 2009-10-29
Changed in linux (Ubuntu):
status: New → Confirmed
Micah Gersten (micahg) on 2009-11-01
tags: added: metabug
Changed in linux (Ubuntu):
assignee: nobody → IPv6 Task Force (ipv6)
Changed in network-manager (Ubuntu):
assignee: nobody → IPv6 Task Force (ipv6)
Changed in network-manager (Ubuntu):
assignee: IPv6 Task Force (ipv6) → nobody
Changed in linux (Ubuntu):
assignee: IPv6 Task Force (ipv6) → nobody
Changed in linux (Ubuntu):
assignee: nobody → IPv6 Task Force (ipv6)
Changed in network-manager (Ubuntu):
assignee: nobody → IPv6 Task Force (ipv6)
Martin Olsson (mnemo) on 2009-11-03
Changed in linux (Ubuntu):
assignee: IPv6 Task Force (ipv6) → nobody
Changed in network-manager (Ubuntu):
assignee: IPv6 Task Force (ipv6) → nobody
Martin Pitt (pitti) on 2009-11-04
Changed in linux (Ubuntu Lucid):
importance: Undecided → High
Changed in linux (Ubuntu Karmic):
importance: Undecided → High
tags: added: regression-release
removed: needs-reassignment
Changed in linux (Ubuntu Karmic):
milestone: none → karmic-updates
Martin Pitt (pitti) on 2009-11-04
Changed in network-manager (Ubuntu Karmic):
status: New → Invalid
Changed in network-manager (Ubuntu Lucid):
status: Confirmed → Invalid
Martin Pitt (pitti) on 2009-11-04
affects: linux (Ubuntu Lucid) → glibc (Ubuntu Lucid)
Changed in glibc (Ubuntu Lucid):
assignee: nobody → Canonical Foundations Team (canonical-foundations)
status: Confirmed → Triaged
description: updated
Changed in glibc (Ubuntu Karmic):
status: New → Triaged
Changed in glibc (Ubuntu Lucid):
assignee: Canonical Foundations Team (canonical-foundations) → Matthias Klose (doko)
jordan.sc (jordanjsc) on 2009-11-04
Changed in glibc (Ubuntu Karmic):
status: Triaged → Fix Committed
Micah Gersten (micahg) on 2009-11-04
Changed in glibc (Ubuntu Karmic):
status: Fix Committed → Triaged
Nech (gerard-guadall) on 2009-11-05
Changed in glibc (Ubuntu Karmic):
status: Triaged → In Progress
status: In Progress → Confirmed
finno (finnegan) on 2009-11-07
Changed in glibc (Ubuntu Lucid):
status: Triaged → Invalid
Martin Pitt (pitti) on 2009-11-08
Changed in glibc (Ubuntu Lucid):
status: Invalid → Confirmed
Changed in glibc (Fedora):
status: Unknown → Confirmed
Carropa (carropa) on 2009-11-10
Changed in glibc (Ubuntu Karmic):
status: Confirmed → Fix Released
status: Fix Released → Confirmed
description: updated
Matthias Klose (doko) on 2009-12-24
Changed in glibc (Ubuntu Lucid):
status: Confirmed → Fix Released
Changed in glibc (Ubuntu Karmic):
status: Confirmed → In Progress
Martin Pitt (pitti) on 2010-01-03
tags: added: verification-needed
affects: glibc (Ubuntu Karmic) → eglibc (Ubuntu Karmic)
Changed in eglibc (Ubuntu Karmic):
status: In Progress → Fix Committed
Martin Pitt (pitti) on 2010-01-04
tags: added: verification-done
removed: verification-needed
Martin Pitt (pitti) on 2010-01-07
Changed in eglibc (Ubuntu Lucid):
status: Fix Released → Confirmed
leucomax (w-smetanig) on 2010-01-09
Changed in eglibc (Ubuntu Karmic):
status: Fix Committed → Fix Released
Martin Pitt (pitti) on 2010-01-10
Changed in eglibc (Ubuntu Karmic):
status: Fix Released → Fix Committed
Changed in eglibc (Ubuntu Karmic):
status: Fix Committed → Fix Released
Changed in eglibc (Ubuntu Karmic):
status: Fix Released → Fix Committed
status: Fix Committed → Fix Released
Changed in eglibc (Ubuntu Lucid):
status: Confirmed → Fix Released
Martin Pitt (pitti) on 2010-01-17
Changed in eglibc (Ubuntu Lucid):
status: Fix Released → Triaged
Changed in eglibc (Ubuntu Karmic):
status: Fix Released → Invalid
Micah Gersten (micahg) on 2010-01-18
Changed in eglibc (Ubuntu Karmic):
status: Invalid → Fix Released
Changed in eglibc (Ubuntu Karmic):
status: Fix Released → Fix Committed
Steve Langasek (vorlon) on 2010-03-01
Changed in eglibc (Ubuntu Karmic):
status: Fix Committed → Fix Released
Changed in eglibc (Ubuntu Karmic):
status: Fix Released → Incomplete
status: Incomplete → In Progress
Martin Pitt (pitti) on 2010-03-02
Changed in eglibc (Ubuntu Karmic):
status: In Progress → Fix Released
Changed in eglibc (Ubuntu Karmic):
status: Fix Released → In Progress
Martin Pitt (pitti) on 2010-03-03
Changed in eglibc (Ubuntu Karmic):
status: In Progress → Fix Released
description: updated
Emmet Hikory (persia) on 2010-03-10
tags: added: ipv6
description: updated
steve (swchoi-choi) on 2010-04-16
Changed in eglibc (Ubuntu Lucid):
status: Triaged → Confirmed
Steve Langasek (vorlon) on 2010-04-16
Changed in eglibc (Ubuntu Lucid):
status: Confirmed → Triaged
Johan (deberghes-johan) on 2010-04-24
summary: - [karmic regression] all network apps / browsers suffer from multi-second
- delays by default due to IPv6 DNS lookups
+ [regression] all network apps / browsers suffer from multi-second delays
+ by default due to IPv6 DNS lookups
250 comments hidden view all 330 comments
Derek (bugs-m8y) wrote :

http://sourceware.org/bugzilla/show_bug.cgi?id=4599

This is kind of related to the last few comments. If you decide to force AI_ADDRCONFIG, until those fixes are in place, you should watch out for IPv6 in your hosts file.

Derek (bugs-m8y) wrote :

oh, and that came from:
https://bugzilla.mozilla.org/show_bug.cgi?id=467497#c9
and the following two comments.

I do wish Launchpad allowed anchors to comment numbers in the context of the whole page

omair (omair-hafiz) wrote :

Are there any updates with regards to this bug? I've been waiting patiently for some kind of fix (I check proposed updates everyday in the hope that there is some mention of this). My system is becoming frankly unusable since 90% of what I do is on the net. I've seen other bugs that have been open for five years or so and I fear that this one is going the same route. No updates whatsoever, even the discussion on this list have stopped.

It's a pity that I'm thinking of installing something else (even a windows 7 installation at this point) even though I've been thoroughly satisfied with 10.04 in all other respects.

omair (omair-hafiz) wrote :

Also what I don't understand is why DOCKY bugs are assigned as "critical" in the ubuntu bug list but this one is only of 'high' importance!

omair: did you try the pdns-recursor workaround?
maybe this can help you until there is a fix.

install pdns-recursor via synaptic or apt-get.
then edit /etc/resolv.con (sudo gedit /etc/resolv.conf) and set nameserver to 127.0.0.1

if your problem is only in firefox you can disable ipv6: enter about:config in the addressbar and confirm the warning. then enter ipv6 as a filter and double click on network.dns.disableIPv6 (after this the value must be enabled)

hope this stops you thinking about windows 7 ;-)

smonsarr (smonsarr-junk) wrote :

For me using openDNS works fine as a workaround.

Derek (bugs-m8y) wrote :

omair, if you can't change your DNS, I've found that forcing AI_ADDRCONFIG as noted in #288, #289 and (importantly) #290
works nicely for me.

Also, if it causes trouble for you, you can just remove or comment out the ld.so.preload line.

I've applied it on 5 computers here at work w/ no issues and immediate improvements, where DNS changes are simply not an option.

I do also set:
network.dns.disableIPv6;true

in Firefox as well. But all the other apps on the system are now working nicely (wget, ssh etc).

omair (omair-hafiz) wrote :

hello,

flurin & derek: thanks for the information. I'll definitely try the pdns-recursor workaround. the firefox workaround didn't work for me, unfortunately. but i'll try changing the dns and forcing AI_ADDRCONFIG. Lets see what happens.

As for windows 7, I took out my installation CDs for Windows Vista that came with my Lenovo T400. Installation, with all the crap customizations, took about 2 hours? (I just phased out and started playing GTA4 after a certain point). After that the goddamn updates took literally 12 hours (with all the restarts and my 2 MB shared connection). I ended up with a system that booted up in a minute, with a fingerprint reader that didnt work and reintroduced me to the general slowness that drove me to linux in the first place. So now I'm sitting here with lucid back on and checking proposed updates. I have horrible internet, but atleast I don't want to throw my laptop out the window.

By the way, I had a Live CD of openSuse 11.2 lying around (KDE) and I can confirm that I had no issues whatsoever with my internet when I installed that on my system as was the case with Jaunty. Fedora 13 and the latest PCLinuxOS, however, suffer from the same issue. Additionally, the problem is curiously confined only to my internet connection at home and not at work. I initially had the same router (Linksys WRT54G) at home and at work. I changed the one at home thinking that it may have been a router problem and got a DLink Wireless N router instead. That did not work. I have the same ISP at home and work but different modems: a ZyXEL P-600 at work and an Alcatel SpeedTouch Home at home. The modem at home is considerably older then the one at work. Additionally, I have also found that if the internet on my Lucid box is acting up at home, it slows down the internet for everyone else connected to the router. So the lucid lynx is not only managing to annoy me but also other windows users (my wife) as well!

I hope that these workarounds work - Lucid is actually the best OS I've used in a very long time.

Szabolcs (szhorvat) wrote :

It took forever for this to get fixed for Karmic, and now, after upgrading to Lucid, the bug is back. This is absolutely ridiculous. And no, most of us are not in a position to buy a new router or switch ISPs because Ubuntu gets randomly broken with every upgrade.

Jeremy Visser (jeremy-visser) wrote :

On the contrary, Ubuntu is not a position to deviate from pushing forward with IPv6 just because some of you have broken hardware.

Derek (bugs-m8y) wrote :

Jeremy. Member of the IPv6 taskforce eh.

Well, it is fortunate for me that the code snippet I posted in #288 and #290 worked, because otherwise Ubuntu's pushing forward would have pushed it right off our (large) corporate network.

We have 0 control over that infrastructure. So it was either eliminate the slow and steady introduction of Linux and more open services in general, or find a workaround for this *bug*.

There's ideology, and then there's pragmatism.

It may not work for everyone, but it'd be nice if something equiv to defaulting to AI_ADDRCONFIG without the need for that preload trick was made available in some alternate package that people could add, see if it works for them, and remove once transitions were complete.

Jeroen Massar (massar) wrote :

Dear Derek, there is a way to fix this problem in your large corporate network, like we did for that small corporate network that I am using: fix the resolvers. As you are claiming to have a large corporate network, you most likely have only a handful of recursors but you might have a 100k clients, lets see which ones are easier to upgrade, 100k clients which are all over the place or that 10 max or so recursors.... easy pick I would say.

The thing you most likely are forgetting is the fact that the DNS recursors that you are using are not only broken for AAAA records, but most likely for every single other address. Thus, by resolving this issue you will solve other magical problems too.
You can directly move on to support DNSSEC too for that matter if you are busy anyway.

Yes, the problem is annoying, no there is not much that Ubuntu or any other OS can do about this. Thus fix the problem in the right spot.

Derek (bugs-m8y) wrote :

This is where pragmatism comes in.
We have absolutely no control over those resolvers, and even if we had any influence whatsoever with those who did, corporate networks are very slow to change. Ubuntu is the outsider. The Windows machines work. Your solution is not a pragmatic one.

So, while being "pure" is good, I thought Ubuntu stayed out of such things.

That's why Ubuntu offers easy integration of binary drivers for ATI and nVidia, why Ubuntu has a restricted-extras for convenient meta.

Simply because IPv6 is *better* doesn't mean you should sacrifice adoption for the ideal.

Using AI_ADDRCONFIG is simple enough, and as noted browsers like Firefox and Chrome have adopted that.

What would be nice would be a simple package that forces it across the board, simply as an option for broken networks.

I have to agree with Derek. With due respect to all the techs who do the
hard work of keeping Ubuntu (and especially Kubuntu in my case) so
great, I find that, as technical folks, we sometimes get overly focused
on the technical side and forget about the larger world in which that
exists. Sometimes the technically correct solution is not the right
solution in the real world, at least not at first.

Because of all the issues with Karmic, this should have been anticipated
and accounted for in Lucid. By this I mean that clear documentation,
fixes and workarounds should have been provided - if the problem could
not be accounted for silently in code. This problem is a huge hassle for
users who aren't up to speed on the technical side of connecting to the
internet. Those who are may look down on those who are not, but that's
no way to run an operating system.

Derek wrote:
> This is where pragmatism comes in.
> We have absolutely no control over those resolvers, and even if we had any influence whatsoever with those who did, corporate networks are very slow to change. Ubuntu is the outsider. The Windows machines work. Your solution is not a pragmatic one.
>
> So, while being "pure" is good, I thought Ubuntu stayed out of such
> things.
>
> That's why Ubuntu offers easy integration of binary drivers for ATI and
> nVidia, why Ubuntu has a restricted-extras for convenient meta.
>
> Simply because IPv6 is *better* doesn't mean you should sacrifice
> adoption for the ideal.
>
> Using AI_ADDRCONFIG is simple enough, and as noted browsers like Firefox
> and Chrome have adopted that.
>
> What would be nice would be a simple package that forces it across the
> board, simply as an option for broken networks.
>

--
Tai Sines
"Share your strengths, not your weaknesses." -- Yogi Bhajan

JG (jg+launchpad) wrote :

Thanks Derek. Your patch worked for me. I had already disabled IPv6 via sysctl and took all IPv6 addresses off my interfaces. The about:config solves firefox, but mutt and ssh were still a problem.

I'm a bit surprised at some of the suggested workarounds. I wouldn't really blame the resolvers - things shouldn't be doing AAAA lookups if ipv6 is disabled in the first place. It might be possible to blame the authors of virtually every network-aware app, but that isn't realistic.

Most of us running ubuntu in corporate networks with broken Microsoft resolvers are doing so completely unsupported. If you open a ticket, you'll be lucky if ignoring it is the worst that happens. More likely you'll be told to use a supported environment and just give them another reason why linux users are an expensive problem. "Go fix your resolvers" is just not a reasonable response. Using other DNS servers doesn't work in this case either, because they don't have access the intranet zones.

alfredo (alacis) wrote :

Hi, Folks.

I set up the function "getaddrinfo()" as specified in #288 & #290 above, but then lost connectivity to Samba shares on other machines in the local LAN.

When I commented-out the line "/usr/local/lib/getaddrinfo_wrap.so" in the file "/etc/ld.so.preload", instantly my Samba shares returned.

What now?

Alf

Derek (bugs-m8y) wrote :

Yeah, dunno what to say. WFM w/ my samba shares.
mount.cifs //intranetdev/wwwroot /home/nemo/Shares/intranet

That sorta thing.

Guess you're out of luck on that fix. Here's hoping something else works. Sorry.

Derek (bugs-m8y) wrote :

Well, kind of a good news/bad news situation.
Bad news. A little while ago my solution stopped working for me. Drove me absolutely batty.
I tried all the other things too, disable.ipv6=1 as a kernel parameter, various options in sysctl.conf that used to work, blacklisting any possible modules (not that they were loaded), and of course stripping down nsswitch.conf since all that mdns stuff had always caused unresolvable here.

Heck, I also tried:
        hints->ai_flags|=AI_ADDRCONFIG;
        hints->ai_flags|=AI_V4MAPPED;
        hints->ai_flags&=!AI_ALL;
in my wrapper, even though I have no idea if those would work, and specifying a struct addrinfo newhints; if hints were null.

Despite trying every conceivable way of saying NO I DO NOT WANT IPV6 CAUSE MY NETWORK SUCKS.
I still saw:
sudo tcpdump -an | grep 192.168.1.100
17:15:07.744171 IP 192.168.1.2.34702 > 192.168.1.100.53: 25164+ AAAA? reddit.com. (28)

(for a wget of reddit.com)

Anyway.
The happy ending is that recently a 3rd DNS resolver was added. The two broken ones are still broken, but so long as I explicitly specify only the new one in network settings and disable resolv.conf setup from DHCP, I'm fine.

I still have no idea what changed, but at least mine is working.

My sympathies for those of you still stuck in this situation.

1 comments hidden view all 330 comments
Tore Anderson (toreanderson) wrote :

There is no question that the underlying problem here is defective DNS resolvers that choke on perfectly legitimate AAA queries. That said, there are a couple of issues present in software shipped by Ubuntu that cause the problem to manifest itself as slowdowns noticeable by end users:

1) When called with the AI_ADDRCONFIG flag, libc's getaddrinfo() function does not disregard link-local IPv6 addresses when determining whether or not the local host has usable IPv6 connectivity. Since every IPv6-capable OS will have link-local IPv6 addresses assigned to all interfaces - regardless of any external connectivity being available or not - this essentially makes AI_ADDRCONFIG on Linux useless for the purpose of suppressing AAAA queries when they're not useful.

I've submitted a bug to the GNU libc upstream about this issue at <http://sourceware.org/bugzilla/show_bug.cgi?id=12377>.

getaddrinfo() on other operating systems (such as Apple Mac OS X and Microsoft Windows) does disregard link-local IPv6 addresses when called with AI_ADDRCONFIG, which is why the problem appears to affect GNU/Linux distributions more than other operating systems.

2) Many applications do not set the AI_ADDRCONFIG flag when calling getaddrinfo(). This includes, notably, Mozilla Firefox. However, a patch to correct this has recently been committed to the mozilla-central developement repo and will likely be part of Firefox 4.0 beta 11 (hopefully also 3.6.15), see <https://bugzilla.mozilla.org/show_bug.cgi?id=614526>. Microsoft Windows enables the use of AI_ADDRCONFIG as the system-wide default, as far as I know, which explains why it is able to cope better with those broken middleware boxes. Mac OS X does not set AI_ADDRCONFIG by default, however it has an extremely short timeout waiting for AAAA responses after the A response has been answered (around 125ms), which in turn hides the problem from most end users. Additionally, most major browsers (except Firefox) do set AI_ADDRCONFIG explicitly, which suppress the problematic AAAA queries in the first place.

So what Ubuntu could to avoid this problem is 1) to develop and include a patch to glibc that makes getaddrinfo() ignore link-local addresses for AI_ADDRCONFIG purposes, and 2) to back-port the NSPR patch already committed to mozilla-central to the version of Firefox shipped (or wait until Mozilla releases a new version with the patch already included).

Tore

At last nomebody has understood the problem ! Well done !

I totally agree with your solution no 1, which is don't consider link-local
adresses (the ones which start with fe80:: ) as IPv6 adresses that can
resolve AAAA DNS records because that never happen and never will by design

Neil (goofandfroggie) wrote :

I'm glad to see it's not just me having this problem still.
I was give this little fix and works great. maybe a help, for give me if this has been posted already, there is a lot to read though.

sudo gedit /etc/resolv.conf

change: nameserver 10.1.1.1 (numbers maybe different on yours)
to: nameserver 8.8.8.8 and then save.

I would be interested if it works for other people.
this is only the way I can use Google earth Firefox & Thunderbird with out changing the ipv6 settings in Ff & T/bird. I can not use earth at all unless I change the nameserver, then all is good.
But I have to do this each time I start up.

Martin Pitt (pitti) wrote :

Hello,

Neil [2011-02-01 7:32 -0000]:
> I would be interested if it works for other people.

Yes, for me as well.

> But I have to do this each time I start up.

I created a script for that:

$ cat /etc/network/if-up.d/0nameserver
#!/bin/sh
grep -q Speedport_W_303V_Typ_B /etc/resolv.conf || exit 0
cat <<EOF > /etc/resolv.conf
nameserver 217.0.43.81
nameserver 217.0.43.65
EOF

The first line checks if I'm in my "home" network, as I only want to apply this
workaround when I'm at home.

(The script needs to be executable)

--
Martin Pitt | http://www.piware.de
Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)

>change: nameserver 10.1.1.1 (numbers maybe different on yours)
>to: nameserver 8.8.8.8 and then save.

I think this is just switching from your ISP's to Google's DNS server. Admittedly many ISPs' servers are broken, but changing the default warrants more discussion.

Tore Anderson (toreanderson) wrote :

FYI, today Mozilla released Firefox 4.0 beta 11, which now calls getaddrinfo() with the AI_ADDRCONFIG flag. You get it from http://www.firefox.com/beta/.

This solves half of the problem. The remaining piece is now to make glibc ignore link-local IPv6 addresses when called with the AI_ADDRCONFIG flag. (This is how all other major operating systems behave already.) I have a bug open in the glibc bugzilla at http://sourceware.org/bugzilla/show_bug.cgi?id=12377 - it would be really great if the Ubuntu glibc developers could help out by writing a suitable patch and attach it to the bug report. I don't think it should be very hard (just extend the already existing logic that ignores loopback addresses). Unfortunately, I'm not much of a programmer myself...

Tore

Tore Anderson (toreanderson) wrote :

Here's one half of the solution - it's a patch to glibc that makes getaddrinfo() ignore link-local addresses when called with the AI_ADDRCONFIG flag set. This makes getaddrinfo() avoid querying for AAAAs when the host has no IPv6 connectivity, provided that the AI_ADDRCONFIG flag is set.

Tore

Tore Anderson (toreanderson) wrote :

Here's the second half of the solution. It's a patch that makes Mozilla Firefox use AI_ADDRCONFIG when calling getaddrinfo(). Note that the Mozilla release drivers have already approved this patch for inclusion on the 3.6.x branch, and it has already been commited to Firefox 4.0 (it's included in beta11).

Tore

gene (eugenios) wrote :

Unbelievable, this bug still manages to bug people on the latest and fully updated Ubuntu 11.04!
The strangeness of the situation is as follows:

ubuntu 11.04, uname -a:
Linux 3.0.0-mine #3 SMP Thu Jul 28 14:03:44 CDT 2011 i686 i686 i386 GNU/Linux

Where, with firefox 5.0, epiphany, chromium some websites take very long time to load, while everything else involving connection is fast. E.g., w3m, lynx and especially elinks are extremely fast! At the same time on!!!! On the

ubuntu 10.04:
 uname -a
Linux 2.6.35.13-mine #1 SMP Fri May 6 00:20:57 CDT 2011 x86_64 GNU/Linux
Exact same browsers are almost as fast as their text-based brethren.

So, IMHO, the problem resides not with the actual browser(s) but with something else.
This is getting ridiculous!

gene (eugenios) wrote :

Forgot to mention, that neither disabling ipv6 completely, nor "playing" with the /etc/nsswitch.conf works.
Now since this bug is filed against Karmic, I wonder do I have to make my bug a duplicate? In case if I see it on my machine.

Neil (goofandfroggie) wrote :

Yet still some problems for me to but I must say only with Google earth and Ubuntu Tweek g/earth wont "connect" and tweek cant get the updates. but if I set sudo gedit /etc/resolv.conf and set the servername to 8.8.8.8 they will work. as I said earlier in comment #312

Joel (jeidsath) wrote :

I have this issue with ssh in Ubuntu 11.10. Installing the power-dns resolver as mentioned in an earlier comment worked for me.

To install pdns-resolver, I I set my nameserver to 127.0.0.1 in /etc/resolv.conf and followed these instructions:
http://www.thatfleminggent.com/2009/08/09/getting-a-powerdns-recursor-up-and-going-fast

Javier Vilalta (jvilalta) wrote :

I'm not sure if this is the same bug I'm experiencing, but if I try to access a domain without IPv6 address, I get this on tshark:

  0.000000 192.168.2.103 -> 192.168.2.254 DNS 74 Standard query AAAA one.ubuntu.com
  0.074500 192.168.2.254 -> 192.168.2.103 DNS 135 Standard query response
  0.074682 192.168.2.103 -> 192.168.2.254 DNS 95 Standard query AAAA one.ubuntu.com.internal.eudemo.info
  0.075854 192.168.2.254 -> 192.168.2.103 DNS 95 Standard query response, No such name
  0.075991 192.168.2.103 -> 192.168.2.254 DNS 74 Standard query A one.ubuntu.com
  0.147486 192.168.2.254 -> 192.168.2.103 DNS 106 Standard query response A 91.189.89.219 A 91.189.89.218

As you can see, between the AAAA and the A resolution there's a wrong query with my local domain added: this is the one which takes a few seconds to fail (not in this case because I have setup my dnsmasq with local=/internal.eudemo.info/ to get a fast response)
I have tested it with both Firefox and Chrome and both do the same, so I assume is a system problem. Is this the same problem or I need to open a separate bug report (or something is wrong with my setup)?

Stéphane Graber (stgraber) wrote :

New patches have been proposed a few days ago on redhat's bugtracker at https://bugzilla.redhat.com/show_bug.cgi?id=505105

Tore Anderson (toreanderson) wrote :

Stéphane, the same patch was posted in this bug as well, see comment #316. (The one in #317 is no longer necessary, as it's been included in the NSPR upstream code for a long time now.)

Tore

LinkedIn
------------

Bug,

I'd like to add you to my professional network on LinkedIn.

- Joe

Joe Klein
Security Researcher at IPv6 Cyber Security Forum
Washington D.C. Metro Area

Confirm that you know Joe Klein:
https://www.linkedin.com/e/vie2u9-h1zn0mwv-2f/isd/7022549566/G7SZOZHv/?hs=false&tok=3uKWta_34LVlc1

--
You are receiving Invitation to Connect emails. Click to unsubscribe:
http://www.linkedin.com/e/vie2u9-h1zn0mwv-2f/Vwq1u_xoiZpYl-DoyfGWOkloZaoeXvNdJwvY8FM/goo/417757%40bugs%2Elaunchpad%2Enet/20061/I2403245545_1/?hs=false&tok=3LBymgQqwLVlc1

(c) 2012 LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA.

dhenry (tfc-duke) wrote :

This LinkedIn invitation is a bit odd : Bug can't reply to you :)

JoeKlein (jsklein) wrote :

LinkedIn
------------

Bug,

I'd like to add you to my professional network on LinkedIn.

- Joe

Joe Klein
Security Researcher at IPv6 Cyber Security Forum
Washington D.C. Metro Area

Confirm that you know Joe Klein:
https://www.linkedin.com/e/vie2u9-h8qdlhrc-u/isd/7022549566/G7SZOZHv/?hs=false&tok=2v3uvX5CP0E5s1

--
You are receiving Invitation to Connect emails. Click to unsubscribe:
http://www.linkedin.com/e/vie2u9-h8qdlhrc-u/Vwq1u_xoiZpYl-DoyfGWOkloZaoeXvNdJwvY8FM/goo/417757%40bugs%2Elaunchpad%2Enet/20061/I3098597473_1/?hs=false&tok=3e7QtgmV70E5s1

(c) 2012 LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA.

Pavel Šimerda (pavlix-a) wrote :

I would like to add new information and research that has been done in the Fedora project:

https://fedoraproject.org/wiki/Networking/NameResolution/ADDRCONFIG

It links to related fedora bug reports which in turn link to upstream bug reports. It contains enough information about what is required to solve dualstack getaddrinfo() problems.

We are working on this and invite anyone from the community to help us get rid of dualstack-related name resolution problems. Feel free to contact us. Contacts at the Fedora feature page:

https://fedoraproject.org/wiki/Features/DualstackNetworking

Or contact me directly:

https://fedoraproject.org/wiki/User:Pavlix

no longer affects: network-manager (Ubuntu)
no longer affects: network-manager (Ubuntu Karmic)
no longer affects: network-manager (Ubuntu Lucid)
Faye Salwin (faye-salwin) wrote :

In the hope that this helps someone. I spent most of today fighting this and found a solution.

d-i preseed/early_command string grep -q options /etc/resolv.conf || echo "options single-request" >> /etc/resolv.conf ;

and then

d-i preseed/late_command string grep -q options /etc/resolvconf/resolv.conf.d/tail || echo "options single-request" >> /etc/resolvconf/resolv.conf.d/tail ;

The difference in speed of install is marked.

Faye Salwin (faye-salwin) wrote :

oops, that late_command doesn't work, but you get the picture. It's missing in-target, but I'm not sure if I can in-target redirect.

Displaying first 40 and last 40 comments. View all 330 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Related blueprints

Remote bug watches

Bug watches keep track of this bug in other bug trackers.