wget doesn't redo DNS lookup

Bug #84104 reported by nrdb
4
Affects Status Importance Assigned to Milestone
wget
Unknown
Unknown
wget (Ubuntu)
Fix Released
Wishlist
Micah Cowan

Bug Description

Version GNU Wget 1.10.2

When downloading a big file from a website that is using a dynamic DNS service, if the IP changes, the connection is broken to the server. When trying to reconnect wget isn't doing a DNS lookup, and keeps trying the old IP address.

1) start a big download from a site using a dynamic DNS
2) cause the sites IP address to change (i.e. reset ADSL modem)

Wget will try to reconnect

3) note that "$ dig <host name>" reports new IP address of host
4) note that wget is still trying to contact old IP address

nrdb (nrdb01)
description: updated
Revision history for this message
Diego Ongaro (ongardie) wrote :

I interpret the --no-dns-cache man documentation to suggest this behavior is intentional. Here's the relevant excerpt.

--no-dns-cache
Turn off caching of DNS lookups. Normally, Wget remembers the IP addresses it
looked up from DNS so it doesn’t have to repeatedly contact the DNS server for
the same (typically small) set of hosts it retrieves from. This cache exists
in memory only; a new Wget run will contact DNS again.

However, it has been reported that in some situations it is not desirable to
cache host names, even for the duration of a short-running application like
Wget. With this option Wget issues a new DNS lookup (more precisely, a new
call to "gethostbyname" or "getaddrinfo") each time it makes a new connection.

[snip]

Revision history for this message
nrdb (nrdb01) wrote :

Wouldn't the above option mean not to do a DNS lookup before getting each packet ?

And if wget is failing to connect isn't it best for it to do everything it can to reconnect ?

Revision history for this message
Diego Ongaro (ongardie) wrote :

I believe "With this option Wget issues a new DNS lookup ... each time it makes a new connection." indicates that it wouldn't do a DNS lookup per packet, but rather per connection.

Regarding "And if wget is failing to connect isn't it best for it to do everything it can to reconnect ?", I personally agree with you, but I do not know the opinion of the developers.

Revision history for this message
Micah Cowan (micahcowan) wrote :

There is no software anywhere that would ever do a DNS lookup before each packet in a TCP connection. For one thing, that would be just ridiculously expensive, at least doubling the number of packets being sent out, and imposing huge delays as the application waits for the DNS response. For another, once a TCP connection is established, all packets must necessarily be between the same IP addresses. A "connection" is defined by an address/port pair on each end. If you change any of those, it's not the same connection anymore. Thirdly, once a connection has been established, the system kernel typically handles all the low-level packet sending, not the application. So, it really wouldn't make sense to have an option that controls DNS querying between packets for a single connection.

As Diego indicated, wget caches DNS values by default, to enhance efficiency. If this is not the behavior you desire, you should use the --no-dns-cache option.

...However, I think your point that wget ought "to do everything it can" to succeed in reestablishing connection is a valid one. In the situation you've described, it's not possible for wget to avoid losing the initial connection, but if it fails in its second attempt, it should possibly attempt a DNS lookup. I'll go ahead and confirm this bug (I'm actually the upstream maintainer… as of yesterday!). I'll set the priority low, though, as there is a simple workaround (--no-dns-cache), and I have other things that will keep me busy for a while. :)

Changed in wget:
assignee: nobody → micahcowan
importance: Undecided → Low
status: New → Triaged
Micah Cowan (micahcowan)
Changed in wget:
importance: Low → Wishlist
Revision history for this message
nrdb (nrdb01) wrote : Re: [Bug 84104] Re: wget doesn't redo DNS lookup

Micah Cowan wrote:
> There is no software anywhere that would ever do a DNS lookup before
> each packet in a TCP connection. For one thing, that would be just
> ridiculously expensive, at least doubling the number of packets being
> sent out, and imposing huge delays as the application waits for the DNS
> response. For another, once a TCP connection is established, all packets
> must necessarily be between the same IP addresses. A "connection" is
> defined by an address/port pair on each end. If you change any of those,
> it's not the same connection anymore. Thirdly, once a connection has
> been established, the system kernel typically handles all the low-level
> packet sending, not the application. So, it really wouldn't make sense
> to have an option that controls DNS querying between packets for a
> single connection.
>
> As Diego indicated, wget caches DNS values by default, to enhance
> efficiency. If this is not the behavior you desire, you should use the
> --no-dns-cache option.
>
> ...However, I think your point that wget ought "to do everything it can"
> to succeed in reestablishing connection is a valid one. In the situation
> you've described, it's not possible for wget to avoid losing the initial
> connection, but if it fails in its second attempt, it should possibly
> attempt a DNS lookup. I'll go ahead and confirm this bug (I'm actually
> the upstream maintainer… as of yesterday!). I'll set the priority low,
> though, as there is a simple workaround (--no-dns-cache), and I have
> other things that will keep me busy for a while. :)
>
> ** Changed in: wget (Ubuntu)
> Importance: Undecided => Low
> Assignee: (unassigned) => Micah Cowan
> Status: New => Triaged
>
Thanks for listening.

Revision history for this message
Micah Cowan (micahcowan) wrote :

Hm, well, I wanted to note that it's in our upstream bugtracker at https://savannah.gnu.org/bugs/index.php?20393 , but apparently LP doesn't track Savannah.

Changed in wget:
status: New → Unknown
Revision history for this message
Adam Buchbinder (adam-buchbinder) wrote :

Launchpad does apparently support Savannah now; it's just picky about the URL form it'll accept.

Changed in wget:
importance: Undecided → Unknown
status: New → Unknown
Revision history for this message
Clint Byrum (clint-fewbar) wrote :

This was fixed in wget 1.11 according to the upstream bug report. Marking as Fix Released. (1.12 was shipped in 10.04)

Changed in wget (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.