proxy tries ipv6 and gets 503 when no ipv6 routes

Bug #1547640 reported by Scott Moser
78
This bug affects 13 people
Affects Status Importance Assigned to Milestone
squid3 (Ubuntu)
Fix Released
High
Unassigned
Precise
Fix Released
High
Unassigned
Trusty
Fix Released
High
Unassigned
Wily
Fix Released
Medium
Unassigned
Xenial
Fix Released
High
Unassigned

Bug Description

== Begin SRU Information ==
[Impact]
Users of squid3 as a proxy on a host without ipv6 connectivity will see http '503' errors if they attempt to access a url through that proxy that has greater than 9 ipv6 addresses associated with it.

The failure case is that affected ubuntu users specifically was:
 a.) user uses squid from Ubuntu as a proxy
 b.) security.ubuntu.com and archive.ubuntu.com had additional IPV6 addresses added to their dns, such that there were 10 ipv6 addresses for each.
 c.) the squid system does not have access to the ipv6 addresses. Most likely that woudl be a result of having no routable ipv6 traffic.

The change as described in the upstream commit is:
| Update forward_max_tries to permit 25 server paths
|
| With cloud sites becoming more popular more CDN servers are producing
| long lists of IPv6 and IPv4 addresses. If there are not enough paths
| selected the IPv4 ones may never be reached.

[Test Case]
The attached 'lp-1547640.sh' can be run with:
  ./lp-1547640.sh setup
  ./lp-1547640.sh test

It installs squid3 and sets up dnsmasq to know about 10 ipv6 addresses for a host, and then attempts to use that squid proxy.

[Regression Potential]
Likely scenarios to cause regression would be for hosts that have several ipv6 addresses. The change has been in squid3 upstream in trunk since 2013-08-21 and for quite a while though. It is released in squids 3.5 branch.

[Other Info]
After we saw and diagnosed this failure, Canonical's IS team removed one of the ipv6 addresses from security.ubuntu.com and archive.ubuntu.com, so that there are only 9 present now.
  $ host archive.ubuntu.com | grep 'has IPv6'
  archive.ubuntu.com has IPv6 address 2001:67c:1562::16
  archive.ubuntu.com has IPv6 address 2001:67c:1360:8c01::19
  archive.ubuntu.com has IPv6 address 2001:67c:1562::14
  archive.ubuntu.com has IPv6 address 2001:67c:1560:8001::11
  archive.ubuntu.com has IPv6 address 2001:67c:1360:8001::17
  archive.ubuntu.com has IPv6 address 2001:67c:1560:8001::13
  archive.ubuntu.com has IPv6 address 2001:67c:1562::17
  archive.ubuntu.com has IPv6 address 2001:67c:1360:8c01::18
  archive.ubuntu.com has IPv6 address 2001:67c:1562::15

There *were* 10 on the day this caused a problem. Canonical will hold off on adding more ipv6 until this change is rolled out widely.

The fix for this bug will come to xenial through a merge with debian under bug 1473691.

== End SRU Information ==

Many people run squid (squid-deb-proxy, or maas-proxy) to provide ubuntu archive mirror caching and proxying. MAAS sets this up by default for users with the 'maas-proxy' package.

On or about Friday February 19, this setup began to fail for many people.
Users would see 'apt-get update' returning 503 errors. For me, I saw 503 on security.ubuntu.com addresses.

The reason for the failure was that the DNS records for Ubuntu reacheda threshold of 10 IPv6 entries. The squid proxy host did not have ipv6 connectivity and with a limit of 10 retries the failover does not reach any IPv4 addresses - thus would fail.

The fix/workaround is to add the following to your squid config:
  # http://www.squid-cache.org/Doc/config/forward_max_tries/
  forward_max_tries 25

The appropriate squid config file depends on what is running squid.
  maas-proxy: /usr/share/maas/maas-proxy.conf
  squid-deb-proxy: /etc/init/squid-deb-proxy.conf

I'm not sure how this previously worked, nor what change was made.
One change that was made in this time frame was a glibc update (2.19-0ubuntu6.6 to 2.19-0ubuntu6.7) for security (CVE-2013-7423 CVE-2014-9402 CVE-2015-1472 CVE-2015-1473). But it doesn't seem to make sense that that would change squid3 to start looking for AAAA records when it did not previously.
i can verify that as late as
  Thu Feb 18 06:36:07 EST 2016
i was seeing entries in my squid logs with
  1455713142.896 335 10.7.2.103 TCP_REFRESH_UNMODIFIED/200 82620 GET http://security.ubuntu.com/ubuntu/dists/xenial-security/InRelease - HIER_DIRECT/91.189.88.149 -
but now i get
  1455879482.210 1 10.7.2.103 TCP_REFRESH_FAIL/200 635 GET http://security.ubuntu.com/ubuntu/dists/precise-security/main/i18n/Index - HIER_DIRECT/2001:67c:1562::14 -

Related Bugs:

Scott Moser (smoser)
description: updated
Revision history for this message
Seth Arnold (seth-arnold) wrote :

Adding dns_v4_first on to my 14.04 LTS /etc/squid-deb-proxy/squid-deb-proxy.conf solved this for me.

My personal best guess is that something happened during machine reboots in the Canonical datacenter to address the glibc updates.

My failures were to both security.ubuntu.com and archive.ubuntu.com, e.g.:

W: Failed to fetch http://security.ubuntu.com/ubuntu/dists/trusty-security/restricted/binary-amd64/Packages 503 Service Unavailable

W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/trusty-proposed/restricted/binary-amd64/Packages 503 Service Unavailable

(there were dozens more like this, these two were just side-by-side in scrollback.)

Thanks

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in squid (Ubuntu):
status: New → Confirmed
Changed in squid-deb-proxy (Ubuntu):
status: New → Confirmed
Changed in maas:
status: New → Triaged
importance: Undecided → High
milestone: none → 2.0.0
Changed in maas:
assignee: nobody → Andres Rodriguez (andreserl)
tags: added: cloud-installer
Revision history for this message
Johan Ehnberg (johan-ehnberg) wrote :

On my 14.04 LTS system the workaround was to add 'dns_v4_first on' to /etc/maas/maas-proxy.conf

Revision history for this message
🤖 Landscape Builder (landscape-builder) wrote :

Does someone know yet what happened?

affects: squid (Ubuntu) → squid3 (Ubuntu)
Revision history for this message
Stéphane Graber (stgraber) wrote :

I'm unfamiliar with the squid codebase but if it does use the normal socket library, it would be doing a getaddrinfo, then iterate over the results, those results would begin with IPv6 AAAA records as IPv6 is always to be preferred over IPv4 when available, but any attempt to connect would result in a Network unreachable error and so cause a fallback to the next result.

Assuming squid uses the normal resolving code, the only normal situations in which this behavior would happen is if the host does have a route to the target IPv6 subnet (such as a default route) OR if getaddrinfo is only returning AAAA records.

With getaddrinfo being provided by glibc and in the very codepath which was modified for the security update, it'd be my first bet that this is somehow related. It would be pretty trivial to check too for someone with an affected system, just downgrade glibc to the previous version (right before this week's security fix), then restart squid and see if it behaves normally. If it does, upgrade glibc again, restart squid again and confirm that it's again misbehaving.

I can't easily do that check myself as my home network is IPv6 only and so I can't possibly be affected by this bug as all my squid servers run in IPv6-only mode (and are still all working as expected).

Revision history for this message
Andreas Hasenack (ahasenack) wrote :
Download full text (4.2 KiB)

I tried downgrading libc6 and restarting squid (heck, even rebooting the container), but it still happened. Scott IIRC also tried that.

Still, it's a heck of a coincidence.

Squid in debug mode shows it's getting 10 IPv4 and 10 IPv6 addresses back for the archive, and trying each one in turn. But once it crosses over to the ipv4 ones the log gets a bit confusing for me:
2016/02/20 22:18:32.559| ipcache.cc(497) ipcacheParse: ipcacheParse: 20 answers for 'security.ubuntu.com'
2016/02/20 22:18:32.559| ipcache.cc(566) ipcacheParse: ipcacheParse: security.ubuntu.com #0 [2001:67c:1360:8001::17]
2016/02/20 22:18:32.559| ipcache.cc(566) ipcacheParse: ipcacheParse: security.ubuntu.com #1 [2001:67c:1562::15]
2016/02/20 22:18:32.559| ipcache.cc(566) ipcacheParse: ipcacheParse: security.ubuntu.com #2 [2001:67c:1360:8c01::19]
2016/02/20 22:18:32.559| ipcache.cc(566) ipcacheParse: ipcacheParse: security.ubuntu.com #3 [2001:67c:1360:8c01::18]
2016/02/20 22:18:32.559| ipcache.cc(566) ipcacheParse: ipcacheParse: security.ubuntu.com #4 [2001:67c:1562::14]
2016/02/20 22:18:32.559| ipcache.cc(566) ipcacheParse: ipcacheParse: security.ubuntu.com #5 [2001:67c:1562::17]
2016/02/20 22:18:32.559| ipcache.cc(566) ipcacheParse: ipcacheParse: security.ubuntu.com #6 [2001:67c:1560:8001::13]
2016/02/20 22:18:32.559| ipcache.cc(566) ipcacheParse: ipcacheParse: security.ubuntu.com #7 [2001:67c:1562::13]
2016/02/20 22:18:32.559| ipcache.cc(566) ipcacheParse: ipcacheParse: security.ubuntu.com #8 [2001:67c:1562::16]
2016/02/20 22:18:32.559| ipcache.cc(566) ipcacheParse: ipcacheParse: security.ubuntu.com #9 [2001:67c:1560:8001::11]
2016/02/20 22:18:32.559| ipcache.cc(555) ipcacheParse: ipcacheParse: security.ubuntu.com #10 91.189.91.13
2016/02/20 22:18:32.559| ipcache.cc(555) ipcacheParse: ipcacheParse: security.ubuntu.com #11 91.189.88.153
2016/02/20 22:18:32.559| ipcache.cc(555) ipcacheParse: ipcacheParse: security.ubuntu.com #12 91.189.91.15
2016/02/20 22:18:32.559| ipcache.cc(555) ipcacheParse: ipcacheParse: security.ubuntu.com #13 91.189.91.23
2016/02/20 22:18:32.559| ipcache.cc(555) ipcacheParse: ipcacheParse: security.ubuntu.com #14 91.189.88.152
2016/02/20 22:18:32.559| ipcache.cc(555) ipcacheParse: ipcacheParse: security.ubuntu.com #15 91.189.88.149
2016/02/20 22:18:32.559| ipcache.cc(555) ipcacheParse: ipcacheParse: security.ubuntu.com #16 91.189.92.201
2016/02/20 22:18:32.559| ipcache.cc(555) ipcacheParse: ipcacheParse: security.ubuntu.com #17 91.189.91.24
2016/02/20 22:18:32.559| ipcache.cc(555) ipcacheParse: ipcacheParse: security.ubuntu.com #18 91.189.92.200
2016/02/20 22:18:32.559| ipcache.cc(555) ipcacheParse: ipcacheParse: security.ubuntu.com #19 91.189.91.14

Marks :17 as bad:
2016/02/20 22:18:32.560| ipcache.cc(1075) ipcacheMarkBadAddr: ipcacheMarkBadAddr: security.ubuntu.com [2001:67c:1360:8001::17]:80

Decides to check :15
2016/02/20 22:18:32.560| ipcache.cc(1039) ipcacheCycleAddr: ipcacheCycleAddr: security.ubuntu.com now at [2001:67c:1562::15] (2 of 20)
2016/02/20 22:18:32.560| AsyncCall.cc(85) ScheduleCall: ConnOpener.cc(132) will call fwdConnectDoneWrapper(local=[::] remote=[2001:67c:1360:8001::17]:80 flags=1, errno=101, flag=-8, data=0x555c4de451b8) [...

Read more...

Revision history for this message
Scott Moser (smoser) wrote :

There is a work around for this issue currently in place.
The change was to remove one of the 10 ipv6 addresses that were returned in a query for security.ubuntu.com or archive.ubuntu.com. Now there are only 9 ipv6 addresses in place.

This works around the issue and users should not see this error when using squid as a apt mirror.

We will work on diagnosing the fix and getting the proper change SRU'd into the archive.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I think the MAAS tasks can be removed or marked as invalid here.

no longer affects: maas
no longer affects: maas/1.10
no longer affects: maas/1.9
Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :

It appears this bug is fixed in 3.5.1, which Robie Basak is syncing from debian.
I've tried to reproduce but have not been able. When robie fixes bug 1473691, the fix should come into xenial (16.04).

We can look upstream to find what fix this actually was and cherry pick back to 14.04.

no longer affects: squid-deb-proxy (Ubuntu)
Changed in squid3 (Ubuntu Trusty):
status: New → Confirmed
Changed in squid3 (Ubuntu Wily):
status: New → Confirmed
Changed in squid3 (Ubuntu Trusty):
importance: Undecided → High
Changed in squid3 (Ubuntu Xenial):
status: Confirmed → In Progress
importance: Undecided → High
Changed in squid3 (Ubuntu Wily):
importance: Undecided → Low
importance: Low → Medium
Revision history for this message
Scott Moser (smoser) wrote :

I've marked 16.04 task as triaged.

Revision history for this message
Amos Jeffries (yadi) wrote :

The upstream fix was http://www.squid-cache.org/Versions/v3/3.5/changesets/squid-3-12982.patch - which is to increase the number of IPs attempted to 25 instead of just 10.

Revision history for this message
Amos Jeffries (yadi) wrote :

And for the record. No Squid does not use libc getaddrinfo(). That API provides speed restrictions several orders of magnitude too slow for even small Squid installations.

description: updated
Revision history for this message
Paul Gear (paulgear) wrote :

@yadi Won't changing the limit from 10 to 25 just put off the problem until later? As I understand it, the main reason this issue caused problems is that squid attempts IPv6 connections from hosts without global IPv6 connectivity.

Revision history for this message
Scott Moser (smoser) wrote :

@paul,
 I suspect the '25' is 25 ipv6 addresses. Thats based on our debugging and fix we put into place. We started seeing the bug when a 10th ipv6 address was added to archive.ubuntu.com (and security.ubuntu.com). The workaround we put in place was to remove a single ipv6 address, resulting in 9 addresses.

So yes, it does just push the issue off, but a fairly large way off.

@Amos,
Thank you for your assessment. and pointer to the fix.

Changed in squid3 (Ubuntu Trusty):
status: Confirmed → Triaged
Changed in squid3 (Ubuntu Wily):
status: Confirmed → Triaged
Scott Moser (smoser)
description: updated
Scott Moser (smoser)
description: updated
Revision history for this message
Andres Rodriguez (andreserl) wrote :

Why wouldn't an appropriate fix be to prevent squid from using IPv6 if the system only has link-local addresses and not global addresses?

Scott Moser (smoser)
Changed in squid3 (Ubuntu Precise):
status: New → Triaged
importance: Undecided → High
Revision history for this message
Amos Jeffries (yadi) wrote :

Andres,
 Because there is no way to distinguish between a local-only network and one using NAT without actually trying to connect to the IPs (which is exactly what Squid is doing - up to the limit of forward_max_tries). The problem is identical and far more widespread in IPv4. Disabling IPv4 whenever RFC1918 addresses were the only ones assigned would cut a huge number of networks connectivity.

It simply comes down to the fact that despite some mistaken opinions to the contrary, IPv6 is mandatory for any network that wishes to communicate with the www. IPv4-only networks (even just on the global facing part) will face more and more inability to communicate as time passes. We can juggle some numbers to workaround the pain for a while. But in the end IPv6 is mandatory.

Revision history for this message
Paul Gear (paulgear) wrote :

@yadi: link-local only was the proposal by @andreserl (not RFC1918, or its IPv6 equivalent, ULA), and that is completely doable. So is detection of an isolated, but not link-local-only network. It's simple - if there's no route in the local routing table to the destination, it should be excluded from attempts. This would work for both IPv4 and IPv6, regardless of the presence of NAT, because NAT still requires a route to the destination in order to function.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package squid3 - 3.1.19-1ubuntu3.12.04.6

---------------
squid3 (3.1.19-1ubuntu3.12.04.6) precise-security; urgency=medium

  * SECURITY UPDATE: denial of service via crafted UDP SNMP request
    - debian/patches/CVE-2014-6270.patch: fix off-by-one in
      src/snmp_core.cc.
    - CVE-2014-6270
  * SECURITY UPDATE: error handling vulnerability
    - debian/patches/CVE-2016-2571.patch: better handling of huge response
      headers in src/http.cc.
    - CVE-2016-2571
  * Fix security issue that only applies when package is rebuilt with the
    enable-ssl flag, which is not the case in the Ubuntu archive.
    - debian/patches/CVE-2014-0128.patch: denial of service via a crafted
      range request.
  * debian/patches/increase-default-forward-max-tries.patch:
    change the default setting of 'forward_max_tries' from 10
    to 25. (LP: #1547640)

 -- Marc Deslauriers <email address hidden> Fri, 04 Mar 2016 14:57:14 -0500

Changed in squid3 (Ubuntu Precise):
status: Triaged → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package squid3 - 3.3.8-1ubuntu16.2

---------------
squid3 (3.3.8-1ubuntu16.2) wily-security; urgency=medium

  [ Scott Moser ]
  * debian/patches/increase-default-forward-max-tries.patch:
    change the default setting of 'forward_max_tries' from 10
    to 25. (LP: #1547640)

  [ Marc Deslauriers ]
  * SECURITY UPDATE: denial of service via crafted UDP SNMP request
    - debian/patches/CVE-2014-6270.patch: fix off-by-one in
      src/snmp_core.cc.
    - CVE-2014-6270
  * SECURITY UPDATE: error handling vulnerability
    - debian/patches/CVE-2016-2571.patch: better handling of huge response
      headers in src/http.cc.
    - CVE-2016-2571
  * Fix security issues that only apply when package is rebuilt with the
    enable-ssl flag, which is not the case in the Ubuntu archive.
    - debian/patches/CVE-2014-0128.patch: denial of service via a crafted
      range request.
    - debian/patches/CVE-2015-3455.patch: incorrect X509 server certificate
      domain matching.

 -- Marc Deslauriers <email address hidden> Fri, 04 Mar 2016 14:59:48 -0500

Changed in squid3 (Ubuntu Wily):
status: Triaged → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package squid3 - 3.3.8-1ubuntu6.6

---------------
squid3 (3.3.8-1ubuntu6.6) trusty-security; urgency=medium

  [ Scott Moser ]
  * debian/patches/increase-default-forward-max-tries.patch:
    change the default setting of 'forward_max_tries' from 10
    to 25. (LP: #1547640)

  [ Marc Deslauriers ]
  * SECURITY UPDATE: denial of service via crafted UDP SNMP request
    - debian/patches/CVE-2014-6270.patch: fix off-by-one in
      src/snmp_core.cc.
    - CVE-2014-6270
  * SECURITY UPDATE: error handling vulnerability
    - debian/patches/CVE-2016-2571.patch: better handling of huge response
      headers in src/http.cc.
    - CVE-2016-2571
  * Fix security issues that only apply when package is rebuilt with the
    enable-ssl flag, which is not the case in the Ubuntu archive.
    - debian/patches/CVE-2014-0128.patch: denial of service via a crafted
      range request.
    - debian/patches/CVE-2015-3455.patch: incorrect X509 server certificate
      domain matching.

 -- Marc Deslauriers <email address hidden> Fri, 04 Mar 2016 14:58:52 -0500

Changed in squid3 (Ubuntu Trusty):
status: Triaged → Fix Released
Revision history for this message
Amos Jeffries (yadi) wrote :

This is fixed in the squid package now available in Xenial.

Changed in squid3 (Ubuntu Xenial):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.