glibc update caused NSS ABI break

Bug #1674532 reported by Eric Horne on 2017-03-21
630
This bug affects 91 people
Affects Status Importance Assigned to Milestone
eglibc (Ubuntu)
Precise
Critical
Steve Beattie
Trusty
Critical
Steve Beattie
glibc (Ubuntu)
Critical
Adam Conrad
Xenial
Critical
Steve Beattie

Bug Description

After installing the libc6_2.19-0ubuntu6.10_amd64_udeb package during the automated install of ubuntu 14.04.5, the system was suddenly unable to resolve hostnames via dns. Installing -0ubuntu6.9 resolved the issue. Reinstalling -ubuntu6.10 broke the system again.

I note that -ubuntu6.10 was recently added to the archive.

I am currently unable to install Ubuntu 14.04.5.

CVE References

Eric Horne (ehorne) wrote :

"Suddenly" refers to the log in which it is happily downloading stuff from the archive and after successfully downloading and installing the libc6_2.19-0ubuntu6.10 package it is unable to download the next package because it can no longer resolve the archive.ubuntu.com name that it had been resolving prior to the installation of that libc6 library. I unfortunately don't know how to get logs off the system to send in :(

Steve Beattie (sbeattie) wrote :

Hi Eric, thanks for the rpeort. Sorry you're having difficulties. I can reproduce the issue with the Ubuntu 14.04 mini.iso from trusty-updates.

Changed in eglibc (Ubuntu):
status: New → Confirmed
importance: Undecided → Critical
Steve Beattie (sbeattie) wrote :

I am unable to reproduce it with the mini.iso from Ubuntu 12.04, even though the same problematic patch was backported to that release. It also does not reproduce with the mini.iso from Ubuntu 16.04.

Pieter Lexis (pieter-lexis) wrote :

This problem stems from the inclusion of the fix for CVE-2015-5180. The internal number for A + AAAA changed in glibc (from T_UNSPEC=62321 to T_QUERY_A_AND_AAAA=439963904).

In TCPDump, we see queries going out for TYPE62321 (the former internal number for A+AAAA).

I think busybox uses the old symbols here?

Ilia Sharov (elurin) wrote :

I confirm same kind of problem in Precise.
After update to -ubuntu6.10 ipv4 become to be preferable protocol even on ipv6-only servers.
%%
telnet google.com 443
Trying 173.194.222.102...
Trying 173.194.222.101...
Trying 173.194.222.113...
Trying 173.194.222.138...
Trying 173.194.222.139...
Trying 173.194.222.100...
telnet: Unable to connect to remote host: Network is unreachable
%%
Direct protocol direction works fine:
%%
telnet -6 google.com 443
Trying 2a00:1450:4010:c01::65...
Connected to google.com.
Escape character is '^]'.
%%

Hi

To extend the info on this one:

unattended-upgrades updated this package today, and caused the same issue for me, causing servers to stop resolving DNS internally. Rebooting after installing resolved the issue, suggesting the update itself is fine, but the process of updating breaks resolution.

After updating a server (without reboot), the internal DNS servers received odd queries of type TYPE20736 (no idea what that is)

21-Mar-2017 10:38:49.627 queries: info: client 172.16.x.x#49803 (host.internal.domain): query: host.internal.domain IN TYPE20736 + (172.16.x.x)

After rebooting said server, it starts querying normal type A records again.

I also run some authoritative DNS servers, and have seen the same query type from the public - suggesting that other servers out in the wild are experiencing the same issue!

21-Mar-2017 14:22:22.578 queries: client x.x.x.x#43836 (www.domain.com): query: www.domain.com IN TYPE20736 - (x.x.x.x)

I realize this is not directly related to the subject of this bug, but I think its relevant.

Cheers,
Michael

Éric Paul (epaul) wrote :

Hi,

I have the same problem. I'm unable to install any 14.04 from my PXE server. The installation proceed as expected until:

DEBUG: retrieving libc6-udeb 2.19-0ubuntu6.10

The next package triggers a:

wget: unable to resolve host address 'fr.archive.ubuntu.com'

Something strange noticed while debugging, using 'route' I get the right resolved address for my gateway while any 'ping fr.archive.ubuntu.com' in BBox print a:

ping: bad address 'fr.archive.ubuntu.com'

It's critical as I can't install anything.

Eric.

Stuart Harland (essjayhch) wrote :

This issue does not appear to exist in the installer for xenial but definitely effects the anna-install process for trusty.

The effect is to make the system unable to be installed.

Please see http://termbin.com/1npy which has the output of syslog from the installer.

Eric Horne (ehorne) wrote :

I believe changing the preseed.cfg file to use an IP address for the archive instead of the domain name will work around this; it's not a good long term solution since it messes up ubuntu's dns load balancing, but if you need to perform an install this might work. I'm in the process of verifying this.

d-i mirror/http/hostname string archive.ubuntu.com

to

d-i mirror/http/hostname string (looked up ip address)

and... thank you all for quickly investigating the issue! :)

Éric Paul (epaul) wrote :

Thanks to Eric for suggesting the workaround.

It works if there's no virtual hosting on the mirror. As I manage a local mirror, I changed my apache configuration and I can install with my PXE.

    Eric.

Rally (rallyspam) wrote :

Just hit this bug with our PXE provisioning system when trying to deploy 14.04. We were deploying without issues days ago.

The syslog from the failed install shows wget to our apt mirror location working fine until after anna retrieves libc6-udeb_2.19-0ubuntu6.10_amd64.deb. The subsequent retrieval of libcryptsetu4-udeb 2:1.6.1-1ubuntu1 fails due to wget being unable to resolve host address.

It looks like libc6-udeb_2.19-0ubuntu6.10_amd64.deb in the last 24 hours with numerous security fixes related to hostname resolution.

Rally (rallyspam) wrote :

I just tested the workaround of setting the apt mirror to be an IP instead of a hostname and it does finish the base install but our provisioning process is still broken. We have a late_command that gets a script to do integrations into our environment (AD domain joining, configuration management, certificates, etc.). All of that breaks as well because there's numerous parts that need hostname resolution.

Eric Horne (ehorne) wrote :

The IP workaround doesn't work because archive.ubuntu.com is a virutal host (ie it needs the Host: header in the HTTP request). What I did to work around that was add this to my preseed.cfg. Adjust as needed

d-i preseed/early_command string echo "91.189.88.161 archive.ubuntu.com" >> /etc/hosts && echo "hosts: files dns" >> /etc/nsswitch.conf

I added the dns in there just in case it happened to start working again :)

Of course, add other IPs/names that you need. If you need a lot, not sure. I was playing with the idea of trying to get it to install 6.9 only, but I couldn't figure out how to tell it to do that (and the early_command is too soon, and the late_command is too late).

It'd be great if 6.10 could be rolled back for now.

Steve Beattie (sbeattie) wrote :

Hi, I've put test glibc packages in the ubuntu-security-proposed ppa https://launchpad.net/~ubuntu-security-proposed/+archive/ubuntu/ppa/+packages that revert the fix for CVE-2015-5180. If someone could test that the udeb's from there don't break your pxe install that would be great.

Quite likely what will happen is if the libc6-udeb is manually installed after it breaks and the install is continued on successfully, another failure will happen when the libnss-dns-udeb is pulled from the ubuntu archive, and will likely need to also be manually installed.

Eric Horne (ehorne) wrote :

is it valid to run the broken install, when it breaks manually pull down the new package that you pushed and install that, then let it continue with the installation?

Adam Conrad (adconrad) on 2017-03-21
Changed in eglibc (Ubuntu):
status: Confirmed → Invalid
Changed in eglibc (Ubuntu Precise):
importance: Undecided → Critical
status: New → Confirmed
Changed in eglibc (Ubuntu Trusty):
importance: Undecided → Critical
status: New → Confirmed
Changed in eglibc (Ubuntu Xenial):
status: New → Invalid
Changed in eglibc (Ubuntu Yakkety):
status: New → Invalid
Changed in glibc (Ubuntu Precise):
status: New → Invalid
Changed in glibc (Ubuntu Trusty):
status: New → Invalid
Changed in glibc (Ubuntu):
assignee: nobody → Adam Conrad (adconrad)
Adam Conrad (adconrad) on 2017-03-21
Changed in glibc (Ubuntu):
status: New → Fix Committed
Adam Conrad (adconrad) on 2017-03-21
Changed in glibc (Ubuntu Yakkety):
status: New → Invalid
Changed in glibc (Ubuntu Xenial):
importance: Undecided → Critical
status: New → Confirmed
Changed in eglibc (Ubuntu Precise):
assignee: nobody → Steve Beattie (sbeattie)
Changed in eglibc (Ubuntu Trusty):
assignee: nobody → Steve Beattie (sbeattie)
Changed in glibc (Ubuntu Xenial):
assignee: nobody → Steve Beattie (sbeattie)
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package glibc - 2.23-0ubuntu7

---------------
glibc (2.23-0ubuntu7) xenial-security; urgency=medium

  * REGRESSION UPDATE: Previous update introduced ABI breakage in
    internal glibc query ABI
    - Revert patches/any/CVE-2015-5180-regression.diff
      (LP: #1674532)

 -- Steve Beattie <email address hidden> Tue, 21 Mar 2017 08:54:23 -0700

Changed in glibc (Ubuntu Xenial):
status: Confirmed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package eglibc - 2.19-0ubuntu6.11

---------------
eglibc (2.19-0ubuntu6.11) trusty-security; urgency=medium

  * REGRESSION UPDATE: Previous update introduced ABI breakage in
    internal glibc query ABI
    - Back out patches/any/CVE-2015-5180-regression.diff
      (LP: #1674532)

 -- Steve Beattie <email address hidden> Tue, 21 Mar 2017 03:28:13 -0700

Changed in eglibc (Ubuntu Trusty):
status: Confirmed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package eglibc - 2.15-0ubuntu10.17

---------------
eglibc (2.15-0ubuntu10.17) precise-security; urgency=medium

  * REGRESSION UPDATE: Previous update introduce ABI breakage in
    internal glibc query ABI
    - Back out patches/any/CVE-2015-5180-regression.diff
      (LP: #1674532)

 -- Steve Beattie <email address hidden> Tue, 21 Mar 2017 08:49:32 -0700

Changed in eglibc (Ubuntu Precise):
status: Confirmed → Fix Released
Rally (rallyspam) wrote :

The libc6-udeb_2.19-0ubuntu6.11_amd64.udeb fix that was just released fixes the name resolution issues and we're able to provision our systems as normal again.

Just like my earlier comment, this _fix_ *broke* a running system after it was automatically applied.

Start-Date: 2017-03-22 04:17:12
Commandline: /usr/bin/unattended-upgrade
Upgrade: libc6:amd64 (2.23-0ubuntu6, 2.23-0ubuntu7), locales:amd64 (2.23-0ubuntu6, 2.23-0ubuntu7), libc-bin:amd64 (2.23-0ubuntu6, 2.23-0ubuntu7), multiarch-support:amd64 (2.23-0ubuntu6, 2.23-0ubuntu7)
End-Date: 2017-03-22 04:17:34

Immediately after that, the system is no longer able to resolve names. On my DNS server, I can see this is sending queries of TYPE62321 instead of A

22-Mar-2017 04:17:30.407 queries: info: client 172.16.x.x#63762 (host.interna.domain): query: host.interna.domain IN TYPE62321 + (172.16.1.10)

A reboot clears the issue, but the point is that applying the update is still a "critical" bug because it breaks [some] running systems. In this case, the system is running apache+php. It might be that it only effects certain things.

To add to that, I just manually updated another running webserver, and boom. The apache+php webserver uses DNS to resolve the database server, which it can no longer do after the update, bringing the site down.

It sends these incorrect TYPE62321 DNS queries to our DNS server.

22-Mar-2017 05:33:01.205 queries: info: client 172.16.x.x#59821 (mysql.internal.domain): query: mysql.internal.domain IN TYPE62321 + (172.16.x.x)

In this case, restarting apache2 clears the issue. I don't know if anything else also needs restarting (so i did a reboot after to be sure).

If it helps, I also have traditional resolv.conf instead of resolvconf

Adam Conrad (adconrad) wrote :

Yes, it's known that updating from the version with the regression to the version that backed it out will cause the same (well, inverse) bug to appear until services or the machine are restarted. That's unfortunate, but there's not a whole bunch we can do to mitigate that, and reverting the regression was the important bit.

Also, once I have installed the update on the DNS server, if I run apt-get update it queries TYPE62321 record.

22-Mar-2017 06:04:37.406 queries: info: client 127.0.0.1#38103 (gb.archive.ubuntu.com.comlaude.lon2): query: gb.archive.ubuntu.com.comlaude.lon2 IN TYPE62321 + (127.0.0.1)

This server is also an apt-cacher server, if i restart apt-cacher, then it queries A records correctly again.

22-Mar-2017 06:05:37.426 queries: info: client 127.0.0.1#52629 (gb.archive.ubuntu.com): query: gb.archive.ubuntu.com IN A + (127.0.0.1)

Again, the question is what else needs to be restarted?

Adam Conrad (adconrad) on 2017-03-22
summary: - Ubuntu 14.04 broken during PXE boot
+ glibc update caused NSS ABI break
Adam Conrad (adconrad) wrote :

"Again, the question is what else needs to be restarted?"

The answer to that is "I don't know", which is why the general recommendation here should be to just reboot. Basically, any long-running process with the older libc loaded that tries to do NSS DNS lookups will fail. This was the bug introduced in the previous update, and reverted in this, but of course, the revert will have the same effect as the original update if you upgraded and restarted with the interim libc.

Again, this sucks, but there's no sane way to both revert the ABI break and make it smooth for the people who restarted services (or the machine) with the bad update in between.

fariazz (fariazz) wrote :

I'm having this problem in 14.04 since yesterday, where all my DNS resolutions take forever, some timeout. I've restarted several times.

This has broken many integrations in our site, is there anything that can be done to make things work?

This is the version of libc6 that I have:

dpkg -s libc6
Package: libc6
Status: install ok installed
Priority: required
Section: libs
Installed-Size: 10568
Maintainer: Ubuntu Developers <email address hidden>
Architecture: amd64
Multi-Arch: same
Source: glibc
Version: 2.21-0ubuntu4
Replaces: libc6-amd64
Depends: libgcc1
Suggests: glibc-doc, debconf | debconf-2.0, locales
Breaks: hurd (<< 1:0.5.git20140203-1), libtirpc1 (<< 0.2.3), lsb-core (<= 3.2-27), nscd (<< 2.21)
Conflicts: prelink (<= 0.0.20090311-1), tzdata (<< 2007k-1), tzdata-etch
Conffiles:
 /etc/ld.so.conf.d/x86_64-linux-gnu.conf 593ad12389ab2b6f952e7ede67b8fbbf
Description: GNU C Library: Shared libraries
 Contains the standard libraries that are used by nearly all programs on
 the system. This package includes shared versions of the standard C library
 and the standard math library, as well as many others.
Homepage: http://www.gnu.org/software/libc/libc.html
Original-Maintainer: GNU Libc Maintainers <email address hidden>

Roberts (robertsv) wrote :

Same issue on Ubuntu 14.X LTS
After auto upgrade on 2017.03.21 application running on apache and php is unable some times to connect to db server.
Upgrade log:
Preparing to unpack .../libc6_2.19-0ubuntu6.10_amd64.deb ...
Unpacking libc6:amd64 (2.19-0ubuntu6.10) over (2.19-0ubuntu6.9) ...

Upgrade on 2017.03.22 did not help
Upgrade log:
Preparing to unpack .../libc6_2.19-0ubuntu6.11_amd64.deb ...
Unpacking libc6:amd64 (2.19-0ubuntu6.11) over (2.19-0ubuntu6.10) ...

Is there an estimated time for fixing libc6 issue for Ubuntu 14.X LTS?

fariazz (fariazz) wrote :

My bad, I'm in 15.04.

Any workaround to make things work again?

This is causing severe problems on our end so any advise would be welcome.

Adam Conrad (adconrad) wrote :

@robertsv: As mentioned in other comments, just reboot. You'll be fine from there.

@fariazz: If you're on 15.04, you're not seeing this bug. Also, 15.04 has been out of support for over a year.

Roberts (robertsv) wrote :

@Adam C.: Thx!

Sean Leach (sean.leach) wrote :

I'm also seeing this issue replicated on 14.04.2 -

$ apt-cache show libc6 | grep Version
Version: 2.19-0ubuntu6.11
Version: 2.19-0ubuntu6

$cat /var/log/apt/history.log

...
Start-Date: 2017-03-21 07:11:13
Upgrade: libgnutls-openssl27:amd64 (2.12.23-12ubuntu2.6, 2.12.23-12ubuntu2.7), multiarch-support:amd64 (2.19-0ubuntu6.9, 2.19-0ubuntu6.10), libfreetype6:amd64 (2.5.2-1ubuntu2.5, 2.5.2-1ubuntu2.6), libc-dev-bin:amd64 (2.19-0ubuntu6.9, 2.19-0ubuntu6.10), libc-bin:amd64 (2.19-0ubuntu6.9, 2.19-0ubuntu6.10), libc6:amd64 (2.19-0ubuntu6.9, 2.19-0ubuntu6.10), libgnutls26:amd64 (2.12.23-12ubuntu2.6, 2.12.23-12ubuntu2.7), libc6-dev:amd64 (2.19-0ubuntu6.9, 2.19-0ubuntu6.10)
End-Date: 2017-03-21 07:11:31
...

Note some people suggesting a reboot will fix - in my case a reboot did not permanently fix the issue, but did give me around 18 hours of stability before performance was too poor to serve operationally.

Harry Wiles (coddyman) wrote :

14.04
Reboot temporarily fixed issue for a few hours, but has resurfaced since.

thecyborgus (justin-demaris) wrote :

I'm still seeing this on Ubuntu 16.04 connecting to Amazon RDS as well.

Apache service reloads had temporarily resolved it, but I got a huge rash of these again across all of my servers during the Unattended installs again last night.

Start-Date: 2017-03-21 04:30:58
Commandline: /usr/bin/unattended-upgrade
Upgrade: libc6-dev:amd64 (2.23-0ubuntu3, 2.23-0ubuntu6), libc6:amd64 (2.23-0ubuntu3, 2.23-0ubuntu6), locales:amd64 (2.23-0ubuntu3, 2.23-0ubuntu6), libc-bin:amd64 (2.23-0ubuntu3, 2.23-0ubuntu6), libc-dev-bin:amd64 (2.23-0ubuntu3, 2.23-0ubuntu6), multiarch-support:amd64 (2.23-0ubuntu3, 2.23-0ubuntu6), libfreetype6:amd64 (2.6.1-0.1ubuntu2, 2.6.1-0.1ubuntu2.1)
End-Date: 2017-03-21 04:31:02

Start-Date: 2017-03-22 03:53:14
Commandline: /usr/bin/unattended-upgrade
Upgrade: libc6-dev:amd64 (2.23-0ubuntu6, 2.23-0ubuntu7), libc6:amd64 (2.23-0ubuntu6, 2.23-0ubuntu7), locales:amd64 (2.23-0ubuntu6, 2.23-0ubuntu7), libc-bin:amd64 (2.23-0ubuntu6, 2.23-0ubuntu7), libc-dev-bin:amd64 (2.23-0ubuntu6, 2.23-0ubuntu7), multiarch-support:amd64 (2.23-0ubuntu6, 2.23-0ubuntu7)
End-Date: 2017-03-22 03:53:18

I rebooted after the error cropped up and it worked for about 8 or 9 hours before it happened again on the same servers.

rmuch (rmuch) wrote :

Experienced this issue when it brought down production services yesterday,
manifesting itself as the PHP issue described in #1674733.

A reboot temporarily solved the problem but then it reoccurred approximately 18
hours later.

I'm currently unable to provide reproduction steps other than it seems to affect
long-running processes.

Jeff Klink (techcanuck) wrote :

Agreed on this one, we're also experiencing production failures, but they happen every night around 1am and simply restarting apache clears it up.

From our logs

Start-Date: 2017-03-21 03:22:40
Commandline: /usr/bin/unattended-upgrade
Upgrade: libc6-dev:amd64 (2.23-0ubuntu3, 2.23-0ubuntu6), libc6:amd64 (2.23-0ubuntu3, 2.23-0ubuntu6), locales:amd64 (2.23-0ubuntu3, 2.23-0ubuntu6), libc-bin:amd64 (2.23-0ubuntu3, 2.23-0ubuntu6), libc-dev-bin:amd64 (2.23-0ubuntu3, 2.23-0ubuntu6), multiarch-support:amd64 (2.23-0ubuntu3, 2.23-0ubuntu6)
End-Date: 2017-03-21 03:22:46

Start-Date: 2017-03-22 00:27:23
Commandline: /usr/bin/unattended-upgrade
Upgrade: libc6-dev:amd64 (2.23-0ubuntu6, 2.23-0ubuntu7), libc6:amd64 (2.23-0ubuntu6, 2.23-0ubuntu7), locales:amd64 (2.23-0ubuntu6, 2.23-0ubuntu7), libc-bin:amd64 (2.23-0ubuntu6, 2.23-0ubuntu7), libc-dev-bin:amd64 (2.23-0ubuntu6, 2.23-0ubuntu7), multiarch-support:amd64 (2.23-0ubuntu6, 2.23-0ubuntu7)
End-Date: 2017-03-22 00:27:29

We believe that the unattended upgrade updated and killed us RIGHT around the same time, thus needing the apache restart. 2 nights in a row we found these failures, and 2 libc upgrades at the same time.

Taylor Otwell (taylorotwell) wrote :

Is this the new norm for Ubuntu 14.04? Just restart your entire server every night? This is the most ridiculous shit I have ever seen in my life. When will this be fixed?

Chris Monahan (cobra-v) wrote :

Taylor - I agree. I have trusted automatic security updates for years. I have to turn them off and start running updates manually because I can't trust Ubuntu won't do this again.

Adam Conrad (adconrad) wrote :

To everyone asking "when will this be fixed", it's fixed. If you're up to date, the regression is fixed. This doesn't require "nightly" reboots, it requires one reboot after being fully up to date. Sarcasm and snark and overstating the issue may make you feel better, but it doesn't help people who are actually reading the bug for factual information.

thecyborgus (justin-demaris) wrote :

For clarities sake, and so I can understand what happened on my servers, is it correct that this issue cropped up with two different Unattended installs on two consecutive days? (i.e. the 21st and the 22nd).

information type: Public → Public Security
Adam Conrad (adconrad) wrote :

@justin-demaris: Yes, sort of. What happened is that the first update caused a regression, and the second update reverted that. Any services that were restarted with the broken update would then have been "re-broken" with the update that backed out that change until they were again re-started. The good news is that people with less aggressive update schedules probably skipped the bug entirely (as upgrading from an older libc to the current one won't break), but that's scant comfort for those who came along for the ride.

Marech (marechs) wrote :

Hey! Im little bit confused, which version is latest stable release where bug is fixed?

Im Using

Distributor ID: Ubuntu
Description: Ubuntu 16.04.2 LTS
Release: 16.04
Codename: xenial

Package: libc6
Status: install ok installed
Priority: required
Section: libs
Installed-Size: 10948
Maintainer: Ubuntu Developers <email address hidden>
Architecture: amd64
Multi-Arch: same
Source: glibc
Version: 2.23-0ubuntu7
Replaces: libc6-amd64
Depends: libgcc1
Suggests: glibc-doc, debconf | debconf-2.0, locales
Breaks: hurd (<< 1:0.5.git20140203-1), libtirpc1 (<< 0.2.3), locales (<< 2.23), locales-all (<< 2.23), lsb-core (<= 3.2-27), nscd (<< 2.23)
Conffiles:
 /etc/ld.so.conf.d/x86_64-linux-gnu.conf 593ad12389ab2b6f952e7ede67b8fbbf
Description: GNU C Library: Shared libraries
 Contains the standard libraries that are used by nearly all programs on
 the system. This package includes shared versions of the standard C library
 and the standard math library, as well as many others.
Homepage: http://www.gnu.org/software/libc/libc.html
Original-Maintainer: GNU Libc Maintainers <email address hidden>

And still this morning got the same error.

Andrew Wason (rectalogic) wrote :

After installing libc6 2.23-0ubuntu7, you can check for running processes that are using a deleted (pre-update) /lib/x86_64-linux-gnu/libc-2.23.so using lsof:

sudo lsof -d DEL | grep /lib/x86_64-linux-gnu/libc-2.23.so

Those processes need to be restarted to re-link with the fixed libc.

Jamie (jm7485) wrote :

I just want to chime in here and say this bug has cost our company thousands of dollars in lost revenue, and not to mention the unnecessary cost of debugging time, because it broke two of the most important API endpoints we use... Stripe for payments and the API for email delivery with SendGrid, so customers where unable to sign up, and even if they could, pay us.

We will be disabling automatic security updates from now on.

Ponny (ponny) wrote :

Still getting this despite being on latest. Seems to come and go. I'm on 14.04.5.

Robie Basak (racb) on 2017-03-24
tags: added: regression-update
Martin Su (mrtns) wrote :
Download full text (6.7 KiB)

We are still encountering intermittent DNS resolution issues, even after an OS reboot. For example:

Net::OpenTimeout: execution expired
                  /usr/lib/ruby/2.3.0/resolv-replace.rb: 25:in `initialize'
                  /usr/lib/ruby/2.3.0/resolv-replace.rb: 25:in `initialize'
                        /usr/lib/ruby/2.3.0/net/smtp.rb: 542:in `open'
                        /usr/lib/ruby/2.3.0/net/smtp.rb: 542:in `tcp_socket'

Environment:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.4 LTS
Release: 14.04
Codename: trusty

$ grep '2017-03-2' /var/log/unattended-upgrades/unattended-upgrades.log
2017-03-20 06:42:25,345 INFO Initial blacklisted packages: postgresql redis-server
2017-03-20 06:42:25,346 INFO Starting unattended upgrades script
2017-03-20 06:42:25,347 INFO Allowed origins are: ['o=Ubuntu,a=trusty-security']
2017-03-20 06:42:41,350 INFO No packages found that can be upgraded unattended and no pending auto-removals
2017-03-21 06:35:11,528 INFO Initial blacklisted packages: postgresql redis-server
2017-03-21 06:35:11,531 INFO Starting unattended upgrades script
2017-03-21 06:35:11,531 INFO Allowed origins are: ['o=Ubuntu,a=trusty-security']
2017-03-21 06:35:47,806 INFO Packages that will be upgraded: libc-bin libc-dev-bin libc6 libc6-dev libfreetype6 libgnutls-openssl27 libgnutls26 multiarch-support
2017-03-21 06:35:47,807 INFO Writing dpkg log to '/var/log/unattended-upgrades/unattended-upgrades-dpkg_2017-03-21_06:35:47.807180.log'
2017-03-21 06:35:55,379 INFO All upgrades installed
2017-03-21 06:36:02,861 INFO Packages that are auto removed: ''
2017-03-21 06:36:02,941 INFO Packages auto-removed
2017-03-22 06:31:53,606 INFO Initial blacklisted packages: postgresql redis-server
2017-03-22 06:31:53,607 INFO Starting unattended upgrades script
2017-03-22 06:31:53,607 INFO Allowed origins are: ['o=Ubuntu,a=trusty-security']
2017-03-22 06:32:13,869 INFO Packages that will be upgraded: libc-bin libc-dev-bin libc6 libc6-dev multiarch-support
2017-03-22 06:32:13,870 INFO Writing dpkg log to '/var/log/unattended-upgrades/unattended-upgrades-dpkg_2017-03-22_06:32:13.869840.log'
2017-03-22 06:32:19,722 INFO All upgrades installed
2017-03-22 06:32:25,622 INFO Packages that are auto removed: ''
2017-03-22 06:32:25,668 INFO Packages auto-removed
2017-03-23 06:28:02,799 INFO Initial blacklisted packages: postgresql redis-server
2017-03-23 06:28:02,800 INFO Starting unattended upgrades script
2017-03-23 06:28:02,801 INFO Allowed origins are: ['o=Ubuntu,a=trusty-security']
2017-03-23 06:28:10,368 INFO No packages found that can be upgraded unattended and no pending auto-removals
2017-03-24 06:29:48,590 INFO Initial blacklisted packages: postgresql redis-server
2017-03-24 06:29:48,591 INFO Starting unattended upgrades script
2017-03-24 06:29:48,592 INFO Allowed origins are: ['o=Ubuntu,a=trusty-security']
2017-03-24 06:30:03,308 INFO Packages that will be upgraded: git git-core git-man
2017-03-24 06:30:03,309 INFO Writing dpkg log to '/var/log/unattended-upgrades/unattended-upgrades-dpkg_2017-03-24_06:30:03.308474.log'
2017-03-24 06:30:09,616 INFO All upgrades installed
2017-03-24 06:30:1...

Read more...

Nacho Vazquez (ivazquez1) wrote :

It's still happening to my servers too.

It looks like they are being all hit (12 of them) at the same time. This is after an apt-get update / upgrade cycle and a reboot.

$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.5 LTS"

$ apt-cache show libc6 | grep Version
Version: 2.19-0ubuntu6.11
Version: 2.19-0ubuntu6

Lukas (jossnaz) wrote :

http://stackoverflow.com/questions/43053872/file-get-contents-with-full-url-to-same-server-very-slow-with-php-7-as-of-very-r

it is not even necessary that the DNS lookup doesn't work, it actually can be just very very slow so it seems to me at least

Lukas (jossnaz) wrote :

and yes, to say as well:

we have 4 servers affected, clients are very upset. We will be disabling automatic updates.

for the meantime: what is the proper solution to this?

Seth Arnold (seth-arnold) wrote :

Lukas, if your DNS works but is slow that's probably two junk nameservers listed in /etc/resolv.conf before one working nameserver. A full debugging of your DNS is probably out of scope for a bug report; I suggest heading to askubuntu.com or IRC.

Thanks

no longer affects: eglibc (Ubuntu)
no longer affects: eglibc (Ubuntu Xenial)
no longer affects: eglibc (Ubuntu Yakkety)
no longer affects: glibc (Ubuntu Precise)
no longer affects: glibc (Ubuntu Trusty)
no longer affects: glibc (Ubuntu Yakkety)
Changed in glibc (Ubuntu):
importance: Undecided → Critical
Paul Dubs (paul-dubs) wrote :

I am also still seeing problems related to hostname resolution.

Sometimes, a simple 'dig @8.8.8.8 google.com' will take up to 30 seconds and then timeout, sometimes it works after a long delay, and sometimes it works as quick as it should.

In any instance where it works, it says that the dns server answered in about 30ms, so the delay must be from somewhere else.

I'm running this with glibc version 2.24-3ubuntu2 on yakkety.

bhat3 (bhat3) wrote :

#Fixed LTS package versions: 2.23-0ubuntu7 (xenial) & 2.19-0ubuntu6.11 (trusty)

@jozznaz, jm7485 & co: You're not alone for me it was way more than 50 boxes where the PHP stuff couldn't resolve names anymore and needed my attention. Nasty and frustrating for sure ... and i was also swearing ;)

But for all the "haters": Keep in mind that automatic updates means you automated a change request that can fail by definition, that's why you should do them in a specific time frame so you're able to respond quickly in case of failure. If you don't know how to do that and you're going fully productive with your servers better find a real Linux admin who knows more then just "apt install unattended-upgrades" ;)

Otherwise a preintegrated solution to problems with libc updates is to blacklist them in /etc/apt/apt.conf.d/50unattended-upgrades:

// List of packages to not update (regexp are supported)
Unattended-Upgrade::Package-Blacklist {
// "vim";
// "libc6";
// "libc6-dev";
// "libc6-i686";
};

But be aware the security holes in there can effect a lot of stuff and will not get patched if you blacklist them. In general i run unattended-upgrades for many many years now and it was the second time i got nuked so it's still a good trade off if you consider the time you save for manually patching or that you have patched systems when you don't have the time.

Another paid solution would be to talk with Canonical about Landscape in what ways it can help here: https://landscape.canonical.com/

Adam Conrad (adconrad) wrote :

@paul-dubs: "dig @8.8.8.8", by definition, completely bypasses the libc resolver. It makes a direct connection to 8.8.8.8 and queries it. If that's slow or broken for you, you have a link layer networking issue, or Google's geo-located 8.8.8.8 near you is broken, or some other similar thing. It literally can't relate to the this update.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package glibc - 2.24-9ubuntu2

---------------
glibc (2.24-9ubuntu2) zesty; urgency=medium

  * debian/patches/any/cvs-resolv-internal-qtype.diff: Revert to avoid
    failure in name resolution on upgrades from yakkety (LP: #1674532)

 -- Adam Conrad <email address hidden> Tue, 21 Mar 2017 15:27:15 -0600

Changed in glibc (Ubuntu):
status: Fix Committed → Fix Released
Daniel Colceag (danielcolceag) wrote :

I'm having issues with DNS over VPN since 2.24-0ubuntu2. I cannot access the company's internal sites through VPN anymore. Please tell me what should I post here for more details

Seth Arnold (seth-arnold) wrote :

Daniel, if this bug is related to your issue you can simply reboot to fix it. If it's still broken then it's not this issue.

Thanks

To post a comment you must log in.
This report contains Public Security information  Edit
Everyone can see this security related information.

Duplicates of this bug

Other bug subscribers