id crashed with SIGSEGV in sock_eq()

Bug #1571456 reported by Anders Kaseorg on 2016-04-18
38
This bug affects 5 people
Affects Status Importance Assigned to Milestone
GLibC
Unknown
Unknown
glibc (Debian)
Fix Released
Unknown
glibc (Fedora)
Fix Released
Undecided
glibc (Ubuntu)
High
Adam Conrad
Xenial
High
Adam Conrad

Bug Description

[Impact]

The nss_hesiod nsswitch module, which worked in previous releases, does not work at all in Ubuntu 16.04. Enabling it causes NULL pointer dereferences in calls such as getpwuid(). This will prevent any user logins from succeeding in our environment of hundreds of workstations, which in turn blocks us from upgrading from 14.04 to 16.04.

[Test Case]

# sed -i 's/passwd: *compat/& hesiod/' /etc/nsswitch.conf
# cat > /etc/hesiod.conf <<EOF
lhs=.ns
rhs=.athena.mit.edu
EOF
# id andersk
Segmentation fault (core dumped)

Expected output: uid=39270(andersk) gid=101(…) groups=101(…).

[Regression Potential]

I wrote a 6-line patch that conditionalizes an errant res_nclose call. There is also a bigger upstream patch on the glibc 2.22 and 2.23 stable branches that entirely removes the unused abstraction that necessitated the res_nclose calls at all. Neither patch makes any changes outside of the glibc hesiod directory, which as of now is so thoroughly broken that there is nothing left to regress.

[Other Info]

ProblemType: Crash
DistroRelease: Ubuntu 16.04
Package: coreutils 8.25-2ubuntu2
ProcVersionSignature: Ubuntu 4.4.0-18.34-generic 4.4.6
Uname: Linux 4.4.0-18-generic x86_64
NonfreeKernelModules: openafs
ApportVersion: 2.20.1-0ubuntu2
Architecture: amd64
CurrentDesktop: GNOME
Date: Sun Apr 17 22:39:06 2016
EcryptfsInUse: Yes
ExecutablePath: /usr/bin/id
ExecutableTimestamp: 1455802667
InstallationDate: Installed on 2016-02-19 (58 days ago)
InstallationMedia: Ubuntu-GNOME 16.04 LTS "Xenial Xerus" - Alpha amd64 (20160218)
ProcCmdline: id andersk
ProcCwd: /home/anders
SegvAnalysis:
 Segfault happened at: 0x7fef32217a88 <__libc_res_nsend+3192>: cmp %dx,(%rax)
 PC (0x7fef32217a88) ok
 source "%dx" ok
 destination "(%rax)" (0x00000000) not located in a known VMA region (needed writable region)!
SegvReason: writing NULL VMA
Signal: 11
SourcePackage: coreutils
StacktraceTop:
 sock_eq (a2=0x0, a1=0x7fef33b9daf4 <_res+20>) at res_send.c:1584
 __libc_res_nsend (statp=0x7fef33b9dae0 <_res>, buf=buf@entry=0x7ffd88e80910 "@\267\001", buflen=45, buf2=buf2@entry=0x0, buflen2=buflen2@entry=0, ans=ans@entry=0x7ffd88e80d10 " you want. Don't add spaces after the\n", anssiz=1024, ansp=0x0, ansp2=0x0, nansp2=0x0, resplen2=0x0, ansp2_malloced=0x0) at res_send.c:408
 __GI___res_nsend (statp=<optimized out>, buf=buf@entry=0x7ffd88e80910 "@\267\001", buflen=<optimized out>, ans=ans@entry=0x7ffd88e80d10 " you want. Don't add spaces after the\n", anssiz=anssiz@entry=1024) at res_send.c:630
 get_txt_records (class=1, name=name@entry=0xff3dd0 "39270.uid.ns.athena.mit.edu", ctx=0xff27e0) at hesiod.c:374
 hesiod_resolve (context=context@entry=0xff27e0, name=name@entry=0x7ffd88e81190 "39270", type=type@entry=0x7fef3242a486 "uid") at hesiod.c:240
Title: id crashed with SIGSEGV in sock_eq()
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm bumblebee cdrom dip libvirtd lpadmin plugdev sambashare sbuild sudo wireshark

Created attachment 1061638
gdb "where full" results, plus a couple of variables

Description of problem:
With a working hesiod configuration, and hesiod enabled for group resolution, multiple applications are crashing while initializing a supplemental groups list.

Version-Release number of selected component (if applicable):
glibc-2.21.90-21.fc23.x86_64
coreutils-8.24-2.fc23.x86_64 used to reproduce the bug

How reproducible:
Always

Steps to Reproduce:
1. cat > /etc/hesiod.conf << EOF
lhs=.hs
rhs=.devel.redhat.com
EOF
2. Add "hesiod" as a source for "group" information in /etc/nsswitch.conf. Mine reads "files hesiod".
3. Run "groups nalin" or similar.

Actual results:
"groups" segfaults. I'll attach the gdb backtrace.

Expected results:
The expected groups list.

Additional info:

Caused by upstream commit 2212c1420c92a33b0e0bd9a34938c9814a56c0f7. Bug reported upstream. There are various ways to fix this, but which approach is best is unclear.

Reproducer without changing /etc:

cat > /etc/hesiod.conf << EOF
lhs=.hs
rhs=.devel.redhat.com
EOF
HESIOD_CONFIG=hesiod.conf getent -s hesiod group 0 0

I'm reverting the upstream commit which introduced this bug.

glibc-2.22-9.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0f9e9a34ce

glibc-2.22-9.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-0f9e9a34ce

glibc-2.22-9.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.

Due to this change, a glibc update to the fixed versions *without* a reboot (or process restart) may cause name resolution failures.

Anders Kaseorg (andersk) wrote :
affects: coreutils (Ubuntu) → glibc (Ubuntu)
information type: Private → Public

StacktraceTop:
 sock_eq (a2=0x0, a1=0x7fef33b9daf4 <_res+20>) at res_send.c:1584
 __libc_res_nsend (statp=0x7fef33b9dae0 <_res>, buf=buf@entry=0x7ffd88e80910 "@\267\001", buflen=45, buf2=buf2@entry=0x0, buflen2=buflen2@entry=0, ans=ans@entry=0x7ffd88e80d10 " you want. Don't add spaces after the\n", anssiz=1024, ansp=0x0, ansp2=0x0, nansp2=0x0, resplen2=0x0, ansp2_malloced=0x0) at res_send.c:408
 __GI___res_nsend (statp=<optimized out>, buf=buf@entry=0x7ffd88e80910 "@\267\001", buflen=<optimized out>, ans=ans@entry=0x7ffd88e80d10 " you want. Don't add spaces after the\n", anssiz=anssiz@entry=1024) at res_send.c:630
 get_txt_records (class=1, name=name@entry=0xff3dd0 "39270.uid.ns.athena.mit.edu", ctx=0xff27e0) at hesiod.c:374
 hesiod_resolve (context=context@entry=0xff27e0, name=name@entry=0x7ffd88e81190 "39270", type=type@entry=0x7fef3242a486 "uid") at hesiod.c:240

Changed in glibc (Ubuntu):
importance: Undecided → Medium
tags: removed: need-amd64-retrace
Changed in glibc (Debian):
status: Unknown → Confirmed
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in glibc (Ubuntu):
status: New → Confirmed
Anders Kaseorg (andersk) wrote :

Here’s a debdiff adding the patch I submitted to the glibc mailing list (https://sourceware.org/ml/libc-alpha/2016-04/msg00563.html). Unlike the Fedora revert, this only touches nss_hesiod, which is completely unusable without the fix. So it should be very low risk.

Anders Kaseorg (andersk) wrote :

The attachment "glibc_2.23-0ubuntu3_lp1571456.debdiff" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch
Adam Conrad (adconrad) wrote :

As soon as you get this committed to glibc master, I'll backport it to the 2.23 branch and pull it into the first xenial SRU.

Anders Kaseorg (andersk) wrote :

This was fixed in glibc master by ripping out the affected code. Either strategy is fine.

https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=5018f16c6205404ba3aa7298dc8a3d45fbd46bfc

Changed in glibc (Debian):
status: Confirmed → Fix Released
Changed in glibc (Ubuntu):
status: Confirmed → Triaged

Unsubscribing sponsors as it looks like this is going to happen via upstream.

Changed in glibc (Ubuntu):
assignee: nobody → Adam Conrad (adconrad)
Anders Kaseorg (andersk) wrote :

It happened upstream. What are we waiting for now?

Anders Kaseorg (andersk) wrote :

Upstream backports:

master: 5018f16c6205404ba3aa7298dc8a3d45fbd46bfc
2.23: 2d1f6790183dabf54c5b05be97d3872dab720c83
2.22: a64be6fb2f1317ce7039a4bb8638bd0c30c31e28

Ken Baker (bakerkj) wrote :

Are there any updates on the status of this?

Anders Kaseorg (andersk) wrote :

This was fixed in yakkety’s libc6 2.23-1ubuntu1. We are still waiting for a xenial SRU.

description: updated
Anders Kaseorg (andersk) on 2016-07-19
description: updated
Anders Kaseorg (andersk) wrote :

If you want to fix this by pulling in the upstream 2.23 stable branch, here is a debdiff for that. I verified that debian/patches/git-updates.diff contains the output of ‘git diff glibc-2.23 glibc-2.23-21-g146b58d’, replaced it with the output of ‘git diff glibc-2.23 glibc-2.23-71-gbbe472f’ (minus the manual directory, which was removed from the source package), and resolved a trivial patch conflict.

There are some other important fixes there, including CVE-2016-1234, CVE-2016-3706, and CVE-2016-4429.

There’s a test build in ppa:anders-kaseorg/ppa.

Anders Kaseorg (andersk) on 2016-07-22
Changed in glibc (Ubuntu):
status: Triaged → Fix Released
Changed in glibc (Ubuntu Xenial):
status: New → Confirmed
Changed in glibc (Ubuntu Xenial):
importance: Undecided → Medium
Ken Baker (bakerkj) wrote :

@Anders,

I have taken your patch (glibc_2.23-0ubuntu3_stable-updates.debdiff), applied it, and built a version on armhf and so far it works.

Thank you!

Anders Kaseorg (andersk) wrote :

Thanks Ken. I believe that either patch is ready to be uploaded to xenial-proposed (the smaller patch would just need the changelog version adjusted). I’d really appreciate any feedback from sponsors here; if you’d prefer something else in between the extremes of a targeted patch fixing just this problem and a wholesale update from the upstream stable branch, I am happy to help out in whatever way is most likely to get this fixed quickly.

Luke Faraone (lfaraone) on 2016-08-14
Changed in glibc (Ubuntu Xenial):
status: Confirmed → Triaged
Anders Kaseorg (andersk) on 2016-09-21
tags: added: patch-accepted-debian patch-accepted-upstream
Anders Kaseorg (andersk) wrote :

This is still awaiting sponsorship of one of these patches to xenial-proposed.

Taylor Yu (tlyu) on 2016-09-28
description: updated
Changed in glibc (Debian):
status: Fix Released → Confirmed
Changed in glibc (Ubuntu):
importance: Medium → High
Changed in glibc (Ubuntu Xenial):
importance: Medium → High
Adam Conrad (adconrad) on 2016-09-30
Changed in glibc (Ubuntu Xenial):
assignee: nobody → Adam Conrad (adconrad)

Hello Anders, or anyone else affected,

Accepted glibc into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/glibc/2.23-0ubuntu4 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in glibc (Ubuntu Xenial):
status: Triaged → Fix Committed
tags: added: verification-needed
Adam Conrad (adconrad) wrote :

Verified that the 2.23-0ubuntu4 binaries in xenial-proposed resolve this issue.

tags: added: verification-done
removed: verification-needed
Ken Baker (bakerkj) wrote :

Verified that the 2.23-0ubuntu4 binaries in xenial-proposed resolve this issue for me on both armhf and amd64.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package glibc - 2.23-0ubuntu4

---------------
glibc (2.23-0ubuntu4) xenial; urgency=medium

  * debian/rules.d/tarball.mk: Apply --no-renames to make the diff readable.
  * debian/patches/git-updates.diff: Update from release/2.23/master branch:
    - Include fix for potential makecontext() hang on ARMv7 (CVE-2016-6323)
    - Include fix for SEGV in sock_eq with nss_hesiod module (LP: #1571456)
    - Include malloc fixes, addressing multithread deadlocks (LP: #1630302)
    - debian/patches/hurd-i386/cvs-libpthread.so.diff: Dropped, upstreamed.
    - debian/patches/any/submitted-argp-attribute.diff: Dropped, upstreamed.
    - debian/patches/hurd-i386/tg-hurdsig-fixes-2.diff: Rebased to upstream.
  * debian/patches/ubuntu/local-altlocaledir.diff: Updated to latest version
    from Martin that limits scope to LC_MESSAGES, fixing segv (LP: #1577460)
  * debian/patches/any/cvs-cos-precision.diff: Fix cos() bugs (LP: #1614966)
  * debian/testsuite-xfail-debian.mk: Allow nptl/tst-signal6 to fail on ARM.

 -- Adam Conrad <email address hidden> Fri, 14 Oct 2016 00:00:34 -0600

Changed in glibc (Ubuntu Xenial):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for glibc has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Changed in glibc (Debian):
status: Confirmed → Fix Released
Changed in glibc (Fedora):
importance: Unknown → Undecided
status: Unknown → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.