exec'ing a setuid binary from a threaded program sometimes fails to setuid

Bug #1672819 reported by John Lenton
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Linux
Confirmed
High
golang-1.6 (Ubuntu)
Invalid
Undecided
Unassigned
Xenial
Fix Released
High
Michael Hudson-Doyle
Yakkety
Invalid
Undecided
Unassigned
Zesty
Invalid
Undecided
Unassigned
linux (Ubuntu)
Fix Released
High
Colin Ian King
Xenial
Fix Released
High
Colin Ian King
Yakkety
Fix Released
High
Colin Ian King
Zesty
Fix Released
High
Colin Ian King

Bug Description

== SRU template for golang-1.6 ==

[Impact]
The kernel bug reported below means that occasionally (maybe 1 in 1000 times) the snapd -> snap-confine exec that is part of a snap execution fails to take the setuid bit on the snap-confine binary into account which means that the execution fails. This is extremely confusing for the user of the snap who just sees a permission denied error with no explanation.

The kernel bug has been fixed in Xenial+ but not all users of snapd are on xenial+ kernels (they might be on trusty or another distribution entirely).
Backporting this fix will mean that the snapd in the core snap will get the workaround next time it is built and because the snapd in trusty or the other distro will re-exec into the snapd in the core snap before execing snap-confine, users should not see the above behaviour.

[Test case]
This will be a bit tricky as the kernel bug has been fixed. A xenial container on a trusty host/VM should do the trick. The test case from https://gist.github.com/chipaca/806c90d96c437444f27f45a83d00a813 should be sufficient to demonstrate the bug and then, once golang-1.6 has been upgraded from proposed, the fix.

[Regression potential]
If there is a bug in the patch it could cause deadlocks in currently working programs. But the patch is pretty simple and has passed review upstream so I think it should be OK.

== SRU REQUEST XENIAL, YAKKETY, ZESTY ==

Due to two race conditions in check_unsafe_exec(), exec'ing a setuid binary from a threaded program sometimes fails to setuid.

== Fix ==

Sauce patch for Xenial, Yakkety + Zesty:

https://lists.ubuntu.com/archives/kernel-team/2017-May/084102.html

This fix re-executes the unsafe check if there is a discrepancy between the expected fs count and the found count during the racy window during thread exec or exit. This re-check occurs very infrequently and saves a lot of addition locking on per thread structures that would make performance of fork/exec/exit prohibitively expensive.

== Test case ==

See the example C code in the patch, https://lists.ubuntu.com/archives/kernel-team/2017-May/084102.html

Run the test code as follows: for i in $(seq 1000); do ./a; done

With the patch, no messages are emitted, without the patch, one sees a message:

"Failed, got euid 1000 (expecting 0)"

..which shows the setuid program failed the check_unsafe_exec() because of the race.

== Regression potential ==

breaking existing safe exec semantics.

====================

This can be reproduced with
https://gist.github.com/chipaca/806c90d96c437444f27f45a83d00a813

With that, and go 1.8, if you run “make” and then

for i in `seq 99`; do ./a_go; done

you'll see a variable number of ”GOT 1000” (or whatever your user id is). If you don't, add one or two more 9s on there.

That's a simple go reproducer. You can also use “a_p” instead of “a_go” to see one that only uses pthreads. “a_c” is a C version that does *not* reproduce the issue.

But it's not pthreads: if in a_go.go you comment out the “import "C"”, you'll still see the “GOT 1000” messages, in a static binary that uses no pthreads, just clone(2). You'll also see a bunch of warnings because it's not properly handling an EAGAIN from clone, but that's unrelated.

If you pin the process to a single thread using taskset, you don't get the issue from a_go; a_p continues to reproduce the issue. In some virtualized environments we haven't been able to reproduce the issue either (e.g. some aws instances), but kvm works (you need -smp to see the issue from a_go).

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-4.4.0-64-generic 4.4.0-64.85
ProcVersionSignature: Ubuntu 4.4.0-64.85-generic 4.4.44
Uname: Linux 4.4.0-64-generic x86_64
NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
ApportVersion: 2.20.1-0ubuntu2.5
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/pcmC0D0p: john 2354 F...m pulseaudio
 /dev/snd/controlC0: john 2354 F.... pulseaudio
CurrentDesktop: Unity
Date: Tue Mar 14 17:17:23 2017
HibernationDevice: RESUME=UUID=b9fd155b-dcbe-4337-ae77-6daa6569beaf
InstallationDate: Installed on 2014-04-27 (1051 days ago)
InstallationMedia: Ubuntu 14.04 LTS "Trusty Tahr" - Release amd64 (20140417)
MachineType: Dell Inc. Latitude E6510
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.4.0-64-generic root=/dev/mapper/ubuntu--vg-root ro enable_mtrr_cleanup mtrr_spare_reg_nr=8 mtrr_gran_size=32M mtrr_chunk_size=32M quiet splash
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-64-generic N/A
 linux-backports-modules-4.4.0-64-generic N/A
 linux-firmware 1.157.8
SourcePackage: linux
SystemImageInfo: Error: command ['system-image-cli', '-i'] failed with exit code 2:
UpgradeStatus: Upgraded to xenial on 2015-06-18 (634 days ago)
dmi.bios.date: 12/05/2013
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A16
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 9
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA16:bd12/05/2013:svnDellInc.:pnLatitudeE6510:pvr0001:rvnDellInc.:rn:rvr:cvnDellInc.:ct9:cvr:
dmi.product.name: Latitude E6510
dmi.product.version: 0001
dmi.sys.vendor: Dell Inc.

Revision history for this message
John Lenton (chipaca) wrote :
Revision history for this message
John Lenton (chipaca) wrote :

I also tried this in 4.10.0-11-generic, same results.

Changed in linux (Ubuntu):
status: New → Triaged
importance: Undecided → High
Changed in linux (Ubuntu Xenial):
status: New → Triaged
importance: Undecided → High
tags: added: kernel-key
Revision history for this message
Kamal Mostafa (kamalmostafa) wrote :

I can reproduce this with the simple pthreads-only reproducer (loop of ./a_p running setuid binary ./b) running 4.4.0-57-generic on bare metal.

$ for i in `seq 10`; do ./a_p; done
GOT 1000
GOT 1000

$ for i in `seq 1000`; do ./a_p; done | wc -l
117

Revision history for this message
Kamal Mostafa (kamalmostafa) wrote :

An AWS instance (t2.xlarge with 4 vCPU's) running 4.4.0-1001-aws reproduces the problem:

$ for i in `seq 10000`; do ./a_p; done | wc -l
124

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

I had a bit of a stare at the kernel source and suspected that the downgrade of uid is happening here: https://github.com/torvalds/linux/blob/v4.4/security/commoncap.c#L547-L548

I added a "WARN(1, "downgrading in subprocess %d %d\n", bprm->unsafe, (int)capable(CAP_SETUID))" which revealed that bprm->unsafe is 1 aka LSM_UNSAFE_SHARE.

The only place (I can find) that bprm->unsafe is set to LSM_UNSAFE_SHARE is this check in check_unsafe_exec here (from https://github.com/torvalds/linux/blob/v4.4/fs/exec.c#L1281):

 t = p;
 n_fs = 1;
 spin_lock(&p->fs->lock);
 rcu_read_lock();
 while_each_thread(p, t) {
  if (t->fs == p->fs)
   n_fs++;
 }
 rcu_read_unlock();

 if (p->fs->users > n_fs)
  bprm->unsafe |= LSM_UNSAFE_SHARE;
 else
  p->fs->in_exec = 1;
 spin_unlock(&p->fs->lock);

So I think (and here it gets a bit sketchy) we're racing with copy_process in kernel/fork.c: that calls copy_fs (which is what increments p->fs->users) some way before it does the stuff necessary to make the new thread be included in the while_each_thread(p, t) loop. So n_fs is too low, the check triggers and the setuid bits get ignored.

No idea at all how to fix this of course.

tags: added: kernel-da-key
removed: kernel-key
tags: added: kernel-key
removed: kernel-da-key
Changed in linux (Ubuntu Xenial):
assignee: nobody → Colin Ian King (colin-king)
status: Triaged → In Progress
Revision history for this message
Colin Ian King (colin-king) wrote :

The following seems to fix it, but I need to exercise this a bit more to be 100% certain it is rock solid:

diff --git a/fs/fs_struct.c b/fs/fs_struct.c
index 7dca743..cd7175e2 100644
--- a/fs/fs_struct.c
+++ b/fs/fs_struct.c
@@ -98,8 +98,10 @@ void exit_fs(struct task_struct *tsk)
                int kill;
                task_lock(tsk);
                spin_lock(&fs->lock);
+ rcu_read_lock();
                tsk->fs = NULL;
                kill = !--fs->users;
+ rcu_read_unlock();
                spin_unlock(&fs->lock);
                task_unlock(tsk);
                if (kill)

Revision history for this message
Colin Ian King (colin-king) wrote :

Nope, that fails too.

Revision history for this message
Colin Ian King (colin-king) wrote :

So the thread fs has been torn down and so t->fs is null which then triggers the miscounting of n_fs; so I'm sspeculating we may need to try:

 while_each_thread(p, t) {
  if (t->fs == p->fs || !t->fs)
   n_fs++;
 }

Revision history for this message
John Lenton (chipaca) wrote :
Revision history for this message
Colin Ian King (colin-king) wrote :

With the change mentioned in comment #8 I now cannot reproduce the issue.

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

This also happens on Fedora 25 running 4.10.8-200.fc25.x64_64

Revision history for this message
In , colin.king (colin.king-linux-kernel-bugs) wrote :

There is a race condition on the fs->users counter and fs->in_exec flag that impacts exec() on a suid program when there are multiple threads being created and destroyed by a parent process. This in_exec flag was introduced way back in 2009 with commit:

commit 498052bba55ecaff58db6a1436b0e25bfd75a7ff
Author: Al Viro <email address hidden>
Date: Mon Mar 30 07:20:30 2009 -0400

When performing an exec() on a suid program, check_unsafe_exec() checks to see we have more f->fs->users than the count of child threads that share the same fs with the parent.

However, there are two race conditions that this seems to fail on:

1. The parent may be creating a new thread and copy_fs in kernel/fork.c bumps the fs->users count before the thread is attached to the parent hence causing the check p->fs->users > n_fs in check_unsafe_exec() to be true in check_unsafe_exec() and cause the execution a the suid program by the parent to fail to marked as unsafe and so it executes as a non-suid executable. The crux of the matter is that the fs->user count temporarily ahead by 1 with the number of threads linked to the process and we don't have any locking mechanism to protect us from this in the exec phase.

2. A thread my be dying and the associated fs pointer is nullified before it is removed from the parent and this also causes issues in the check_unsafe_exec() fs sanity check. I believe this can be fixed checking for t->fs being NULL, e.g.:

diff --git a/fs/exec.c b/fs/exec.c
index 72934df68471..ebfd9b76b69f 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1447,7 +1447,7 @@ static void check_unsafe_exec(struct linux_binprm *bprm)
        spin_lock(&p->fs->lock);
        rcu_read_lock();
        while_each_thread(p, t) {
- if (t->fs == p->fs)
+ if (t->fs == p->fs || !t->fs)
                        n_fs++;
        }
        rcu_read_unlock();

The reproducer is quite simple and always easy to reproduce:

$ cat Makefile
ALL=a b
all: $(ALL)

a: LDFLAGS=-pthread

b: b.c
 $(CC) b.c -o b
 sudo chown root:root b
 sudo chmod u+s b

test:
 for I in $$(seq 1000); do echo $I; ./a ; done

clean:
 rm -vf $(ALL)

$ cat a.c
#include <unistd.h>
#include <stdio.h>
#include <pthread.h>
#include <time.h>

void *nothing(void *p)
{
 return NULL;
}

void *target(void *p) {
 for (;;) {
  pthread_t t;
  if (pthread_create(&t, NULL, nothing, NULL) == 0)
   pthread_join(t, NULL);
     }
 return NULL;
}

int main(void)
{
 struct timespec tv;
 int i;

 for (i = 0; i < 10; i++) {
  pthread_t t;
  pthread_create(&t, NULL, target, NULL);
 }
 tv.tv_sec = 0;
 tv.tv_nsec = 100000;
 nanosleep(&tv, NULL);
 if (execl("./b", "./b", NULL) < 0)
  perror("execl");
 return 0;
}

$ cat b.c
#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>

int main(void)
{
 const uid_t euid = geteuid();
 if (euid != 0) {
  printf("Failed, got euid %d (expecting 0)\n", euid);
         return 1;
 }
 return 0;
}

To reproduce:

make
make test

You will see "Failed, got euid 1000 (expecting 0)" errors whenever the suid program being exec'd fails to exec as a suid process because of the race and failure in check_unsafe_exec()

Revision history for this message
Colin Ian King (colin-king) wrote :

This bug has been around since at least 2009.

Kernel Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195453

Revision history for this message
In , dev.jongy (dev.jongy-linux-kernel-bugs) wrote :

I don't think the second condition you gave is relevant, because once the task_struct->fs pointer is nullified, this thread is not accounted in fs->users anymore. Since this thread doesn't count for neither fs->count nor n_fs, it is okay.

The first condition is indeed a problem. I'm not sure of the wanted fix (probably some kind of locking in copy_process, that while_each_thread could use too), but currently this race is bad.

Revision history for this message
In , colin.king (colin.king-linux-kernel-bugs) wrote :

Thanks for looking at this. The locking on copy_process concerns me as I didn't want to introduce a locking fix that caused a serious performance regression on the copy and exec paths.

tags: added: kernel-da-key
removed: kernel-key
Revision history for this message
In , colin.king (colin.king-linux-kernel-bugs) wrote :

Any updates on this?

Revision history for this message
Colin Ian King (colin-king) wrote :

exec'ing from a thread is an interesting problem; the semantics of exec should be to terminal all the threads before the exec occurs according to http://maxim.int.ru/bookshelf/PthreadsProgram/htm/r_44.html

The normal idiom would be to do:
  fork()
      child exec's
      parent waits for child

I'm not sure in your case if you desire all the threads to terminate after the exec, so the wait() may be in fact be replaced by pthread termination calls on all the threads for your implementation.

Unfortunately there is an issue with fork'ing in a thread; any mutex held by another thread at the moment of fork becomes locked forever since we have once mutex locked by the parent and one by the child. Normally one would therefore use pthread_atfork() to help workaround this issue, see https://stackoverflow.com/questions/14407544/mixing-threads-fork-and-mutexes-what-should-i-watch-out-for

Revision history for this message
Colin Ian King (colin-king) wrote :

"to terminal all the threads" should read "to terminate all the threads"

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote : Re: [Bug 1672819] Re: exec'ing a setuid binary from a threaded program sometimes fails to setuid

On 8 May 2017 at 10:32, Colin Ian King <email address hidden> wrote:

> exec'ing from a thread is an interesting problem; the semantics of exec
> should be to terminal all the threads before the exec occurs according
> to http://maxim.int.ru/bookshelf/PthreadsProgram/htm/r_44.html
>
> The normal idiom would be to do:
> fork()
> child exec's
> parent waits for child
>
> I'm not sure in your case if you desire all the threads to terminate
> after the exec, so the wait() may be in fact be replaced by pthread
> termination calls on all the threads for your implementation.
>

The original bug report was about a process calling execve directly, not a
fork/exec situation. So yes, the expectation is that all threads are gone.

> Unfortunately there is an issue with fork'ing in a thread; any mutex
> held by another thread at the moment of fork becomes locked forever
> since we have once mutex locked by the parent and one by the child.
> Normally one would therefore use pthread_atfork() to help workaround
> this issue, see https://stackoverflow.com/questions/14407544/mixing-
> threads-fork-and-mutexes-what-should-i-watch-out-for

Go doesn't support forking (except for some very careful code that calls
exec in the child), for exactly this sort of reason.

Revision history for this message
Colin Ian King (colin-king) wrote :

I think I've found the simplest solution that avoids costly locking overhead and seems to work in my tests. I've uploaded the debs for Xenial in:

http://kernel.ubuntu.com/~cking/lp-1672819/

Would you mind testing these and seeing if it helps.

Changed in linux (Ubuntu Xenial):
status: In Progress → Incomplete
Revision history for this message
John Lenton (chipaca) wrote :

With the kernel from #16 I am no longer able to reproduce the issue, not with the simplified reproducers described in this bug, nor with the original (slower and more convoluted) snapd reproducer.

Revision history for this message
In , colin.king (colin.king-linux-kernel-bugs) wrote :

Created attachment 256351
this fix solves the issue without any overhead of extra per thread locking and a simple lightweight retry

This has been extensively tested on a multi-proc Xeon server with the reproducers and fixes the issue. I've checked this out with a high contention of pthreads and the retry loop occurs very rarely, so the overhead of the retry is very small indeed.

John Lenton (chipaca)
Changed in linux (Ubuntu Xenial):
status: Incomplete → In Progress
description: updated
Seth Forshee (sforshee)
Changed in linux (Ubuntu):
status: Triaged → Fix Committed
Changed in linux (Ubuntu Zesty):
status: New → Fix Committed
Changed in linux (Ubuntu Yakkety):
status: New → Fix Committed
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Yakkety):
assignee: nobody → Colin Ian King (colin-king)
Changed in linux (Ubuntu Zesty):
assignee: nobody → Colin Ian King (colin-king)
importance: Undecided → High
Changed in linux (Ubuntu Yakkety):
importance: Undecided → High
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
tags: added: verification-needed-yakkety
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-yakkety' to 'verification-done-yakkety'. If the problem still exists, change the tag 'verification-needed-yakkety' to 'verification-failed-yakkety'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-zesty
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-zesty' to 'verification-done-zesty'. If the problem still exists, change the tag 'verification-needed-zesty' to 'verification-failed-zesty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

Changed in linux (Ubuntu):
assignee: nobody → Colin Ian King (colin-king)
Revision history for this message
Colin Ian King (colin-king) wrote :

tested on xenial, 4.4.0-80-generic #101-Ubuntu, passed the test.

tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Colin Ian King (colin-king) wrote :

tested on yakkety, 4.8.0-55-generic #58-Ubuntu, passed the test.

tags: added: verification-done-yakkety
removed: verification-needed-yakkety
Revision history for this message
Colin Ian King (colin-king) wrote :

tested on zesty, 4.10.0-23-generic #25-Ubuntu, passed the test.

tags: added: verification-done-zesty
removed: verification-needed-zesty
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (3.8 KiB)

This bug was fixed in the package linux - 4.8.0-58.63

---------------
linux (4.8.0-58.63) yakkety; urgency=low

  * linux: 4.8.0-58.63 -proposed tracker (LP: #1700533)

  * CVE-2017-1000364
    - Revert "UBUNTU: SAUCE: mm: Only expand stack if guard area is hit"
    - Revert "mm: do not collapse stack gap into THP"
    - Revert "mm: enlarge stack guard gap"
    - mm: vma_adjust: remove superfluous confusing update in remove_next == 1 case
    - mm: larger stack guard gap, between vmas
    - mm: fix new crash in unmapped_area_topdown()
    - Allow stack to grow up to address space limit

linux (4.8.0-57.62) yakkety; urgency=low

  * linux: 4.8.0-57.62 -proposed tracker (LP: #1699035)

  * CVE-2017-1000364
    - SAUCE: mm: Only expand stack if guard area is hit

  * CVE-2017-7374
    - fscrypt: remove broken support for detecting keyring key revocation

  * CVE-2017-100363
    - char: lp: fix possible integer overflow in lp_setup()

  * CVE-2017-9242
    - ipv6: fix out of bound writes in __ip6_append_data()

  * CVE-2017-9075
    - sctp: do not inherit ipv6_{mc|ac|fl}_list from parent

  * CVE-2017-9074
    - ipv6: Prevent overrun when parsing v6 header options

  * CVE-2017-9076
    - ipv6/dccp: do not inherit ipv6_mc_list from parent

  * CVE-2017-9077
    - ipv6/dccp: do not inherit ipv6_mc_list from parent

  * CVE-2017-8890
    - dccp/tcp: do not inherit mc_list from parent

  * extend-diff-ignore should use exact matches (LP: #1693504)
    - [Packaging] exact extend-diff-ignore matches

  * APST quirk needed for Intel NVMe (LP: #1686592)
    - nvme: Quirk APST on Intel 600P/P3100 devices

  * regression: the 4.8 hwe kernel does not create the
    /sys/block/*/device/enclosure_device:* symlinks (LP: #1691899)
    - scsi: ses: Fix SAS device detection in enclosure

  * datapath: Add missing case OVS_TUNNEL_KEY_ATTR_PAD (LP: #1676679)
    - openvswitch: Add missing case OVS_TUNNEL_KEY_ATTR_PAD

  * connection flood to port 445 on mounting cifs volume under kernel
    (LP: #1686099)
    - cifs: Do not send echoes before Negotiate is complete

  * Support IPMI system interface on Cavium ThunderX (LP: #1688132)
    - i2c: octeon: Rename driver to prepare for split
    - i2c: octeon: Split the driver into two parts
    - [Config] CONFIG_I2C_THUNDERX=m
    - i2c: thunderx: Add i2c driver for ThunderX SOC
    - i2c: thunderx: Add SMBUS alert support
    - i2c: octeon,thunderx: Move register offsets to struct
    - i2c: octeon: Sort include files alphabetically
    - i2c: octeon: Use booleon values for booleon variables
    - i2c: octeon: thunderx: Add MAINTAINERS entry
    - i2c: octeon: Fix set SCL recovery function
    - i2c: octeon: Avoid sending STOP during recovery
    - i2c: octeon: Fix high-level controller status check
    - i2c: octeon: thunderx: TWSI software reset in recovery
    - i2c: octeon: thunderx: Remove double-check after interrupt
    - i2c: octeon: thunderx: Limit register access retries
    - i2c: thunderx: Enable HWMON class probing

  * CVE-2017-5577
    - drm/vc4: Return -EINVAL on the overflow checks failing.

  * Merlin SGMII fail on Ubuntu Xenial HWE kernel (LP: #1686305)
    - net: phy: marvell: fix Marvell 88E1512 u...

Read more...

Changed in linux (Ubuntu Yakkety):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (18.8 KiB)

This bug was fixed in the package linux - 4.4.0-83.106

---------------
linux (4.4.0-83.106) xenial; urgency=low

  * linux: 4.4.0-83.106 -proposed tracker (LP: #1700541)

  * CVE-2017-1000364
    - Revert "UBUNTU: SAUCE: mm: Only expand stack if guard area is hit"
    - Revert "mm: do not collapse stack gap into THP"
    - Revert "mm: enlarge stack guard gap"
    - mm: vma_adjust: remove superfluous confusing update in remove_next == 1 case
    - mm: larger stack guard gap, between vmas
    - mm: fix new crash in unmapped_area_topdown()
    - Allow stack to grow up to address space limit

linux (4.4.0-82.105) xenial; urgency=low

  * linux: 4.4.0-82.105 -proposed tracker (LP: #1699064)

  * CVE-2017-1000364
    - SAUCE: mm: Only expand stack if guard area is hit

  * linux-aws/linux-gke incorrectly producing and using linux-*-tools-
    common/linux-*-cloud-tools-common (LP: #1688579)
    - [Config] make linux-tools-common and linux-cloud-tools-common protection
      consistent

  * CVE-2017-9242
    - ipv6: fix out of bound writes in __ip6_append_data()

  * CVE-2017-9075
    - sctp: do not inherit ipv6_{mc|ac|fl}_list from parent

  * CVE-2017-9074
    - ipv6: Prevent overrun when parsing v6 header options

  * CVE-2017-9076
    - ipv6/dccp: do not inherit ipv6_mc_list from parent

  * CVE-2017-9077
    - ipv6/dccp: do not inherit ipv6_mc_list from parent

  * CVE-2017-8890
    - dccp/tcp: do not inherit mc_list from parent

  * Module signing exclusion for staging drivers does not work properly
    (LP: #1690908)
    - SAUCE: Fix module signing exclusion in package builds

  * extend-diff-ignore should use exact matches (LP: #1693504)
    - [Packaging] exact extend-diff-ignore matches

  * Dell XPS 9360 wifi 5G performance is poor (LP: #1692836)
    - SAUCE: ath10k: fix the wifi speed issue for kill 1535

  * Upgrade Redpine WLAN/BT driver to ver. 1.2.RC12 (LP: #1694607)
    - SAUCE: Redpine: Upgrade to ver. 1.2.RC12

  * [DP MST] No audio output through HDMI/DP/mDP ports in Dell WD15 and TB15
    docking stations (LP: #1694665)
    - drm/i915: Store port enum in intel_encoder
    - drm/i915: Eliminate redundant local variable definition
    - drm/i915: Switch to using port stored in intel_encoder
    - drm/i915: Move audio_connector to intel_encoder
    - drm/i915/dp: DP audio API changes for MST
    - drm/i915: abstract ddi being audio enabled
    - drm/i915/audio: extend get_saved_enc() to support more scenarios
    - drm/i915: enable dp mst audio

  * Xenial update to 4.4.70 stable release (LP: #1694621)
    - usb: misc: legousbtower: Fix buffers on stack
    - usb: misc: legousbtower: Fix memory leak
    - USB: ene_usb6250: fix DMA to the stack
    - watchdog: pcwd_usb: fix NULL-deref at probe
    - char: lp: fix possible integer overflow in lp_setup()
    - USB: core: replace %p with %pK
    - ARM: tegra: paz00: Mark panel regulator as enabled on boot
    - tpm_crb: check for bad response size
    - infiniband: call ipv6 route lookup via the stub interface
    - dm btree: fix for dm_btree_find_lowest_key()
    - dm raid: select the Kconfig option CONFIG_MD_RAID0
    - dm bufio: avoid a possible ABBA deadlock
    - dm bufio: check ...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (19.7 KiB)

This bug was fixed in the package linux - 4.10.0-26.30

---------------
linux (4.10.0-26.30) zesty; urgency=low

  * linux: 4.10.0-26.30 -proposed tracker (LP: #1700528)

  * CVE-2017-1000364
    - Revert "UBUNTU: SAUCE: mm: Only expand stack if guard area is hit"
    - Revert "mm: do not collapse stack gap into THP"
    - Revert "mm: enlarge stack guard gap"
    - mm: larger stack guard gap, between vmas
    - mm: fix new crash in unmapped_area_topdown()
    - Allow stack to grow up to address space limit

linux (4.10.0-25.29) zesty; urgency=low

  * linux: 4.10.0-25.29 -proposed tracker (LP: #1699028)

  * CVE-2017-1000364
    - SAUCE: mm: Only expand stack if guard area is hit

  * CVE-2017-9074
    - ipv6: Prevent overrun when parsing v6 header options
    - ipv6: Check ip6_find_1stfragopt() return value properly.

  * [Zesty] QDF2400 ARM64 server - NMI watchdog: BUG: soft lockup - CPU#8 stuck
    for 22s! (LP: #1680549)
    - iommu/dma: Stop getting dma_32bit_pfn wrong
    - iommu/dma: Implement PCI allocation optimisation
    - iommu/dma: Convert to address-based allocation
    - iommu/dma: Clean up MSI IOVA allocation
    - iommu/dma: Plumb in the per-CPU IOVA caches
    - iommu/iova: Fix underflow bug in __alloc_and_insert_iova_range

  * Zesty update to 4.10.17 stable release (LP: #1692898)
    - xen: adjust early dom0 p2m handling to xen hypervisor behavior
    - target: Fix compare_and_write_callback handling for non GOOD status
    - target/fileio: Fix zero-length READ and WRITE handling
    - iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement
    - usb: xhci: bInterval quirk for TI TUSB73x0
    - usb: host: xhci: print correct command ring address
    - USB: serial: ftdi_sio: add device ID for Microsemi/Arrow SF2PLUS Dev Kit
    - USB: Proper handling of Race Condition when two USB class drivers try to
      call init_usb_class simultaneously
    - USB: Revert "cdc-wdm: fix "out-of-sync" due to missing notifications"
    - staging: vt6656: use off stack for in buffer USB transfers.
    - staging: vt6656: use off stack for out buffer USB transfers.
    - staging: gdm724x: gdm_mux: fix use-after-free on module unload
    - staging: wilc1000: Fix problem with wrong vif index
    - staging: comedi: jr3_pci: fix possible null pointer dereference
    - staging: comedi: jr3_pci: cope with jiffies wraparound
    - usb: misc: add missing continue in switch
    - usb: gadget: legacy gadgets are optional
    - usb: Make sure usb/phy/of gets built-in
    - usb: hub: Fix error loop seen after hub communication errors
    - usb: hub: Do not attempt to autosuspend disconnected devices
    - x86/boot: Fix BSS corruption/overwrite bug in early x86 kernel startup
    - selftests/x86/ldt_gdt_32: Work around a glibc sigaction() bug
    - x86, pmem: Fix cache flushing for iovec write < 8 bytes
    - um: Fix PTRACE_POKEUSER on x86_64
    - perf/x86: Fix Broadwell-EP DRAM RAPL events
    - KVM: x86: fix user triggerable warning in kvm_apic_accept_events()
    - KVM: arm/arm64: fix races in kvm_psci_vcpu_on
    - arm64: KVM: Fix decoding of Rt/Rt2 when trapping AArch32 CP accesses
    - block: fix blk_integrity_register to use templ...

Changed in linux (Ubuntu Zesty):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Changed in golang-1.6 (Ubuntu):
status: New → Invalid
Changed in golang-1.6 (Ubuntu Yakkety):
status: New → Invalid
Changed in golang-1.6 (Ubuntu Zesty):
status: New → Invalid
Changed in golang-1.6 (Ubuntu Xenial):
assignee: nobody → Michael Hudson-Doyle (mwhudson)
importance: Undecided → High
description: updated
Changed in golang-1.6 (Ubuntu Xenial):
status: New → In Progress
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello John, or anyone else affected,

Accepted golang-1.6 into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/golang-1.6/1.6.2-0ubuntu5~16.04.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in golang-1.6 (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-xenial
removed: verification-done-xenial
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

I've verified the fix in the way I suspected I'd have to, with one extra wrinkle.

1) In a trusty VM, I verified that the C test case from the gist failed. (It did).
2) I launched a xenial lxd container on the VM and built the Go test case with version 1.6.2-0ubuntu5~16.04.2 of golang-1.6-go.
3) It did not fail in the lxd container for reasons I couldn't understand but it did fail when copied out of the container on to the trusty VM (failed 563 times out of 100k)
4) I then installed golang-1.6-go version 1.6.2-0ubuntu5~16.04.3 in the container and rebuilt the Go test case with the new compiler.
5) This did not fail when run directly on the trusty VM (0 failures out of 100k runs)

So I'm confident the fix has helped.

tags: added: verification-done-xenial
removed: verification-needed verification-needed-xenial
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for golang-1.6 has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package golang-1.6 - 1.6.2-0ubuntu5~16.04.3

---------------
golang-1.6 (1.6.2-0ubuntu5~16.04.3) xenial; urgency=medium

  * Backport workaround for execve issue that causes the setuid bit to be
    ignored when losing a race in the kernel. (LP: #1672819)

 -- Michael Hudson-Doyle <email address hidden> Mon, 03 Jul 2017 11:53:56 +1200

Changed in golang-1.6 (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Vasily Averin (vvs.at.openvz.org) wrote :

Guys,
your commit d6572202d986 ("UBUNTU:SAUCE: exec: ensure file system accounting in check_unsafe_exec is correct") looks wrong for me,
it leads to endless cycle in check_unsafe_exec().

fs/exec.c:: check_unsafe_exec()
...
recheck:
        fs_recheck = false;
        t = p;
        n_fs = 1;
        spin_lock(&p->fs->lock);
        rcu_read_lock();
        while_each_thread(p, t) {
                if (t->fs == p->fs)
                        n_fs++;
                if (t->flags & (PF_EXITING | PF_FORKNOEXEC))
                        fs_recheck = true;
        }
        rcu_read_unlock();

        if (p->fs->users > n_fs) {
                if (fs_recheck) {
                        spin_unlock(&p->fs->lock);
                        goto recheck; <<<<<< cycles forever
                }
                bprm->unsafe |= LSM_UNSAFE_SHARE;
        } else
                p->fs->in_exec = 1;
        spin_unlock(&p->fs->lock);

We have few Soft lockups inside VMs with ubuntu 16.04, where process was cyceled here.
Should I submit you separate bug for this problem?

Revision history for this message
Vasily Averin (vvs.at.openvz.org) wrote :
Revision history for this message
In , benh (benh-linux-kernel-bugs) wrote :

Did you ever submit the fix to the mailing list ?

Changed in linux:
importance: Unknown → High
status: Unknown → Confirmed
Revision history for this message
In , shaoyi (shaoyi-linux-kernel-bugs) wrote :

Hi Colin,

May I please ask about the most recent update about your fix? This race condition can still be reproduced on the current linux mainline v5.19.0-rc6. I found ubuntu had picked this patch up as https://lists.ubuntu.com/archives/kernel-team/2017-May/084102.html but also reported the soft lockup issue in kernel 4.4 as https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1876856 so I'm wondering do you have an updated version of the patch and the plan to submit it to the upstream? Would really appreciate if you're still tracking this issue!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.