exec'ing a setuid binary from a threaded program sometimes fails to setuid

Bug #1672819 reported by John Lenton on 2017-03-14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Colin Ian King

Bug Description

This can be reproduced with

With that, and go 1.8, if you run “make” and then

for i in `seq 99`; do ./a_go; done

you'll see a variable number of ”GOT 1000” (or whatever your user id is). If you don't, add one or two more 9s on there.

That's a simple go reproducer. You can also use “a_p” instead of “a_go” to see one that only uses pthreads. “a_c” is a C version that does *not* reproduce the issue.

But it's not pthreads: if in a_go.go you comment out the “import "C"”, you'll still see the “GOT 1000” messages, in a static binary that uses no pthreads, just clone(2). You'll also see a bunch of warnings because it's not properly handling an EAGAIN from clone, but that's unrelated.

If you pin the process to a single thread using taskset, you don't get the issue from a_go; a_p continues to reproduce the issue. In some virtualized environments we haven't been able to reproduce the issue either (e.g. some aws instances), but kvm works (you need -smp to see the issue from a_go).

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-4.4.0-64-generic 4.4.0-64.85
ProcVersionSignature: Ubuntu 4.4.0-64.85-generic 4.4.44
Uname: Linux 4.4.0-64-generic x86_64
NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
ApportVersion: 2.20.1-0ubuntu2.5
Architecture: amd64
 /dev/snd/pcmC0D0p: john 2354 F...m pulseaudio
 /dev/snd/controlC0: john 2354 F.... pulseaudio
CurrentDesktop: Unity
Date: Tue Mar 14 17:17:23 2017
HibernationDevice: RESUME=UUID=b9fd155b-dcbe-4337-ae77-6daa6569beaf
InstallationDate: Installed on 2014-04-27 (1051 days ago)
InstallationMedia: Ubuntu 14.04 LTS "Trusty Tahr" - Release amd64 (20140417)
MachineType: Dell Inc. Latitude E6510
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.4.0-64-generic root=/dev/mapper/ubuntu--vg-root ro enable_mtrr_cleanup mtrr_spare_reg_nr=8 mtrr_gran_size=32M mtrr_chunk_size=32M quiet splash
 linux-restricted-modules-4.4.0-64-generic N/A
 linux-backports-modules-4.4.0-64-generic N/A
 linux-firmware 1.157.8
SourcePackage: linux
SystemImageInfo: Error: command ['system-image-cli', '-i'] failed with exit code 2:
UpgradeStatus: Upgraded to xenial on 2015-06-18 (634 days ago)
dmi.bios.date: 12/05/2013
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A16
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 9
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA16:bd12/05/2013:svnDellInc.:pnLatitudeE6510:pvr0001:rvnDellInc.:rn:rvr:cvnDellInc.:ct9:cvr:
dmi.product.name: Latitude E6510
dmi.product.version: 0001
dmi.sys.vendor: Dell Inc.

John Lenton (chipaca) wrote :
John Lenton (chipaca) wrote :

I also tried this in 4.10.0-11-generic, same results.

Changed in linux (Ubuntu):
status: New → Triaged
importance: Undecided → High
Changed in linux (Ubuntu Xenial):
status: New → Triaged
importance: Undecided → High
tags: added: kernel-key
Kamal Mostafa (kamalmostafa) wrote :

I can reproduce this with the simple pthreads-only reproducer (loop of ./a_p running setuid binary ./b) running 4.4.0-57-generic on bare metal.

$ for i in `seq 10`; do ./a_p; done
GOT 1000
GOT 1000

$ for i in `seq 1000`; do ./a_p; done | wc -l

Kamal Mostafa (kamalmostafa) wrote :

An AWS instance (t2.xlarge with 4 vCPU's) running 4.4.0-1001-aws reproduces the problem:

$ for i in `seq 10000`; do ./a_p; done | wc -l

Michael Hudson-Doyle (mwhudson) wrote :

I had a bit of a stare at the kernel source and suspected that the downgrade of uid is happening here: https://github.com/torvalds/linux/blob/v4.4/security/commoncap.c#L547-L548

I added a "WARN(1, "downgrading in subprocess %d %d\n", bprm->unsafe, (int)capable(CAP_SETUID))" which revealed that bprm->unsafe is 1 aka LSM_UNSAFE_SHARE.

The only place (I can find) that bprm->unsafe is set to LSM_UNSAFE_SHARE is this check in check_unsafe_exec here (from https://github.com/torvalds/linux/blob/v4.4/fs/exec.c#L1281):

 t = p;
 n_fs = 1;
 while_each_thread(p, t) {
  if (t->fs == p->fs)

 if (p->fs->users > n_fs)
  bprm->unsafe |= LSM_UNSAFE_SHARE;
  p->fs->in_exec = 1;

So I think (and here it gets a bit sketchy) we're racing with copy_process in kernel/fork.c: that calls copy_fs (which is what increments p->fs->users) some way before it does the stuff necessary to make the new thread be included in the while_each_thread(p, t) loop. So n_fs is too low, the check triggers and the setuid bits get ignored.

No idea at all how to fix this of course.

tags: added: kernel-da-key
removed: kernel-key
tags: added: kernel-key
removed: kernel-da-key
Changed in linux (Ubuntu Xenial):
assignee: nobody → Colin Ian King (colin-king)
status: Triaged → In Progress
Colin Ian King (colin-king) wrote :

The following seems to fix it, but I need to exercise this a bit more to be 100% certain it is rock solid:

diff --git a/fs/fs_struct.c b/fs/fs_struct.c
index 7dca743..cd7175e2 100644
--- a/fs/fs_struct.c
+++ b/fs/fs_struct.c
@@ -98,8 +98,10 @@ void exit_fs(struct task_struct *tsk)
                int kill;
+ rcu_read_lock();
                tsk->fs = NULL;
                kill = !--fs->users;
+ rcu_read_unlock();
                if (kill)

Colin Ian King (colin-king) wrote :

Nope, that fails too.

Colin Ian King (colin-king) wrote :

So the thread fs has been torn down and so t->fs is null which then triggers the miscounting of n_fs; so I'm sspeculating we may need to try:

 while_each_thread(p, t) {
  if (t->fs == p->fs || !t->fs)

Colin Ian King (colin-king) wrote :

With the change mentioned in comment #8 I now cannot reproduce the issue.

Zygmunt Krynicki (zyga) wrote :

This also happens on Fedora 25 running 4.10.8-200.fc25.x64_64

Colin Ian King (colin-king) wrote :

This bug has been around since at least 2009.

Kernel Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195453

tags: added: kernel-da-key
removed: kernel-key
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.