Regression due to CVE patches in kwallet-pam (processes not inheriting user's supplementary groups )

Bug #1784964 reported by TJ on 2018-08-01
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
policykit-1 (Ubuntu)
Undecided
Unassigned

Bug Description

This report is tracking a possible regression caused by the recent CVE-2018-1116 patches to policykit-1.

On 18.04, since package upgrades on July 23rd, and after the first reboot since then on Aug 1st, I hit an issue with the primary (sudo, adm, etc...) user getting Permission Denied trying to do:

tail -f /var/log/syslog

when that file is owned by syslog:adm and is g=r.

I then found that "groups" reports only the $USER and not the entire list, but "groups $USER" reports all the groups correctly.

The user shell is set to /usr/bin/tmux and /etc/tmux.conf has "set -g default-shell /bin/bash"

After changing the user's shell back to /bin/bash and logging in on tty1 the list of groups shows correctly for the /bin/bash process running on tty1.

I investigated and found that for the affected processes, such as the tmux process, /proc/$PID/loginuid = 4294967295 whereas the /bin/bash process on tty1 correctly reported 1000. The same with the respective gid_map and uid_map.

4294967295 == -1 == 0xFFFFFFFF

The recent CVE patch to policykit has several functions where it does "uid = -1" which seems to tie in to my findings so far.

I also noticed Ubuntu is still based on version 0.105 which was released in 2012 - upstream released 0.115 with the CVE patch.

I suspect the backporting has missed something.

The Ubuntu backport patch is:

https://git.launchpad.net/ubuntu/+source/policykit-1/commit/?h=applied/ubuntu/bionic-devel&id=840c50182f5ab1ba28c1d20cce4c207364852935

TJ (tj) on 2018-08-02
description: updated
Tom Reynolds (tomreyn) wrote :

I observe what is likely the same problem on XUbuntu 16.04.5, running these commands in xfce4-terminal:

user1@mysystem:~$ lsb_release -ds;cat /proc/version;echo $SHELL;groups;groups $(whoami)
Ubuntu 16.04.5 LTS
Linux version 4.15.0-29-generic (buildd@lcy01-amd64-024) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)) #31~16.04.1-Ubuntu SMP Wed Jul 18 08:54:04 UTC 2018
/bin/bash
user1
user1 : user1 adm disk fax cdrom sudo dip plugdev users lxd lpadmin sambashare libvirtd vboxusers

user1@mysystem:~$ ps f
  PID TTY STAT TIME COMMAND
 3544 pts/2 Ss 0:00 bash
 3582 pts/2 R+ 0:00 \_ ps f
user1@mysystem:~$ cat /proc/3544/loginuid;echo
4294967295

Everything behaves correctly on tty1 or after sudo login + login as user1 on the terminal.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in policykit-1 (Ubuntu):
status: New → Confirmed
TJ (tj) wrote :

journalctl shows the problem with the auid and session values being 0xFFFFFFFF (-1) when calling a sudo command:

Aug 02 01:18:20 hephaestion.lan.iam.tj audit[5094]: USER_AUTH pid=5094 uid=1000 auid=4294967295 ses=4294967295 msg='op=PAM:authentication acct="tj" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/1 res=success'

and trying to tail syslog:

tj  ~  tail -f /var/log/syslog
tail: cannot open '/var/log/syslog' for reading: Permission denied
tail: no files remaining

tj  ~  ls -ld /var /var/log /var/log/syslog
drwxr-xr-x 16 root root 4096 Apr 2 13:02 /var
drwxrwxr-x 25 root syslog 4096 Aug 2 01:16 /var/log
-rw-r----- 1 syslog adm 235432 Aug 2 01:31 /var/log/syslog

tj  ~  groups $USER
tj : tj root adm disk lp dialout cdrom floppy sudo audio video plugdev users netdev lpadmin kvm libvirtd wireshark lxd libvirtd

tj  ~  groups
tj

TJ (tj) wrote :

I've discovered another quirk:

If my first log-in after booting is at the TTY console (not GUI) the groups show up correctly there *and* in a terminal in the Xorg GUI session afterwards.

But if I first log-in to the GUI then log-in to the TTY console both show only the user group.

TJ (tj) wrote :

The quirk is more nuanced than I reported above.

This reports groups correctly:

1. GUI login
2. Switch to TTY, login
3. "groups"
4. Switch to GUI
5. Launch Terminal
6. "groups"

This only reports the username:

1. GUI Login
2. Launch Terminal
3. "groups"
4. Switch to TTY, login
5. "groups"

Tests done with /usr/bin/tmux as the user's shell and tmux default-shell = /bin/bash which has been my standard configuration for several years now with no problems.

TJ (tj) wrote :

Looking at the diff between Ubuntu and upstream I noticed Ubuntu 0.105 code isn't adapted for "systemd --user" as described in

https://bugs.freedesktop.org/show_bug.cgi?id=76358

and in the source for the function: polkit_backend_session_monitor_is_session_active()

Alex Murray (alexmurray) wrote :

I can't reproduce this myself but I am using the default shell (bash provided by dash) and gnome-terminal. My understanding of the change to policykit-1 https://git.launchpad.net/ubuntu/+source/policykit-1/commit/?h=applied/ubuntu/bionic-devel&id=840c50182f5ab1ba28c1d20cce4c207364852935 is that previously the UID was supplied by the caller and used without validation - instead now policykit tries to validate the supplied UID against the actual UID of the subject - however this could change over time and is racy - so there are a bunch of failure modes which this change to policykit introduces which would previously have worked.

Alex Murray (alexmurray) wrote :

@TJ re comment:6 that fix is already in for both xenial and bionic as far as I can see.

TJ (tj) wrote :

Looking at the diff between upstream 0.105 and Ubuntu's I happened to notice the CKDB_PATH (ConsoleKit database path) /var/run/ConsoleKit/database which seems to be consulted on some occasions.

On the affected PC which was d-r-u-ed from 16.04 ConsoleKit 0.4.6-5 is still installed and that database is present. On another PC that has a clean 18.04 install that doesn't exist because consolekit is purely virtual now.

I'm not sure if this is relevant or not but worth reporting.

The database contains (for the *good* session) :

[Seat /org/freedesktop/ConsoleKit/Seat1]
kind=0
sessions=/org/freedesktop/ConsoleKit/Session1
devices=

[Session /org/freedesktop/ConsoleKit/Session1]
uid=1000
seat=/org/freedesktop/ConsoleKit/Seat1
login_session_id=3
display_device=/dev/tty1
remote_host_name=
is_active=false
is_local=true
creation_time=2018-08-02T01:18:34.707417Z

[SessionLeader /org/freedesktop/ConsoleKit/Session1]
session=/org/freedesktop/ConsoleKit/Session1
uid=0
pid=2494
service_name=:1.73

Note it references "display_device=/dev/tty1"

That is, I think, a reference to the TTY1 console login I did first. I'm going to restart after writing this and check what is in the file if I open the GUI Terminal shell first.

TJ (tj) wrote :

It seems that /var/run/ConsoleKit directory and its database is only created by console tty log-ins but not the GUI. There was no directory after GUI Terminal shell started; only after switching to TTY1.

So it would seem this isn't the cause since PCs without ConsoleKit work fine.

Alex Murray (alexmurray) wrote :

I've tried replicating your setup in a fresh bionic VM (ie. using tmux as default shell which then launches bash) and I can't replicate this:

amurray@sec-bionic-amd64:~$ grep amurray /etc/passwd
amurray:x:1000:1000:Ubuntu,,,:/home/amurray:/usr/bin/tmux
amurray@sec-bionic-amd64:~$ echo $SHELL
/bin/bash
amurray@sec-bionic-amd64:~$ cat /etc/tmux.conf
set -g default-shell /bin/bash
amurray@sec-bionic-amd64:~$ groups
amurray adm cdrom sudo dip plugdev lpadmin sambashare

This is all from within a graphic gnome-terminal launched after logging into the desktop (see picture which I will attach separately).

Can you perhaps try and provide more details on how I could try and replicate this?

A couple things to try

1. I've rebuilt polkit-1 with some extra debugging to try and flag when UIDs mismatch - this should end up in the following PPA https://launchpad.net/~alexmurray/+archive/ubuntu/lp1784964 which you could try installing from and seeing if journalctl shows anything?

2. Can you try downgrading polkit-1 and see if that resolves the issue?

Alex Murray (alexmurray) wrote :
TJ (tj) wrote :

Tom tried those things in a VM last night and could reproduce it. On a suggestion by Robbie Basak but the downgrade didn't solve it, which made me suggest something in the configuration is being permanently changed.

I'm not going to downgrade the package because I am debugging it and don't want to disturb the system.

As Tom reproduced in a VM and on 16.04.5 I wonder if this is a timing issue that the _racy_ detection is truly detecting.

The common thread is Tom and myself are both using Xubuntu/XFCE.

TJ (tj) wrote :

I've awk-ed a list of the packages Upgraded or Installed on July 28th on the affected PC (previous upgrade was on July 8th). I've put a ? in front of those that could be suspect. That list is short:

grep '^?' Hacking/bug-groups-packages-updated.log
? gir1.2-polkit-1.0:amd64 (0.105-20, 0.105-20ubuntu0.18.04.1),
? libpam-systemd:amd64 (237-3ubuntu10, 237-3ubuntu10.3),
? libpolkit-agent-1-0:amd64 (0.105-20, 0.105-20ubuntu0.18.04.1),
? libpolkit-backend-1-0:amd64 (0.105-20, 0.105-20ubuntu0.18.04.1),
? libpolkit-gobject-1-0:amd64 (0.105-20, 0.105-20ubuntu0.18.04.1),
? libsystemd0:amd64 (237-3ubuntu10, 237-3ubuntu10.3),
? libsystemd0:i386 (237-3ubuntu10, 237-3ubuntu10.3),
? policykit-1:amd64 (0.105-20, 0.105-20ubuntu0.18.04.1),
? systemd:amd64 (237-3ubuntu10, 237-3ubuntu10.3),

The entire list is attached in case I've missed something.

The command used to generate it was:

zcat history.log.1.gz | awk '/^Start-Date:.*2018-07-28/{FOUND=1; print -bash} FOUND && /^(Install|Upgrade): / { LIST=gensub( /), /, "),\n", "g", -bash) } { if(LIST != "") {gsub(/^(Install|Upgrade): /, "", LIST); print "---"; print LIST | "sort"; print "---"; LIST=""}}' > ~/Hacking/bug-groups-packages-updated.log

Tom Reynolds (tomreyn) wrote :

TJ is right, I also confirmed this issue on a freshly installed 18.04.1 x86_64 Desktop VM last night. After enabling 'proposed' and installing all pending updates, 'groups' in a terminal returned just the users primary group. I then restored a snapshot taken right after the 18.04 installation (but with 'proposed' already enabled), and installed all pending updates again, this time one by one, but could not reproduce it then. I don't have any indication that the outcome would have been any different without 'proposed'.

So it remains unclear to me how to reproduce this reliably. It is clear that it is possible to reproduce this (occasionally) on a fresh 18.04.1 installation. And also on 16.04.5. So I do think it will affect many.

TJ (tj) on 2018-08-02
summary: - Regression due to CVE-2018-1116 (processes not inheriting user ID or
- groups )
+ Regression due to CVE-2018-1116 (processes not inheriting user's groups
+ )

/proc/*/loginuid is set by the pam_loginuid module when you login. Policykit isn't involved in that process at all.

Are you using gdm to log into the graphical session?

TJ (tj) wrote :

I think this Debian-reported bug is closely related. The description certainly sounds very like what I've experienced so far. I'm not linking it to this bug report until any relationship is clearer.

"policykit-1: please treat background processes (user bus) as part of active GUI session"

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=779988

Marc Deslauriers (mdeslaur) wrote :

What's the output of "id" in a broken shell?

Marc Deslauriers (mdeslaur) wrote :

Are you using local passwd/shadow/group files, or are you authenticating using something else?

H Geerts (hgeerts) wrote :
Download full text (5.7 KiB)

I experience this same behaviour using lightdm + KDE plasma.
I've also tested lightdm + unity which did not trigger this behaviour.
This install uses local passwd/shadow/group files.
Both tests were after a fresh boot.

harm@harm-XPS-13-9360:~$ lsb_release -a; cat /proc/version
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.1 LTS
Release: 18.04
Codename: bionic
Linux version 4.15.0-29-generic (buildd@lgw01-amd64-057) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #31-Ubuntu SMP Tue Jul 17 15:39:52 UTC 2018

harm@harm-XPS-13-9360:~$ groups; groups $(whoami)
harm
harm : harm adm dialout cdrom sudo dip plugdev netdev lpadmin sambashare libvirt docker

harm@harm-XPS-13-9360:~$ id; id $(whoami)
uid=1000(harm) gid=1000(harm) groups=1000(harm)
uid=1000(harm) gid=1000(harm) groups=1000(harm),4(adm),20(dialout),24(cdrom),27(sudo),30(dip),46(plugdev),109(netdev),113(lpadmin),128(sambashare),135(libvirt),999(docker)

harm@harm-XPS-13-9360:~$ cat /etc/nsswitch.conf
# /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality. ...

Read more...

TJ (tj) wrote :

Marc: regular stand-alone install, local authentication via passwd/shadow/group.

Here's what I see with the 'broken' sequence GUI terminal:

 tj  ~  id
uid=1000(tj) gid=1000(tj) groups=1000(tj)
 tj  ~  groups
tj
 tj  ~  groups $USER
tj : tj root adm disk lp dialout cdrom floppy sudo audio video plugdev users netdev lpadmin kvm libvirtd wireshark lxd libvirtd

$ pid=$BASHPID; while [[ $pid -ne 0 ]]; do ids=$(grep '^([P]*id\|.*id:\|Groups:)' /proc/$pid/status); echo -e "cmdline: $(cat /proc/$pid/cmdline) \n $ids" 2>/dev/null; pid=$(echo $ids | awk '{print $8}'); done > Hacking/bug-groups-parent-process-tree.log

cmdline: -bash
 Tgid: 3610
Ngid: 0
Pid: 3610
PPid: 3548
TracerPid: 0
Uid: 1000 1000 1000 1000
Gid: 1000 1000 1000 1000
NStgid: 3610
NSpid: 3610
NSpgid: 3610
NSsid: 3610

cmdline: tmux
 Tgid: 3548
Ngid: 0
Pid: 3548
PPid: 1
TracerPid: 0
Uid: 1000 1000 1000 1000
Gid: 1000 1000 1000 1000
NStgid: 3548
NSpid: 3548
NSpgid: 3548
NSsid: 3548

cmdline: /sbin/init
 Tgid: 1
Ngid: 0
Pid: 1
PPid: 0
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
NStgid: 1
NSpid: 1
NSpgid: 1
NSsid: 1

TJ (tj) wrote :

Marc:

Are you using gdm to log into the graphical session?

lightdm - this is Xubuntu

TJ (tj) wrote :

I am beginning to suspect this is an systemd-logind issue. I've been thinking it's logind but just checked the upgrade packages to/from versions and cross-checked against the changelogs.

? systemd:amd64 (237-3ubuntu10, 237-3ubuntu10.3),

And we have a major change to logind included in that:

systemd (237-3ubuntu10.2) bionic; urgency=medium

  * logind: backport v238/v239 fixes for handling DRM devices.
    These changes introduce all the fixes that correct handling of open fd's
    related to the DRM devices, as used by for example NVIDIA GPUs. This backport
    includes some refactoring, corrections, and comment updates. This to insure
    that correct history is preserved, code comments match reality, and to ease
    backporting logind fixes in the future SRUs. (LP: #1777099)

TJ (tj) wrote :

Just noticed the PID tree trace didn't match on the Group: from proc/$PID/status. Here's the corrected output.

$ pid=$BASHPID; while [[ $pid -ne 0 ]]; do ids=$(grep '^\(.*id:\|Group\)' /proc/$pid/status); echo -e "cmdline: $(cat /proc/$pid/cmdline) \n $ids" 2>/dev/null; pid=$(echo $ids | awk '{print $8}'); done

cmdline: -bash
 Tgid: 3610
Ngid: 0
Pid: 3610
PPid: 3548
TracerPid: 0
Uid: 1000 1000 1000 1000
Gid: 1000 1000 1000 1000
Groups:
NStgid: 3610
NSpid: 3610
NSpgid: 3610
NSsid: 3610

cmdline: tmux
 Tgid: 3548
Ngid: 0
Pid: 3548
PPid: 1
TracerPid: 0
Uid: 1000 1000 1000 1000
Gid: 1000 1000 1000 1000
Groups:
NStgid: 3548
NSpid: 3548
NSpgid: 3548
NSsid: 3548

cmdline: /sbin/init
 Tgid: 1
Ngid: 0
Pid: 1
PPid: 0
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
Groups:
NStgid: 1
NSpid: 1
NSpgid: 1
NSsid: 1

TJ (tj) wrote :

And this is the same output using the 'correct' scenario by logging into the TTY console first.

cmdline: -bash
 Tgid: 3516
Ngid: 0
Pid: 3516
PPid: 3488
TracerPid: 0
Uid: 1000 1000 1000 1000
Gid: 1000 1000 1000 1000
Groups: 0 4 6 7 20 24 25 27 29 44 46 100 108 114 132 134 134 137 142 1000
NStgid: 3516
NSpid: 3516
NSpgid: 3516
NSsid: 3516

cmdline: -tmux
 Tgid: 3488
Ngid: 0
Pid: 3488
PPid: 1
TracerPid: 0
Uid: 1000 1000 1000 1000
Gid: 1000 1000 1000 1000
Groups: 0 4 6 7 20 24 25 27 29 44 46 100 108 114 132 134 134 137 142 1000
NStgid: 3488
NSpid: 3488
NSpgid: 3488
NSsid: 3488

cmdline: /sbin/init
 Tgid: 1
Ngid: 0
Pid: 1
PPid: 0
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
Groups:
NStgid: 1
NSpid: 1
NSpgid: 1
NSsid: 1

TJ (tj) wrote :

Just noticed in $HOME/.xsession-errors the following:

(polkit-gnome-authentication-agent-1:4029): polkit-gnome-1-WARNING **: 15:04:54.498: Failed to register client: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.gnome.SessionManager was not provided by any .service files

TJ (tj) on 2018-08-04
summary: - Regression due to CVE-2018-1116 (processes not inheriting user's groups
- )
+ Regression due to CVE-2018-1116 (processes not inheriting user's
+ supplementary groups )
tags: added: regression-update

This appears to be due to CVE patches in kwallet-pam not policykit-1. Marking this as a duplicate of

Bug #1781418 "User not being initialized correctly on login"

It affects lightdm due to its /etc/pam.d/lightdm including libpam_kwallet{4,5}.so. Solution is to comment/remove those.

See comment #5 of the kwallet-pam bug.

summary: - Regression due to CVE-2018-1116 (processes not inheriting user's
- supplementary groups )
+ Regression due to CVE patches in kwallet-pam (processes not inheriting
+ user's supplementary groups )
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.