Apparmor related regression on access to unix sockets on a candidate 3.16 backport kernel

Bug #1390223 reported by Stéphane Graber on 2014-11-06
50
This bug affects 9 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
John Johansen
Utopic
Medium
John Johansen
Vivid
Medium
John Johansen

Bug Description

I recently noticed a bunch of containers failing in a rather odd way when running postfix.

The most visible example is when running mailq on an empty queue. Without apparmor (unconfined container) I see that the queue is empty, with apparmor, I get Permission denied.

That's all running as root so the permission denied looks a tiny bit odd. Also, running the 3.13 kernel, I don't get any of that weirdness.

My guess is that it has to do with the work that went into the 3.16 kernel for socket mediation. In theory only systems that run the utopic apparmor (which I DO NOT) should be seeing that kind of behavior, but it looks like some code path isn't checking things properly :)

== strace in unconfined container ==
chdir("/var/spool/postfix") = 0
rt_sigaction(SIGPIPE, {SIG_IGN, [PIPE], SA_RESTORER|SA_RESTART, 0x7f8963a62c30}, {SIG_IGN, [], 0}, 8) = 0
getuid() = 0
socket(PF_LOCAL, SOCK_STREAM, 0) = 4
fcntl(4, F_GETFL) = 0x2 (flags O_RDWR)
fcntl(4, F_SETFL, O_RDWR) = 0
connect(4, {sa_family=AF_LOCAL, sun_path="public/showq"}, 110) = 0
poll([{fd=4, events=POLLIN}], 1, 3600000) = 1 ([{fd=4, revents=POLLIN|POLLHUP}])
read(4, "Mail queue is empty\n", 4096) = 20
poll([{fd=4, events=POLLIN}], 1, 3600000) = 1 ([{fd=4, revents=POLLIN|POLLHUP}])
read(4, "", 4096) = 0
write(1, "Mail queue is empty\n", 20Mail queue is empty
) = 20
close(4) = 0
exit_group(0) = ?
+++ exited with 0 +++

== strace in confined container ==
chdir("/var/spool/postfix") = 0
rt_sigaction(SIGPIPE, {SIG_IGN, [PIPE], SA_RESTORER|SA_RESTART, 0x7ffe62de4c30}, {SIG_IGN, [], 0}, 8) = 0
getuid() = 0
socket(PF_LOCAL, SOCK_STREAM, 0) = 4
fcntl(4, F_GETFL) = 0x2 (flags O_RDWR)
fcntl(4, F_SETFL, O_RDWR) = 0
connect(4, {sa_family=AF_LOCAL, sun_path="public/showq"}, 110) = 0
poll([{fd=4, events=POLLIN}], 1, 3600000) = 1 ([{fd=4, revents=POLLIN|POLLHUP}])
read(4, 0x7ffe65b35c00, 4096) = -1 EACCES (Permission denied)
close(4) = 0
write(2, "postqueue: warning: close: Permi"..., 45postqueue: warning: close: Permission denied
) = 45
sendto(3, "<20>Nov 6 20:40:42 postfix/post"..., 78, MSG_NOSIGNAL, NULL, 0) = 78
exit_group(0) = ?

Kernel is a slightly outdated version of the kernel from the kernel team PPA:
Linux shell01 3.16.0-23-generic #31-Ubuntu SMP Thu Oct 23 20:13:35 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

If you think the latest build will improve this, I can test it, but seeing how this is a production server, I can't just flip kernels every 5 minutes (I'm running 3.16 to avoid a nasty btrfs bug on 3.13).

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1390223

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: utopic
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Incomplete → Confirmed
tags: added: kernel-da-key
Changed in linux (Ubuntu):
assignee: nobody → John Johansen (jjohansen)
Teemu Torma (teemu-torma) wrote :

I am seeing same kind of behaviour after upgrade to utopic, mailq fails with permission denied.

The odd thing is it doesn't happen every time. Sometimes when running mailq in a loop it might work occasionally. Sometimes not. Sometimes it works for period of time just fine and then starts failing again.

mailq does not have an apparmor profile.
audit.log does not show any denied apparmor requests.

If I remove all postfix apparmor profiles by apparmor_parser -R, the problem appears to go away.

The kernel is 3.16.0-24-generic.

Andy Whitcroft (apw) on 2014-12-11
Changed in linux (Ubuntu Utopic):
status: New → Confirmed
importance: Undecided → Medium
assignee: nobody → John Johansen (jjohansen)
Andy Whitcroft (apw) on 2015-01-06
Changed in linux (Ubuntu Vivid):
status: Confirmed → Fix Committed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.18.0-8.9

---------------
linux (3.18.0-8.9) vivid; urgency=low

  [ Leann Ogasawara ]

  * Release Tracking Bug
    - LP: #1407692
  * rebase to v3.18.1
  * ubuntu: AUFS -- Resolve build failure union has no member named
    'd_child'

  [ Upstream Kernel Changes ]

  * arm64: optimized copy_to_user and copy_from_user assembly code
    - LP: #1400349
  * x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit
    - LP: #1400314
    - CVE-2014-8134
  * rebase to v3.18.1
 -- Leann Ogasawara <email address hidden> Mon, 05 Jan 2015 09:12:32 -0800

Changed in linux (Ubuntu Vivid):
status: Fix Committed → Fix Released
Michael Heča (orgoj) wrote :

I have same problem on Trustu with Utopic kernel linux-generic-lts-utopic 3.16.0.30.23 and nginx + uwsgi fastrouter socket. Sometimes access denied and partial html file is send.

in /var/log/nginx/error.log
2015/02/11 08:39:27 [alert] 474#0: *148 readv() failed (13: Permission denied) while reading upstream, client: 10.199.1.1, server: ~^(www\.)?(?<project>[^\.]+)(\.mp.*)?, request: "GET / HTTP/1.1", upstream: "uwsgi://unix://var/local/mp/fastrouter.sock:", host: "www.wdt-cz.mpvirt:8086"

Kernel 3.13 from Trusty is OK.

zoolook (nbensa) wrote :

I have some containers that authenticate thru kerberos/ldap and I couldn't ssh in the containers while using utopic's kernels.

My fix was to temporaly go back to trusty's kernel.

zoolook (nbensa) wrote :

I went ahead and upgraded to vivid. My kerberos/ldap problems are solved but I still get:

$ mailq
postqueue: warning: close: Permission denied

guest is trusty:

ii postfix 2.11.0-1ubuntu1 amd64 High-performance mail transport agent

host is vivid:

ii lxc 1.1.0-0ubuntu1 amd64 Linux Containers userspace tools
ii apparmor 2.9.1-0ubuntu7 amd64 User-space parser utility for AppArmor

Linux venkman 3.19.0-9-generic #9-Ubuntu SMP Wed Mar 11 17:50:03 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

As #2, I get "Permission denied" ~997/1000 tries (in a loop).

ubuntu@test:~$ for i in {1..1000}; do mailq; done 2>&1 | grep -v Permission
Mail queue is empty
Mail queue is empty
Mail queue is empty
Mail queue is empty
ubuntu@test:~$

mike Bernson (mike-mlb) wrote :

I am have the same problem with puppet in a container on 14.04 using hwe kernel version 3.16.0-31.
Package install for kernel is linux-image-generic-lts-utopic

Switch back to kernel 3.13.0-46 fixed the problem for me.

I have the same problem on 15.04 using Linux smtp01 3.19.0-9-generic.

Just to add, getting this error in the dmesg:

[97661.056052] audit: type=1400 audit(1426952275.541:2120): apparmor="DENIED" operation="file_perm" profile="lxc-container-default" name="public/showq" pid=25035 comm="postqueue" requested_mask="r" denied_mask="r" fsuid=0 ouid=0

zoolook (nbensa) wrote :

$ uname -a
Linux venkman 3.19.0-11-generic #11-Ubuntu SMP Tue Mar 31 22:17:56 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Up-to-date Vivid.

Same problem.

Joseph Salisbury (jsalisbury) wrote :

This bug may have regressed again in the 3.19 kernel. Does this issue go away if you boot into the 3.18.0-8.9 kernel?

Changed in linux (Ubuntu Vivid):
status: Fix Released → Confirmed
Andy Whitcroft (apw) on 2015-05-07
Changed in linux (Ubuntu):
status: Confirmed → Fix Committed
Andy Whitcroft (apw) on 2015-05-21
Changed in linux (Ubuntu Vivid):
status: Confirmed → Fix Committed
Lee Lists (lists-jave) wrote :

Hi,

Where is the "Fix Committed", is there a package we could try or build ?

Regards,
Lee

rufflove (nat-hulse) wrote :

I'm seeing the same thing with postscreen on the current kernel:

audit: type=1400 audit(1433863724.358:39): apparmor="DENIED" operation="file_perm" profile="lxc-container-default" name="private/dnsblog" pid=21627 comm="postscreen" requested_mask="r" denied_mask="r" fsuid=100104 ouid=0

zoolook (nbensa) wrote :

I waited two releases after the "fix commited" announce. Still not fixed. Where's the fix? What version of what package?

This is current vivid.

zoolook@venkman:~$ sudo lxc-attach -n dana

root@dana:~# mailq
postqueue: warning: close: Permission denied

root@dana:~# uname -a
Linux dana 3.19.0-20-generic #20-Ubuntu SMP Fri May 29 10:10:47 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

root@dana:~# dmesg | tail -n1
[ 4323.625556] audit: type=1400 audit(1433983568.106:2631): apparmor="DENIED" operation="file_perm" profile="lxc-container-default" name="public/showq" pid=29637 comm="postqueue" requested_mask="r" denied_mask="r" fsuid=0 ouid=0

zoolook (nbensa) wrote :

An again:

zoolook@venkman:~$ sudo lxc-attach -n dana

root@dana:~# mailq
postqueue: warning: close: Permission denied

root@dana:~# uname -a
Linux dana 3.19.0-21-generic #21-Ubuntu SMP Sun Jun 14 18:31:11 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

root@dana:~# dmesg | tail -n1
[ 3637.172284] audit: type=1400 audit(1434420248.087:2162): apparmor="DENIED" operation="file_perm" profile="lxc-container-default" name="public/showq" pid=29657 comm="postqueue" requested_mask="r" denied_mask="r" fsuid=0 ouid=0

Seriously. Is anyone working on this?

rufflove (nat-hulse) wrote :

smtp and qmgr suffer similar denials:

kernel: [33252.627322] audit: type=1400 audit(1434961302.532:240): apparmor="DENIED" operation="file_perm" profile="lxc-container-default" name="private/trace" pid=11752 comm="qmgr" requested_mask="r" denied_mask="r" fsuid=100104 ouid=0

kernel: [33252.626415] audit: type=1400 audit(1434961302.532:238): apparmor="DENIED" operation="file_perm" profile="lxc-container-default" name="private/defer" pid=32222 comm="smtp" requested_mask="r"

Wolfgang Bumiller (wbumiller) wrote :

We encountered the same problem and noticed it only happened on 64 bit containers, while 32 bit containers seemed to work. We also tested upstream kernels 4.1 and 4.2.3, same result.

Wolfgang Bumiller (wbumiller) wrote :

Ah those "upstream" kernels weren't pure... Just tested manually compiled kernel master branch and tag 4.2, no issues. Copied over a packaged one: problem reappears.
It's not a container problem though. Running this on a host has the same effect:
# aa-exec -p $pick_your_favorite_profile -- socat UNIX:/var/spool/postfix/public/showq -
also gets an EPERM.
Also not 32/64 bit dependent, however we do have some containers where it always works, and some where it always fails, and that was their only obvious distinction, which now seems unrelated.

John Johansen (jjohansen) wrote :

So confirming that this bug is two separate issues

There is the committed fix for a bug around the bad unix_addr_fs macro that was causing a failure. The remaining bug is around a socket that is being shutdown and revalidated, it can manifest it self as a race so there are cases where it appears to fail at random and others where it appears to be reliable.

John Johansen (jjohansen) wrote :

This bug will be used for tracking the bad unix_addr_fs macro issue that has already been commited.

The other part of the reported bug, deleted entry, denial for socket being revalidated on shutdown will be tracked under bug 1446906.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.3.0-1.10

---------------
linux (4.3.0-1.10) xenial; urgency=low

  [ Andy Whitcroft ]

  * [Config] make IBMVETH consistent on powerpc/ppc64el
    - LP: #1521712
  * [Config] follow ibmvscsi name change
    - LP: #1521712
  * [Config] move ibm disk and ethernet drivers to linux-image
    - LP: #1521712
  * [Config] include ibmveth in nic-modules for ppc64el
    - LP: #1521712
  * [Config] s390x -- disable abi/module checks for s390x

  [ Tim Gardner ]

  * [Config] Add spl/zfs provides to generic and powerpc64-smp
  * [Config] Add zfs to d-i fs-core-modules

  [ Upstream Kernel Changes ]

  * KVM: x86: work around infinite loop in microcode when #AC is delivered
  * KVM: svm: unconditionally intercept #DB
  * Btrfs: fix truncation of compressed and inlined extents
  * staging/dgnc: fix info leak in ioctl
  * [media] media/vivid-osd: fix info leak in ioctl
  * crypto: asymmetric_keys - remove always false comparison
  * X.509: Fix the time validation [ver #2]
  * isdn_ppp: Add checks for allocation failure in isdn_ppp_open()
  * ppp, slip: Validate VJ compression slot parameters completely

 -- Andy Whitcroft <email address hidden> Tue, 01 Dec 2015 21:37:13 +0000

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Rolf Leggewie (r0lf) wrote :

utopic has seen the end of its life and is no longer receiving any updates. Marking the utopic task for this ticket as "Won't Fix".

Changed in linux (Ubuntu Utopic):
status: Confirmed → Won't Fix

This bug was nominated against a series that is no longer supported, ie vivid. The bug task representing the vivid nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Vivid):
status: Fix Committed → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers