Please re-enable container support in apport

Bug #1732518 reported by Stéphane Graber
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
apport (Ubuntu)
Fix Released
High
Unassigned
Xenial
Fix Released
High
Unassigned
Zesty
Fix Released
High
Unassigned
Artful
Fix Released
High
Unassigned
Bionic
Fix Released
High
Unassigned

Bug Description

The latest security update for apport disabled container crash forwarding, this is a feature which users do rely on in production and while it may have been appropriate to turn it off to put a security update out, this needs to be re-enabled ASAP.

I provided a patch which fixed the security issue before the security issue was publicly disclosed so pushing an SRU to all Ubuntu releases re-enabling this code should be pretty trivial.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Tagged as regression-update since the security update technically regressed this (albeit on purpose).

Changed in apport (Ubuntu Xenial):
importance: Undecided → High
Changed in apport (Ubuntu Zesty):
importance: Undecided → High
Changed in apport (Ubuntu Artful):
importance: Undecided → High
Changed in apport (Ubuntu Bionic):
importance: Undecided → High
Changed in apport (Ubuntu Xenial):
status: New → Triaged
Changed in apport (Ubuntu Zesty):
status: New → Triaged
Changed in apport (Ubuntu Artful):
status: New → Triaged
Changed in apport (Ubuntu Bionic):
status: New → Triaged
tags: added: regression-update
Revision history for this message
Tyler Hicks (tyhicks) wrote :

The patch in comment #4 of bug 1726372 was mostly complete but issues were discovered late as we were approached the CRD for the CVEs described in that bug:

1) The patch should be updated to forward the new dump_mode argument into the container. This is a trivial change.
2) The patch changed the functionality of apport so that it processes, in the host, all crashes that come from a "non-full" container. The PoC in the description of bug 1726372 simply creates a PID namespace, without a new mount namespace, and then calls abort(). The behavioral change introduced by the patch resulted in apport writing the core dump to /tmp/core when it didn't do that before because it ignored such crashes.
3) The combination of the patch and the fix for CVE-2017-14177, which added a new required dump_mode command line option to Apport, made it potentially dangerous for an updated Apport in the host to forward a crash to a non-updated Apport in a container as the dump_mode parameter would be treated as the global_pid in the container's Apport.

These three issues are why we had to make the decision to (temporarily) drop container crash forwarding.

I won't be directly involved in re-enabling the container crash forwarding support but please feel free to ping me for a review, if needed.

Revision history for this message
Stéphane Graber (stgraber) wrote :

This debdiff re-introduces the forwarding code, it also cleans a number of things up:

 - It fixes a regression of apport on systems using upstart
 - It replaces the is_container logic with a is_same_ns function that lets us check things more carefully.
 - If the pidns differs but mntns doesn't, apport will process the crash locally, using the global pid.
 - If the mntns differs but pidns doesn't, then the crash is just plain ignored
 - If pidns and mntns differ and an apport receiver socket can be found, the crash is forwarded. If none can be found, the crash is ignored.
 - All arguments except the first and last are now sent to the receiver.
 - The receiver has a check for the number of received arguments, ignoring any crash that doesn't match its view of the world.
 - The ucred is used for pid passing, translating the pid across pidns.

I've done the following tests:
 - Standard crash on host => crash in /var/crash of host
 - Crash on host in a different pidns => crash in /var/crash of host
 - Crash on host in a different mtnns => no crash file
 - Crash on host in a different pidns and mntns => no crash file (no receiver found)
 - Crash in container with receiver setup => crash in /var/crash of container

Revision history for this message
Tyler Hicks (tyhicks) wrote :

Do we have a strong reason to start handling crashes inside of "non-full" containers on stable Ubuntu releases? I'm specifically talking about when this conditional evaluates to True:

  elif not is_same_ns(host_pid, "pid") and is_same_ns(host_pid, "mnt"):

If there's no strong reason, can we only enable that in Bionic?

Also, did you test that with the the PoC in bug 1726372? I'm fairly certain that it'll create a core dump in /tmp (/tmp/core) which is new/undesired.

Revision history for this message
Tyler Hicks (tyhicks) wrote :

Going back to point #3 in comment 2, I don't see anything that will protect against an updated apport in the host from forwarding a crash to a non-updated apport in a container, causing the container's apport to confuse dump_mode as a global_pid. Am I missing something that protects against that or do not consider it to be a problem worth worrying about?

Revision history for this message
Stéphane Graber (stgraber) wrote :

I strongly suspect that us ignoring crashes coming from a different pidns but same mntns is effectively making us drop crashes from various web browsers and other piece of server software (I remember some ftp server using a pidns per connection at some point).

There doesn't seem to be a good reason to drop those as they are legitimate crashes that we can handle just fine.

Revision history for this message
Tyler Hicks (tyhicks) wrote :

I suspect that you're correct but I'd rather not widen the attack surface of apport without having a strong reason to do so. If there's not strong justification, maybe enabling the handling of those crashes in the dev release and seeing how it plays out would be a better approach.

Revision history for this message
Stéphane Graber (stgraber) wrote :

As for the forwarding issue, there is a check now in place to prevent such things from happening in the future, that's the argument check in the receiver.

For the case where as user is running the pre-security upload version of apport in a container and this post-security upload version on the host, then the container will indeed receive one more argument than it needs but I don't think there's much we can do about this.

In this case, the host would send "<pid> <signal> <ulimit> <dump mode>" to the container.
The container would then set its sys.argv to match, effectively putting the dump mode as the global pid.

This is obviously not going to work well and will result in apport crashing in the container.
As far as I can tell this isn't exploitable and will get resolved as soon as the container is upgraded. The check I put in place will prevent this from happening again and once we get named arguments, the problem will go away for good while retaining backward compatibility.

Revision history for this message
Stéphane Graber (stgraber) wrote :

As for the reproducer from the other bug causing a /tmp/core file to be written, my understanding is that this is normal apport behavior for when a core file size is set.

Revision history for this message
Stéphane Graber (stgraber) wrote :
Revision history for this message
Stéphane Graber (stgraber) wrote :

Fixed an issue with the mntns check (replaced else with elif).

Revision history for this message
Stéphane Graber (stgraber) wrote :

stgraber@castiana:~$ cat crash.c
#include <stdlib.h>

int main() {
    abort();
    return 0;
}
stgraber@castiana:~$ gcc crash.c -o crash
stgraber@castiana:~$ ulimit -c unlimited
stgraber@castiana:~$ ./crash
Aborted (core dumped)
stgraber@castiana:~$ ls -lh core
-rw------- 1 stgraber domain admins 260K Nov 15 18:12 core
stgraber@castiana:~$ sudo tail /var/log/apport.log
ERROR: apport (pid 10082) Wed Nov 15 18:12:31 2017: called for pid 10081, signal 6, core limit 18446744073709551615, dump mode 1
ERROR: apport (pid 10082) Wed Nov 15 18:12:31 2017: ignoring implausibly big core limit, treating as unlimited
ERROR: apport (pid 10082) Wed Nov 15 18:12:31 2017: executable: /home/stgraber/crash (command line "./crash")
ERROR: apport (pid 10082) Wed Nov 15 18:12:31 2017: executable does not belong to a package, ignoring
ERROR: apport (pid 10082) Wed Nov 15 18:12:31 2017: writing core dump to /home/stgraber/core (limit: -1)
stgraber@castiana:~$

Revision history for this message
Stéphane Graber (stgraber) wrote :

Anyway, for the pidns thing, yes, we can change the condition to trip on either a different pidns or a different mntns and log a message that this isn't supported, I'm just not seeing much risk with just supporting it and it would certainly qualify as a bugfix.

The handling in this case isn't particularly controversial either, if we see we're in the same mntns but pidns differs we just use the global pid and move on. We know the filesystem will be the same, so it's fine to call the hooks and write wherever apport usually writes.

Revision history for this message
Tyler Hicks (tyhicks) wrote :

If you don't run the `ulimit -c unlimited` command, your crash program will not result in apport writing out a core file.

However, even if you don't run that command, the reproducer in bug 1726372 will cause apport to write out a core file.

Revision history for this message
Tyler Hicks (tyhicks) wrote :

The reason I'm being picky about the pidns thing is because I think this update needs to go through -security since it fixes regressions caused by the security update. We try to be as conservative as possible with those updates.

Revision history for this message
Stéphane Graber (stgraber) wrote :

The reproducer does the ulimit itself, that's why you're getting a core.

The example code in 1726372 specifically does:

   corelimit.rlim_cur = RLIM_INFINITY;
   corelimit.rlim_max = RLIM_INFINITY;
   setrlimit(RLIMIT_CORE, &corelimit);

Which is equivalent to "ulimit -c unlimited"

Revision history for this message
Tyler Hicks (tyhicks) wrote :

Sigh... Thanks for being patient with me on that. I think my brain just wrote everything at the top of main() off as setting up the namespace for some reason. That's embarrassing... :)

tags: added: patch
Revision history for this message
Tyler Hicks (tyhicks) wrote :

@Brian did you have any thoughts on the debdiff?

Revision history for this message
Brian Murray (brian-murray) wrote :

I'm good with these changes.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package apport - 2.20.4-0ubuntu4.8

---------------
apport (2.20.4-0ubuntu4.8) zesty-security; urgency=medium

  [ Stéphane Graber ]
  * REGRESSION UPDATE: Fix regression in previous upload by re-enabling
    container support. (LP: #1732518)
  * REGRESSION UPDATE: Fix the core_pattern for upstart based systems to
    include the dump mode.
  * Add code preventing a user from confusing apport by using
    a manually crafted filesystem inside a combination of a user and mount
    namespace.
  * Add a check in apport receiver for the number of arguments so that
    should another argument be added later, the receiver will simply ignore
    the crash until it itself gets updated.

 -- Tyler Hicks <email address hidden> Fri, 17 Nov 2017 15:55:58 +0000

Changed in apport (Ubuntu Zesty):
status: Triaged → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package apport - 2.20.7-0ubuntu3.5

---------------
apport (2.20.7-0ubuntu3.5) artful-security; urgency=medium

  [ Stéphane Graber ]
  * REGRESSION UPDATE: Fix regression in previous upload by re-enabling
    container support. (LP: #1732518)
  * Add code preventing a user from confusing apport by using
    a manually crafted filesystem inside a combination of a user and mount
    namespace.
  * Add a check in apport receiver for the number of arguments so that
    should another argument be added later, the receiver will simply ignore
    the crash until it itself gets updated.

 -- Tyler Hicks <email address hidden> Fri, 17 Nov 2017 15:58:36 +0000

Changed in apport (Ubuntu Artful):
status: Triaged → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package apport - 2.20.1-0ubuntu2.13

---------------
apport (2.20.1-0ubuntu2.13) xenial-security; urgency=medium

  * REGRESSION UPDATE: Fix regression in previous upload by re-enabling
    container support. (LP: #1732518)
  * REGRESSION UPDATE: Fix the core_pattern for upstart based systems to
    include the dump mode.
  * Add code preventing a user from confusing apport by using
    a manually crafted filesystem inside a combination of a user and mount
    namespace.
  * Add a check in apport receiver for the number of arguments so that
    should another argument be added later, the receiver will simply ignore
    the crash until it itself gets updated.

 -- Stéphane Graber <email address hidden> Wed, 15 Nov 2017 17:00:24 -0500

Changed in apport (Ubuntu Xenial):
status: Triaged → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package apport - 2.20.8-0ubuntu2

---------------
apport (2.20.8-0ubuntu2) bionic; urgency=medium

  [ Stéphane Graber ]
  * REGRESSION UPDATE: Fix regression in previous upload by re-enabling
    container support. (LP: #1732518)
  * Add code preventing a user from confusing apport by using
    a manually crafted filesystem inside a combination of a user and mount
    namespace.
  * Add a check in apport receiver for the number of arguments so that
    should another argument be added later, the receiver will simply ignore
    the crash until it itself gets updated.

 -- Brian Murray <email address hidden> Mon, 20 Nov 2017 08:46:52 -0800

Changed in apport (Ubuntu Bionic):
status: Triaged → Fix Released
tags: added: id-5a0f2b531bfe92138cbe0a65
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.