"System program problem" report button does nothing

Bug #1821412 reported by Brian Murray
44
This bug affects 10 people
Affects Status Importance Assigned to Milestone
update-notifier (Ubuntu)
High
Dan Bungert
Bionic
Undecided
Unassigned
Focal
High
Brian Murray
Hirsute
High
Brian Murray
Impish
High
Brian Murray

Bug Description

[Impact]
  * Users, already annoyed that software has crashed and they have an
    unwanted dialog instead, are unable to click the report button to
    let us know about the crash.
  * Two identical-looking dialogs exist - one on the system-crash
    path, and one spawned by the background update-notifier process.
    The system crash one is the failing one, the update-notifier one
    is fine. The system crash one shows first, and users may not
    bother to approve the second one given that the first didn't do
    anything
  * We receive fewer crash reports

[Test Plan]
  * sudo xeyes &
  * sudo kill -11 $PID from above
  * receive crash notification
  * click "Report problem..."
  * We should see the report procedure start

[Where problems could occur]
  * By the nature of the supplied fix, if the report process hangs
    around, the normal systemd cleanup procedures would fail to clean
    it up. This same risk is present for report processes initiated
    on the non-system-crash report code path.
  * Wayland is still affected even with this fix, for different reasons
    (LP: #1947929)

[Other Info]
Q: Systemd says KillMode=process is not recommended
https://www.freedesktop.org/software/systemd/man/systemd.kill.html

In this case, killing the other processes in the control-group is part
of the problem.

The system crash dialog is a relatively simple bit of code that shows
a dialog, then runs a report process. This dialog process is spawned
as part of a chain of processes downstream from
update-notifier-crash.service.

This dialog runs the report process using g_spawn_async(). After
spawning the report process, there isn't any particular reason to keep
the dialog process around, so it exits. This exit is triggering the
KillMode behavior, and because the report process is in the control
group, the report process is killed.

Another possibility for addressing the process management is to run
the report process synchronously - keep the dialog process around
until the report is done and just make the dialog not visible. This
is not a workable answer in this case because with the current gtk
code usage, the dialog sticks around - even after the call to
gtk_widget_destroy()! This appears to be due to usage of
gtk_dialog_run(). Long term I recommend we modernize this code.
My initial fix for this LP was in that direction.
https://git.launchpad.net/~dbungert/update-notifier/commit/?id=06058d5705ed7cd636206c1ee72c376ec903fe74

Related branches

Revision history for this message
Brian Murray (brian-murray) wrote :
Changed in update-notifier (Ubuntu):
importance: Undecided → High
tags: added: rls-dd-notfixing rls-ee-incoming
removed: rls-dd-incoming
tags: added: id-5c9cf7c38b53cd82590a3144
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in update-notifier (Ubuntu):
status: New → Confirmed
Revision history for this message
Matt Robbins (mtr-w) wrote :

It's happening to me with Ubuntu 18.04 LTS Desktop.

tags: removed: rls-ee-incoming
Changed in update-notifier (Ubuntu Eoan):
status: Confirmed → Won't Fix
tags: added: fr-25
tags: added: focal groovy
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package update-notifier - 3.192.35

---------------
update-notifier (3.192.35) hirsute; urgency=medium

  * src/system-crash.c:
    - Wrap pkexec call in /bin/sh to workaround requirement
      that PPID not be 1. (LP: #1821412)
    - Don't use G_SPAWN_DO_NOT_REAP_CHILD in order to avoid
      zombie processes.

 -- Tiago Stürmer Daitx <email address hidden> Wed, 21 Oct 2020 19:55:37 +0000

Changed in update-notifier (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Dan Bungert (dbungert) wrote :
Download full text (3.3 KiB)

Hi Bryan,

I advise against SRUing the current patch to focal, based on the following:
1) I have not been able to reproduce the original problem on focal - but could
see it on disco
2) The changelog for the fix states
  - Wrap pkexec call in /bin/sh to workaround requirement
    that PPID not be 1. (LP: #1821412)
  - Don't use G_SPAWN_DO_NOT_REAP_CHILD in order to avoid
    zombie processes.
3) PPID == 1 is a valid concern, given this code from polkit-1 pkexec.c: /* Figure out the parent process */ pid_of_caller = getppid (); if (pid_of_caller == 1)
    {
      /* getppid() can return 1 if the parent died (meaning that we are reaped
       * by /sbin/init); In that case we simpy bail.
       */
      g_printerr ("Refusing to render service to dead parents.\n");
      goto out;
    }
4) Without the patch, ppid is the systemd pid (not 1)
5) With the patch, ppid is still systemd (and still not 1), which suggests that
the change doesn't make the situation any better - and extracting the
before/after usage of g_spawn_async into a test program provides equivalent
results.
                       ...

Read more...

Revision history for this message
Brian Murray (brian-murray) wrote :

I was testing this on Hirsute today and was nothing was happening when I clicked the "Report" button. I then ran update-notifier with --debug-misc and discovered the following in the output:

** (update-notifier:2610): DEBUG: 15:44:59.075: crashreport_check
** (update-notifier:2610): DEBUG: 15:44:59.304: crashreport found running apport now
** (update-notifier:2610): DEBUG: 15:44:59.304: fire up the crashreport tool
** (update-notifier:2610): DEBUG: 15:44:59.417: no pkexec for system reports under wayland

So it seems like this may not be working because hirsute is now using wayland. That being said there is weird behavior when not using wayland. After killing xeyes I instantly receive a crash report but clicking "Report" does nothing. However, later on when update-notifier checks for a crash report and a dialog is raised clicking "Report" does behave as expected so perhaps the fix is actually incomplete for Xorg and not working for wayland.

Revision history for this message
Brian Murray (brian-murray) wrote :

I think the pkexec wayland bug is https://bugs.launchpad.net/ubuntu/+source/backintime/+bug/1713313 but its a long read.

Revision history for this message
Brian Murray (brian-murray) wrote :

It's possible that the first dialog is coming from crash.c and system-crash.c.

Changed in update-notifier (Ubuntu):
status: Fix Released → Triaged
Dan Bungert (dbungert)
Changed in update-notifier (Ubuntu):
assignee: nobody → Dan Bungert (dbungert)
status: Triaged → In Progress
Norbert (nrbrtx)
tags: removed: disco groovy rls-dd-notfixing
Revision history for this message
Dan Bungert (dbungert) wrote :

Moving the Wayland portion to LP: #1947929

no longer affects: update-notifier (Ubuntu Eoan)
Revision history for this message
Brian Murray (brian-murray) wrote :

 $ dput update-notifier_3.192.46_source.changes
Trying to upload package to ubuntu
Checking signature on .changes
gpg: /home/bdmurray/source-trees/update-notifier/update-notifier_3.192.46_source.changes: Valid signature from 1E918B66765B3E31
Checking signature on .dsc
gpg: /home/bdmurray/source-trees/update-notifier/update-notifier_3.192.46.dsc: Valid signature from 1E918B66765B3E31
Uploading to ubuntu (via ftp to upload.ubuntu.com):
  Uploading update-notifier_3.192.46.dsc: done.
  Uploading update-notifier_3.192.46.tar.xz: done.
  Uploading update-notifier_3.192.46_source.buildinfo: done.
  Uploading update-notifier_3.192.46_source.changes: done.
Successfully uploaded packages.

Dan Bungert (dbungert)
description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package update-notifier - 3.192.46

---------------
update-notifier (3.192.46) jammy; urgency=medium

  [ Dan Bungert ]
  * Adjust update-notifier-crash.service so that the report process is not
    killed before it has a chance to act (LP: #1821412)

 -- Brian Murray <email address hidden> Thu, 21 Oct 2021 14:49:48 -0700

Changed in update-notifier (Ubuntu):
status: In Progress → Fix Released
Changed in update-notifier (Ubuntu Focal):
status: New → In Progress
Changed in update-notifier (Ubuntu Hirsute):
status: New → In Progress
Changed in update-notifier (Ubuntu Impish):
status: New → In Progress
Changed in update-notifier (Ubuntu Focal):
assignee: nobody → Brian Murray (brian-murray)
Changed in update-notifier (Ubuntu Hirsute):
assignee: nobody → Brian Murray (brian-murray)
Changed in update-notifier (Ubuntu Impish):
assignee: nobody → Brian Murray (brian-murray)
Changed in update-notifier (Ubuntu Focal):
importance: Undecided → High
Changed in update-notifier (Ubuntu Hirsute):
importance: Undecided → High
Changed in update-notifier (Ubuntu Impish):
importance: Undecided → High
Changed in update-notifier (Ubuntu Bionic):
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers