unity-gtk-module.service is racy; session services don't stop if session terminates

Bug #1618886 reported by Martin Pitt on 2016-08-31
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
gnome-session (Ubuntu)
Medium
Iain Lane
unity-gtk-module (Ubuntu)
High
Martin Pitt
upstart (Ubuntu)
Medium
Iain Lane

Bug Description

Sometimes on session start unity-gtk-module.service runs too late or something, and $GTK_MODULES does not include "unity". It is in "systemctl --user show-environment" but not in a terminal bash.

Related branches

Martin Pitt (pitti) on 2016-08-31
Changed in unity-gtk-module (Ubuntu):
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Martin Pitt (pitti)
importance: Medium → High
milestone: none → ubuntu-16.09

More than this unit, it seems related to the fact that it doesn't get deactivated on events such as Xorg being killed.
And this is not the only module with this issue. Also gnome-keyring-ssh.

In general any script which is on graphical-session doesn't get stopped. And in fact:

sudo killall -9 Xorg
marco@ubuntu-vmware:~:0$ systemctl --user status graphical-session.target
● graphical-session.target - Current graphical user session
   Loaded: loaded (/usr/lib/systemd/user/graphical-session.target; static; vendor preset: enabled)
   Active: active since mer 2016-08-31 13:48:21 CEST; 54s ago
     Docs: man:systemd.special(7)

ago 31 13:48:21 ubuntu-vmware systemd[6159]: Reached target Current graphical user session.

Martin Pitt (pitti) wrote :

Without the PPA this is because upstart runs unity and there is no synchronization between unity-gtk-module.service and unity7.conf.

With the PPA (i. e. converted services) this happens because unity-gtk-module.service stays running after killing X (systemctl restart lightdm) if there is another login. We need to restart it on login, or even better, ensure that it stops when X gets killed.

Iain Lane (laney) wrote :

To restart everything when the session starts, add

  systemctl --user stop graphical-session.target

to the systemd-graphical-session.conf and run-systemd-session, after the block where all failed units are reset.

For the unity-in-upstart case, we might want to not override unity-gtk-module? Maybe it then becomes racy with setting environment variables though.

Iain Lane (laney) wrote :

s/restart everything/stop everything from a previous graphical session/

If unity is set as required by ubuntu-session, then stopping unity7 causes the graphical session not to be stopped either...

Martin Pitt (pitti) on 2016-09-13
Changed in unity-gtk-module (Ubuntu):
status: Triaged → In Progress
Martin Pitt (pitti) wrote :

The unity-gtk-module problem is simple enough to fix (bring back the upstart job, and restrict the systemd unit to dbus+systemd to avoid calling initctl set-env twice). Opening a second gnome-session task to investigate the more general "services don't go down when the session stops" issue.

Changed in gnome-session (Ubuntu):
assignee: nobody → Martin Pitt (pitti)
status: New → Triaged
summary: - unity-gtk-module.service is racy
+ unity-gtk-module.service is racy; session services don't stop if session
+ terminates
Martin Pitt (pitti) on 2016-09-13
Changed in unity-gtk-module (Ubuntu):
status: In Progress → Fix Committed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package unity-gtk-module - 0.0.0+16.10.20160913-0ubuntu1

---------------
unity-gtk-module (0.0.0+16.10.20160913-0ubuntu1) yakkety; urgency=medium

  * Fix $GTK_MODULES race condition (LP: #1618886)

 -- Martin Pitt <email address hidden> Tue, 13 Sep 2016 15:27:01 +0000

Changed in unity-gtk-module (Ubuntu):
status: Fix Committed → Fix Released
Martin Pitt (pitti) on 2016-09-16
Changed in gnome-session (Ubuntu):
status: Triaged → In Progress
milestone: none → ubuntu-16.09
Martin Pitt (pitti) wrote :

> sudo killall -9 Xorg

This causes some havoc -- as this does not actually stop anything in the session, random services just start failing the next time they try to talk to X. Due to our Restart=on-failure, these units then try to restart a few times until they hit the restart limit and go to "failed".

So this is rather rude -- just log out instead, or do "kill -9 -1" to kill your user processes.

That said, I think it's ok that services go to "failed" as long as the next login properly restarts them, which brings us to..

> In general any script which is on graphical-session doesn't get stopped.

Indeed this does kill ubuntu-session.target and its associated services like unity7.service. But we don't configure a requirement in the other direction, i. e. graphical-session.target does not stop when ubuntu-session.target stops.

/usr/lib/gnome-session/run-systemd-session has a loop to clean up failed units, but appparently that's not wide enough yet.

Martin Pitt (pitti) wrote :

Argh!

    if [ "$(systemctl --user show -p PartOf --value)" = "graphical-session.target" ]; then

The systemctl --user call is missing a $unit argument, so this loop never actually worked.

Martin Pitt (pitti) wrote :

The same "missing $unit" is in /usr/share/upstart/systemd-session/upstart/systemd-graphical-session.conf.

Changed in upstart (Ubuntu):
assignee: nobody → Martin Pitt (pitti)
status: New → In Progress
Martin Pitt (pitti) on 2016-09-16
Changed in upstart (Ubuntu):
status: In Progress → Fix Committed
Martin Pitt (pitti) on 2016-09-16
Changed in gnome-session (Ubuntu):
status: In Progress → Won't Fix
status: Won't Fix → Fix Committed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package gnome-session - 3.20.2-1ubuntu3

---------------
gnome-session (3.20.2-1ubuntu3) yakkety; urgency=medium

  [ Dmitry Shachnev ]
  * Backport upstream patch to allow users to override $QT_QPA_PLATFORMTHEME
    (allow_overriding_qt_platformtheme.patch).

  [ Martin Pitt ]
  * debian/data/run-systemd-session: Add missing $unit to systemctl show call,
    so that the cleanup of failed units actually works. (LP: #1618886)

 -- Martin Pitt <email address hidden> Fri, 16 Sep 2016 15:27:56 +0200

Changed in gnome-session (Ubuntu):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package upstart - 1.13.2-0ubuntu31

---------------
upstart (1.13.2-0ubuntu31) yakkety; urgency=medium

  * init/tests/test_conf.c: Disable previous check harder (on powerpc).

upstart (1.13.2-0ubuntu30) yakkety; urgency=medium

  * init/tests/test_conf.c: Disable check in "Invalid .conf file does not stop
    ConfFile being serialised" test on powerpc, there is just no pleasing it
    and the numerous build retries are unnerving.

upstart (1.13.2-0ubuntu29) yakkety; urgency=medium

  * debian/systemd-graphical-session.conf: Drop workaround for killing
    gnome-keyring, this was fixed properly now.
  * debian/systemd-graphical-session.conf: Add missing $unit to systemctl show
    call, so that the cleanup of failed units actually works. (LP: #1618886)

 -- Martin Pitt <email address hidden> Mon, 19 Sep 2016 12:26:08 +0200

Changed in upstart (Ubuntu):
status: Fix Committed → Fix Released
Iain Lane (laney) wrote :

This isn't quite enough in this situation.

- Log in over SSH or in a vt (so there is a logind session alive for your user)
- Log into the same user in Unity
- Simulate an Xorg crash (pkill -f -SEGV Xorg)
- Log back in

You don't have GTK_MODULES set properly. The previous gnome-session is lingering.

To fix that, something like comment #3 is needed. Such as looping over all 'active' units that are PartOf graphical-session.target and stopping them all.

gnome-session needs a fix to its ExecStopPost to not kill the one we just started.

I'll upload this and it can be reviewed in the queue.

Changed in gnome-session (Ubuntu):
status: Fix Released → Triaged
importance: Undecided → High
Changed in upstart (Ubuntu):
status: Fix Released → Triaged
importance: Undecided → Medium
Changed in gnome-session (Ubuntu):
importance: High → Medium
assignee: Martin Pitt (pitti) → Iain Lane (laney)
Changed in upstart (Ubuntu):
assignee: Martin Pitt (pitti) → Iain Lane (laney)
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package gnome-session - 3.20.2-1ubuntu4

---------------
gnome-session (3.20.2-1ubuntu4) yakkety; urgency=medium

  * debian/data/run-systemd-session: Stop any lingering active units when
    logging in. (LP: #1618886)
  * debian/data/gnome-session.service: Save the XDG SESSION we were started
    with, and only use that one to end the session. Otherwise, if we are
    stopped by the above codepath, we risk stopping the session that we are
    logging into and not the previous (crashed) one.
  * debian/patches/50_ubuntu_sessions.patch: Set the desktop names to
    Unity:Unity7. "Unity" is shared between Unity 7 and Unity 8, but
    applications should be able to Only/NotShowIn either one of these if they
    want.

 -- Iain Lane <email address hidden> Wed, 28 Sep 2016 12:09:16 +0100

Changed in gnome-session (Ubuntu):
status: Triaged → Fix Released
Martin Pitt (pitti) wrote :

> To fix that, something like comment #3 is needed. Such as looping over all 'active' units that are PartOf graphical-session.target and stopping them all.

I think we would only need to "systemctl stop graphical-session.target" for this, otherwise a unit forgets the PartOf= and that loop would not work anyway.

> gnome-session needs a fix to its ExecStopPost to not kill the one we just started.

This is similar to waiting for "deactivating" units after *-session.target goes down. On the systemd sprint we just figured out a better scheme for this which solves that waiting, does not require this session ID tracking, and also gets rid of the manual starting of graphical-session-pre.target: Eventually we just want to declare in *-session.target that it comes After=graphical-session-pre.target and have this propagate to the dependencies (https://github.com/systemd/systemd/issues/3750). Until then we can just have a mini-generator do this for us. After we have the Requires/After=graphical-session-pre.target, then "systemctl stop graphical-session-pre.target" properly blocks until all "later" units are completely stopped.

On Sat, Oct 01, 2016 at 01:53:56PM -0000, Martin Pitt wrote:
> > To fix that, something like comment #3 is needed. Such as looping over
> all 'active' units that are PartOf graphical-session.target and stopping
> them all.
>
> I think we would only need to "systemctl stop graphical-session.target"
> for this, otherwise a unit forgets the PartOf= and that loop would not
> work anyway.

I tried this, but I think that it might have only been before I worked
out the below ExecStopPost issue - so it's worth testing again.

>
> > gnome-session needs a fix to its ExecStopPost to not kill the one we
> just started.
>
> This is similar to waiting for "deactivating" units after
> *-session.target goes down. On the systemd sprint we just figured out a
> better scheme for this which solves that waiting, does not require this
> session ID tracking, and also gets rid of the manual starting of
> graphical-session-pre.target: Eventually we just want to declare in
> *-session.target that it comes After=graphical-session-pre.target and
> have this propagate to the dependencies
> (https://github.com/systemd/systemd/issues/3750). Until then we can just
> have a mini-generator do this for us. After we have the Requires/After
> =graphical-session-pre.target, then "systemctl stop graphical-session-
> pre.target" properly blocks until all "later" units are completely
> stopped.

I remember this problem.

It would only solve this specific issue if it lets us get rid of the
stop or restart in the session-runner script. If you ever end up
stopping gnome-keyring from within a new session then its ExecStopPost
kills the upstart session of this new one that we are starting up, *not*
the previous one that it was started up under. That's a mismatch vs.
session and user specific semantics of upstart and systemd --user.

Hmm. Maybe this is saying that we should bind ubuntu-session.target to
something else - like unity7? You can't log in again with an active
unity7, so in theory (if stop is propagated down to graphical-session
and graphical-session-pre) there wouldn't be a need to stop/restart in
the script since everything would be stopped by definition if you're
trying to start unity7 again.

--
Iain Lane [ <email address hidden> ]
Debian Developer [ <email address hidden> ]
Ubuntu Developer [ <email address hidden> ]

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package upstart - 1.13.2-0ubuntu32

---------------
upstart (1.13.2-0ubuntu32) yakkety; urgency=medium

  * debian/systemd-graphical-session.conf: Stop any lingering active units
    when logging in. (LP: #1618886)
  * debian/control: Break old versions of gnome-session, which take the
    session down when stopped in the above way.

 -- Iain Lane <email address hidden> Tue, 27 Sep 2016 17:47:27 +0100

Changed in upstart (Ubuntu):
status: Triaged → Fix Released
Martin Pitt (pitti) wrote :

> If you ever end up stopping gnome-keyring from within a new session then its ExecStopPost kills the upstart session of this new one that we are starting up, *not* the previous one that it was started up under.

I assume you meant to say "gnome-session.service" here, not keyring (as its ExecStopPost is harmless). gnome-session is the session leader, so stopping it by definition means to end the current session.

I now cleaned up the transaction handling to avoid having to wait for "deactivating" units after *-session.target ends (both at the top of the script to clean up lingering sessions and the end to cleanly exit without prematurely stopping X).

Martin Pitt (pitti) wrote :

Iain Lane [2016-10-01 19:26 -0000]:
> Hmm. Maybe this is saying that we should bind ubuntu-session.target to
> something else - like unity7? You can't log in again with an active
> unity7, so in theory (if stop is propagated down to graphical-session
> and graphical-session-pre) there wouldn't be a need to stop/restart in
> the script since everything would be stopped by definition if you're
> trying to start unity7 again.

I like that idea in principle -- picking a session leader which
automatically dies when X goes away would be a lot simpler indeed (and
gnome-session does not do that). I picked gnome-session as the session
leader because it traditionally has had that role, and it still has
the API for logging out.

Another point is that we want to make unity/compiz robust against
crashes and provide auto-restart -- if it becomes the session leader,
then the whole session will go down on any crash. So picking
unity7.service in particular is a bad choice.

Of course over time we want to get rid of gnome-session -- we don't
need its service management any more, thus the only thing it provides
is that D-Bus API for logout/user switching etc. This can/should be
replaced with calls to stop units or calls to the DM, and then we
should actually be able to completely drop that concept of a "session
leader" (or rather, foo-session.target then *is* the leader, as we
really intend). But we aren't there yet, so until then I think
BindsTo=gnome-session.service is still conceptually correct IMHO.

Iain Lane (laney) wrote :

On Sun, Oct 02, 2016 at 08:39:01AM -0000, Martin Pitt wrote:
> > If you ever end up stopping gnome-keyring from within a new session
> then its ExecStopPost kills the upstart session of this new one that we
> are starting up, *not* the previous one that it was started up under.
>
> I assume you meant to say "gnome-session.service" here, not keyring (as
> its ExecStopPost is harmless). gnome-session is the session leader, so
> stopping it by definition means to end the current session.

Yes I did. But the problem is that it is *not* always stopped when the
session dies, so it is a bad choice of leader while that is true.

--
Iain Lane [ <email address hidden> ]
Debian Developer [ <email address hidden> ]
Ubuntu Developer [ <email address hidden> ]

Iain Lane (laney) wrote :

On Sun, Oct 02, 2016 at 10:27:43AM -0000, Martin Pitt wrote:
> Iain Lane [2016-10-01 19:26 -0000]:
> > Hmm. Maybe this is saying that we should bind ubuntu-session.target to
> > something else - like unity7? You can't log in again with an active
> > unity7, so in theory (if stop is propagated down to graphical-session
> > and graphical-session-pre) there wouldn't be a need to stop/restart in
> > the script since everything would be stopped by definition if you're
> > trying to start unity7 again.
>
> I like that idea in principle -- picking a session leader which
> automatically dies when X goes away would be a lot simpler indeed (and
> gnome-session does not do that). I picked gnome-session as the session
> leader because it traditionally has had that role, and it still has
> the API for logging out.
>
> Another point is that we want to make unity/compiz robust against
> crashes and provide auto-restart -- if it becomes the session leader,
> then the whole session will go down on any crash. So picking
> unity7.service in particular is a bad choice.

It would be good if it were possible to say that the BindsTo shouldn't
trigger the target to stop until after systemd gives up restarting the
unit.

--
Iain Lane [ <email address hidden> ]
Debian Developer [ <email address hidden> ]
Ubuntu Developer [ <email address hidden> ]

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.