lightdm 1.18.2 breaks D-Bus with second sessions

Bug #1599478 reported by Yves-Alexis Perez
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Light Display Manager
Fix Released
High
Unassigned
lightdm (Debian)
Fix Released
Unknown

Bug Description

Hi,

after upgrading LightDM to 1.18.2 in Debian, I had multiple reports (confirmed by me) that it broke logins after the first ones.

When someone logs in, the logs out, then all subsequent logins don't have D-Bus session running, which usually breaks completely the session.

The downstream bug report is at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=829557 and don't hesitate to ask information.

I have to admit this bugs puzzles me a little, I don't find anything obvious in the diff between the two versions, so any help appreciated.

Changed in lightdm (Debian):
status: Unknown → Confirmed
Revision history for this message
Robert Ancell (robert-ancell) wrote :

Also I can't see anything related that has changed...

Some ideas to try:
- Try rebuilding 1.18.1 and see if it has any issues (in case the way 1.18.2 was built has changed something)
- Try lp:lightdm/1.18 in case the few fixes on that make a difference.
- Does lp:lightdm also cause the same problem?

Revision history for this message
Yves-Alexis Perez (corsac) wrote :

I've just rebuild 1.18.1 and same thing happens (it does work fine). I'll try the lp branches but I'm not sure how to not mess with package-installed files too much.

Revision history for this message
Yves-Alexis Perez (corsac) wrote :

I've added some investigation on downstream bug report. It seems that dbus-daemon dies early at the second login with lightdm 1.18.2. Still no clue why.

Revision history for this message
Yves-Alexis Perez (corsac) wrote :

So, a bit more investigation: it seems that dbus-daemon dies because epoll_ctl returns EINVAL which means epdf is not an epoll file descriptor. Not too sure why.

Also, it seems that installing lightdm 1.19.2 from Ubuntu fixes the problem for an user.

Revision history for this message
Yves-Alexis Perez (corsac) wrote :

With some help from git-remote-bzr I've did a git bisect and it seems the first bad commit is:

https://bazaar.launchpad.net/~lightdm-team/lightdm/1.18/revision/2319

(refactor GreeterSession…)

Revision history for this message
Yves-Alexis Perez (corsac) wrote : Re: [Pkg-xfce-devel] Bug#829557: Bug#829557: firefox: error box at start-up / D-BUS related issue

[adding Robert and the launchpad bug to CC]

On Mon, 2016-07-11 at 13:03 +0200, Yves-Alexis Perez wrote:
> On Mon, 2016-07-11 at 11:46 +0200, Yves-Alexis Perez wrote:
> > > I wonder whether there are other reasons why epoll_ctl can report
> > > EINVAL?
> >
> > The syscall source code is at http://lxr.free-electrons.com/source/fs/even
> > tp
> > ol
> > l.c#L1849 and it seems EINVAL is used as a default error case at various
> > places, so maybe.
> > >
> > > I also wonder whether the new lightdm is starting dbus-launch with a
> > > different value for some arbitrary kernel limit, or whether your
> > > previous
> > > session leaked some fds resulting in dbus-launch coming up with 90% of
> > > an arbitrary limit already in use, or something like that?
> >
> > For what it's worth, after closing the first session there's no process
> > running under my uid. I'll try to check the limits in 75dbus to see if
> > they
> > differ.
>
> Some more investigation: I've done a bisect in lightdm and the offending
> commit is https://bazaar.launchpad.net/~lightdm-team/lightdm/1.18/revision/2
> 31
> 9 which is a somehow large refactoring, I didn't yet identify what could be
> the problem there (but reported that upstream as well).
>
> I've checked the currently opened file descriptors when when starting the
> session (I've added an ls -l /proc/self/fd in 75dbus..) and here are the
> results (don't bother about the PIDs, the “first” login was after a lightdm
> restart after the “second” login).
>
> For the first login:
>
> + ls -l /proc/self/fd
> total 0
> lr-x------ 1 corsac corsac 64 Jul 11 12:56 0 -> /dev/null
> l-wx------ 1 corsac corsac 64 Jul 11 12:56 1 -> /home/corsac/.xsession-
> errors
> l-wx------ 1 corsac corsac 64 Jul 11 12:56 2 -> /home/corsac/.xsession-
> errors
> lr-x------ 1 corsac corsac 64 Jul 11 12:56 3 -> /proc/30014/fd
>
> For the second:
>
> + ls -l /proc/self/fd
> total 0
> lr-x------ 1 corsac corsac 64 Jul 11 12:56 0 -> /proc/29846/fd
> l-wx------ 1 corsac corsac 64 Jul 11 12:56 1 -> /home/corsac/.xsession-
> errors
> l-wx------ 1 corsac corsac 64 Jul 11 12:56 2 -> /home/corsac/.xsession-
> errors
>
> So it seems stdin is closed for the second login. Could it break dbus-
> launch/dbus-daemon somehow?

Looking at the 2319 revision in lightdm it seems that greeter_start() closes
the two file descriptors and that looks spurious. Digging a little bit more
reveals https://bazaar.launchpad.net/~lightdm-team/lightdm/1.18/revision/2327
which seems to fix the problem indeed.

I'm uploading a packaged including the patch asap.

Robert, you might want to release an 1.18.3 soon, I guess we're not the only
ones impacted.

Simon, I'm not sure if it's something worth investigating/fixing in dbus, so
I'll let you handle from there, I guess.

Thanks everyone for the help.

Regards,
--
Yves-Alexis

Revision history for this message
Simon McVittie (smcv) wrote :

Control: clone 829557 -2
Control: severity -2 important
Control: reassign -2 dbus
Control: retitle -2 dbus-launch/dbus-daemon behave badly if stdin is closed?

On Mon, 11 Jul 2016 at 13:23:38 +0200, Yves-Alexis Perez wrote:
> > So it seems stdin is closed for the second login. Could it break dbus-
> > launch/dbus-daemon somehow?

Probably. dbus-launch polls its stdin under certain circumstances (which
is silly, but it's backwards-compatibility with code from long before I
got involved), and dbus-daemon might close it or end up with fds that it
opens unexpectedly becoming stdin.

Hopefully the fact that you have discovered this will make it easier for
me to make a smaller-than-lightdm reproducer.

> Simon, I'm not sure if it's something worth investigating/fixing in dbus, so
> I'll let you handle from there, I guess.

dbus-daemon is designed to be launchable from X11 "autolaunching" in
arbitrarily precarious circumstances, so yes this is a bug, and I'm
opening a clone to track that. Thank you for making it non-RC so I can
carry on with my holiday :-)

    S

Changed in lightdm (Debian):
status: Confirmed → Fix Released
Revision history for this message
Bruce Fowler (brf) wrote :

CTL-ALT-Backspace cleans this up, if you need a temporary work-around.

Changed in lightdm:
status: New → Fix Released
milestone: none → 1.18.3
importance: Undecided → Critical
importance: Critical → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.