Race condition in lightdm greeter setup
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Light Display Manager |
Triaged
|
High
|
Unassigned |
Bug Description
Tested lightdm versions: 1.4.0-0ubuntu2 and 1.6.0-0ubuntu2 on Ubuntu 12.10
I come across a bug in lightdm that causes lightdm to die randomly when starting or after logout. It happens only on some machines and on some machines it never happens. My best guess here is a race condition.
During greeter setup "lightdm --session-child" is spawned twice. The first call to session_start() that does fork+execlp() is done from greeter_start() and right after that handle_login() calls session_stop() + session_start(). session_stop() sends SIGTERM to the child process, but if the child has not managed to call execlp() yet, it still has signal handler set for SIGTERM, because fork copies the parent's signal handlers to the child. Now when session_stop() sends SIGTERM to the child, it uses the signal handler of the parent which causes the signal go to signal_cb() that then signals the main lightdm process to die.
I managed to get rid of the problem by switching fork() -> vfork() in session_start() which blocks the parent before execlp() is run and signal handlers are cleared. This ensures that the signal sent from session_stop() always ends up only to the child.
The signal handling in lightdm looks quite tricky, so I'm not really sure if this is the best way to handle this. Maybe it would be possible to get rid of the first "lightdm --session-child" spawning to make all this unnecessary?
Relevant pieces of code:
The fork() is called here and the calling process has signal handlers set.
-----------> src/session.c <--------------
gboolean
session_start (Session *session, const gchar *service, const gchar *username, gboolean do_authenticate, gboolean is_interactive, gboolean is_guest)
{
...
/* Run the child */
session-
if (session->priv->pid < 0)
{
g_debug ("Failed to fork session child process: %s", strerror (errno));
return FALSE;
}
if (session->priv->pid == 0)
{
/* Run us again in session child mode */
execlp ("lightdm",
_exit (EXIT_FAILURE);
}
...
}
session_stop (Session *session)
{
g_return_
if (session->priv->pid > 0)
{
g_debug ("Session %d: Sending SIGTERM", session-
kill (session-
// FIXME: Handle timeout
}
}
-----------> src/session.c <--------------
session_start() is called first time from greeter_start()
-----------> src/greeter.c <--------------
gboolean
greeter_start (Greeter *greeter, const gchar *service, const gchar *username)
{
...
result = session_start (greeter-
...
}
-----------> src/greeter.c <--------------
Almost immediately after session_start() is called, reset_session() is callend which then calls session_stop(). After that session_start() is called again which spawns a new "lightdm --session-child" process.
-----------> src/greeter.c <--------------
static void
handle_login (Greeter *greeter, guint32 sequence_number, const gchar *username)
{
...
reset_session (greeter);
...
/* Run the session process */
session_start (greeter-
...
}
-----------> src/greeter.c <--------------
Because sometimes fork+execlp() has not finished before SIGTERM is sent, the SIGTERM ends up here instead of the callback in session-child.c:
-----------> src/process.c <--------------
signal_cb (int signum, siginfo_t *info, void *data)
{
/* NOTE: Using g_printerr as can't call g_warning from a signal callback */
if (write (signal_pipe[1], &info->si_signo, sizeof (int)) < 0 ||
write (signal_pipe[1], &info->si_pid, sizeof (pid_t)) < 0)
g_printerr ("Failed to write to signal pipe: %s", strerror (errno));
}
static gboolean
handle_signal (GIOChannel *source, GIOCondition condition, gpointer data)
{
int signo;
pid_t pid;
Process *process;
if (read (signal_pipe[0], &signo, sizeof (int)) < 0 ||
read (signal_pipe[0], &pid, sizeof (pid_t)) < 0)
{
g_warning ("Error reading from signal pipe: %s", strerror (errno));
return TRUE;
}
g_debug ("Got signal %d from process %d", signo, pid);
process = g_hash_table_lookup (processes, GINT_TO_POINTER (pid));
if (process == NULL)
process = process_get_current ();
if (process)
return TRUE;
}
-----------> src/process.c <--------------
Related branches
Changed in lightdm: | |
importance: | Undecided → High |
Changed in lightdm: | |
status: | Confirmed → Incomplete |
Not to comment on lightdm which I'm not familiar with, but for the benefit of doubt it'd be useful to know that this also happens after the just released different race condition fix in lightdm 1.6.0-0ubuntu2.1 and plymouth 0.8.8-0ubuntu6.1, bug #982889. That one happens on intel graphics.
The lightdm patch is at http:// bazaar. launchpad. net/~ubuntu- branches/ ubuntu/ raring/ lightdm/ raring- proposed/ revision/ 91 bazaar. launchpad. net/~tjaalton/ plymouth/ race-fix/ revision/ 1447
And plymouth patch seems to be this one http://