Comment 3 for bug 1297248

Revision history for this message
Tetsuo Handa (9-launchpad-i-love-sakura-ne-jp) wrote :

All the detail is in Bug #1276705.

(1) Currently finit_module() of mptsas kernel module does need more than
    30 seconds to initialize LSI SAS1068E disk.

(2) Currently systemd-udevd unconditionally sends SIGKILL upon hardcoded
    30 seconds timeout. As a result, finit_module() of mptsas kernel
    module receives SIGKILL when waiting for error handler thread to be
    started.

(3) Before commit 786235ee was applied, finit_module() receiving SIGKILL
    was no problem because kthread_create() ignored SIGKILL when waiting
    for error handler thread to be started. But after commit 786235ee was
    applied, finit_module() receiving SIGKILL is a problem because
    kthread_create() no longer ignores SIGKILL when waiting for error
    handler thread to be started. As a result, finit_module() of mptsas
    kernel module failed to initialize LSI SAS1068E disk, leading to
    a boot failure.

Commit 786235ee was meant for helping OOM killer to terminate the victim
process immediately when the victim process is unable to be terminated
due to waiting for kthreadd process to complete memory allocation.

Kernel developers think that it is a systemd's bug because any thread
who received SIGKILL has a right to terminate immediately. Therefore,
reverting commit 786235ee is not acceptable for kernel developers.

On the other hand, systemd developers think that it is a kernel's bug
because finit_module() should return within 30 seconds. Therefore,
changing to longer timeout is not acceptable for systemd developers.

Since there was no time to wait for systemd to allow longer timeout,
Bug #1276705 used a SAUCE patch that allows kthread_create() to ignore
SIGKILL up to 10 seconds. We used a SAUCE patch for Ubuntu 14.04, but
we don't want to carry this SAUCE patch forever.