Ubuntu
linux package

Hard lockup in 2 CPUs due to deadlock in cpu_stoppers

Bionic (18.04)
Bug #1821259

Bug #1821259 reported by Mauricio Faria de Oliveira on 2019-03-21

This bug affects 1 person

	Status	Importance	Assigned to
linux (Ubuntu)
Xenial	Fix Released	Undecided	Unassigned
Bionic	Fix Released	Undecided	Unassigned

Bug Description

[Impact]

* This problem hard locks up 2 CPUs in a deadlock, and this
soft locks up other CPUs as an effect; the system becomes
unusable.

* This is relatively rare / difficult to hit because it's a
   corner case in scheduling/load balancing that needs timing
   with CPU stopper code. And it needs SMP plus _NUMA_ system.
   (but it can be hit with synthetic test case attached in LP.)

* Since SMP plus NUMA usually equals _servers_ it looks like
a good idea to prevent this bug / hard lockups / rebooting.

* The fix resolves the potential deadlock by removing one of
the calls required to deadlock from under the locked code.

[Test Case]

* There's a synthetic test case to reproduce this problem
(although without the stack traces - just a system hang)
attached to this LP bug.

* It uses kprobes/mdelay/cpu stopper calls to force the code
to execute and force the timing/locking condition to occur.

* $ sudo insmod kmod-stopper.ko

Some dmesg logging occurs, and systems either hangs or not.
See examples in comments.

[Regression Potential]

* These are patches to the cpu stop_machine.c code, and they
   change a bit how it works; however, there are no upstream
   fixes for these patches anymore and they are still the top
   of the 'git log --oneline -- kernel/stop_machine.c' output.

* These patches have been verified with the synthetic test case
and 'stress-ng --class scheduler --sequential 0' (no regressions)
on guest with 2 CPUs and one physical system with 24 CPUs.

[Other Info]

* The patches are required on Xenial and later.
* There are 4 patches for Xenial, and 2 patches pending for Bionic.
* All patches are applied from Cosmic onwards.

[Original Description]

These 2 hard lockups happened all of a sudden in the logs, and many soft lockups occur after them as a fallout.

    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.477086] NMI watchdog: Watchdog detected hard LOCKUP on cpu 10
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.483800] Modules linked in: <...>
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484066] CPU: 10 PID: 58 Comm: migration/10 Not tainted 4.4.0-116-generic #140~14.04.1-Ubuntu
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484068] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 02/17/2017
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484070] task: ffff883ff2a76200 ti: ffff883ff2110000 task.ti: ffff883ff2110000
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484071] RIP: 0010:[<ffffffff810c8cb0>] [<ffffffff810c8cb0>] native_queued_spin_lock_slowpath+0x160/0x170
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484079] RSP: 0000:ffff883ff2113c58 EFLAGS: 00000002
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484080] RAX: 0000000000000101 RBX: 0000000000000086 RCX: 0000000000000001
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484081] RDX: 0000000000000101 RSI: 0000000000000001 RDI: ffff881fff991ba8
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484083] RBP: ffff883ff2113c58 R08: 0000000000000101 R09: ffff883ff082e200
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484084] R10: 0000000000002e04 R11: 0000000000002e04 R12: ffff881fff997c60
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484085] R13: ffff881fff991ba8 R14: 0000000000000000 R15: ffff881fff997300
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484087] FS: 0000000000000000(0000) GS:ffff883fff000000(0000) knlGS:0000000000000000
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484088] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484090] CR2: 00007f7caaa23020 CR3: 0000001f46740000 CR4: 0000000000160670
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484091] Stack:
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484092] ffff883ff2113c68 ffffffff811870eb ffff883ff2113c80 ffffffff81819907
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484094] ffff881fff991ba0 ffff883ff2113cb0 ffffffff8111c600 ffff881fff997300
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484096] ffff881fff997c90 ffff881ff03dd400 0000000000000000 ffff883ff2113cc0
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484098] Call Trace:
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484105] [<ffffffff811870eb>] queued_spin_lock_slowpath+0xb/0xf
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484109] [<ffffffff81819907>] _raw_spin_lock_irqsave+0x37/0x40
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484113] [<ffffffff8111c600>] cpu_stop_queue_work+0x30/0x80
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484116] [<ffffffff8111ccd0>] stop_one_cpu_nowait+0x30/0x40
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484119] [<ffffffff810bbb5b>] load_balance+0x71b/0x940
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484122] [<ffffffff810bbff5>] pick_next_task_fair+0x275/0x4b0
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484126] [<ffffffff81816166>] __schedule+0x6c6/0x7f0
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484132] [<ffffffff810a2560>] ? sort_range+0x30/0x30
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484134] [<ffffffff818162c5>] schedule+0x35/0x80
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484136] [<ffffffff810a262d>] smpboot_thread_fn+0xcd/0x180
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484139] [<ffffffff8109f138>] kthread+0xd8/0xf0
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484141] [<ffffffff8109f060>] ? kthread_park+0x60/0x60
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484143] [<ffffffff81819ff5>] ret_from_fork+0x55/0x80
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603802.484144] [<ffffffff8109f060>] ? kthread_park+0x60/0x60

    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.644471] NMI watchdog: Watchdog detected hard LOCKUP on cpu 6
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651086] Modules linked in: <...>
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651342] CPU: 6 PID: 204932 Comm: ceph-osd Not tainted 4.4.0-116-generic #140~14.04.1-Ubuntu
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651344] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 02/17/2017
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651345] task: ffff881ff03dd400 ti: ffff883cda77c000 task.ti: ffff883cda77c000
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651347] RIP: 0010:[<ffffffff810aacb6>] [<ffffffff810aacb6>] try_to_wake_up+0x86/0x3f0
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651353] RSP: 0000:ffff883cda77fa78 EFLAGS: 00000002
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651354] RAX: 0000000000000001 RBX: ffff883ff2a76200 RCX: 0000000000000000
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651355] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff883ff2a768d4
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651356] RBP: ffff883cda77fab8 R08: 000000000000000a R09: ffff881ff03dd400
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651357] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000017300
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651359] R13: ffff883ff2a768d4 R14: 0000000000000046 R15: 0000000000000000
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651360] FS: 00007ff8ecbc9700(0000) GS:ffff881fff980000(0000) knlGS:0000000000000000
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651362] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651363] CR2: 0000000014583550 CR3: 0000003d4ac96000 CR4: 0000000000160670
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651364] Stack:
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651365] 0000000000000202 ffff883cda77fa98 0000000000000003 0000000000000006
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651368] 000000000000000a ffff883cda77fb70 ffff883fff011ba0 ffff881fff991ba0
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651370] ffff883cda77fac8 ffffffff810ab035 ffff883cda77fbc8 ffffffff8111cc22
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651372] Call Trace:
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651375] [<ffffffff810ab035>] wake_up_process+0x15/0x20
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651379] [<ffffffff8111cc22>] stop_two_cpus+0x1b2/0x230
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651382] [<ffffffff8111c650>] ? cpu_stop_queue_work+0x80/0x80
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651384] [<ffffffff810b5d15>] ? dequeue_entity+0x455/0x8a0
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651386] [<ffffffff8111c650>] ? cpu_stop_queue_work+0x80/0x80
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651388] [<ffffffff810aaa70>] ? __migrate_swap_task.part.83+0x80/0x80
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651390] [<ffffffff810ab18e>] migrate_swap+0xae/0x130
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651392] [<ffffffff810b4e44>] task_numa_migrate+0x504/0x930
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651394] [<ffffffff810b52e9>] numa_migrate_preferred+0x79/0x80
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651396] [<ffffffff810b9373>] task_numa_fault+0x923/0xcd0
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651400] [<ffffffff8175e407>] ? tcp_recvmsg+0x6b7/0xbd0
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651404] [<ffffffff811da9be>] ? mpol_misplaced+0x14e/0x190
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651408] [<ffffffff811b7836>] handle_pte_fault+0x5a6/0x1440
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651411] [<ffffffff816f6693>] ? sock_recvmsg+0x43/0x50
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651413] [<ffffffff811b9540>] handle_mm_fault+0x250/0x540
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651417] [<ffffffff81069e1a>] __do_page_fault+0x19a/0x430
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651419] [<ffffffff8106a0d2>] do_page_fault+0x22/0x30
    Nov 23 15:48:33 SYSTEM_NAME kernel: [4603805.651423] [<ffffffff8181c5a8>] page_fault+0x28/0x30

Tags:

CVE References

Revision history for this message

Mauricio Faria de Oliveira (mfo) wrote on 2019-03-21:

Download full text (4.0 KiB)

Analysis
--------

The 1st hard lockup is harder to get the interesting data out of, as apparently the registers with variables
related to the cpu number have been clobbered by more recent calls in the spinlock path.

Looking at the 2nd hard lockup:

addr2line + code shows us that try_to_wake_up() in line 1997 is indeed looping with IRQs disabled in line 1939 (thus a hard lockup):

$ addr2line -pifae ddeb-116.140/usr/lib/debug/boot/vmlinux-4.4.0-116-generic 0xffffffff810aacb6
0xffffffff810aacb6: try_to_wake_up at /build/linux-lts-xenial-ozsla7/linux-lts-xenial-4.4.0/kernel/sched/core.c:1997

    1926 static int
    1927 try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
    1928 {
    ...
    1939 raw_spin_lock_irqsave(&p->pi_lock, flags);
    ...
    1993 /*
    1994 * If the owning (remote) cpu is still in the middle of schedule() with
    1995 * this task as prev, wait until its done referencing the task.
    1996 */
    1997 while (p->on_cpu)
    1998 cpu_relax();
    ...
    2027 raw_spin_unlock_irqrestore(&p->pi_lock, flags);
    2028
    2029 return success;
    2030 }

The objdump disassembly of try_to_wake_up() in vmlinux for the RIP instruction address (ffffffff810aacb6),
shows a while loop that just checks for non-zero 'p->on_cpu' and calls cpu_relax() (which translates to the 'pause' instruction):

    ffffffff810aacb1: f3 90 pause
    ffffffff810aacb3: 8b 43 28 mov 0x28(%rbx),%eax
    ffffffff810aacb6: 85 c0 test %eax,%eax
    ffffffff810aacb8: 75 f7 jne ffffffff810aacb1 <try_to_wake_up+0x81>

So, it checks for the value in pointer in RBX + offset 0x28, which according to the 'pahole' tool, is indeed the 'on_cpu' field:

$ pahole --hex -C task_struct ddeb-116.140/usr/lib/debug/boot/vmlinux-4.4.0-116-generic | grep on_cpu
int on_cpu; /* 0x28 0x4 */

So, the task_struct pointer is in RBX, which is:

RBX: ffff883ff2a76200

And that matches the other hard locked up task on CPU 10 (see its 'task:' field).

Per the stack trace in CPU 10, and the identical timestamp of the two hard lockup messages, and the fact both stack traces are cpu_stopper related,
it does look like CPU 10 is waiting on the spinlock of one of the 2 cpu stoppers held by CPU 6, which is exactly the scenario in the suggested patch.

The problem/fix has been verified with a synthetic test-case (attached).

commit 0b26351b910fb8fe6a056f8a1bbccabe50c0e19f
Author: Peter Zijlstra <email address hidden>
Date: Fri Apr 20 11:50:05 2018 +0200

stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock

Matt reported the following deadlock:

CPU0 CPU1

    schedule(.prev=migrate/0) <fault>
      pick_next_task() ...
        idle_balance() migrate_swap()
          active_balance() stop_two_cpus()
                 ...

Analysis
--------

The 1st hard lockup is harder to get the interesting data out of, as apparently the registers with variables
related to the cpu number have been clobbered by more recent calls in the spinlock path.

Looking at the 2nd hard lockup:

addr2line + code shows us that try_to_wake_up() in line 1997 is indeed looping with IRQs disabled in line 1939 (thus a hard lockup):

$ addr2line -pifae ddeb-116.140/usr/lib/debug/boot/vmlinux-4.4.0-116-generic 0xffffffff810aacb6                         
    0xffffffff810aacb6: try_to_wake_up at /build/linux-lts-xenial-ozsla7/linux-lts-xenial-4.4.0/kernel/sched/core.c:1997

1926 static int
    1927 try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
    1928 {
    ...
    1939         raw_spin_lock_irqsave(&p->pi_lock, flags);
    ...
    1993         /*
    1994          * If the owning (remote) cpu is still in the middle of schedule() with
    1995          * this task as prev, wait until its done referencing the task.
    1996          */
    1997         while (p->on_cpu)
    1998                 cpu_relax();
    ...
    2027         raw_spin_unlock_irqrestore(&p->pi_lock, flags);
    2028 
    2029         return success;
    2030 }

ffffffff810aacb1:       f3 90                   pause
    ffffffff810aacb3:       8b 43 28                mov    0x28(%rbx),%eax
    ffffffff810aacb6:       85 c0                   test   %eax,%eax
    ffffffff810aacb8:       75 f7                   jne    ffffffff810aacb1 <try_to_wake_up+0x81>

So, it checks for the value in pointer in RBX + offset 0x28, which according to the 'pahole' tool, is indeed the 'on_cpu' field:

$ pahole --hex -C task_struct ddeb-116.140/usr/lib/debug/boot/vmlinux-4.4.0-116-generic | grep on_cpu                  
        int                        on_cpu;               /*  0x28   0x4 */

So, the task_struct pointer is in RBX, which is:

RBX: ffff883ff2a76200

And that matches the other hard locked up task on CPU 10 (see its 'task:' field).

The problem/fix has been verified with a synthetic test-case (attached).

commit 0b26351b910fb8fe6a056f8a1bbccabe50c0e19f
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Fri Apr 20 11:50:05 2018 +0200

stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock
    
    Matt reported the following deadlock:
    
    CPU0                                    CPU1
    
    schedule(.prev=migrate/0)               <fault>
      pick_next_task()                        ...
        idle_balance()                          migrate_swap()
          active_balance()                        stop_two_cpus()
                                                    spin_lock(stopper0->lock)
                                                    spin_lock(stopper1->lock)
                                                    ttwu(migrate/0)
                                                      smp_cond_load_acquire() -- waits for schedule()
            stop_one_cpu(1)
              spin_lock(stopper1->lock) -- waits for stopper lock
    
    Fix this deadlock by taking the wakeups out from under stopper->lock.
    This allows the active_balance() to queue the stop work and finish the
    context switch, which in turn allows the wakeup from migrate_swap() to
    observe the context and complete the wakeup.
<...>

The stop_two_cpus() call can only happen in a NUMA system per it's caller chain:
  stop_two_cpus() <- migrate_swap() <- task_numa_migrate() <- numa_migrate_preferred() <- [task_numa_placement()] <- task_numa_fault()

Revision history for this message

Mauricio Faria de Oliveira (mfo) wrote on 2019-03-21:

kmod-stopper.c Edit (8.0 KiB, text/x-csrc)

Test-case (kmod-stopper.c)
---------

$ sudo apt-get -y install gcc make libelf-dev linux-headers-$(uname -r)

$ touch Makefile # fake it, and use this make line:
$ make -C /lib/modules/$(uname -r)/build M=$(pwd) obj-m=kmod-stopper.o modules

$ echo 9 | sudo tee /proc/sys/kernel/printk

$ sudo insmod kmod-stopper.ko
<watch console for messages>
<it either hangs / finishes>

$ sudo rmmod kmod-stopper

Revision history for this message

Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote on 2019-03-21: Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1821259

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status:	New → Incomplete
tags:	added: xenial

Revision history for this message

Mauricio Faria de Oliveira (mfo) wrote on 2019-03-21:

Test-case on Xenial;

$ ls -1d /sys/devices/system/cpu/cpu[0-9]*
/sys/devices/system/cpu/cpu0
/sys/devices/system/cpu/cpu1

Original
--------

$ uname -rv
4.4.0-144-generic #170-Ubuntu SMP Thu Mar 14 11:56:20 UTC 2019

$ sudo insmod kmod-stopper/kmod-stopper.ko
[ 74.198379] mod_init() :: this cpu = 0x1, that cpu = 0x0
[ 74.199613] mod_init() :: that_cpu_stopper_task = ffff88003d80e600, comm = migration/0
[ 74.206194] kp2/stop_two_cpus() :: this cpu = 0x1, that cpu = 0x0
[ 74.206196] do_nothing() :: this cpu = 0x0, that cpu = 0x1
[ 74.206201] kp1/pick_next_task_fair() :: this cpu = 0x0, that cpu = 0x1
[ 74.206203] kp1/pick_next_task_fair() :: before sleep (1000 msecs)
[ 74.212759] kp2/stop_two_cpus() :: before sleep (500 msecs)
[ 74.710138] kp2/stop_two_cpus() :: after sleep (500 msecs)
[ 75.198324] kp1/pick_next_task_fair() :: after sleep (1000 msecs)
[ 75.199814] kp1/pick_next_task_fair() :: stopping other cpu...
<hang>

The test-case only failed 2 out of 50+ tests.

Patched:
-------

$ uname -rv
4.4.0-144-generic #170+test20190320b1 SMP Wed Mar 20 18:35:06 UTC 2019

$ sudo insmod kmod-stopper/kmod-stopper.ko
[ 85.958527] mod_init() :: this cpu = 0x1, that cpu = 0x0
[ 85.965876] mod_init() :: that_cpu_stopper_task = ffff88003d80e600, comm = migration/0
[ 85.993446] kp2/stop_two_cpus() :: this cpu = 0x1, that cpu = 0x0
[ 85.993471] do_nothing() :: this cpu = 0x0, that cpu = 0x1
[ 85.993477] kp1/pick_next_task_fair() :: this cpu = 0x0, that cpu = 0x1
[ 85.993480] kp1/pick_next_task_fair() :: before sleep (1000 msecs)
[ 86.019469] kp2/stop_two_cpus() :: before sleep (500 msecs)
[ 86.521688] kp2/stop_two_cpus() :: after sleep (500 msecs)
[ 86.987662] kp1/pick_next_task_fair() :: after sleep (1000 msecs)
[ 86.989427] kp1/pick_next_task_fair() :: stopping other cpu...
[ 86.991109] do_nothing() :: this cpu = 0x1, that cpu = 0x0
[ 86.992615] do_nothing() :: this cpu = 0x1, that cpu = 0x0
<finished>

It passes every time (50+ tests).

Test-case on Xenial;

$ ls -1d /sys/devices/system/cpu/cpu[0-9]*
/sys/devices/system/cpu/cpu0
/sys/devices/system/cpu/cpu1

Original
--------

$ uname -rv
4.4.0-144-generic #170-Ubuntu SMP Thu Mar 14 11:56:20 UTC 2019

$ sudo insmod kmod-stopper/kmod-stopper.ko
[   74.198379] mod_init() :: this cpu = 0x1, that cpu = 0x0
[   74.199613] mod_init() :: that_cpu_stopper_task = ffff88003d80e600, comm = migration/0
[   74.206194] kp2/stop_two_cpus() :: this cpu = 0x1, that cpu = 0x0
[   74.206196] do_nothing() :: this cpu = 0x0, that cpu = 0x1
[   74.206201] kp1/pick_next_task_fair() :: this cpu = 0x0, that cpu = 0x1
[   74.206203] kp1/pick_next_task_fair() :: before sleep (1000 msecs)
[   74.212759] kp2/stop_two_cpus() :: before sleep (500 msecs)
[   74.710138] kp2/stop_two_cpus() :: after  sleep (500 msecs)
[   75.198324] kp1/pick_next_task_fair() :: after  sleep (1000 msecs)
[   75.199814] kp1/pick_next_task_fair() :: stopping other cpu...
<hang>

The test-case only failed 2 out of 50+ tests.

Patched:
-------

$ uname -rv
4.4.0-144-generic #170+test20190320b1 SMP Wed Mar 20 18:35:06 UTC 2019

$ sudo insmod kmod-stopper/kmod-stopper.ko
[   85.958527] mod_init() :: this cpu = 0x1, that cpu = 0x0
[   85.965876] mod_init() :: that_cpu_stopper_task = ffff88003d80e600, comm = migration/0
[   85.993446] kp2/stop_two_cpus() :: this cpu = 0x1, that cpu = 0x0
[   85.993471] do_nothing() :: this cpu = 0x0, that cpu = 0x1
[   85.993477] kp1/pick_next_task_fair() :: this cpu = 0x0, that cpu = 0x1
[   85.993480] kp1/pick_next_task_fair() :: before sleep (1000 msecs)
[   86.019469] kp2/stop_two_cpus() :: before sleep (500 msecs)
[   86.521688] kp2/stop_two_cpus() :: after  sleep (500 msecs)
[   86.987662] kp1/pick_next_task_fair() :: after  sleep (1000 msecs)
[   86.989427] kp1/pick_next_task_fair() :: stopping other cpu...
[   86.991109] do_nothing() :: this cpu = 0x1, that cpu = 0x0
[   86.992615] do_nothing() :: this cpu = 0x1, that cpu = 0x0
<finished>

It passes every time (50+ tests).

Revision history for this message

Mauricio Faria de Oliveira (mfo) wrote on 2019-03-21:

Since Bionic already has the fix commit applied,
the original kernel version doesn't hit the problem.

Revision history for this message

Mauricio Faria de Oliveira (mfo) wrote on 2019-03-21:

Both xenial and bionic original/patched kernels
were tested with stress-ng scheduler class, and
no regressions were observed.

$ stress-ng --version
stress-ng, version 0.09.56 (gcc 8.3, x86_64 Linux 4.15.0-47-generic) 💻🔥

$ sudo stress-ng --class scheduler --sequential 0

$ uname -rv
4.4.0-144-generic #170-Ubuntu SMP Thu Mar 14 11:56:20 UTC 2019

$ uname -rv
4.4.0-144-generic #170+test20190320b1 SMP Wed Mar 20 18:35:06 UTC 2019

$ uname -rv
4.15.0-47-generic #50-Ubuntu SMP Wed Mar 13 10:44:52 UTC 2019

$ uname -rv
4.15.0-47-generic #50+test20190320b1 SMP Wed Mar 20 20:08:03 UTC 2019

Revision history for this message

Mauricio Faria de Oliveira (mfo) wrote on 2019-03-21:

[X][PATCH 0/4] LP#1821259 Fix for deadlock in cpu_stopper
https://lists.ubuntu.com/archives/kernel-team/2019-March/099427.html

[B][PATCH 0/2] Fix for LP#1821259 (pending patches for) Fix for deadlock in cpu_stopper
https://lists.ubuntu.com/archives/kernel-team/2019-March/099432.html

no longer affects:	linux (Ubuntu)
Changed in linux (Ubuntu Bionic):
status:	New → Confirmed
Changed in linux (Ubuntu Xenial):
status:	New → Confirmed

Khaled El Mously (kmously) on 2019-03-28

Changed in linux (Ubuntu Xenial):
status:	Confirmed → Fix Committed
Changed in linux (Ubuntu Bionic):
status:	Confirmed → Fix Committed

Revision history for this message

Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote on 2019-04-04:

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags:	added: verification-needed-bionic
tags:	added: verification-needed-xenial

Revision history for this message

Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote on 2019-04-04:

#10

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

Revision history for this message

Mauricio Faria de Oliveira (mfo) wrote on 2019-04-05:

#11

Verification done on xenial-proposed.
The testcase cannot hang the system, repeated 10 times.

4.4.0-145-generic

$ sudo insmod kmod-stopper.ko
[ 324.201942] kmod_stopper: loading out-of-tree module taints kernel.
[ 324.205604] kmod_stopper: module verification failed: signature and/or required key missing - tainting kernel
[ 324.213641] mod_init() :: this cpu = 0x2, that cpu = 0x3
[ 324.214802] mod_init() :: that_cpu_stopper_task = ffff88013a97b300, comm = migration/3
[ 324.224825] kp2/stop_two_cpus() :: this cpu = 0x2, that cpu = 0x3
[ 324.224834] do_nothing() :: this cpu = 0x3, that cpu = 0x2
[ 324.224839] kp1/pick_next_task_fair() :: this cpu = 0x3, that cpu = 0x2
[ 324.224841] kp1/pick_next_task_fair() :: before spin (1000 msecs)
[ 324.230226] kp2/stop_two_cpus() :: before spin (500 msecs)
[ 324.727963] kp2/stop_two_cpus() :: after spin (500 msecs)
[ 325.217499] kp1/pick_next_task_fair() :: after spin (1000 msecs)
[ 325.218596] kp1/pick_next_task_fair() :: stopping other cpu...
<hangs>

4.4.0-146-generic

$ sudo insmod kmod-stopper.ko
[ 512.306797] mod_init() :: this cpu = 0x0, that cpu = 0x1
[ 512.308267] mod_init() :: that_cpu_stopper_task = ffff88013a913300, comm = migration/1
[ 512.318288] kp2/stop_two_cpus() :: this cpu = 0x0, that cpu = 0x1
[ 512.318298] do_nothing() :: this cpu = 0x1, that cpu = 0x0
[ 512.318335] kp1/pick_next_task_fair() :: this cpu = 0x1, that cpu = 0x0
[ 512.318337] kp1/pick_next_task_fair() :: before spin (1000 msecs)
[ 512.325303] kp2/stop_two_cpus() :: before spin (500 msecs)
[ 512.823132] kp2/stop_two_cpus() :: after spin (500 msecs)
[ 513.312125] kp1/pick_next_task_fair() :: after spin (1000 msecs)
[ 513.313440] kp1/pick_next_task_fair() :: stopping other cpu...
[ 513.314708] do_nothing() :: this cpu = 0x0, that cpu = 0x1
[ 513.315908] do_nothing() :: this cpu = 0x0, that cpu = 0x1

tags:

added: verification-done-xenial
removed: verification-needed-xenial

Revision history for this message

Mauricio Faria de Oliveira (mfo) wrote on 2019-04-08:

#12

Verification done on bionic-proposed.

The Bionic kernel already has the main fix patch,
the new patches are just to bring it up with the
incremental fixes upstream for the main fix patch.

No regressions observed between 4.15.0-{47,48}-generic
with `sudo stress-ng --class scheduler --sequential 0`.

tags:

added: verification-done-bionic
removed: verification-needed-bionic

Revision history for this message

Launchpad Janitor (janitor) wrote on 2019-04-23:

#13

Download full text (15.4 KiB)

This bug was fixed in the package linux - 4.4.0-146.172

---------------
linux (4.4.0-146.172) xenial; urgency=medium

* linux: 4.4.0-146.172 -proposed tracker (LP: #1822834)

  * Packaging resync (LP: #1786013)
    - [Packaging] update helper scripts
    - [Packaging] resync retpoline extraction

  * 3b080b2564287be91605bfd1d5ee985696e61d3c in ubuntu_btrfs_kernel_fixes
    triggers system hang on i386 (LP: #1812845)
    - btrfs: raid56: properly unmap parity page in finish_parity_scrub()

This bug was fixed in the package linux - 4.4.0-146.172

---------------
linux (4.4.0-146.172) xenial; urgency=medium

* linux: 4.4.0-146.172 -proposed tracker (LP: #1822834)

* Packaging resync (LP: #1786013)
    - [Packaging] update helper scripts
    - [Packaging] resync retpoline extraction

* 3b080b2564287be91605bfd1d5ee985696e61d3c in ubuntu_btrfs_kernel_fixes
    triggers system hang on i386 (LP: #1812845)
    - btrfs: raid56: properly unmap parity page in finish_parity_scrub()

* Xenial update: 4.4.177 upstream stable release (LP: #1822271)
    - ceph: avoid repeatedly adding inode to mdsc->snap_flush_list
    - numa: change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES
    - KEYS: allow reaching the keys quotas exactly
    - mfd: ti_am335x_tscadc: Use PLATFORM_DEVID_AUTO while registering mfd cells
    - mfd: twl-core: Fix section annotations on {,un}protect_pm_master
    - mfd: db8500-prcmu: Fix some section annotations
    - mfd: ab8500-core: Return zero in get_register_interruptible()
    - mfd: qcom_rpm: write fw_version to CTRL_REG
    - mfd: wm5110: Add missing ASRC rate register
    - mfd: mc13xxx: Fix a missing check of a register-read failure
    - net: hns: Fix use after free identified by SLUB debug
    - MIPS: ath79: Enable OF serial ports in the default config
    - scsi: qla4xxx: check return code of qla4xxx_copy_from_fwddb_param
    - scsi: isci: initialize shost fully before calling scsi_add_host()
    - MIPS: jazz: fix 64bit build
    - isdn: i4l: isdn_tty: Fix some concurrency double-free bugs
    - atm: he: fix sign-extension overflow on large shift
    - leds: lp5523: fix a missing check of return value of lp55xx_read
    - isdn: avm: Fix string plus integer warning from Clang
    - RDMA/srp: Rework SCSI device reset handling
    - KEYS: user: Align the payload buffer
    - KEYS: always initialize keyring_index_key::desc_len
    - batman-adv: fix uninit-value in batadv_interface_tx()
    - net/packet: fix 4gb buffer limit due to overflow check
    - team: avoid complex list operations in team_nl_cmd_options_set()
    - sit: check if IPv6 enabled before calling ip6_err_gen_icmpv6_unreach()
    - net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames
    - ARCv2: Enable unaligned access in early ASM code
    - Revert "bridge: do not add port to router list when receives query with
      source 0.0.0.0"
    - libceph: handle an empty authorize reply
    - drm/msm: Unblock writer if reader closes file
    - ASoC: Intel: Haswell/Broadwell: fix setting for .dynamic field
    - ALSA: compress: prevent potential divide by zero bugs
    - thermal: int340x_thermal: Fix a NULL vs IS_ERR() check
    - usb: dwc3: gadget: Fix the uninitialized link_state when udc starts
    - usb: gadget: Potential NULL dereference on allocation error
    - ASoC: dapm: change snprintf to scnprintf for possible overflow
    - ASoC: imx-audmux: change snprintf to scnprintf for possible overflow
    - ARC: fix __ffs return value to avoid build warnings
    - mac80211: fix miscounting of ttl-dropped frames
    - serial: fsl_lpuart: fix maximum acceptable baud rate with over-sampling
    - scsi: csiostor: fix NULL pointer dereference in csio_vport_set_state()
    - net: altera_tse: fix connect_local_phy error path
    - ibmveth: Do not process frames after calling napi_reschedule
    - mac80211: don't initiate TDLS connection if station is not associated to AP
    - cfg80211: extend range deviation for DMG
    - KVM: nSVM: clear events pending from svm_complete_interrupts() when exiting
      to L1
    - arm/arm64: KVM: Feed initialized memory to MMIO accesses
    - KVM: arm/arm64: Fix MMIO emulation data handling
    - powerpc: Always initialize input array when calling epapr_hypercall()
    - mmc: spi: Fix card detection during probe
    - x86/uaccess: Don't leak the AC flag into __put_user() value evaluation
    - USB: serial: option: add Telit ME910 ECM composition
    - USB: serial: cp210x: add ID for Ingenico 3070
    - USB: serial: ftdi_sio: add ID for Hjelmslund Electronics USB485
    - cpufreq: Use struct kobj_attribute instead of struct global_attr
    - sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names
    - ncpfs: fix build warning of strncpy
    - isdn: isdn_tty: fix build warning of strncpy
    - staging: lustre: fix buffer overflow of string buffer
    - net-sysfs: Fix mem leak in netdev_register_kobject
    - team: Free BPF filter when unregistering netdev
    - bnxt_en: Drop oversize TX packets to prevent errors.
    - net: nfc: Fix NULL dereference on nfc_llcp_build_tlv fails
    - xen-netback: fix occasional leak of grant ref mappings under memory pressure
    - net: Add __icmp_send helper.
    - net: avoid use IPCB in cipso_v4_error
    - net: phy: Micrel KSZ8061: link failure after cable connect
    - x86/CPU/AMD: Set the CPB bit unconditionally on F17h
    - applicom: Fix potential Spectre v1 vulnerabilities
    - MIPS: irq: Allocate accurate order pages for irq stack
    - hugetlbfs: fix races and page leaks during migration
    - netlabel: fix out-of-bounds memory accesses
    - net: dsa: mv88e6xxx: Fix u64 statistics
    - ip6mr: Do not call __IP6_INC_STATS() from preemptible context
    - media: uvcvideo: Fix 'type' check leading to overflow
    - vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel
    - perf tools: Handle TOPOLOGY headers with no CPU
    - IB/{hfi1, qib}: Fix WC.byte_len calculation for UD_SEND_WITH_IMM
    - ipvs: Fix signed integer overflow when setsockopt timeout
    - iommu/amd: Fix IOMMU page flush when detach device from a domain
    - xtensa: SMP: fix ccount_timer_shutdown
    - xtensa: SMP: fix secondary CPU initialization
    - xtensa: smp_lx200_defconfig: fix vectors clash
    - xtensa: SMP: mark each possible CPU as present
    - xtensa: SMP: limit number of possible CPUs by NR_CPUS
    - net: altera_tse: fix msgdma_tx_completion on non-zero fill_level case
    - net: hns: Fix wrong read accesses via Clause 45 MDIO protocol
    - net: stmmac: dwmac-rk: fix error handling in rk_gmac_powerup()
    - gpio: vf610: Mask all GPIO interrupts
    - nfs: Fix NULL pointer dereference of dev_name
    - scsi: libfc: free skb when receiving invalid flogi resp
    - platform/x86: Fix unmet dependency warning for SAMSUNG_Q10
    - cifs: fix computation for MAX_SMB2_HDR_SIZE
    - x86/kexec: Don't setup EFI info if EFI runtime is not enabled
    - x86_64: increase stack size for KASAN_EXTRA
    - mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone
    - mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone
    - fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
    - autofs: drop dentry reference only when it is never used
    - autofs: fix error return in autofs_fill_super()
    - ARM: pxa: ssp: unneeded to free devm_ allocated data
    - irqchip/mmp: Only touch the PJ4 IRQ & FIQ bits on enable/disable
    - dmaengine: at_xdmac: Fix wrongfull report of a channel as in use
    - dmaengine: dmatest: Abort test in case of mapping error
    - s390/qeth: fix use-after-free in error path
    - perf symbols: Filter out hidden symbols from labels
    - MIPS: Remove function size check in get_frame_info()
    - Input: wacom_serial4 - add support for Wacom ArtPad II tablet
    - Input: elan_i2c - add id for touchpad found in Lenovo s21e-20
    - iscsi_ibft: Fix missing break in switch statement
    - futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock()
    - ARM: dts: exynos: Add minimal clkout parameters to Exynos3250 PMU
    - Revert "x86/platform/UV: Use efi_runtime_lock to serialise BIOS calls"
    - ARM: dts: exynos: Do not ignore real-world fuse values for thermal zone 0 on
      Exynos5420
    - udplite: call proper backlog handlers
    - netfilter: x_tables: enforce nul-terminated table name from getsockopt
      GET_ENTRIES
    - netfilter: nfnetlink_log: just returns error for unknown command
    - netfilter: nfnetlink_acct: validate NFACCT_FILTER parameters
    - netfilter: nf_conntrack_tcp: Fix stack out of bounds when parsing TCP
      options
    - KEYS: restrict /proc/keys by credentials at open time
    - l2tp: fix infoleak in l2tp_ip6_recvmsg()
    - net: hsr: fix memory leak in hsr_dev_finalize()
    - net: sit: fix UBSAN Undefined behaviour in check_6rd
    - net/x25: fix use-after-free in x25_device_event()
    - net/x25: reset state in x25_connect()
    - pptp: dst_release sk_dst_cache in pptp_sock_destruct
    - ravb: Decrease TxFIFO depth of Q3 and Q2 to one
    - route: set the deleted fnhe fnhe_daddr to 0 in ip_del_fnhe to fix a race
    - tcp: handle inet_csk_reqsk_queue_add() failures
    - net/mlx4_core: Fix reset flow when in command polling mode
    - net/mlx4_core: Fix qp mtt size calculation
    - net/x25: fix a race in x25_bind()
    - mdio_bus: Fix use-after-free on device_register fails
    - net: Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255
    - missing barriers in some of unix_sock ->addr and ->path accesses
    - ipvlan: disallow userns cap_net_admin to change global mode/flags
    - vxlan: test dev->flags & IFF_UP before calling gro_cells_receive()
    - vxlan: Fix GRO cells race condition between receive and link delete
    - net/hsr: fix possible crash in add_timer()
    - gro_cells: make sure device is up in gro_cells_receive()
    - tcp/dccp: remove reqsk_put() from inet_child_forget()
    - ALSA: bebob: use more identical mod_alias for Saffire Pro 10 I/O against
      Liquid Saffire 56
    - fs/9p: use fscache mutex rather than spinlock
    - It's wrong to add len to sector_nr in raid10 reshape twice
    - media: videobuf2-v4l2: drop WARN_ON in vb2_warn_zero_bytesused()
    - 9p: use inode->i_lock to protect i_size_write() under 32-bit
    - 9p/net: fix memory leak in p9_client_create
    - ASoC: fsl_esai: fix register setting issue in RIGHT_J mode
    - stm class: Fix an endless loop in channel allocation
    - crypto: caam - fixed handling of sg list
    - crypto: ahash - fix another early termination in hash walk
    - gpu: ipu-v3: Fix i.MX51 CSI control registers offset
    - gpu: ipu-v3: Fix CSI offsets for imx53
    - s390/dasd: fix using offset into zero size array error
    - ARM: OMAP2+: Variable "reg" in function omap4_dsi_mux_pads() could be
      uninitialized
    - Input: matrix_keypad - use flush_delayed_work()
    - i2c: cadence: Fix the hold bit setting
    - Input: st-keyscan - fix potential zalloc NULL dereference
    - ARM: 8824/1: fix a migrating irq bug when hotplug cpu
    - assoc_array: Fix shortcut creation
    - net: systemport: Fix reception of BPDUs
    - pinctrl: meson: meson8b: fix the sdxc_a data 1..3 pins
    - net: mv643xx_eth: disable clk on error path in mv643xx_eth_shared_probe()
    - ASoC: topology: free created components in tplg load error
    - arm64: Relax GIC version check during early boot
    - tmpfs: fix link accounting when a tmpfile is linked in
    - ARC: uacces: remove lp_start, lp_end from clobber list
    - phonet: fix building with clang
    - mac80211_hwsim: propagate genlmsg_reply return code
    - net: set static variable an initial value in atl2_probe()
    - tmpfs: fix uninitialized return value in shmem_link
    - stm class: Prevent division by zero
    - crypto: arm64/aes-ccm - fix logical bug in AAD MAC handling
    - CIFS: Fix read after write for files with read caching
    - tracing: Do not free iter->trace in fail path of tracing_open_pipe()
    - ACPI / device_sysfs: Avoid OF modalias creation for removed device
    - regulator: s2mps11: Fix steps for buck7, buck8 and LDO35
    - regulator: s2mpa01: Fix step values for some LDOs
    - clocksource/drivers/exynos_mct: Move one-shot check from tick clear to ISR
    - clocksource/drivers/exynos_mct: Clear timer interrupt when shutdown
    - s390/virtio: handle find on invalid queue gracefully
    - scsi: virtio_scsi: don't send sc payload with tmfs
    - scsi: target/iscsi: Avoid iscsit_release_commands_from_conn() deadlock
    - m68k: Add -ffreestanding to CFLAGS
    - btrfs: ensure that a DUP or RAID1 block group has exactly two stripes
    - Btrfs: fix corruption reading shared and compressed extents after hole
      punching
    - crypto: pcbc - remove bogus memcpy()s with src == dest
    - cpufreq: tegra124: add missing of_node_put()
    - cpufreq: pxa2xx: remove incorrect __init annotation
    - ext4: fix crash during online resizing
    - ext2: Fix underflow in ext2_max_size()
    - clk: ingenic: Fix round_rate misbehaving with non-integer dividers
    - dmaengine: usb-dmac: Make DMAC system sleep callbacks explicit
    - mm/vmalloc: fix size check for remap_vmalloc_range_partial()
    - kernel/sysctl.c: add missing range check in do_proc_dointvec_minmax_conv
    - intel_th: Don't reference unassigned outputs
    - parport_pc: fix find_superio io compare code, should use equal test.
    - i2c: tegra: fix maximum transfer size
    - perf bench: Copy kernel files needed to build mem{cpy,set} x86_64 benchmarks
    - serial: 8250_pci: Fix number of ports for ACCES serial cards
    - serial: 8250_pci: Have ACCES cards that use the four port Pericom PI7C9X7954
      chip use the pci_pericom_setup()
    - jbd2: clear dirty flag when revoking a buffer from an older transaction
    - jbd2: fix compile warning when using JBUFFER_TRACE
    - powerpc/32: Clear on-stack exception marker upon exception return
    - powerpc/wii: properly disable use of BATs when requested.
    - powerpc/powernv: Make opal log only readable by root
    - powerpc/83xx: Also save/restore SPRG4-7 during suspend
    - ARM: s3c24xx: Fix boolean expressions in osiris_dvs_notify
    - dm: fix to_sector() for 32bit
    - NFS41: pop some layoutget errors to application
    - perf intel-pt: Fix CYC timestamp calculation after OVF
    - perf auxtrace: Define auxtrace record alignment
    - perf intel-pt: Fix overlap calculation for padding
    - md: Fix failed allocation of md_register_thread
    - NFS: Fix an I/O request leakage in nfs_do_recoalesce
    - NFS: Don't recoalesce on error in nfs_pageio_complete_mirror()
    - nfsd: fix memory corruption caused by readdir
    - nfsd: fix wrong check in write_v4_end_grace()
    - PM / wakeup: Rework wakeup source timer cancellation
    - rcu: Do RCU GP kthread self-wakeup from softirq and interrupt
    - media: uvcvideo: Avoid NULL pointer dereference at the end of streaming
    - drm/radeon/evergreen_cs: fix missing break in switch statement
    - KVM: nVMX: Sign extend displacements of VMX instr's mem operands
    - KVM: nVMX: Ignore limit checks on VMX instructions using flat segments
    - KVM: X86: Fix residual mmio emulation request to userspace
    - Linux 4.4.177

* sky2 ethernet card doesn't work after returning from suspend
    (LP: #1807259) // sky2 ethernet card link not up after suspend
    (LP: #1809843) // Xenial update: 4.4.177 upstream stable release
    (LP: #1822271)
    - sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79

* [CONFIG] please enable highdpi font FONT_TER16x32 (LP: #1819881)
    - lib/fonts/Kconfig: keep non-Sparc fonts listed together
    - Fonts: New Terminus large console font
    - [Config]: enable highdpi Terminus 16x32 font support

* Hard lockup in 2 CPUs due to deadlock in cpu_stoppers (LP: #1821259)
    - stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock
    - stop_machine: Disable preemption when waking two stopper threads
    - stop_machine: Disable preemption after queueing stopper threads
    - stop_machine: Atomically queue and wake stopper threads

-- Khalid Elmously <khalid.elmously@canonical.com>  Tue, 02 Apr 2019 23:03:42 -0400

Changed in linux (Ubuntu Xenial):
status:	Fix Committed → Fix Released

Revision history for this message

Launchpad Janitor (janitor) wrote on 2019-04-24:

#14

Download full text (14.6 KiB)

This bug was fixed in the package linux - 4.15.0-48.51

---------------
linux (4.15.0-48.51) bionic; urgency=medium

* linux: 4.15.0-48.51 -proposed tracker (LP: #1822820)

  * Packaging resync (LP: #1786013)
    - [Packaging] update helper scripts
    - [Packaging] resync retpoline extraction

  * 3b080b2564287be91605bfd1d5ee985696e61d3c in ubuntu_btrfs_kernel_fixes
    triggers system hang on i386 (LP: #1812845)
    - btrfs: raid56: properly unmap parity page in finish_parity_scrub()

  * [P9][LTCTest][Opal][FW910] cpupower monitor shows multiple stop Idle_Stats
    (LP: #1719545)
    - cpupower : Fix header name to read idle state name

  * [amdgpu] screen corruption when using touchpad (LP: #1818617)
    - drm/amdgpu/gmc: steal the appropriate amount of vram for fw hand-over (v3)
    - drm/amdgpu: Free VGA stolen memory as soon as possible.

  * [SRU][B/C/OEM]IOMMU: add kernel dma protection (LP: #1820153)
    - ACPICA: AML parser: attempt to continue loading table after error
    - ACPI / property: Allow multiple property compatible _DSD entries
    - PCI / ACPI: Identify untrusted PCI devices
    - iommu/vt-d: Force IOMMU on for platform opt in hint
    - iommu/vt-d: Do not enable ATS for untrusted devices
    - thunderbolt: Export IOMMU based DMA protection support to userspace
    - iommu/vt-d: Disable ATS support on untrusted devices

  * Add basic support to NVLink2 passthrough (LP: #1819989)
    - powerpc/powernv/npu: Do not try invalidating 32bit table when 64bit table is
      enabled
    - powerpc/powernv: call OPAL_QUIESCE before OPAL_SIGNAL_SYSTEM_RESET
    - powerpc/powernv: Export opal_check_token symbol
    - powerpc/powernv: Make possible for user to force a full ipl cec reboot
    - powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn
    - powerpc/powernv: Move npu struct from pnv_phb to pci_controller
    - powerpc/powernv/npu: Move OPAL calls away from context manipulation
    - powerpc/pseries/iommu: Use memory@ nodes in max RAM address calculation
    - powerpc/pseries/npu: Enable platform support
    - powerpc/pseries: Remove IOMMU API support for non-LPAR systems
    - powerpc/powernv/npu: Check mmio_atsd array bounds when populating
    - powerpc/powernv/npu: Fault user page into the hypervisor's pagetable

  * Huawei Hi1822 NIC has poor performance (LP: #1820187)
    - net-next: hinic: fix a problem in free_tx_poll()
    - hinic: remove ndo_poll_controller
    - net-next/hinic: add checksum offload and TSO support
    - hinic: Fix l4_type parameter in hinic_task_set_tunnel_l4
    - net-next/hinic:replace multiply and division operators
    - net-next/hinic:add rx checksum offload for HiNIC
    - net-next/hinic:fix a bug in set mac address
    - net-next/hinic: fix a bug in rx data flow
    - net: hinic: fix null pointer dereference on pointer hwdev
    - hinic: optmize rx refill buffer mechanism
    - net-next/hinic:add shutdown callback
    - net-next/hinic: replace disable_irq_nosync/enable_irq

  * [CONFIG] please enable highdpi font FONT_TER16x32 (LP: #1819881)
    - Fonts: New Terminus large console font
    - [Config]: enable highdpi Terminus 16x32 font support

* [19.04 FEAT] qeth: Enhanced link...

This bug was fixed in the package linux - 4.15.0-48.51

---------------
linux (4.15.0-48.51) bionic; urgency=medium

* linux: 4.15.0-48.51 -proposed tracker (LP: #1822820)

* Packaging resync (LP: #1786013)
    - [Packaging] update helper scripts
    - [Packaging] resync retpoline extraction

* 3b080b2564287be91605bfd1d5ee985696e61d3c in ubuntu_btrfs_kernel_fixes
    triggers system hang on i386 (LP: #1812845)
    - btrfs: raid56: properly unmap parity page in finish_parity_scrub()

* [P9][LTCTest][Opal][FW910] cpupower monitor shows multiple stop Idle_Stats
    (LP: #1719545)
    - cpupower : Fix header name to read idle state name

* [CONFIG] please enable highdpi font FONT_TER16x32 (LP: #1819881)
    - Fonts: New Terminus large console font
    - [Config]: enable highdpi Terminus 16x32 font support

* [19.04 FEAT] qeth: Enhanced link speed - kernel part (LP: #1814892)
    - s390/qeth: report 25Gbit link speed

* CVE-2017-5754
    - x86/nmi: Fix NMI uaccess race against CR3 switching
    - x86/mm: Fix documentation of module mapping range with 4-level paging
    - x86/pti: Enable global pages for shared areas
    - x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image
    - x86/pti: Leave kernel text global for !PCID
    - x86/pti: Fix boot problems from Global-bit setting
    - x86/pti: Fix boot warning from Global-bit setting
    - x86/pti: Reduce amount of kernel text allowed to be Global
    - x86/pti: Disallow global kernel text with RANDSTRUCT
    - x86/entry/32: Add explicit 'l' instruction suffix
    - x86/asm-offsets: Move TSS_sp0 and TSS_sp1 to asm-offsets.c
    - x86/entry/32: Rename TSS_sysenter_sp0 to TSS_entry2task_stack
    - x86/entry/32: Load task stack from x86_tss.sp1 in SYSENTER handler
    - x86/entry/32: Put ESPFIX code into a macro
    - x86/entry/32: Unshare NMI return path
    - x86/entry/32: Split off return-to-kernel path
    - x86/entry/32: Enter the kernel via trampoline stack
    - x86/entry/32: Leave the kernel via trampoline stack
    - x86/entry/32: Introduce SAVE_ALL_NMI and RESTORE_ALL_NMI
    - x86/entry/32: Handle Entry from Kernel-Mode on Entry-Stack
    - x86/entry/32: Simplify debug entry point
    - x86/entry/32: Add PTI cr3 switch to non-NMI entry/exit points
    - x86/entry/32: Add PTI CR3 switches to NMI handler code
    - x86/entry: Rename update_sp0 to update_task_stack
    - x86/pgtable: Rename pti_set_user_pgd() to pti_set_user_pgtbl()
    - x86/pgtable/pae: Unshare kernel PMDs when PTI is enabled
    - x86/pgtable/32: Allocate 8k page-tables when PTI is enabled
    - x86/pgtable: Move pgdp kernel/user conversion functions to pgtable.h
    - x86/pgtable: Move pti_set_user_pgtbl() to pgtable.h
    - x86/pgtable: Move two more functions from pgtable_64.h to pgtable.h
    - x86/mm/pae: Populate valid user PGD entries
    - x86/mm/pae: Populate the user page-table with user pgd's
    - x86/mm/pti: Add an overflow check to pti_clone_pmds()
    - x86/mm/pti: Define X86_CR3_PTI_PCID_USER_BIT on x86_32
    - x86/mm/pti: Clone CPU_ENTRY_AREA on PMD level on x86_32
    - x86/mm/pti: Make pti_clone_kernel_text() compile on 32 bit
    - x86/mm/pti: Keep permissions when cloning kernel text in
      pti_clone_kernel_text()
    - x86/mm/pti: Introduce pti_finalize()
    - x86/mm/pti: Clone entry-text again in pti_finalize()
    - x86/mm/dump_pagetables: Define INIT_PGD
    - x86/pgtable/pae: Use separate kernel PMDs for user page-table
    - x86/ldt: Reserve address-space range on 32 bit for the LDT
    - x86/ldt: Define LDT_END_ADDR
    - x86/ldt: Split out sanity check in map_ldt_struct()
    - x86/ldt: Enable LDT user-mapping for PAE
    - x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32
    - [Config] Update PAGE_TABLE_ISOLATION annotations
    - x86/mm/pti: Add Warning when booting on a PCID capable CPU
    - x86/entry/32: Add debug code to check entry/exit CR3
    - x86/pti: Check the return value of pti_user_pagetable_walk_p4d()
    - x86/pti: Check the return value of pti_user_pagetable_walk_pmd()
    - perf/core: Make sure the ring-buffer is mapped in all page-tables
    - x86/entry/32: Check for VM86 mode in slow-path check
    - x86/mm: Remove in_nmi() warning from vmalloc_fault()
    - x86/kexec: Allocate 8k PGDs for PTI
    - x86/mm/pti: Clear Global bit more aggressively
    - mm: Allow non-direct-map arguments to free_reserved_area()
    - x86/mm/init: Pass unconverted symbol addresses to free_init_pages()
    - x86/mm/init: Add helper for freeing kernel image pages
    - x86/mm/init: Remove freed kernel image areas from alias mapping
    - x86/mm/pti: Fix 32 bit PCID check
    - x86/mm/pti: Don't clear permissions in pti_clone_pmd()
    - x86/mm/pti: Clone kernel-image on PTE level for 32 bit
    - x86/relocs: Add __end_rodata_aligned to S_REL
    - x86/mm/pti: Move user W+X check into pti_finalize()
    - x86/efi: Load fixmap GDT in efi_call_phys_epilog()
    - x86/efi: Load fixmap GDT in efi_call_phys_epilog() before setting %cr3
    - x86/mm/doc: Clean up the x86-64 virtual memory layout descriptions
    - x86/mm/doc: Enhance the x86-64 virtual memory layout descriptions
    - x86/entry/32: Clear the CS high bits
    - x86/mm: Move LDT remap out of KASLR region on 5-level paging
    - x86/ldt: Unmap PTEs for the slot before freeing LDT pages
    - x86/ldt: Remove unused variable in map_ldt_struct()
    - x86/mm: Fix guard hole handling
    - x86/dump_pagetables: Fix LDT remap address marker

* Avoid potential memory corruption on HiSilicon SoCs (LP: #1819546)
    - iommu/arm-smmu-v3: Avoid memory corruption from Hisilicon MSI payloads

* Ubuntu18.04.01: [Power9] power8 Compat guest(RHEL7.6) crashes during guest
    boot with > 256G of memory (kernel/kvm) (LP: #1818645)
    - ]PATCH] KVM: PPC: Book3S HV: Don't truncate HPTE index in xlate function

* Fix for dual Intel NVMes (LP: #1821961)
    - SAUCE: nvme: Merge two quirk entries into one for Intel 760p/Pro 7600p

* CVE-2017-5715
    - tools headers: Synchronize prctl.h ABI header
    - x86/spectre: Add missing family 6 check to microcode check
    - x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation
    - x86/speculation: Apply IBPB more strictly to avoid cross-process data leak
    - x86/speculation: Propagate information about RSB filling mitigation to sysfs
    - x86/speculation: Add RETPOLINE_AMD support to the inline asm CALL_NOSPEC
      variant
    - x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support
    - x86/retpoline: Remove minimal retpoline support
    - x86/speculation: Update the TIF_SSBD comment
    - x86/speculation: Clean up spectre_v2_parse_cmdline()
    - x86/speculation: Remove unnecessary ret variable in cpu_show_common()
    - x86/speculation: Move STIPB/IBPB string conditionals out of
      cpu_show_common()
    - x86/speculation: Disable STIBP when enhanced IBRS is in use
    - x86/speculation: Rename SSBD update functions
    - x86/speculation: Reorganize speculation control MSRs update
    - sched/smt: Make sched_smt_present track topology
    - x86/Kconfig: Select SCHED_SMT if SMP enabled
    - sched/smt: Expose sched_smt_present static key
    - x86/speculation: Rework SMT state change
    - x86/l1tf: Show actual SMT state
    - x86/speculation: Reorder the spec_v2 code
    - x86/speculation: Mark string arrays const correctly
    - x86/speculataion: Mark command line parser data __initdata
    - x86/speculation: Unify conditional spectre v2 print functions
    - x86/speculation: Add command line control for indirect branch speculation
    - x86/speculation: Prepare for per task indirect branch speculation control
    - x86/process: Consolidate and simplify switch_to_xtra() code
    - x86/speculation: Avoid __switch_to_xtra() calls
    - x86/speculation: Prepare for conditional IBPB in switch_mm()
    - ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRS
    - x86/speculation: Split out TIF update
    - x86/speculation: Prevent stale SPEC_CTRL msr content
    - x86/speculation: Prepare arch_smt_update() for PRCTL mode
    - x86/speculation: Add prctl() control for indirect branch speculation
    - x86/speculation: Enable prctl mode for spectre_v2_user
    - x86/speculation: Add seccomp Spectre v2 user space protection mode
    - x86/speculation: Provide IBPB always command line options
    - kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb
    - x86/speculation: Change misspelled STIPB to STIBP
    - x86/speculation: Add support for STIBP always-on preferred mode
    - x86, modpost: Replace last remnants of RETPOLINE with CONFIG_RETPOLINE
    - s390: remove closung punctuation from spectre messages
    - x86/speculation: Simplify the CPU bug detection logic

* CVE-2018-3639
    - x86/bugs: Add AMD's variant of SSB_NO
    - x86/bugs: Add AMD's SPEC_CTRL MSR usage
    - x86/bugs: Switch the selection of mitigation from CPU vendor to CPU features
    - x86/bugs: Update when to check for the LS_CFG SSBD mitigation
    - x86/bugs: Fix the AMD SSBD usage of the SPEC_CTRL MSR
    - KVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabled

* [Ubuntu] vfio-ap: add subsystem to matrix device to avoid libudev failures
    (LP: #1818854)
    - s390: vfio_ap: link the vfio_ap devices to the vfio_ap bus subsystem

* Kernel regularly logs: Bluetooth: hci0: last event is not cmd complete
    (0x0f) (LP: #1748565)
    - Bluetooth: Fix unnecessary error message for HCI request completion

* HiSilicon HNS ethernet broken in 4.15.0-45 (LP: #1818294)
    - net: hns: Fix WARNING when hns modules installed

* rtl8723be wifi does not work under linux-modules-extra-4.15.0-33-generic
    (LP: #1788997)
    - SAUCE: Revert "rtlwifi: cleanup 8723be ant_sel definition"

* Crash from :i915 module with 4.15.0-46-generic using multi-display
    (LP: #1819486)
    - SAUCE: Revert "drm/i915: Fix hotplug irq ack on i965/g4x"

* kernel linux-image-4.15.0-44 not booting on Hyperv Server 2008R2
    (LP: #1814069)
    - hv/netvsc: fix handling of fallback to single queue mode
    - hv/netvsc: Fix NULL dereference at single queue mode fallback

* Lenovo ideapad 330-15ICH Wifi rfkill hard blocked (LP: #1811815)
    - platform/x86: ideapad: Add ideapad 330-15ICH to no_hw_rfkill

* Qualcomm Atheros QCA9377 wireless does not work (LP: #1818204)
    - platform/x86: ideapad-laptop: Add Ideapad 530S-14ARR to no_hw_rfkill list

* fscache: jobs might hang when fscache disk is full (LP: #1821395)
    - fscache: fix race between enablement and dropping of object

* hns3: fix oops in hns3_clean_rx_ring() (LP: #1821064)
    - net: hns3: add dma_rmb() for rx description

* Hard lockup in 2 CPUs due to deadlock in cpu_stoppers (LP: #1821259)
    - stop_machine: Disable preemption after queueing stopper threads
    - stop_machine: Atomically queue and wake stopper threads

* tcm_loop.ko: move from modules-extra into main modules package
    (LP: #1817786)
    - [Packaging] move tcm_loop.lo to main linux-modules package

* tcmu user space crash results in kernel module hang. (LP: #1819504)
    - scsi: tcmu: delete unused __wait
    - scsi: tcmu: track nl commands
    - scsi: tcmu: simplify nl interface
    - scsi: tcmu: add module wide block/reset_netlink support

* Intel XL710 - i40e driver does not work with kernel 4.15 (Ubuntu 18.04)
    (LP: #1779756)
    - i40e: Fix for Tx timeouts when interface is brought up if DCB is enabled
    - i40e: prevent overlapping tx_timeout recover

* some codecs stop working after S3 (LP: #1820930)
    - ALSA: hda - Enforces runtime_resume after S3 and S4 for each codec

* i40e xps management broken when > 64 queues/cpus (LP: #1820948)
    - i40e: Do not allow use more TC queue pairs than MSI-X vectors exist
    - i40e: Fix the number of queues available to be mapped for use

* 4.15 s390x kernel BUG at /build/linux-
    Gycr4Z/linux-4.15.0/drivers/block/virtio_blk.c:565! (LP: #1788432)
    - virtio/s390: avoid race on vcdev->config
    - virtio/s390: fix race in ccw_io_helper()

* [SRU][B/B-OEM/C/D] Fix AMD IOMMU NULL dereference (LP: #1820990)
    - iommu/amd: Fix NULL dereference bug in match_hid_uid

* New Intel Wireless-AC 9260 [8086:2526] card not correctly probed in Ubuntu
    system (LP: #1821271)
    - iwlwifi: add new card for 9260 series

* Add support for MAC address pass through on RTL8153-BD (LP: #1821276)
    - r8152: Add support for MAC address pass through on RTL8153-BD
    - r8152: Fix an error on RTL8153-BD MAC Address Passthrough support

-- Andrea Righi <andrea.righi@canonical.com>  Tue, 02 Apr 2019 18:31:55 +0200

Changed in linux (Ubuntu Bionic):
status:	Fix Committed → Fix Released

Revision history for this message

Jianan Wang (wangjianan-zju) wrote on 2020-09-04:

#15

Hi there. I have a question about whether this fix is applied to kernel version 4.18.0-25? We have upgraded to this kernel version while using ubuntu 18.04 and hit this issue, so want to know which next stable version will contain this fix? Thanks.

Revision history for this message

Mauricio Faria de Oliveira (mfo) wrote on 2020-09-04:

#16

Hi Jianan,

The 4.18 kernel is no longer a supported kernel in Ubuntu,
since Ubuntu Cosmic/18.10 is 'End of Life' a long time ago.

You have to upgrade to the current HWE kernel (4.18 was the
HWE/hardware enablement kernel in the Cosmic timeframe)
now it's 5.4, with something like:

$ sudo apt install linux-generic-hwe-18.04

Hope this helps,
Mauricio

Revision history for this message

Jianan Wang (wangjianan-zju) wrote on 2020-09-07:

#17

Hi Mauricio, that’s very helpful and we will try that, thanks for your input on this!

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

kmod-stopper.c Edit

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

Ubuntulinux package

Hard lockup in 2 CPUs due to deadlock in cpu_stoppers

Bug Description

CVE References

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux package