CPU offlining can trigger hangs with HP ProLiant DL360p Gen8.

Bug #1744163 reported by Vinson Lee on 2018-01-18
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Unassigned
Artful
High
Unassigned
Bionic
High
Unassigned

Bug Description

CPU offlining can trigger hangs with HP ProLiant DL360p Gen8 on Ubuntu 17.10.

The hangs also occur with latest mainline kernel Linux 4.15-rc8.

I bisected the upstream mainline kernel. This is a regression introduced by Linux 4.13-rc1 commit c5cb83bb337c25caae995d992d1cdf9b317f83de ("genirq/cpuhotplug: Handle managed IRQs on CPU hotplug"). https://github.com/torvalds/linux/commit/c5cb83bb337c25caae995d992d1cdf9b317f83de

This test case to offline all CPUs except for CPU 0 can be used to trigger a hang.

for i in $(seq 1 $(expr $(nproc --all) - 1)); do sudo sh -c "echo 0 > /sys/devices/system/cpu/cpu$i/online"; done

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1744163

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Vinson Lee (vlee) on 2018-01-18
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Joseph Salisbury (jsalisbury) wrote :

Is there a kernel trace in the logs from the hang? I'd like to see if this is related to bug 1733662, which has a test kernel here if you want to try it:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1733662/comments/62

Changed in linux (Ubuntu):
status: Confirmed → Triaged
Changed in linux (Ubuntu Artful):
status: New → Triaged
importance: Undecided → High
Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-da-key
Vinson Lee (vlee) wrote :

kernel: [ 242.381420] INFO: task kworker/u129:0:6 blocked for more than 120 seconds.
kernel: [ 242.417101] Not tainted 4.15.0-041500rc8-generic #201801142030
kernel: [ 242.447495] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: [ 242.482715] kworker/u129:0 D 0 6 2 0x80000000
kernel: [ 242.482729] Workqueue: writeback wb_workfn (flush-8:0)
kernel: [ 242.482731] Call Trace:
kernel: [ 242.482744] __schedule+0x291/0x880
kernel: [ 242.482747] schedule+0x2c/0x80
kernel: [ 242.482751] io_schedule+0x16/0x40
kernel: [ 242.482756] get_request+0x2a4/0x7f0
kernel: [ 242.482761] ? wait_woken+0x80/0x80
kernel: [ 242.482765] blk_queue_bio+0x128/0x440
kernel: [ 242.482767] generic_make_request+0x11f/0x300
kernel: [ 242.482769] submit_bio+0x73/0x140
kernel: [ 242.482770] ? submit_bio+0x73/0x140
kernel: [ 242.482836] xfs_submit_ioend+0x87/0x1c0 [xfs]
kernel: [ 242.482861] xfs_do_writepage+0x377/0x690 [xfs]
kernel: [ 242.482868] write_cache_pages+0x209/0x4d0
kernel: [ 242.482891] ? xfs_vm_writepages+0xf0/0xf0 [xfs]
kernel: [ 242.482913] xfs_vm_writepages+0xbe/0xf0 [xfs]
kernel: [ 242.482915] do_writepages+0x48/0xe0
kernel: [ 242.482919] ? check_preempt_curr+0x2d/0x90
kernel: [ 242.482920] ? ttwu_do_wakeup+0x1e/0x140
kernel: [ 242.482924] __writeback_single_inode+0x45/0x330
kernel: [ 242.482926] ? __writeback_single_inode+0x45/0x330
kernel: [ 242.482927] ? try_to_wake_up+0x59/0x480
kernel: [ 242.482929] writeback_sb_inodes+0x1e1/0x510
kernel: [ 242.482932] __writeback_inodes_wb+0x67/0xb0
kernel: [ 242.482934] wb_writeback+0x26b/0x300
kernel: [ 242.482937] wb_workfn+0x180/0x410
kernel: [ 242.482939] ? wb_workfn+0x180/0x410
kernel: [ 242.482943] process_one_work+0x1ea/0x410
kernel: [ 242.482945] worker_thread+0x32/0x410
kernel: [ 242.482948] kthread+0x11e/0x140
kernel: [ 242.482949] ? process_one_work+0x410/0x410
kernel: [ 242.482951] ? kthread_create_worker_on_cpu+0x70/0x70
kernel: [ 242.482956] ret_from_fork+0x32/0x40

This bug was nominated against a series that is no longer supported, ie artful. The bug task representing the artful nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Artful):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers