systemd-udevd hung in blk_mq_freeze_queue_wait testing unpartitioned NVMe drive
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Xenial |
Fix Released
|
Undecided
|
Tim Gardner | ||
Yakkety |
Fix Released
|
Undecided
|
Tim Gardner | ||
Zesty |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
For reference, here is the stack of systemd-udevd seen in the hang:
[ 1558.214013] INFO: task systemd-udevd:1778 blocked for more than 120 seconds.
[ 1558.214318] "echo 0 > /proc/sys/
[ 1558.214556] systemd-udevd D 00003fff8dbdf7a0 0 1778 1 0x00040000
[ 1558.214637] Call Trace:
[ 1558.214673] [c000000004ad3790] [c0000000007aac20] schedule_
[ 1558.214779] [c000000004ad3960] [c0000000000158d0] __switch_
[ 1558.214870] [c000000004ad39c0] [c0000000007adbb4] __schedule+
[ 1558.214961] [c000000004ad3a90] [c0000000003b4e54] blk_mq_
[ 1558.215107] [c000000004ad3af0] [d000000034011964] nvme_revalidate
[ 1558.215386] [c000000004ad3b90] [c0000000003c2398] rescan_
[ 1558.215508] [c000000004ad3c60] [c0000000003bb7ac] __blkdev_
[ 1558.215599] [c000000004ad3c90] [c0000000003bb818] blkdev_
[ 1558.215935] [c000000004ad3cc0] [c0000000003bc334] blkdev_
[ 1558.216016] [c000000004ad3d20] [c0000000002cbcd0] block_ioctl+
[ 1558.216114] [c000000004ad3d40] [c000000000296b38] do_vfs_
[ 1558.216192] [c000000004ad3dd0] [c000000000296ee4] SyS_ioctl+0xc4/0xe0
[ 1558.216275] [c000000004ad3e30] [c00000000000a17c] system_
It appears that systemd-udevd is triggering every time HTX writes to the boot sector (partition table) of the raw drive, and this is causing the revalidate calls which expose the issue with the block driver mq freeze. With a partition table on each drive, HTX will no longer be writing the partition table and no longer triggering systemd to re-read the partition table and try to freeze I/O.
The fix for this is provided by the following upstream commit:
966d2b0 percpu-refcount: fix reference leak during percpu-atomic transition
which needs to be pulled into 16.04 (as well as newer releases).
tags: | added: architecture-ppc64le bugnameltc-148242 severity-critical targetmilestone-inin16042 |
Changed in ubuntu: | |
assignee: | nobody → Taco Screen team (taco-screen-team) |
affects: | ubuntu → linux (Ubuntu) |
Changed in linux (Ubuntu Yakkety): | |
status: | In Progress → Fix Committed |
tags: |
added: verification-done-yakkety removed: verification-needed-yakkety |
Leann,
This looks like one for the Kernel team.
On 02/07/2017 01:19 PM, Launchpad Bug Tracker wrote: kernel/ hung_task_ timeout_ secs" disables this message. timeout+ 0x180/0x2f0 (unreliable) to+0x200/ 0x350 0x414/0x9e0 freeze_ queue_wait+ 0x64/0xd0 _disk+0xd4/ 0x3a0 [nvme] partitions+ 0x98/0x390 reread_ part+0x9c/ 0xd0 reread_ part+0x38/ 0x70 ioctl+0x3b4/ 0xb80 0x70/0x90 ioctl+0x458/ 0x740 call+0x38/ 0xb4 ppc64le bugnameltc-148242 severity-critical targetmilestone -inin16042
> bugproxy (bugproxy) has assigned this bug to you for Ubuntu:
>
> For reference, here is the stack of systemd-udevd seen in the hang:
>
> [ 1558.214013] INFO: task systemd-udevd:1778 blocked for more than 120 seconds.
> [ 1558.214318] "echo 0 > /proc/sys/
> [ 1558.214556] systemd-udevd D 00003fff8dbdf7a0 0 1778 1 0x00040000
> [ 1558.214637] Call Trace:
> [ 1558.214673] [c000000004ad3790] [c0000000007aac20] schedule_
> [ 1558.214779] [c000000004ad3960] [c0000000000158d0] __switch_
> [ 1558.214870] [c000000004ad39c0] [c0000000007adbb4] __schedule+
> [ 1558.214961] [c000000004ad3a90] [c0000000003b4e54] blk_mq_
> [ 1558.215107] [c000000004ad3af0] [d000000034011964] nvme_revalidate
> [ 1558.215386] [c000000004ad3b90] [c0000000003c2398] rescan_
> [ 1558.215508] [c000000004ad3c60] [c0000000003bb7ac] __blkdev_
> [ 1558.215599] [c000000004ad3c90] [c0000000003bb818] blkdev_
> [ 1558.215935] [c000000004ad3cc0] [c0000000003bc334] blkdev_
> [ 1558.216016] [c000000004ad3d20] [c0000000002cbcd0] block_ioctl+
> [ 1558.216114] [c000000004ad3d40] [c000000000296b38] do_vfs_
> [ 1558.216192] [c000000004ad3dd0] [c000000000296ee4] SyS_ioctl+0xc4/0xe0
> [ 1558.216275] [c000000004ad3e30] [c00000000000a17c] system_
>
> It appears that systemd-udevd is triggering every time HTX writes to the
> boot sector (partition table) of the raw drive, and this is causing the
> revalidate calls which expose the issue with the block driver mq freeze.
> With a partition table on each drive, HTX will no longer be writing the
> partition table and no longer triggering systemd to re-read the
> partition table and try to freeze I/O.
>
> The fix for this is provided by the following upstream commit:
>
> 966d2b0 percpu-refcount: fix reference leak during percpu-atomic
> transition
>
> which needs to be pulled into 16.04 (as well as newer releases).
>
> ** Affects: ubuntu
> Importance: Undecided
> Assignee: Taco Screen team (taco-screen-team)
> Status: New
>
>
> ** Tags: architecture-
--
Michael Hohnbaum
OIL Program Manager
Power (ppc64el) Development Project Manager
Canonical, Ltd.