Comment 1 for bug 1662673

Revision history for this message
Michael Hohnbaum (hohnbaum) wrote : Re: [Bug 1662673] [NEW] systemd-udevd hung in blk_mq_freeze_queue_wait testing unpartitioned NVMe drive

Leann,

This looks like one for the Kernel team.

                   Michael

On 02/07/2017 01:19 PM, Launchpad Bug Tracker wrote:
> bugproxy (bugproxy) has assigned this bug to you for Ubuntu:
>
> For reference, here is the stack of systemd-udevd seen in the hang:
>
> [ 1558.214013] INFO: task systemd-udevd:1778 blocked for more than 120 seconds.
> [ 1558.214318] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 1558.214556] systemd-udevd D 00003fff8dbdf7a0 0 1778 1 0x00040000
> [ 1558.214637] Call Trace:
> [ 1558.214673] [c000000004ad3790] [c0000000007aac20] schedule_timeout+0x180/0x2f0 (unreliable)
> [ 1558.214779] [c000000004ad3960] [c0000000000158d0] __switch_to+0x200/0x350
> [ 1558.214870] [c000000004ad39c0] [c0000000007adbb4] __schedule+0x414/0x9e0
> [ 1558.214961] [c000000004ad3a90] [c0000000003b4e54] blk_mq_freeze_queue_wait+0x64/0xd0
> [ 1558.215107] [c000000004ad3af0] [d000000034011964] nvme_revalidate_disk+0xd4/0x3a0 [nvme]
> [ 1558.215386] [c000000004ad3b90] [c0000000003c2398] rescan_partitions+0x98/0x390
> [ 1558.215508] [c000000004ad3c60] [c0000000003bb7ac] __blkdev_reread_part+0x9c/0xd0
> [ 1558.215599] [c000000004ad3c90] [c0000000003bb818] blkdev_reread_part+0x38/0x70
> [ 1558.215935] [c000000004ad3cc0] [c0000000003bc334] blkdev_ioctl+0x3b4/0xb80
> [ 1558.216016] [c000000004ad3d20] [c0000000002cbcd0] block_ioctl+0x70/0x90
> [ 1558.216114] [c000000004ad3d40] [c000000000296b38] do_vfs_ioctl+0x458/0x740
> [ 1558.216192] [c000000004ad3dd0] [c000000000296ee4] SyS_ioctl+0xc4/0xe0
> [ 1558.216275] [c000000004ad3e30] [c00000000000a17c] system_call+0x38/0xb4
>
> It appears that systemd-udevd is triggering every time HTX writes to the
> boot sector (partition table) of the raw drive, and this is causing the
> revalidate calls which expose the issue with the block driver mq freeze.
> With a partition table on each drive, HTX will no longer be writing the
> partition table and no longer triggering systemd to re-read the
> partition table and try to freeze I/O.
>
> The fix for this is provided by the following upstream commit:
>
> 966d2b0 percpu-refcount: fix reference leak during percpu-atomic
> transition
>
> which needs to be pulled into 16.04 (as well as newer releases).
>
> ** Affects: ubuntu
> Importance: Undecided
> Assignee: Taco Screen team (taco-screen-team)
> Status: New
>
>
> ** Tags: architecture-ppc64le bugnameltc-148242 severity-critical targetmilestone-inin16042

--
Michael Hohnbaum
OIL Program Manager
Power (ppc64el) Development Project Manager
Canonical, Ltd.