blk-mq: possible deadlock on CPU hot(un)plug

Bug #1670634 reported by bugproxy on 2017-03-07
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
High
Canonical Kernel Team
linux (Ubuntu)
High
Juerg Haefliger
Xenial
Undecided
Unassigned

Bug Description

== Comment: #0 - Carsten Jacobi <email address hidden> - 2017-03-07 03:35:31 ==
I'm evaluating Ubuntu-Xenial on z for development purposes, the test system is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all pathes are configured).
The system hangs regularly when I make packages with "pdebuild" using the pbuilder packaging suit.
The local Linux development team helped me out with a pre-analysis that I can post here (thanks a lot for that):

With the default settings and under a certain workload,
blk_mq seems to get into a presumed "deadlock".
Possibly this happens on CPU hot(un)plug.

After the I/O stalled, a dump was pulled manually.
The following information is from the crash dump pre-analysis.

$ zgetdump -i dump.0
General dump info:
  Dump format........: elf
  Version............: 1
  UTS node name......: mclint
  UTS kernel release.: 4.4.0-65-generic
  UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
  System arch........: s390x (64 bit)
  CPU count (online).: 2
  Dump memory range..: 8192 MB
Memory map:
  0000000000000000 - 00000001b831afff (7043 MB)
  00000001b831b000 - 00000001ffffffff (1149 MB)

Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

      KERNEL: vmlinux.full
    DUMPFILE: dump.0
        CPUS: 2
        DATE: Fri Mar 3 14:31:07 2017
      UPTIME: 02:11:20
LOAD AVERAGE: 13.00, 12.92, 11.37
       TASKS: 411
    NODENAME: mclint
     RELEASE: 4.4.0-65-generic
     VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
     MACHINE: s390x (unknown Mhz)
      MEMORY: 7.8 GB
       PANIC: ""
         PID: 0
     COMMAND: "swapper/0"
        TASK: bad528 (1 of 2) [THREAD_INFO: b78000]
         CPU: 0
       STATE: TASK_RUNNING (ACTIVE)
        INFO: no panic task found

crash> dev -d
MAJOR GENDISK NAME REQUEST_QUEUE TOTAL ASYNC SYNC DRV
...
    8 1e1d6d800 sda 1e1d51210 0 23151 4294944145 N/A(MQ)
    8 1e4e06800 sdc 2081b18 0 23148 4294944148 N/A(MQ)
    8 1f07800 sdb 20c7568 0 23195 4294944101 N/A(MQ)
    8 1e4e06000 sdd 1e4e31210 0 23099 4294944197 N/A(MQ)
  252 1e1d6c800 dm-0 1e1d51b18 9 1 8 N/A(MQ)
...

So both dm-mpath and sd have requests pending in their block multiqueue.
The large numbers of sd look strange and seem to be the unsigned formatting of the values shown for async multiplied by -1.

[ 0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
[ 0.798262] setup: Linux is running natively in 64-bit mode
[ 0.798290] setup: Max memory size: 8192MB
[ 0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel (System RAM: 7996MB)

[ 0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

[ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 seconds.
[ 5281.179437] Not tainted 4.4.0-65-generic #86-Ubuntu
[ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 5281.179440] xfsaild/dm-11 D 00000000007bcf52 0 1604 2 0x00000000
[ 5281.179444] 00000001e931c230 00000000001a6964 00000001e6f9b958 00000001e6f9b9d8
                      00000001e15795f0 00000001e6f9b988 0000000000ce8c00 00000001ea805c70
                      00000001ea805c00 0000000000ba5ed0 00000001e931c1d0 00000001e1579b20
                      00000001ea805c00 00000001e15795f0 00000001ea805c00 0000000000000000
                      00000000007d3978 00000000007bc9f8 00000001e6f9b9d8 00000001e6f9ba40
[ 5281.179454] Call Trace:
[ 5281.179461] ([<00000000007bc9f8>] __schedule+0x300/0x810)
[ 5281.179462] [<00000000007bcf52>] schedule+0x4a/0xb0
[ 5281.179465] [<00000000007c02aa>] schedule_timeout+0x232/0x2a8
[ 5281.179466] [<00000000007bde50>] wait_for_common+0x110/0x1c8
[ 5281.179472] [<000000000017b602>] flush_work+0x42/0x58
[ 5281.179564] [<000003ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
[ 5281.179589] [<000003ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
[ 5281.179615] [<000003ff805df114>] xfs_log_force+0x44/0x100 [xfs]
[ 5281.179640] [<000003ff805ec668>] xfsaild+0x170/0x798 [xfs]
[ 5281.179643] [<000000000018335a>] kthread+0x10a/0x110
[ 5281.179645] [<00000000007c0ff6>] kernel_thread_starter+0x6/0xc
[ 5281.179646] [<00000000007c0ff0>] kernel_thread_starter+0x0/0xc

see below

[ 5281.179664] INFO: task cpuplugd:2260 blocked for more than 120 seconds.
[ 5281.179665] Not tainted 4.4.0-65-generic #86-Ubuntu
[ 5281.179666] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 5281.179668] cpuplugd D 00000000007bcf52 0 2260 1 0x00000000
[ 5281.179670] 00000001e782e318 00000000001a6964 000000007bc4ba58 000000007bc4bad8
                      00000001d4076db0 000000007bc4ba88 0000000000ce8c00 00000001ea805c70
                      00000001ea805c00 0000000000ba5ed0 00000001e782e2b8 00000001d40772e0
                      00000001ea805c00 00000001d4076db0 00000001ea805c00 0000000000000000
                      00000000007d3978 00000000007bc9f8 000000007bc4bad8 000000007bc4bb40
[ 5281.179678] Call Trace:
[ 5281.179680] ([<00000000007bc9f8>] __schedule+0x300/0x810)
[ 5281.179681] [<00000000007bcf52>] schedule+0x4a/0xb0
[ 5281.179685] [<0000000000516cc2>] blk_mq_freeze_queue_wait+0x62/0xc8
[ 5281.179687] [<0000000000519412>] blk_mq_queue_reinit_notify+0x11a/0x240
[ 5281.179690] [<00000000001844c6>] notifier_call_chain+0x56/0x98
[ 5281.179692] [<000000000018466a>] __raw_notifier_call_chain+0x2a/0x38
[ 5281.179696] [<00000000001605ac>] _cpu_up+0x10c/0x1b0
[ 5281.179698] [<0000000000160738>] cpu_up+0xe8/0x108
[ 5281.179700] [<00000000005d08be>] cpu_subsys_online+0x56/0xb0
[ 5281.179703] [<00000000005ca1c2>] device_online+0x82/0xc0
[ 5281.179704] [<00000000005ca28a>] online_store+0x8a/0x98
[ 5281.179710] [<00000000003a4d12>] kernfs_fop_write+0x13a/0x190
[ 5281.179712] [<000000000031218c>] vfs_write+0x94/0x1a0
[ 5281.179714] [<0000000000312e9e>] SyS_write+0x66/0xd8
[ 5281.179715] [<00000000007c0e3e>] system_call+0xd6/0x264
[ 5281.179718] [<000003ff803df478>] zlib_tr_flush_block+0x650/0x830 [zlib_deflate]

Cpuplugd performs CPU hot(un)plug based on configurable rules.
https://www.ibm.com/support/knowledgecenter/linuxonibm/com.ibm.linux.z.ludd/ludd_r_cpuplugdcmd.html
https://www.ibm.com/support/knowledgecenter/linuxonibm/com.ibm.linux.z.ludd/ludd_t_cpu_act.html
https://www.ibm.com/support/knowledgecenter/linuxonibm/com.ibm.linux.z.ludd/ludd_r_numa_know_cpu.html

[ 5281.179769] INFO: task kworker/0:2:23669 blocked for more than 120 seconds.
[ 5281.179770] Not tainted 4.4.0-65-generic #86-Ubuntu
[ 5281.179771] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 5281.179773] kworker/0:2 D 00000000007bcf52 0 23669 2 0x00000000
[ 5281.179801] Workqueue: xfs-cil/dm-11 xlog_cil_push_work [xfs]
[ 5281.179802] 00000001c925e318 00000000001a6964 0000000068c5f9f8 0000000068c5fa78
                      00000001da99b6d8 0000000068c5fa10 0000000000ce8c00 00000001ea805c70
                      00000001ea805c00 00000001e782e94c 00000001c925e2b8 00000001da99bc08
                      00000001ea805c00 00000001da99b6d8 00000001ea805c00 0000000000000000
                      00000000007d3978 00000000007bc9f8 0000000068c5fa78 0000000068c5fae0
[ 5281.179810] Call Trace:
[ 5281.179812] ([<00000000007bc9f8>] __schedule+0x300/0x810)
[ 5281.179813] [<00000000007bcf52>] schedule+0x4a/0xb0
[ 5281.179839] [<000003ff805de144>] xlog_state_get_iclog_space+0x124/0x338 [xfs]
[ 5281.179864] [<000003ff805de702>] xlog_write+0x1ea/0x800 [xfs]
[ 5281.179890] [<000003ff805e09a6>] xlog_cil_push+0x286/0x508 [xfs]
[ 5281.179891] [<000000000017c400>] process_one_work+0x1a0/0x4f8
[ 5281.179893] [<000000000017c7a2>] worker_thread+0x4a/0x530
[ 5281.179894] [<000000000018335a>] kthread+0x10a/0x110
[ 5281.179896] [<00000000007c0ff6>] kernel_thread_starter+0x6/0xc
[ 5281.179898] [<00000000007c0ff0>] kernel_thread_starter+0x0/0xc

While above kworker executes a work item for a long duration, other processes block in turn on flush_work for a long duration.

[ 5281.179728] INFO: task kworker/0:1:4454 blocked for more than 120 seconds.
[ 5281.179730] Not tainted 4.4.0-65-generic #86-Ubuntu
[ 5281.179731] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 5281.179732] kworker/0:1 D 00000000007bcf52 0 4454 2 0x00000000
[ 5281.179738] Workqueue: events vmstat_shepherd
[ 5281.179739] 00000001c925ac40 00000000001a6964 000000007eb8bb38 000000007eb8bbb8
                      00000001e782e2b8 000000007eb8bb50 0000000000ce8c00 00000001ea805c70
                      00000001c925abe0 00000001c925b274 00000001c925abe0 00000001e782e7e8
                      00000001ea805c00 00000001e782e2b8 00000001ea805c00 0000000000000000
                      00000000007d3978 00000000007bc9f8 000000007eb8bbb8 000000007eb8bc20
[ 5281.179747] Call Trace:
[ 5281.179749] ([<00000000007bc9f8>] __schedule+0x300/0x810)
[ 5281.179750] [<00000000007bcf52>] schedule+0x4a/0xb0
[ 5281.179752] [<00000000007bd39a>] schedule_preempt_disabled+0x2a/0x38
[ 5281.179753] [<00000000007becc4>] __mutex_lock_slowpath+0xcc/0x170
[ 5281.179755] [<00000000007bedc6>] mutex_lock+0x5e/0x78
[ 5281.179756] [<000000000015fba0>] get_online_cpus+0x40/0x68
[ 5281.179757] [<00000000002a3ccc>] vmstat_shepherd+0x44/0x168
[ 5281.179759] [<000000000017c400>] process_one_work+0x1a0/0x4f8
[ 5281.179761] [<000000000017c7a2>] worker_thread+0x4a/0x530
[ 5281.179762] [<000000000018335a>] kthread+0x10a/0x110
[ 5281.179764] [<00000000007c0ff6>] kernel_thread_starter+0x6/0xc
[ 5281.179765] [<00000000007c0ff0>] kernel_thread_starter+0x0/0xc

This work item cannot progress maybe because cpuplugd:2260 above "hangs" in the cpu hotplug notifier chain.

The low level device driver (here zfcp) is completely idle without any pending I/O after the lockup happened and all its paths are in good state and could service I/O but it simply does not get any new I/O requests from the upper layers (scsi / block). Zfcp does not implement blk_mq so dm or scsi translate which works in general but fails with above workload. There were no other undesired events, i.e. no path interruptions nor any recovery in zfcp.

CVE References

Default Comment by Bridge

tags: added: architecture-s39064 bugnameltc-152303 severity-high targetmilestone-inin16041

------- Comment From <email address hidden> 2017-03-07 06:33 EDT-------
[CAN] we have addl. informations (dumps. etc) where can be provide such information to you. approx 1GByte....

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1670634/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
tags: added: s390x
removed: bot-comment
affects: ubuntu → linux (Ubuntu)
Changed in ubuntu-z-systems:
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → High
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-03-07 11:44 EDT-------
Please provide Debug-Info to this IBM_BOX folder
https://ibm.box.com/s/y10o4u7bcvc6nk7rgk2gfmk039ii5d1i
After Debugging this folder will be deleted.

Douglas Miller (dougmill-ibm) wrote :

This looks a lot like the problem in LP #1662673, but that fix is supposed to be in kernel 4.4.0-65-generic. Might be worth confirming though. Or perhaps confirming that the fix actually works on the Z architecture (depends on how the architecture/compiler handles 'bool').

Tim Gardner (timg-tpi) wrote :

Douglas - The patch referred to in LP #1662673 ("percpu-refcount: fix reference leak during percpu-atomic transition") is in Ubuntu-4.4.0-65.86 which has yet to be released.

bugproxy (bugproxy) wrote :
Download full text (4.6 KiB)

------- Comment From <email address hidden> 2017-03-08 05:56 EDT-------
Just tried the newest Kernel 4.4.0-66, and I'm still running into the hang. Here the final statements in /var/log/syslog (the lines, that never make it out onto the disk):

Mar 8 11:26:31 mclint multipathd[955]: mpatha: sdb - tur checker timed out
Mar 8 11:26:31 mclint multipathd[955]: 8:16: reinstated
Mar 8 11:26:31 mclint multipathd[955]: mpatha: sdd - tur checker timed out
Mar 8 11:26:31 mclint rsyslogd-2007: action 'action 10' suspended, next retry is Wed Mar 8 11:27:01 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Mar 8 11:26:31 mclint multipathd[955]: 8:48: reinstated
Mar 8 11:26:31 mclint multipathd[955]: mpatha: sdc - tur checker timed out
Mar 8 11:26:31 mclint multipathd[955]: 8:32: reinstated
Mar 8 11:26:32 mclint multipathd[955]: mpatha: sda - tur checker timed out
Mar 8 11:26:32 mclint multipathd[955]: 8:0: reinstated

And this here shows up on the sclp_line console:

? 961.419327! INFO: task cpuplugd:2604 blocked for more than 120 seconds.
? 961.419337! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419338! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.419404! INFO: task irqbalance:2651 blocked for more than 120 seconds.
? 961.419406! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419407! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.419450! INFO: task kworker/0:4:3801 blocked for more than 120 seconds.
? 961.419451! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419452! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.419494! INFO: task kworker/1:1:4548 blocked for more than 120 seconds.
? 961.419495! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419496! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.419539! INFO: task kworker/0:0H:20302 blocked for more than 120 seconds.
? 961.419540! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419541! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.419764! INFO: task kworker/0:0:66641 blocked for more than 120 seconds.
? 961.419766! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419767! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.419895! INFO: task rm:81710 blocked for more than 120 seconds.
? 961.419896! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419897! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 1081.419024! INFO: task systemd:1 blocked for more than 120 seconds.
? 1081.419033! Not tainted 4.4.0-66-generic #87-Ubuntu
? 1081.419035! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 1081.419148! INFO: task cpuplugd:2604 blocked for more than 120 seconds.
? 1081.419150! Not tainted 4.4.0-66-generic #87-Ubuntu
? 1081.419151! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 1081.419186! INFO: task irqbalance:2651 blocked for more than 120 seconds.
? 1081.419187! Not tainted 4.4.0-66-generic #87-Ubuntu
? 1081.419188! "echo 0 > /proc/sys/kernel/hung_task_tim...

Read more...

Tim Gardner (timg-tpi) wrote :

Due to an emergency CVE rebase, that patch still hasn't made it into the wild. Here is a test kernel that definitely has the patch

http://kernel.ubuntu.com/~rtg/lp1670634/

Douglas Miller (dougmill-ibm) wrote :

I'm confused about the release mechanics, I guess. Looking at the git repository, I see tag "Ubuntu-4.4.0-65.86" (for example) and that tag commit does contain the fix. Is it then possible for a kernel labeled "4.4.0-65-generic #86" to NOT contain that patch? Am I making a gross assumption that these tags reflect what was released?

Douglas Miller (dougmill-ibm) wrote :

Carsten, I am currently thinking there are two possibilities here. Maybe three.

1) The fix I submitted is not in the kernel(s) you are running.
2) The s390 compiler does not produce the necessary code to implicitly convert long int to bool.
3) You are hitting a different bug that just happens to look the same.

For #2, a simple compiler test could be done to check what code is produced when assigning a long int value to a bool (GCC _Bool). If you want to pursue that let me know. I am not familiar with s390 object code, so we might need someone to interpret the objdump. As far as I can tell, the s390 Linux kernel does use GCC _Bool for the data type "bool", so it would then be a matter of what code the s390 GCC produces in this case.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-03-09 11:28 EDT-------
Ok, I just tested the kernel from http://kernel.ubuntu.com/~rtg/lp1670634/ and so far this looks good! I was able to Make the OpenAFS Ubuntu package three times with pdebuild without running into a hang, and this was a very good candidate to run into the hang scenario.
I'd like to activate OpenAFS on that system so that I'll again be able to also start to run jobs as users other than root and also to build more packages for Ubuntu, and at the moment there is only one subtle obstacle:

root@mclint:/var/lib/dkms/openafs/1.6.20.1/build# file /usr/src/linux-headers-4.4.0-67-generic/scripts/basic/fixdep
/usr/src/linux-headers-4.4.0-67-generic/scripts/basic/fixdep: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=4f746ae15cb57aa0f264c965a8061844f5f21fa2, not stripped
root@mclint:/var/lib/dkms/openafs/1.6.20.1/build# dpkg -S /usr/src/linux-headers-4.4.0-67-generic/scripts/basic/fixdep
linux-headers-4.4.0-67-generic: /usr/src/linux-headers-4.4.0-67-generic/scripts/basic/fixdep

Somehow, you put me the x86_64 version of "fixdep" into the linux-headers package in the ~rtg/lp1670634 folder. I checked the same file on the other linux-headers packages on my system and they were present and for the s390x-architecture. I'm a little puzzled here, as the linux-headers package is an "_all.deb" I thought those packages should be free of binary files for a specific architecture. I'll try to replace the fixdep program and try on ...

Tim Gardner (timg-tpi) wrote :

An x86-64 fixdep is an artifact of cross compiling.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-03-13 09:57 EDT-------
Next iteration: I was able to make OpenAFS for the proposed kernel fixed and so I was able to start the next pdebuild job .... and again I run into a hang scenario:

Mar 13 14:30:32 mclint multipathd[881]: mpatha: sda - tur checker timed out
Mar 13 14:30:32 mclint multipathd[881]: 8:0: reinstated
Mar 13 14:30:32 mclint multipathd[881]: mpatha: sdc - tur checker timed out
Mar 13 14:30:32 mclint multipathd[881]: 8:32: reinstated
Mar 13 14:30:33 mclint multipathd[881]: mpatha: sdb - tur checker timed out
Mar 13 14:30:33 mclint multipathd[881]: 8:16: reinstated
Mar 13 14:30:36 mclint multipathd[881]: mpatha: sdd - tur checker timed out
Mar 13 14:30:36 mclint rsyslogd-2007: action 'action 10' suspended, next retry is Mon Mar 13 14:32:06 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Mar 13 14:30:36 mclint multipathd[881]: 8:48: reinstated

I just pulled a dump and compress it, eventually I'll upload it to Box. I'll go on and try to reproduce this without OpenAFS and I'm very confident that this hang is not related to AFS at all ...

General dump info:
Dump format........: s390mv
Version............: 5
Dump created.......: Mon, 13 Mar 2017 14:42:39 +0100
Dump ended.........: Mon, 13 Mar 2017 14:43:17 +0100
Dump CPU ID........: 9efc729648000
UTS node name......: mclint
UTS kernel release.: 4.4.0-67-generic
UTS kernel version.: #88 SMP Wed Mar 8 14:48:51 UTC 2017
Build arch.........: s390x (64 bit)
System arch........: s390x (64 bit)
CPU count (online).: 2
CPU count (real)...: 4
Dump memory range..: 8192 MB
Real memory range..: 8192 MB

Memory map:
0000000000000000 - 00000001b831afff (7043 MB)
00000001b831b000 - 00000001ffffffff (1149 MB)

Dump device info:
Volume 0: 0.0.8409 (online/active)
Volume 1: 0.0.840a (online/active)

bugproxy (bugproxy) wrote :
Download full text (3.5 KiB)

------- Comment From <email address hidden> 2017-03-14 12:07 EDT-------
I was able to run into the hand now also without openafs, /var/log/syslog:

Mar 14 15:10:46 mclint multipathd[887]: mpatha: sda - tur checker timed out
Mar 14 15:10:46 mclint multipathd[887]: 8:0: reinstated
Mar 14 15:10:46 mclint multipathd[887]: mpatha: sdb - tur checker timed out
Mar 14 15:10:46 mclint multipathd[887]: 8:16: reinstated
Mar 14 15:10:46 mclint multipathd[887]: mpatha: sdc - tur checker timed out
Mar 14 15:10:46 mclint multipathd[887]: 8:32: reinstated
Mar 14 15:10:46 mclint multipathd[887]: mpatha: sdd - tur checker timed out
Mar 14 15:10:46 mclint multipathd[887]: 8:48: reinstated

On the sclp_line console:

? 9841.149452! INFO: task btrfs-transacti:634 blocked for more than 120 seconds.

? 9841.149459! Not tainted 4.4.0-67-generic #88
? 9841.149461! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9841.149627! INFO: task cpuplugd:2409 blocked for more than 120 seconds.
? 9841.149628! Not tainted 4.4.0-67-generic #88
? 9841.149629! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9841.149674! INFO: task irqbalance:2515 blocked for more than 120 seconds.
? 9841.149675! Not tainted 4.4.0-67-generic #88
? 9841.149676! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9841.149715! INFO: task kworker/0:2:3661 blocked for more than 120 seconds.
? 9841.149716! Not tainted 4.4.0-67-generic #88
? 9841.149717! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9841.149752! INFO: task tar:16648 blocked for more than 120 seconds.
? 9841.149753! Not tainted 4.4.0-67-generic #88
? 9841.149754! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9961.149482! INFO: task btrfs-transacti:634 blocked for more than 120 seconds.

? 9961.149489! Not tainted 4.4.0-67-generic #88
? 9961.149490! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9961.149640! INFO: task rs:main Q:Reg:1995 blocked for more than 120 seconds.
? 9961.149642! Not tainted 4.4.0-67-generic #88
? 9961.149642! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9961.149727! INFO: task cpuplugd:2409 blocked for more than 120 seconds.
? 9961.149729! Not tainted 4.4.0-67-generic #88
? 9961.149729! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9961.149772! INFO: task irqbalance:2515 blocked for more than 120 seconds.
? 9961.149773! Not tainted 4.4.0-67-generic #88
? 9961.149773! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9961.149811! INFO: task kworker/0:2:3661 blocked for more than 120 seconds.
? 9961.149812! Not tainted 4.4.0-67-generic #88
? 9961.149812! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.

I just made a new dump ->
General dump info:
Dump format........: s390mv
Version............: 5
Dump created.......: Tue, 14 Mar 2017 15:45:29 +0100
Dump ended.........: Tue, 14 Mar 2017 15:46:07 +0100
Dump CPU ID........: 9efc729648000
UTS node name......: mclint
UTS kernel release.: 4.4.0-...

Read more...

bugproxy (bugproxy) wrote :
Download full text (3.6 KiB)

------- Comment From <email address hidden> 2017-03-28 12:23 EDT-------
Just tried Kernel 4.4.0-70 ->

/var/log/syslog:

Mar 28 18:07:46 mclint multipathd[888]: mpatha: sda - tur checker timed out
Mar 28 18:07:46 mclint multipathd[888]: 8:0: reinstated
Mar 28 18:07:46 mclint multipathd[888]: mpatha: sdb - tur checker timed out
Mar 28 18:07:46 mclint multipathd[888]: 8:16: reinstated
Mar 28 18:07:46 mclint multipathd[888]: mpatha: sdc - tur checker timed out
Mar 28 18:07:46 mclint multipathd[888]: 8:32: reinstated
Mar 28 18:07:46 mclint multipathd[888]: mpatha: sdd - tur checker timed out
Mar 28 18:07:46 mclint multipathd[888]: 8:48: reinstated

/dev/sclp_line0:

? 459.779353! BTRFS error (device dm-1): bdev /dev/dm-1 errs: wr 0, rd 1, flush
0, corrupt 0, gen 0
? 459.779503! BTRFS error (device dm-1): bdev /dev/dm-1 errs: wr 0, rd 2, flush
0, corrupt 0, gen 0
? 481.287452! INFO: task xfsaild/dm-11:1727 blocked for more than 120 seconds.
? 481.287459! Not tainted 4.4.0-70-generic #91-Ubuntu
? 481.287461! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 481.287647! INFO: task cpuplugd:2402 blocked for more than 120 seconds.
? 481.287648! Not tainted 4.4.0-70-generic #91-Ubuntu
? 481.287649! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 481.287696! INFO: task irqbalance:2508 blocked for more than 120 seconds.
? 481.287697! Not tainted 4.4.0-70-generic #91-Ubuntu
? 481.287698! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 481.287740! INFO: task kworker/0:19:22353 blocked for more than 120 seconds.
? 481.287741! Not tainted 4.4.0-70-generic #91-Ubuntu
? 481.287742! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 481.287769! INFO: task kworker/0:21:22355 blocked for more than 120 seconds.
? 481.287770! Not tainted 4.4.0-70-generic #91-Ubuntu
? 481.287771! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 481.287875! INFO: task tar:22484 blocked for more than 120 seconds.
? 481.287876! Not tainted 4.4.0-70-generic #91-Ubuntu
? 481.287877! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 601.288111! INFO: task systemd:1 blocked for more than 120 seconds.
? 601.288118! Not tainted 4.4.0-70-generic #91-Ubuntu
? 601.288119! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 601.288207! INFO: task xfsaild/dm-11:1727 blocked for more than 120 seconds.
? 601.288208! Not tainted 4.4.0-70-generic #91-Ubuntu
? 601.288209! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 601.288372! INFO: task rs:main Q:Reg:2002 blocked for more than 120 seconds.
? 601.288374! Not tainted 4.4.0-70-generic #91-Ubuntu
? 601.288374! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 601.288496! INFO: task cpuplugd:2402 blocked for more than 120 seconds.
? 601.288497! Not tainted 4.4.0-70-generic #91-Ubuntu
? 601.288497! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.

crashdump-info:

KERNEL: /usr/lib/debug/boot/vmlinux-4.4.0-70-...

Read more...

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-03-30 08:50 EDT-------
New Kernel, new hang ...

? 961.242228! INFO: task kworker/1:1:38 blocked for more than 120 seconds.
? 961.242235! Not tainted 4.4.0-71-generic #92-Ubuntu
? 961.242236! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.242480! INFO: task xfsaild/dm-11:1742 blocked for more than 120 seconds.
? 961.242481! Not tainted 4.4.0-71-generic #92-Ubuntu
? 961.242482! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.242933! INFO: task rs:main Q:Reg:2043 blocked for more than 120 seconds.
? 961.242934! Not tainted 4.4.0-71-generic #92-Ubuntu
? 961.242935! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.243407! INFO: task cpuplugd:2355 blocked for more than 120 seconds.
? 961.243409! Not tainted 4.4.0-71-generic #92-Ubuntu
? 961.243410! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.243544! INFO: task irqbalance:2447 blocked for more than 120 seconds.
? 961.243546! Not tainted 4.4.0-71-generic #92-Ubuntu
? 961.243546! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.243617! INFO: task kworker/0:2H:3385 blocked for more than 120 seconds.
? 961.243618! Not tainted 4.4.0-71-generic #92-Ubuntu
? 961.243619! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.243911! INFO: task kworker/0:1H:6035 blocked for more than 120 seconds.
? 961.243912! Not tainted 4.4.0-71-generic #92-Ubuntu
? 961.243913! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.244405! INFO: task kworker/0:2:22440 blocked for more than 120 seconds.
? 961.244406! Not tainted 4.4.0-71-generic #92-Ubuntu
? 961.244407! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.244543! INFO: task kworker/0:4:22938 blocked for more than 120 seconds.
? 961.244545! Not tainted 4.4.0-71-generic #92-Ubuntu
? 961.244545! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.245404! INFO: task dpkg:24617 blocked for more than 120 seconds.
? 961.245405! Not tainted 4.4.0-71-generic #92-Ubuntu
? 961.245406! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.

And the dump:

KERNEL: /usr/lib/debug/boot/vmlinux-4.4.0-71-generic
DUMPFILE: mclint_20170330_kernel_4_4_0-71_without_openafs.dump
CPUS: 1
DATE: Thu Mar 30 12:14:27 2017
UPTIME: 00:24:42
LOAD AVERAGE: 14.32, 12.51, 7.23
TASKS: 407
NODENAME: mclint
RELEASE: 4.4.0-71-generic
VERSION: #92-Ubuntu SMP Fri Mar 24 13:03:47 UTC 2017
MACHINE: s390x (unknown Mhz)
MEMORY: 7.8 GB
PANIC: ""
PID: 0
COMMAND: "swapper/0"
TASK: bad528 [THREAD_INFO: b78000]
CPU: 0
STATE: TASK_RUNNING
INFO: no panic task found

The dump is already uploaded to Box

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-key
bugproxy (bugproxy) wrote :
Download full text (3.4 KiB)

------- Comment From <email address hidden> 2017-04-06 11:53 EDT-------
Same procedure as every kernel, /var/log/syslog:

Apr 6 16:17:29 mclint multipathd[881]: mpatha: sda - tur checker timed out
Apr 6 16:17:29 mclint multipathd[881]: 8:0: reinstated
Apr 6 16:17:29 mclint multipathd[881]: mpatha: sdb - tur checker timed out
Apr 6 16:17:29 mclint multipathd[881]: 8:16: reinstated
Apr 6 16:17:29 mclint multipathd[881]: mpatha: sdd - tur checker timed out
Apr 6 16:17:29 mclint multipathd[881]: 8:48: reinstated
Apr 6 16:17:29 mclint multipathd[881]: mpatha: sdc - tur checker timed out
Apr 6 16:17:29 mclint multipathd[881]: 8:32: reinstated

/dev/sclp_line0:

? 361.418628! INFO: task kworker/1:4:860 blocked for more than 120 seconds.
? 361.418635! Not tainted 4.4.0-72-generic #93-Ubuntu
? 361.418637! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 361.418722! INFO: task cpuplugd:2310 blocked for more than 120 seconds.
? 361.418723! Not tainted 4.4.0-72-generic #93-Ubuntu
? 361.418723! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 361.418766! INFO: task irqbalance:2416 blocked for more than 120 seconds.
? 361.418767! Not tainted 4.4.0-72-generic #93-Ubuntu
? 361.418768! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 361.418806! INFO: task kworker/0:2H:3403 blocked for more than 120 seconds.
? 361.418807! Not tainted 4.4.0-72-generic #93-Ubuntu
? 361.418808! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 361.418990! INFO: task kworker/0:9:4449 blocked for more than 120 seconds.
? 361.418991! Not tainted 4.4.0-72-generic #93-Ubuntu
? 361.418992! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 481.420013! INFO: task kworker/1:4:860 blocked for more than 120 seconds.
? 481.420020! Not tainted 4.4.0-72-generic #93-Ubuntu
? 481.420021! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 481.420091! INFO: task systemd-timesyn:1766 blocked for more than 120 seconds
.
? 481.420093! Not tainted 4.4.0-72-generic #93-Ubuntu
? 481.420093! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 481.420136! INFO: task rs:main Q:Reg:2023 blocked for more than 120 seconds.
? 481.420137! Not tainted 4.4.0-72-generic #93-Ubuntu
? 481.420138! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 481.420250! INFO: task cpuplugd:2310 blocked for more than 120 seconds.
? 481.420251! Not tainted 4.4.0-72-generic #93-Ubuntu
? 481.420252! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 481.420291! INFO: task irqbalance:2416 blocked for more than 120 seconds.
? 481.420292! Not tainted 4.4.0-72-generic #93-Ubuntu
? 481.420293! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.

Dump:

KERNEL: /usr/lib/debug/boot/vmlinux-4.4.0-72-generic
DUMPFILE: mclint_20170406_kernel_4_4_0-72_without_openafs.dump
CPUS: 1
DATE: Thu Apr 6 16:43:18 2017
UPTIME: 00:29:55
LOAD AVERAGE: 9.99, 9.56, 7.51
TASKS: 403
NODENAME: mclint
RELEASE: 4.4.0-7...

Read more...

Frank Heimes (frank-heimes) wrote :

I copied over the latest 4.4-72 dump from IBM Box to our Canonical private file share into my home:
/~fheimes/mclint_20170406_kernel_4_4_0-72_without_openafs.dump.bz2
also reachable via https ...

Changed in linux (Ubuntu):
status: New → Triaged
Joseph Salisbury (jsalisbury) wrote :

Is there an update on this bug?

Joseph Salisbury (jsalisbury) wrote :

@<email address hidden> It would be good to know if this bug is already fixed in the mainline kernel. Would it be possible for you to test 4.11-rc8? It can be downloaded from:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11-rc8/

tags: added: kernel-da-key
removed: kernel-key
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-06-06 12:31 EDT-------
@jsalisbury

Sorry for the late late late answer, but finally I've found time and resources to test the 4.11-rc8 Kernel. And yes, this one looks promising :-)

I was able to build OpenAFS with the pbuilder environment and also trying to build the firefox packages with pbuilder did not drive me into the hang scenario I usually face.
I'll do some more tests but this looks promising.
Hmm, if the solution for this bug is Kernel 4.11 than we may have to speak out a warning to whoever is utilizing multipathing on Kernels with an earlier version.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-06-06 12:46 EDT-------
@<email address hidden>

Just to prevent confusion, its probably just a typo, but I think you mean multi-queue (blk-mq and scsi-mq) - thats a different feature. Multipathing (dm-multipath) should certainly work regardless.

Changed in ubuntu-z-systems:
status: New → Triaged
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-06-07 07:59 EDT-------
Hello Benjamin,

you're right and let me rephrase in more detail:

The multiqueue feature is questioned here, but the feature has an impact to multipathing, because Ubuntu-Xenial boots up and has the multiqueue feature turned on as default(!!). You can run "multipathing" but you will have to turn off the multiqueue feature explicitely if you don't want to potentially run into the hang scenario described here.
And my recommendation would be to turn the "multiqueue" feature off by default for all Kernel versions prior to 4.11 .... that's what I wanted to express with my previous post.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-06-08 10:28 EDT-------
I shut down the test system I used for the repros here yesterday. And all of a sudden during the shutdown process the system ran into a hang scenario again. I don't know whether that hang is related to the problem addressed here (blk_mq()), but I took a dump and uploaded it to Box as a precaution though ->

mclint_20170607_kernel_4_11_0-041100rc8_without_openafs.dump.bz2

General dump info:
Dump format........: s390mv
Version............: 5
Dump created.......: Wed, 07 Jun 2017 16:56:28 +0200
Dump ended.........: Wed, 07 Jun 2017 16:57:06 +0200
Dump CPU ID........: 9efc729648000
UTS node name......: mclint
UTS kernel release.: 4.11.0-041100rc8-generic
UTS kernel version.: #201704232131 SMP Mon Apr 24 02:10:15 UTC 2017
Build arch.........: s390x (64 bit)
System arch........: s390x (64 bit)
CPU count (online).: 2
CPU count (real)...: 4
Dump memory range..: 8192 MB
Real memory range..: 8192 MB

Frank Heimes (frank-heimes) wrote :

I copied over the latest 4_11_0-041100rc8 dump from IBM Box to our Canonical private file share into my home:
~fheimes/mclint_20170607_kernel_4_11_0-041100rc8_without_openafs.dump.bz2

tags: added: kernel-key
removed: kernel-da-key
Joseph Salisbury (jsalisbury) wrote :

@<email address hidden>

Can you confirm if v4.11-rc8 fixed the bug or not? Per comment #24 it ts un-clear if it does or not.

If we find that it is fixed in the v4.11-rc8 kernel, or any other newer kernel, we can perform a "Reverse" bisect to identify the commit that fixes the bug, then backport it to prior releases.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-07-10 04:34 EDT-------
@jsalisbury:

I tried the kernel and didn't experience a hang as the system ran, but I ran into a hang when I shut the system down after my test. The dump of that very hang was uploaded from me to our Box account and Frank Heimes downloaded it to forward it to the Kernel team.
I'd like to have their (the kernel team's) statement whether in the dump they see a relation to blk_mq() or not before we close this item!

Joseph Salisbury (jsalisbury) wrote :

Thanks for the update. Can you test v4.11-rc7? It is available from:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11-rc7/

We can perform a "Reverse" kernel bisect if we can identify the last bad kernel version and the first good one.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-07-12 10:45 EDT-------
@jsalisbury:

Pardon to object in some respect, but I can't get the point why I should try another 4.11rc-Kernel and what you refer to with a "bad kernel" and a "good kernel".
The 4.11rc8 kernel seemed to be fine and I just got into a hang when the system was shut down by me. I just want to know whether that hang is related to blk_mq() or not and one glance into the crash-dump I uploaded to box will answer that question. I could also look into the crash-dump by myself, I just don't have the debug symbols for kernel 4.11rc8-s390x.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-07-13 10:26 EDT-------
Just wanted to try the 4.11rc7 kernel and my DKMS-OpenAFS packages don't build on it. I already experienced this once with a package from the kernel-ppa repository:

root@mclint:~# file /usr/src/linux-headers-4.11.0-041100rc7-generic/arch/s390/tools/gen_facilities
/usr/src/linux-headers-4.11.0-041100rc7-generic/arch/s390/tools/gen_facilities: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=4590a42ae8b1bf9b2bd8d05e332e59bc7f47aa93, not stripped

You put x86-ELFs in your s390x linux-headers packages!!!! I need appropriate linux-headers packages to make repros.

Joseph Salisbury (jsalisbury) wrote :

The purpose of testing v4.12-rc7 is to narrow down that last kernel version that had the hang bug(The bad kernel) and the first kernel version that did not(The good kernel). This will allow us to identify the exact commit that fixes the hang bug. This can be accomplished by performing a "Reverse" bisect[0]. Once we know the commit that fixes the bug, we can SRU it to all the previous Ubuntu releases.

Are you not able to test for the hang bug without compiling the DKMS-OpenAFS packages? If so, did they compile okay when you tested v4.12-rc8?

[0] https://wiki.ubuntu.com/Kernel/KernelBisection

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-07-17 12:10 EDT-------
@jsalisbury:

To get your point: If 4.11rc8 was a "good kernel" and 4.11rc7 the last "bad kernel" you could try to get that very diff that fixes this problem and could apply it to the stable 4.4-Kernel line, I agree that's a feasible approach.
However, we must first make sure that 4.11rc8 is a "good kernel" and thatfore someone must look into the dumps I uploaded.
And the arch specific linux-headers-x.x.x-x-generic packages must be fixed for s390x. Without that package I can't build my DKMS-packages.

Juerg Haefliger (juergh) wrote :

I've tried to take a look at at the dump files but the BOX link referenced in comment #4 doesn't work for me. Either I don't have access or it has been removed.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-08-22 03:42 EDT-------
@Juerg. Now you should have access to the box folder...

Juerg Haefliger (juergh) wrote :

Sorry for the late reply, I'm just getting around looking at this. Yes, I do have access to the folder now, thanks!

Questions: Is this setup currently working with another distribution and you're just experiencing issues when running Ubuntu? If so, what's that other distro and kernel? Also, can you give me a high-level overview of your storage architecture?

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-08-30 05:15 EDT-------
This function worked with SLES 12 SP2 kernel 4.4.74.

Juerg Haefliger (juergh) wrote :

Are you running specific CPU (un)plug tests in parallel with pdebuild? Can you post the contents of /etc/cpuplugd.conf?

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-09-04 04:52 EDT-------
no specific cpu (un)plug tests where executed....

------- Comment (attachment only) From <email address hidden> 2017-09-04 04:51 EDT-------

Juerg Haefliger (juergh) wrote :

I'm able to reproduce the issue locally.

Juerg Haefliger (juergh) wrote :

Test passes with 4.13 and 4.12 and SUSE's SLE12 SP2 4.4.21-69-generic kernel.
Test fails with 4.11.

Changed in linux (Ubuntu):
assignee: nobody → Juerg Haefliger (juergh)
Juerg Haefliger (juergh) wrote :

The difference between the Xenial and SUSE kernel is that Xenial has:
  CONFIG_SCSI_MQ_DEFAULT=y
  CONFIG_DM_MQ_DEFAULT=y
but SUSE:
  # CONFIG_SCSI_MQ_DEFAULT is not set
  # CONFIG_DM_MQ_DEFAULT is not set

If I disable blk-mq in the Xenial kernel, the test passes.

The easiest 'fix' would be to simply disable blk-mq. This can be accomplished via the kernel commandline parameters: scsi_mod.use_blk_mq=0 dm_mod.use_blk_mq=0.

I also noticed that s390x is the only architecture where these options are enabled in the Xenial kernel. Is there a specific requirement for this?

tags: added: kernel-da-key
removed: kernel-key

------- Comment From <email address hidden> 2017-09-11 09:52 EDT-------
No specific requirement for s390 know, for enabled this config options.
They can be set to N....

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-09-11 10:46 EDT-------
Ok, I think then the discussion is back at the point I addressed with my comment from June 6th (comment 36 or 38)! If the Multi-Queue feature doesn't work with Kernel 4.4 yet, then you must not deliver it (that's exactly why SUSE did not activate that feature in their 4.4-Kernel package)!
And then again with Kernel 4.11 or 4.12 you can turn that option for those kernel packages on as finally there the feature seems to be mature and can be rolled out.

Juerg Haefliger (juergh) wrote :

Ok, we will turn these options off for Xenial 4.4 and bring s390x in line with the other architectures. Note that we *think* that the reason that they're enabled for s390x is that we initially received a config file from IBM that we used as a base.

Juerg Haefliger (juergh) wrote :

For the record, the following commit fixes the issue for later kernels (>4.11). What we're seeing with the 4.4 kernel is most likely a different issue though.

commit ba74b6f7fcc07355d087af6939712eed4a454821 (refs/bisect/new)
Author: Christoph Hellwig <email address hidden>
Date: Thu Aug 24 18:07:02 2017 +0200

    virtio_pci: fix cpu affinity support

    Commit 0b0f9dc5 ("Revert "virtio_pci: use shared interrupts for
    virtqueues"") removed the adjustment of the pre_vectors for the virtio
    MSI-X vector allocation which was added in commit fb5e31d9 ("virtio:
    allow drivers to request IRQ affinity when creating VQs"). This will
    lead to an incorrect assignment of MSI-X vectors, and potential
    deadlocks when offlining cpus.

    Signed-off-by: Christoph Hellwig <email address hidden>
    Fixes: 0b0f9dc5 ("Revert "virtio_pci: use shared interrupts for virtqueues")
    Reported-by: YASUAKI ISHIMATSU <email address hidden>
    Cc: <email address hidden>
    Signed-off-by: Michael S. Tsirkin <email address hidden>

diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index 007a4f366086..1c4797e53f68 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -107,6 +107,7 @@ static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors,
 {
        struct virtio_pci_device *vp_dev = to_vp_device(vdev);
        const char *name = dev_name(&vp_dev->vdev.dev);
+ unsigned flags = PCI_IRQ_MSIX;
        unsigned i, v;
        int err = -ENOMEM;

@@ -126,10 +127,13 @@ static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors,
                                        GFP_KERNEL))
                        goto error;

+ if (desc) {
+ flags |= PCI_IRQ_AFFINITY;
+ desc->pre_vectors++; /* virtio config vector */
+ }
+
        err = pci_alloc_irq_vectors_affinity(vp_dev->pci_dev, nvectors,
- nvectors, PCI_IRQ_MSIX |
- (desc ? PCI_IRQ_AFFINITY : 0),
- desc);
+ nvectors, flags, desc);
        if (err < 0)
                goto error;
        vp_dev->msix_enabled = 1;

Changed in linux (Ubuntu Xenial):
status: New → Fix Committed
Changed in linux (Ubuntu):
status: Triaged → Fix Committed
Changed in ubuntu-z-systems:
status: Triaged → Fix Committed
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-09-22 04:36 EDT-------
Just checked kernel 4.4.0-96. blk_mq() is still activated as default:

root@xxxxx:~# systool -v -m dm_mod
Module = "dm_mod"

Attributes:
uevent = <store method only>

Parameters:
reserved_bio_based_ios= "16"
reserved_rq_based_ios= "256"
stats_current_allocated_bytes= "0"
use_blk_mq = "Y"
root@xxxxx:~# systool -v -m scsi_mod
Module = "scsi_mod"

Attributes:
uevent = <store method only>

Parameters:
default_dev_flags = "0"
eh_deadline = "-1"
inq_timeout = "20"
max_luns = "512"
scan = "async"
scsi_logging_level = "0"
use_blk_mq = "Y"

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-09-26 07:41 EDT-------
Ok, just checked kernel 4.4.0-97 ... that looks much better:

root@xxxxx:~# uname -a
Linux mclint 4.4.0-97-generic #120-Ubuntu SMP Tue Sep 19 17:27:01 UTC 2017 s390x s390x s390x GNU/Linux
root@xxxxx:~# systool -v -m scsi_mod
Module = "scsi_mod"

Attributes:
uevent = <store method only>

Parameters:
default_dev_flags = "0"
eh_deadline = "-1"
inq_timeout = "20"
max_luns = "512"
scan = "async"
scsi_logging_level = "0"
use_blk_mq = "N"
root@xxxxx:~# systool -v -m dm_mod
Module = "dm_mod"

Attributes:
uevent = <store method only>

Parameters:
reserved_bio_based_ios= "16"
reserved_rq_based_ios= "256"
stats_current_allocated_bytes= "0"
use_blk_mq = "N"

blk_mq() is now turned off by default and that was our main concern! On top, I also started two "big" pdebuild processes (firefox) that so far had the potential to drive the system right into the hang scenario that was the origin cause to write this ticket. The build did not succeed, but the point is that the system did not run into the typical hang -> so I think you can consider this problem being solved!

Hi @jacobi,

Thank you very much for verifying the fix!

Kleber

tags: added: verification-done-xenial
removed: verification-needed-xenial
Launchpad Janitor (janitor) wrote :
Download full text (7.8 KiB)

This bug was fixed in the package linux - 4.4.0-97.120

---------------
linux (4.4.0-97.120) xenial; urgency=low

  * linux: 4.4.0-97.120 -proposed tracker (LP: #1718149)

  * blk-mq: possible deadlock on CPU hot(un)plug (LP: #1670634)
    - [Config] s390x -- disable CONFIG_{DM, SCSI}_MQ_DEFAULT

  * Xenial update to 4.4.87 stable release (LP: #1715678)
    - irqchip: mips-gic: SYNC after enabling GIC region
    - i2c: ismt: Don't duplicate the receive length for block reads
    - i2c: ismt: Return EMSGSIZE for block reads with bogus length
    - ceph: fix readpage from fscache
    - cpumask: fix spurious cpumask_of_node() on non-NUMA multi-node configs
    - cpuset: Fix incorrect memory_pressure control file mapping
    - alpha: uapi: Add support for __SANE_USERSPACE_TYPES__
    - CIFS: remove endian related sparse warning
    - wl1251: add a missing spin_lock_init()
    - xfrm: policy: check policy direction value
    - drm/ttm: Fix accounting error when fail to get pages for pool
    - kvm: arm/arm64: Fix race in resetting stage2 PGD
    - kvm: arm/arm64: Force reading uncached stage2 PGD
    - epoll: fix race between ep_poll_callback(POLLFREE) and ep_free()/ep_remove()
    - crypto: algif_skcipher - only call put_page on referenced and used pages
    - Linux 4.4.87

  * Xenial update to 4.4.86 stable release (LP: #1715430)
    - scsi: isci: avoid array subscript warning
    - ALSA: au88x0: Fix zero clear of stream->resources
    - btrfs: remove duplicate const specifier
    - i2c: jz4780: drop superfluous init
    - gcov: add support for gcc version >= 6
    - gcov: support GCC 7.1
    - lightnvm: initialize ppa_addr in dev_to_generic_addr()
    - p54: memset(0) whole array
    - lpfc: Fix Device discovery failures during switch reboot test.
    - arm64: mm: abort uaccess retries upon fatal signal
    - x86/io: Add "memory" clobber to insb/insw/insl/outsb/outsw/outsl
    - arm64: fpsimd: Prevent registers leaking across exec
    - scsi: sg: protect accesses to 'reserved' page array
    - scsi: sg: reset 'res_in_use' after unlinking reserved array
    - drm/i915: fix compiler warning in drivers/gpu/drm/i915/intel_uncore.c
    - Linux 4.4.86

  * Xenial update to 4.4.85 stable release (LP: #1714298)
    - af_key: do not use GFP_KERNEL in atomic contexts
    - dccp: purge write queue in dccp_destroy_sock()
    - dccp: defer ccid_hc_tx_delete() at dismantle time
    - ipv4: fix NULL dereference in free_fib_info_rcu()
    - net_sched/sfq: update hierarchical backlog when drop packet
    - ipv4: better IP_MAX_MTU enforcement
    - sctp: fully initialize the IPv6 address in sctp_v6_to_addr()
    - tipc: fix use-after-free
    - ipv6: reset fn->rr_ptr when replacing route
    - ipv6: repair fib6 tree in failure case
    - tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP
    - irda: do not leak initialized list.dev to userspace
    - net: sched: fix NULL pointer dereference when action calls some targets
    - net_sched: fix order of queue length updates in qdisc_replace()
    - mei: me: add broxton pci device ids
    - mei: me: add lewisburg device ids
    - Input: trackpoint - add new trackpoint firmware ID
    - Input: elan_i2c...

Read more...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Changed in ubuntu-z-systems:
status: Fix Committed → Fix Released
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-10-11 03:36 EDT-------
IBM Bugzilla status -> closed, Fix Released within Xenial

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers