Activity log for bug #1572630

Date Who What changed Old value New value Message
2016-04-20 15:22:12 Dale Hamel bug added bug
2016-04-20 15:22:12 Dale Hamel attachment added loopcrash https://bugs.launchpad.net/bugs/1572630/+attachment/4640780/+files/loopcrash
2016-04-20 15:22:36 Dale Hamel attachment added dmidecode https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1572630/+attachment/4640781/+files/dmidecode
2016-04-20 15:24:10 Dale Hamel attachment added lspci https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1572630/+attachment/4640782/+files/lspci
2016-04-20 15:27:04 Dale Hamel attachment added lshw https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1572630/+attachment/4640783/+files/lshw
2016-04-21 17:31:48 Launchpad Janitor linux-lts-xenial (Ubuntu): status New Confirmed
2016-05-19 05:38:03 Alberto Salvia Novella linux-lts-xenial (Ubuntu): importance Undecided Critical
2016-06-28 06:11:31 Proton attachment added Screenshot https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1572630/+attachment/4691400/+files/Snip20160628_3.png
2016-07-06 16:45:52 Eric Desrochers bug added subscriber Eric Desrochers
2016-07-13 14:11:33 Eric Desrochers linux-lts-xenial (Ubuntu): assignee Eric Desrochers (slashd)
2016-07-13 14:12:04 Eric Desrochers linux-lts-xenial (Ubuntu): assignee Eric Desrochers (slashd)
2016-07-13 14:12:29 Eric Desrochers linux-lts-xenial (Ubuntu): assignee Eric Desrochers (slashd)
2016-07-21 13:37:01 Eric Desrochers linux-lts-xenial (Ubuntu): status Confirmed In Progress
2016-07-21 13:39:15 Eric Desrochers linux-lts-xenial (Ubuntu): importance Critical Medium
2016-08-22 08:57:30 Arne de Bruijn bug added subscriber Arne de Bruijn
2016-08-29 17:16:01 Eric Desrochers description We discovered a pretty serious regression introduced in 4.4.0-18. At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 This seems somewhat similar to https://lkml.org/lkml/2016/3/16/40, but the trace is not identical. We discovered this issue when we were experimenting with linux-generic-lts-xenial from trusty-updates on a 14.04 installation. When we installed it, 4.4.0-15 was the current package, and it worked fine and provided a large amount of improvements for us. Background security updates installed 4.4.0-18, and this updated and grub and became the default kernel. On a reboot, the node panics about 2 seconds in, resulting in a machine in a dead state. We were able to boot a rescue image and roll bac kto 4.4.0-15, which works nicely. We currently have pinning on 4.4.0-15 to prevent this problem from coming back, but would prefer to see the problem fixed. I'll attach lspci, lshw, and dmidecode for our hardware as well, but this is happening on pretty vanilla supermicro nodes. We are able to consistently reproduce it on our hardware. It is not reproducible in EC2, only on metal. [Impact] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 [Test Case] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. [Regression Potential] * Fix implemented upstream starting with v4.6-rc1 * The fix is fairly straightfoward given the stack trace and the upstream fix. * The fix is hard to verify, but user "Proton" was able to confirmed the test kernel including the fix solve this particular problem. [Other Info] * https://lkml.org/lkml/2016/3/16/40 * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e0e827b9 * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=897bb0c7 [Original Description] We discovered a pretty serious regression introduced in 4.4.0-18. At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 This seems somewhat similar to https://lkml.org/lkml/2016/3/16/40, but the trace is not identical. We discovered this issue when we were experimenting with linux-generic-lts-xenial from trusty-updates on a 14.04 installation. When we installed it, 4.4.0-15 was the current package, and it worked fine and provided a large amount of improvements for us. Background security updates installed 4.4.0-18, and this updated and grub and became the default kernel. On a reboot, the node panics about 2 seconds in, resulting in a machine in a dead state. We were able to boot a rescue image and roll bac kto 4.4.0-15, which works nicely. We currently have pinning on 4.4.0-15 to prevent this problem from coming back, but would prefer to see the problem fixed. I'll attach lspci, lshw, and dmidecode for our hardware as well, but this is happening on pretty vanilla supermicro nodes. We are able to consistently reproduce it on our hardware. It is not reproducible in EC2, only on metal.
2016-08-29 17:16:45 Eric Desrochers description [Impact] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 [Test Case] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. [Regression Potential] * Fix implemented upstream starting with v4.6-rc1 * The fix is fairly straightfoward given the stack trace and the upstream fix. * The fix is hard to verify, but user "Proton" was able to confirmed the test kernel including the fix solve this particular problem. [Other Info] * https://lkml.org/lkml/2016/3/16/40 * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e0e827b9 * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=897bb0c7 [Original Description] We discovered a pretty serious regression introduced in 4.4.0-18. At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 This seems somewhat similar to https://lkml.org/lkml/2016/3/16/40, but the trace is not identical. We discovered this issue when we were experimenting with linux-generic-lts-xenial from trusty-updates on a 14.04 installation. When we installed it, 4.4.0-15 was the current package, and it worked fine and provided a large amount of improvements for us. Background security updates installed 4.4.0-18, and this updated and grub and became the default kernel. On a reboot, the node panics about 2 seconds in, resulting in a machine in a dead state. We were able to boot a rescue image and roll bac kto 4.4.0-15, which works nicely. We currently have pinning on 4.4.0-15 to prevent this problem from coming back, but would prefer to see the problem fixed. I'll attach lspci, lshw, and dmidecode for our hardware as well, but this is happening on pretty vanilla supermicro nodes. We are able to consistently reproduce it on our hardware. It is not reproducible in EC2, only on metal. [Impact] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 [Test Case] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. [Regression Potential]  * Fix implemented upstream starting with v4.6-rc1  * The fix is fairly straightfoward given the stack trace and the upstream fix.  * The fix is hard to verify, but user "Proton" was able to confirmed the test kernel including the fix solve this particular problem: https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1572630/comments/23 [Other Info]  * https://lkml.org/lkml/2016/3/16/40  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e0e827b9  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=897bb0c7 [Original Description] We discovered a pretty serious regression introduced in 4.4.0-18. At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 This seems somewhat similar to https://lkml.org/lkml/2016/3/16/40, but the trace is not identical. We discovered this issue when we were experimenting with linux-generic-lts-xenial from trusty-updates on a 14.04 installation. When we installed it, 4.4.0-15 was the current package, and it worked fine and provided a large amount of improvements for us. Background security updates installed 4.4.0-18, and this updated and grub and became the default kernel. On a reboot, the node panics about 2 seconds in, resulting in a machine in a dead state. We were able to boot a rescue image and roll bac kto 4.4.0-15, which works nicely. We currently have pinning on 4.4.0-15 to prevent this problem from coming back, but would prefer to see the problem fixed. I'll attach lspci, lshw, and dmidecode for our hardware as well, but this is happening on pretty vanilla supermicro nodes. We are able to consistently reproduce it on our hardware. It is not reproducible in EC2, only on metal.
2016-08-29 17:23:21 Eric Desrochers description [Impact] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 [Test Case] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. [Regression Potential]  * Fix implemented upstream starting with v4.6-rc1  * The fix is fairly straightfoward given the stack trace and the upstream fix.  * The fix is hard to verify, but user "Proton" was able to confirmed the test kernel including the fix solve this particular problem: https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1572630/comments/23 [Other Info]  * https://lkml.org/lkml/2016/3/16/40  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e0e827b9  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=897bb0c7 [Original Description] We discovered a pretty serious regression introduced in 4.4.0-18. At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 This seems somewhat similar to https://lkml.org/lkml/2016/3/16/40, but the trace is not identical. We discovered this issue when we were experimenting with linux-generic-lts-xenial from trusty-updates on a 14.04 installation. When we installed it, 4.4.0-15 was the current package, and it worked fine and provided a large amount of improvements for us. Background security updates installed 4.4.0-18, and this updated and grub and became the default kernel. On a reboot, the node panics about 2 seconds in, resulting in a machine in a dead state. We were able to boot a rescue image and roll bac kto 4.4.0-15, which works nicely. We currently have pinning on 4.4.0-15 to prevent this problem from coming back, but would prefer to see the problem fixed. I'll attach lspci, lshw, and dmidecode for our hardware as well, but this is happening on pretty vanilla supermicro nodes. We are able to consistently reproduce it on our hardware. It is not reproducible in EC2, only on metal. [Impact] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 [Test Case] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. [Regression Potential]  * Fix implemented upstream starting with v4.6-rc1  * The fix is fairly straightfoward given the stack trace and the upstream fix.  * The fix is hard to verify, but user "Proton" was able to confirmed that upstream mainline 4.6-rc1 solve the situation and that the test kernel I have provided including the fix solves this particular problem as well. Confirmation by Proton : https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1572630/comments/23 [Other Info]  * https://lkml.org/lkml/2016/3/16/40  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e0e827b9  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=897bb0c7 [Original Description] We discovered a pretty serious regression introduced in 4.4.0-18. At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 This seems somewhat similar to https://lkml.org/lkml/2016/3/16/40, but the trace is not identical. We discovered this issue when we were experimenting with linux-generic-lts-xenial from trusty-updates on a 14.04 installation. When we installed it, 4.4.0-15 was the current package, and it worked fine and provided a large amount of improvements for us. Background security updates installed 4.4.0-18, and this updated and grub and became the default kernel. On a reboot, the node panics about 2 seconds in, resulting in a machine in a dead state. We were able to boot a rescue image and roll bac kto 4.4.0-15, which works nicely. We currently have pinning on 4.4.0-15 to prevent this problem from coming back, but would prefer to see the problem fixed. I'll attach lspci, lshw, and dmidecode for our hardware as well, but this is happening on pretty vanilla supermicro nodes. We are able to consistently reproduce it on our hardware. It is not reproducible in EC2, only on metal.
2016-08-29 17:25:52 Eric Desrochers description [Impact] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 [Test Case] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. [Regression Potential]  * Fix implemented upstream starting with v4.6-rc1  * The fix is fairly straightfoward given the stack trace and the upstream fix.  * The fix is hard to verify, but user "Proton" was able to confirmed that upstream mainline 4.6-rc1 solve the situation and that the test kernel I have provided including the fix solves this particular problem as well. Confirmation by Proton : https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1572630/comments/23 [Other Info]  * https://lkml.org/lkml/2016/3/16/40  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e0e827b9  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=897bb0c7 [Original Description] We discovered a pretty serious regression introduced in 4.4.0-18. At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 This seems somewhat similar to https://lkml.org/lkml/2016/3/16/40, but the trace is not identical. We discovered this issue when we were experimenting with linux-generic-lts-xenial from trusty-updates on a 14.04 installation. When we installed it, 4.4.0-15 was the current package, and it worked fine and provided a large amount of improvements for us. Background security updates installed 4.4.0-18, and this updated and grub and became the default kernel. On a reboot, the node panics about 2 seconds in, resulting in a machine in a dead state. We were able to boot a rescue image and roll bac kto 4.4.0-15, which works nicely. We currently have pinning on 4.4.0-15 to prevent this problem from coming back, but would prefer to see the problem fixed. I'll attach lspci, lshw, and dmidecode for our hardware as well, but this is happening on pretty vanilla supermicro nodes. We are able to consistently reproduce it on our hardware. It is not reproducible in EC2, only on metal. [Impact] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 [Test Case] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. [Regression Potential]  * Fix implemented upstream starting with v4.6-rc1  * The fix is hard to verify, but user "Proton" was able to confirmed that upstream mainline 4.6-rc1 solve the situation and that the test kernel I have provided including the fix solves this particular problem as well. Confirmation by Proton : https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1572630/comments/23 [Other Info]  * https://lkml.org/lkml/2016/3/16/40  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e0e827b9  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=897bb0c7 [Original Description] We discovered a pretty serious regression introduced in 4.4.0-18. At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 This seems somewhat similar to https://lkml.org/lkml/2016/3/16/40, but the trace is not identical. We discovered this issue when we were experimenting with linux-generic-lts-xenial from trusty-updates on a 14.04 installation. When we installed it, 4.4.0-15 was the current package, and it worked fine and provided a large amount of improvements for us. Background security updates installed 4.4.0-18, and this updated and grub and became the default kernel. On a reboot, the node panics about 2 seconds in, resulting in a machine in a dead state. We were able to boot a rescue image and roll bac kto 4.4.0-15, which works nicely. We currently have pinning on 4.4.0-15 to prevent this problem from coming back, but would prefer to see the problem fixed. I'll attach lspci, lshw, and dmidecode for our hardware as well, but this is happening on pretty vanilla supermicro nodes. We are able to consistently reproduce it on our hardware. It is not reproducible in EC2, only on metal.
2016-08-29 17:26:10 Eric Desrochers description [Impact] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 [Test Case] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. [Regression Potential]  * Fix implemented upstream starting with v4.6-rc1  * The fix is hard to verify, but user "Proton" was able to confirmed that upstream mainline 4.6-rc1 solve the situation and that the test kernel I have provided including the fix solves this particular problem as well. Confirmation by Proton : https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1572630/comments/23 [Other Info]  * https://lkml.org/lkml/2016/3/16/40  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e0e827b9  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=897bb0c7 [Original Description] We discovered a pretty serious regression introduced in 4.4.0-18. At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 This seems somewhat similar to https://lkml.org/lkml/2016/3/16/40, but the trace is not identical. We discovered this issue when we were experimenting with linux-generic-lts-xenial from trusty-updates on a 14.04 installation. When we installed it, 4.4.0-15 was the current package, and it worked fine and provided a large amount of improvements for us. Background security updates installed 4.4.0-18, and this updated and grub and became the default kernel. On a reboot, the node panics about 2 seconds in, resulting in a machine in a dead state. We were able to boot a rescue image and roll bac kto 4.4.0-15, which works nicely. We currently have pinning on 4.4.0-15 to prevent this problem from coming back, but would prefer to see the problem fixed. I'll attach lspci, lshw, and dmidecode for our hardware as well, but this is happening on pretty vanilla supermicro nodes. We are able to consistently reproduce it on our hardware. It is not reproducible in EC2, only on metal. [Impact] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 [Test Case] At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. [Regression Potential]  * Fix implemented upstream starting with v4.6-rc1 * The fix is fairly straightfoward given the stack trace.  * The fix is hard to verify, but user "Proton" was able to confirmed that upstream mainline 4.6-rc1 solve the situation and that the test kernel I have provided including the fix solves this particular problem as well. Confirmation by Proton : https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1572630/comments/23 [Other Info]  * https://lkml.org/lkml/2016/3/16/40  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e0e827b9  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=897bb0c7 [Original Description] We discovered a pretty serious regression introduced in 4.4.0-18. At boot-time, the kernel will panic somewhere in 'blk_mq_register_disk', a snippet of the track is below, and full panic dump is attached. The panic dump was collected via serial console, as the kernel panics so early that we cannot kdump it. [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160 [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160 [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490 [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270 This seems somewhat similar to https://lkml.org/lkml/2016/3/16/40, but the trace is not identical. We discovered this issue when we were experimenting with linux-generic-lts-xenial from trusty-updates on a 14.04 installation. When we installed it, 4.4.0-15 was the current package, and it worked fine and provided a large amount of improvements for us. Background security updates installed 4.4.0-18, and this updated and grub and became the default kernel. On a reboot, the node panics about 2 seconds in, resulting in a machine in a dead state. We were able to boot a rescue image and roll bac kto 4.4.0-15, which works nicely. We currently have pinning on 4.4.0-15 to prevent this problem from coming back, but would prefer to see the problem fixed. I'll attach lspci, lshw, and dmidecode for our hardware as well, but this is happening on pretty vanilla supermicro nodes. We are able to consistently reproduce it on our hardware. It is not reproducible in EC2, only on metal.
2016-08-31 16:52:43 Tim Gardner bug task added linux (Ubuntu)
2016-08-31 16:53:03 Tim Gardner nominated for series Ubuntu Xenial
2016-08-31 16:53:03 Tim Gardner bug task added linux (Ubuntu Xenial)
2016-08-31 16:53:03 Tim Gardner bug task added linux-lts-xenial (Ubuntu Xenial)
2016-08-31 16:53:03 Tim Gardner nominated for series Ubuntu Yakkety
2016-08-31 16:53:03 Tim Gardner bug task added linux (Ubuntu Yakkety)
2016-08-31 16:53:03 Tim Gardner bug task added linux-lts-xenial (Ubuntu Yakkety)
2016-08-31 16:53:54 Tim Gardner linux (Ubuntu Xenial): status New Fix Committed
2016-08-31 16:54:06 Tim Gardner linux (Ubuntu Yakkety): status New Fix Released
2016-08-31 16:54:24 Tim Gardner linux-lts-xenial (Ubuntu Xenial): status New In Progress
2016-08-31 16:54:47 Tim Gardner linux-lts-xenial (Ubuntu Xenial): status In Progress Invalid
2016-08-31 16:55:05 Tim Gardner linux-lts-xenial (Ubuntu Yakkety): status In Progress Invalid
2016-08-31 16:55:18 Tim Gardner nominated for series Ubuntu Trusty
2016-08-31 16:55:18 Tim Gardner bug task added linux (Ubuntu Trusty)
2016-08-31 16:55:18 Tim Gardner bug task added linux-lts-xenial (Ubuntu Trusty)
2016-08-31 16:55:29 Tim Gardner linux-lts-xenial (Ubuntu Trusty): status New In Progress
2016-08-31 16:55:38 Tim Gardner linux (Ubuntu Trusty): status New Invalid
2016-08-31 16:55:55 Tim Gardner linux-lts-xenial (Ubuntu Yakkety): assignee Eric Desrochers (slashd)
2016-08-31 16:56:39 Tim Gardner linux (Ubuntu Xenial): assignee Eric Desrochers (slashd)
2016-08-31 16:57:08 Tim Gardner linux-lts-xenial (Ubuntu Trusty): assignee Eric Desrochers (slashd)
2016-09-13 20:48:55 Eric Desrochers linux (Ubuntu Xenial): importance Undecided Medium
2016-09-13 20:49:02 Eric Desrochers linux-lts-xenial (Ubuntu Trusty): importance Undecided Medium
2016-09-26 19:17:28 Brad Figg tags kernel kernel-bug panic xenial kernel kernel-bug panic verification-needed-xenial xenial
2016-09-28 14:43:08 Eric Desrochers linux-lts-xenial (Ubuntu Xenial): assignee Eric Desrochers (slashd)
2016-09-28 14:43:25 Eric Desrochers linux-lts-xenial (Ubuntu Xenial): status Invalid In Progress
2016-09-28 14:43:42 Eric Desrochers linux-lts-xenial (Ubuntu Xenial): importance Undecided Medium
2016-09-28 17:01:44 Arne de Bruijn removed subscriber Arne de Bruijn
2016-10-09 15:29:47 Eric Desrochers tags kernel kernel-bug panic verification-needed-xenial xenial kernel kernel-bug panic verification-done-xenial xenial
2016-10-10 17:37:52 Launchpad Janitor linux (Ubuntu Xenial): status Fix Committed Fix Released
2016-10-10 17:37:52 Launchpad Janitor cve linked 2016-6828
2016-10-10 17:37:52 Launchpad Janitor cve linked 2016-7039
2016-10-10 17:37:53 Launchpad Janitor linux (Ubuntu Xenial): status Fix Committed Fix Released
2016-10-10 17:51:13 Launchpad Janitor linux-lts-xenial (Ubuntu Trusty): status In Progress Fix Released
2016-10-10 17:51:14 Launchpad Janitor linux-lts-xenial (Ubuntu Trusty): status In Progress Fix Released
2016-10-11 03:25:29 Eric Desrochers linux-lts-xenial (Ubuntu Xenial): assignee Eric Desrochers (slashd)