Kernel with virtio-blk oops

Bug #1592541 reported by Ryan Harper on 2016-06-14
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Joseph Salisbury
Trusty
Medium
Joseph Salisbury
linux-lts-utopic (Ubuntu)
Medium
Joseph Salisbury

Bug Description

The following oops was found in a Trusty Cloud-image when doing storage operations with software raid, lvm and various filesystems.

[ 85.327298] general protection fault: 0000 [#1] SMP
[ 85.327806] Modules linked in: bcache btrfs jfs xfs libcrc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_crypt raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 multipath linear cirrus syscopyarea sysfillrect sysimgblt psmouse virtio_scsi ttm drm_kms_helper drm pata_acpi floppy
[ 85.328008] CPU: 0 PID: 6 Comm: kworker/u2:0 Not tainted 3.16.0-71-generic #92~14.04.1-Ubuntu
[ 85.328008] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 85.328008] Workqueue: writeback bdi_writeback_workfn (flush-251:0)
[ 85.328008] task: ffff88003c2732f0 ti: ffff88003c2a0000 task.ti: ffff88003c2a0000
[ 85.328008] RIP: 0010:[<ffffffff813631ce>] [<ffffffff813631ce>] __blk_bios_map_sg+0x1be/0x3d0
[ 85.328008] RSP: 0018:ffff88003c2a38d8 EFLAGS: 00010206
[ 85.328008] RAX: 3355167b09fe31e4 RBX: 0000000000000c00 RCX: 0000000000000000
[ 85.328008] RDX: 3355167b09fe31e5 RSI: ffffea0000bd2a00 RDI: 0000000000000000
[ 85.328008] RBP: ffff88003c2a3958 R08: ffff880028d86520 R09: 0000000000000080
[ 85.328008] R10: 0000000000000000 R11: 000000002ed79000 R12: 0000000000000000
[ 85.328008] R13: 0000000000000c00 R14: 0000000000000000 R15: ffff88003c2a3968
[ 85.328008] FS: 0000000000000000(0000) GS:ffff88003e200000(0000) knlGS:0000000000000000
[ 85.328008] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 85.328008] CR2: 00000000025b5000 CR3: 0000000028d8a000 CR4: 00000000000006f0
[ 85.328008] Stack:
[ 85.328008] ffff88002313e958 ffff8800001dc4e0 ffff880037370f80 0100000000000000
[ 85.328008] ffff880028d86520 ffff880000000080 0000000000000000 ffffea0000bd2a00
[ 85.328008] 0000000000000c00 ffffea0000bb5e00 0000000000001000 ffff8800001dc340
[ 85.328008] Call Trace:
[ 85.328008] [<ffffffff81363415>] blk_rq_map_sg+0x35/0x170
[ 85.328008] [<ffffffff814e4950>] virtio_queue_rq+0xa0/0x240
[ 85.328008] [<ffffffff813671e7>] __blk_mq_run_hw_queue+0x1c7/0x320
[ 85.328008] [<ffffffff81367885>] blk_mq_run_hw_queue+0x65/0x80
[ 85.328008] [<ffffffff813685df>] blk_mq_insert_requests+0xcf/0x150
[ 85.328008] [<ffffffff81369139>] blk_mq_flush_plug_list+0x129/0x140
[ 85.328008] [<ffffffff8135ef11>] blk_flush_plug_list+0xd1/0x220
[ 85.328008] [<ffffffff8135f434>] blk_finish_plug+0x14/0x40
[ 85.328008] [<ffffffff8116ed7d>] generic_writepages+0x4d/0x60
[ 85.328008] [<ffffffff8116ff3e>] do_writepages+0x1e/0x40
[ 85.328008] [<ffffffff811ff660>] __writeback_single_inode+0x40/0x2a0
[ 85.328008] [<ffffffff8120021a>] writeback_sb_inodes+0x26a/0x440
[ 85.328008] [<ffffffff8120048f>] __writeback_inodes_wb+0x9f/0xd0
[ 85.328008] [<ffffffff81200743>] wb_writeback+0x283/0x320
[ 85.328008] [<ffffffff81202ea9>] bdi_writeback_workfn+0x1e9/0x4a0
[ 85.328008] [<ffffffff8108b748>] process_one_work+0x178/0x470
[ 85.328008] [<ffffffff8108bfb1>] worker_thread+0x121/0x570
[ 85.328008] [<ffffffff8108be90>] ? rescuer_thread+0x380/0x380
[ 85.328008] [<ffffffff81092ca2>] kthread+0xd2/0xf0
[ 85.328008] [<ffffffff81092bd0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 85.328008] [<ffffffff81776358>] ret_from_fork+0x58/0x90
[ 85.328008] [<ffffffff81092bd0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 85.328008] Code: 3f 44 89 4c 24 28 48 89 4c 24 30 e8 8d 99 03 00 8b 7c 24 44 48 8b 74 24 38 4c 8b 44 24 20 44 8b 4c 24 28 48 8b 4c 24 30 49 89 07 <48> 8b 10 83 e2 03 40 f6 c6 03 0f 85 a0 01 00 00 48 09 f2 89 78
[ 85.328008] RIP [<ffffffff813631ce>] __blk_bios_map_sg+0x1be/0x3d0
[ 85.328008] RSP <ffff88003c2a38d8>
[ 85.355829] ---[ end trace 12bf400b01eb42cf ]---
[ 85.356327] BUG: unable to handle kernel paging request at ffffffffffffffd8
[ 85.356926] IP: [<ffffffff81093380>] kthread_data+0x10/0x20
[ 85.357393] PGD 1c16067 PUD 1c18067 PMD 0
[ 85.357778] Oops: 0000 [#2] SMP
[ 85.358067] Modules linked in: bcache btrfs jfs xfs libcrc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_crypt raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 multipath linear cirrus syscopyarea sysfillrect sysimgblt psmouse virtio_scsi ttm drm_kms_helper drm pata_acpi floppy
[ 85.360215] CPU: 0 PID: 6 Comm: kworker/u2:0 Tainted: G D 3.16.0-71-generic #92~14.04.1-Ubuntu
[ 85.360215] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 85.360215] task: ffff88003c2732f0 ti: ffff88003c2a0000 task.ti: ffff88003c2a0000
[ 85.360215] RIP: 0010:[<ffffffff81093380>] [<ffffffff81093380>] kthread_data+0x10/0x20
[ 85.360215] RSP: 0018:ffff88003c2a3690 EFLAGS: 00010002
[ 85.360215] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000000f
[ 85.360215] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003c2732f0
[ 85.360215] RBP: ffff88003c2a3690 R08: 0000000000000000 R09: 000000018027001c
[ 85.360215] R10: ffffffff813621da R11: ffffea0000dcb540 R12: ffff88003e2130c0
[ 85.360215] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88003c2732f0
[ 85.360215] FS: 0000000000000000(0000) GS:ffff88003e200000(0000) knlGS:0000000000000000
[ 85.360215] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 85.360215] CR2: 0000000000000028 CR3: 000000003f360000 CR4: 00000000000006f0
[ 85.360215] Stack:
[ 85.360215] ffff88003c2a36a8 ffffffff8108c8d1 ffff88003c273828 ffff88003c2a3708
[ 85.360215] ffffffff8177213e ffff88003c2732f0 ffff88003c2a3fd8 00000000000130c0
[ 85.360215] 00000000000130c0 ffff88003c2732f0 ffff88003c273a48 ffff88003c2732e0
[ 85.360215] Call Trace:
[ 85.360215] [<ffffffff8108c8d1>] wq_worker_sleeping+0x11/0x90
[ 85.360215] [<ffffffff8177213e>] __schedule+0x57e/0x7c0
[ 85.360215] [<ffffffff817723a9>] schedule+0x29/0x70
[ 85.360215] [<ffffffff81071987>] do_exit+0x6e7/0xa70
[ 85.360215] [<ffffffff81017999>] oops_end+0xa9/0x150
[ 85.360215] [<ffffffff81017d4b>] die+0x4b/0x70
[ 85.360215] [<ffffffff810149c6>] do_general_protection+0x126/0x1b0
[ 85.360215] [<ffffffff817783c8>] general_protection+0x28/0x30
[ 85.360215] [<ffffffff813631ce>] ? __blk_bios_map_sg+0x1be/0x3d0
[ 85.360215] [<ffffffff813631b3>] ? __blk_bios_map_sg+0x1a3/0x3d0
[ 85.360215] [<ffffffff81363415>] blk_rq_map_sg+0x35/0x170
[ 85.360215] [<ffffffff814e4950>] virtio_queue_rq+0xa0/0x240
[ 85.360215] [<ffffffff813671e7>] __blk_mq_run_hw_queue+0x1c7/0x320
[ 85.360215] [<ffffffff81367885>] blk_mq_run_hw_queue+0x65/0x80
[ 85.360215] [<ffffffff813685df>] blk_mq_insert_requests+0xcf/0x150
[ 85.360215] [<ffffffff81369139>] blk_mq_flush_plug_list+0x129/0x140
[ 85.360215] [<ffffffff8135ef11>] blk_flush_plug_list+0xd1/0x220
[ 85.360215] [<ffffffff8135f434>] blk_finish_plug+0x14/0x40
[ 85.360215] [<ffffffff8116ed7d>] generic_writepages+0x4d/0x60
[ 85.360215] [<ffffffff8116ff3e>] do_writepages+0x1e/0x40
[ 85.360215] [<ffffffff811ff660>] __writeback_single_inode+0x40/0x2a0
[ 85.360215] [<ffffffff8120021a>] writeback_sb_inodes+0x26a/0x440
[ 85.360215] [<ffffffff8120048f>] __writeback_inodes_wb+0x9f/0xd0
[ 85.360215] [<ffffffff81200743>] wb_writeback+0x283/0x320
[ 85.360215] [<ffffffff81202ea9>] bdi_writeback_workfn+0x1e9/0x4a0
[ 85.360215] [<ffffffff8108b748>] process_one_work+0x178/0x470
[ 85.360215] [<ffffffff8108bfb1>] worker_thread+0x121/0x570
[ 85.360215] [<ffffffff8108be90>] ? rescuer_thread+0x380/0x380
[ 85.360215] [<ffffffff81092ca2>] kthread+0xd2/0xf0
[ 85.360215] [<ffffffff81092bd0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 85.360215] [<ffffffff81776358>] ret_from_fork+0x58/0x90
[ 85.360215] [<ffffffff81092bd0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 85.360215] Code: 00 48 89 e5 5d 48 8b 40 c8 48 c1 e8 02 83 e0 01 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 c8 04 00 00 55 48 89 e5 <48> 8b 40 d8 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
[ 85.360215] RIP [<ffffffff81093380>] kthread_data+0x10/0x20
[ 85.360215] RSP <ffff88003c2a3690>
[ 85.360215] CR2: ffffffffffffffd8
[ 85.360215] ---[ end trace 12bf400b01eb42d0 ]---
[ 85.360215] Fixing recursive fault but reboot is needed!

(The full boot/install log will be attached).

The oops looks similar to the follow issue fixed in newer kernels:

https://bugzilla.novell.com/show_bug.cgi?id=888259

Which resulted in the following patch sent and accepted (From Canonical!)

https://lkml.org/lkml/2014/10/9/339

Fixed in vivid onward, but still present in trusty, fairly hard to recreate.

Ryan Harper (raharper) wrote :
Ryan Harper (raharper) wrote :

Here's the qemu launch command used, in case that's useful.

 qemu-system-x86_64 -enable-kvm -device virtio-scsi-pci,id=virtio-scsi-xkvm -device virtio-net-pci,netdev=net00 -netdev type=user,id=net00 -m 1024 -serial file:/var/lib/jenkins/slaves/venonat/workspace/curtin-vmtest-venonat-devel/output/TrustyHWEUTestRaid5Bcache/logs/install-serial.log -nographic -drive file=/tmp/launch.Cnc0wx/boot.img,if=none,cache=unsafe,format=qcow2,id=boot,index=0 -device virtio-blk,drive=boot -drive file=/var/lib/jenkins/slaves/venonat/workspace/curtin-vmtest-venonat-devel/output/TrustyHWEUTestRaid5Bcache/disks/install_disk.img,if=none,cache=unsafe,format=raw,id=drv2,index=2 -device virtio-blk,drive=drv2,serial=dev2,logical_block_size=512,physical_block_size=512,min_io_size=512 -drive file=/var/lib/jenkins/slaves/venonat/workspace/curtin-vmtest-venonat-devel/output/TrustyHWEUTestRaid5Bcache/disks/extra_disk_0.img,if=none,cache=unsafe,format=raw,id=drv3,index=3 -device virtio-blk,drive=drv3,serial=dev3,logical_block_size=512,physical_block_size=512,min_io_size=512 -drive file=/var/lib/jenkins/slaves/venonat/workspace/curtin-vmtest-venonat-devel/output/TrustyHWEUTestRaid5Bcache/disks/extra_disk_1.img,if=none,cache=unsafe,format=raw,id=drv4,index=4 -device virtio-blk,drive=drv4,serial=dev4,logical_block_size=512,physical_block_size=512,min_io_size=512 -drive file=/var/lib/jenkins/slaves/venonat/workspace/curtin-vmtest-venonat-devel/output/TrustyHWEUTestRaid5Bcache/disks/extra_disk_2.img,if=none,cache=unsafe,format=raw,id=drv5,index=5 -device virtio-blk,drive=drv5,serial=dev5,logical_block_size=512,physical_block_size=512,min_io_size=512 -drive file=/var/lib/jenkins/slaves/venonat/workspace/curtin-vmtest-venonat-devel/output/TrustyHWEUTestRaid5Bcache/disks/extra_disk_3.img,if=none,cache=unsafe,format=raw,id=drv6,index=6 -device virtio-blk,drive=drv6,serial=dev6,logical_block_size=512,physical_block_size=512,min_io_size=512 -kernel /srv/images/trusty/amd64/20160606/utopic/generic/boot-kernel -initrd /srv/images/trusty/amd64/20160606/utopic/generic/boot-initrd -append "root=/dev/vda ds=nocloud-net;seedfrom=http://10.100.0.103:40646/ console=ttyS0 "
kvm pid=6128. my pid=6123
QEMU 2.0.0 monitor - type 'help' for more information
(qemu)

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1592541

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: utopic

Having posted, I see that this is actually a utopic kernel that oopsed
(3.16.0-71-generic)

However, the oops stack is still possible on trusty, the blk-merge code in
3.13 doesn't include the fix
that appears in the vivid kernel.

On Tue, Jun 14, 2016 at 2:56 PM, Ryan Harper <email address hidden>
wrote:

> Here's the qemu launch command used, in case that's useful.
>
>
> qemu-system-x86_64 -enable-kvm -device
> virtio-scsi-pci,id=virtio-scsi-xkvm -device virtio-net-pci,netdev=net00
> -netdev type=user,id=net00 -m 1024 -serial
> file:/var/lib/jenkins/slaves/venonat/workspace/curtin-vmtest-venonat-devel/output/TrustyHWEUTestRaid5Bcache/logs/install-serial.log
> -nographic -drive
> file=/tmp/launch.Cnc0wx/boot.img,if=none,cache=unsafe,format=qcow2,id=boot,index=0
> -device virtio-blk,drive=boot -drive
> file=/var/lib/jenkins/slaves/venonat/workspace/curtin-vmtest-venonat-devel/output/TrustyHWEUTestRaid5Bcache/disks/install_disk.img,if=none,cache=unsafe,format=raw,id=drv2,index=2
> -device
> virtio-blk,drive=drv2,serial=dev2,logical_block_size=512,physical_block_size=512,min_io_size=512
> -drive
> file=/var/lib/jenkins/slaves/venonat/workspace/curtin-vmtest-venonat-devel/output/TrustyHWEUTestRaid5Bcache/disks/extra_disk_0.img,if=none,cache=unsafe,format=raw,id=drv3,index=3
> -device
> virtio-blk,drive=drv3,serial=dev3,logical_block_size=512,physical_block_size=512,min_io_size=512
> -drive
> file=/var/lib/jenkins/slaves/venonat/workspace/curtin-vmtest-venonat-devel/output/TrustyHWEUTestRaid5Bcache/disks/extra_disk_1.img,if=none,cache=unsafe,format=raw,id=drv4,index=4
> -device
> virtio-blk,drive=drv4,serial=dev4,logical_block_size=512,physical_block_size=512,min_io_size=512
> -drive
> file=/var/lib/jenkins/slaves/venonat/workspace/curtin-vmtest-venonat-devel/output/TrustyHWEUTestRaid5Bcache/disks/extra_disk_2.img,if=none,cache=unsafe,format=raw,id=drv5,index=5
> -device
> virtio-blk,drive=drv5,serial=dev5,logical_block_size=512,physical_block_size=512,min_io_size=512
> -drive
> file=/var/lib/jenkins/slaves/venonat/workspace/curtin-vmtest-venonat-devel/output/TrustyHWEUTestRaid5Bcache/disks/extra_disk_3.img,if=none,cache=unsafe,format=raw,id=drv6,index=6
> -device
> virtio-blk,drive=drv6,serial=dev6,logical_block_size=512,physical_block_size=512,min_io_size=512
> -kernel /srv/images/trusty/amd64/20160606/utopic/generic/boot-kernel
> -initrd /srv/images/trusty/amd64/20160606/utopic/generic/boot-initrd
> -append "root=/dev/vda ds=nocloud-net;seedfrom=http://10.100.0.103:40646/
> console=ttyS0 "
> kvm pid=6128. my pid=6123
> QEMU 2.0.0 monitor - type 'help' for more information
> (qemu)
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1592541
>
> Title:
> trusty kernel with virtio-blk oops
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1592541/+subscriptions
>

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key
summary: - trusty kernel with virtio-blk oops
+ Utopic kernel with virtio-blk oops
Changed in linux (Ubuntu):
status: Incomplete → Triaged
Changed in linux (Ubuntu Trusty):
status: New → Triaged
importance: Undecided → Medium
no longer affects: linux-lts-utopic (Ubuntu Trusty)
Changed in linux-lts-utopic (Ubuntu):
status: New → Triaged
importance: Undecided → Medium
summary: - Utopic kernel with virtio-blk oops
+ Kernel with virtio-blk oops
tags: added: trusty
Changed in linux (Ubuntu):
assignee: nobody → Joseph Salisbury (jsalisbury)
status: Triaged → In Progress
Changed in linux (Ubuntu Trusty):
status: Triaged → In Progress
Changed in linux-lts-utopic (Ubuntu):
status: Triaged → In Progress
Changed in linux (Ubuntu Trusty):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux-lts-utopic (Ubuntu):
assignee: nobody → Joseph Salisbury (jsalisbury)
Joseph Salisbury (jsalisbury) wrote :

I built a lts-utopic test kernel with a pick of 764f612. The test kernel can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1592541/

Can you test this kernel and see if it resolves this bug?

Thanks in advance!

Changed in linux (Ubuntu):
status: In Progress → Incomplete
Changed in linux (Ubuntu Trusty):
status: In Progress → Incomplete
Changed in linux-lts-utopic (Ubuntu):
status: In Progress → Incomplete
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers