bcache crash

Bug #1411734 reported by Frank Banul on 2015-01-16
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Unassigned

Bug Description

Using bcache with a 512GB SSD as cache, 4TB HDD as backing, doing a backup of the bcache device sometimes results in a kernel crash. Below is an OCR of the crash log.

lsb_release -rd
Description: Ubuntu 14.04.1 LTS
Release: 14.04

Linux goliad 3.13.0-44-generic #73-Ubuntu SMP Tue Dec 16 00:22:43 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

I have a picture of the screen as well if needed.

[127032.260022] R13: 0000000000000002 R14: ffff8800d5e27ff0 R15: ffff8800d5e28000
[127032.260952] FS: 0000000000000000(0000) GS:ffff88021fc80000(0000) knlGS:0000000000000000
[127032.261880] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[127032.262813] CR2: 0000000000000028 CR3: 000000021ec0e000 CR4: 00000000000007e0
[127032.263755] Stack:
[127032.264683] ffff8800d724f830 ffffffff81084f51 ffffBB00d5e28000 ffff8800d724f890
[127032.265629] ffffffff81724f39 ffff8800d5e28000 ffff8800d724ffd8 0000000000013480
[127032.266577] 0000000000013480 ffff8800d5e28000 ffffBB00d5e28650 ffff8800d5e27ff0
[127032.267523] Call Trace:
[127032.268453] [<ffffffff81084f51>] wq_worker_sleeping+0x11/0x90
[127032.269386] [<ffffffff81724f39>] __schedule+0x589/0x7d0
[127032.270323] [<ffffffff817251a9>] schedule+0x29/0x70
[127032.271260] [<ffffffff8106a15f>] do_exit+0x6df/0xa50
[127032.272194] [<ffffffffB172a329>] oops_end+0xa9/0x150
[127032.273115] [<ffffffff8171996d>] no_context+0x27e/0x28b
[127032.274038] [<ffffffff817199ed>] __bad_area_nose,aphore+0x73/0x1ca
[127032.274960] [<ffffffff81719b57>] bad_area_nosemaphore+0x13/0x15
[127032.275878] [<ffffffff8172ccf7>] __do_page_fault+0xa7/0x560
[127032.276791] [<ffffffff81197793>] ? alloc_pages_current+0xa3/0x160
[127032.277708] [<ffffffff81153Bfe>] ? __get_free_pages+0xe/0x50
[127032.278630] [<ffffffff8117083e>] ? kmalloc_order_trace+0x2e/0xa0
[127032.279553] [<ffffffff8172d1ca>] do_page_fault+0x1a/0x70
[127032.280474] [<ffffffff811a2941>] ? __kmalloc+0x211/0x230
[127032.281390] [<ffffffff81729628>] page_fault+0xZ8/0x30
[127032.282275] [<ffffffffa009a976>] ? bch_htree_node_read+0x176/0x550 [bcache1
[127032.283108] [<ffffffffa009bf45>] bch_btree_node_get+0x165/0x290 [bcache]
[127032.283933] [<ffffffffa009c108>] bch_btree_map_nodes_recurse+0x98/0x140 [bcache]
[127032.284760] [<ffffffffa009eb90>] ? bch_btree_insert_check_key+0x190/0x190 [bcache]
[127032.285593] [<ffffffff81151947>] ? mempool_free_slab+0x17/0x20
[127032.286418] [<ffffffff81151bb9>] ? mempool_free+0x49/0x90
[127032.287233] [<ffffffffa009f3c9>] __bch_btree_map_nodes+0x139/0x1c0 [bcache]
[127032.288053] [<ffffffffa009eb90>] ? bch_btree_insert_check_key+0x190/0x190 [bcache]
[127032.288873] [<ffffffff81151bb9>] ? mempool_free+0x49/0x90
[127032.289678] [<ffffffffa009f504>] bch_btree_insert+0xb4/0x120 [bcache]
[127032.290476] [<ffffffffa00a97ba>] bch_data_insert_keys+0x3a/0x160 [bcache]
[127032.291258] [<ffffffff81083a52>] process_one_work+0x182/0x450
[127032.292029] [<ffffffff81084841>] worker_thread+0x121/0x410
[127032.292786] [<ffffffffB1e84720>] ? rescuer_thread+0x430/0x430
[127032.293523] [<ffffffff8108b572>] kthread+0xd2/0xf0
[127032.294246] [<ffffffff8108b4a0>] ? kthread_create_on_node+0x1c0/0x1c0
[127032.294961] [<ffffffff817317bc>] ret_from_fork+0x7c/0xb0
[127032.295672] [<ffffffff8108b4a0>] ? kthread_create_on_node+0xlc0/0xlc0
[127032.296374] Code: 00 48 89 e5 5d 48 8b 40 c8 48 c1 e8 02 83 e0 01 c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 8b 87 c0 03 00 00 55 48 89 e5 <48> 8b 40 d8 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90
[127032.298061] RIP [<ffffffff8108bc10>] kthread_data+0x10/0x20
[127032.298853] RSP <ffff8800d724f818>
[127032.299630] CR2: ffffffffffffffd8

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.13.0-44-generic 3.13.0-44.73
ProcVersionSignature: Ubuntu 3.13.0-44.73-generic 3.13.11-ckt12
Uname: Linux 3.13.0-44-generic x86_64
ApportVersion: 2.14.1-0ubuntu3.6
Architecture: amd64
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/dsp', '/dev/snd/by-path', '/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/hwC0D3', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied
Date: Fri Jan 16 10:04:07 2015
InstallationDate: Installed on 2011-08-25 (1240 days ago)
InstallationMedia: Ubuntu 11.04 "Natty Narwhal" - Release amd64 (20110427.1)
IwConfig:
 eth0 no wireless extensions.

 lo no wireless extensions.
Lsusb:
 Bus 001 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub
 Bus 002 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 002 Device 003: ID 0557:2220 ATEN International Co., Ltd
MachineType: Dell Inc. OptiPlex 980
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-44-generic root=UUID=63c581a7-3055-4192-88c2-af192e84130d ro quiet splash crashkernel=384M-:128M vt.handoff=7
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RfKill:

SourcePackage: linux
UpgradeStatus: Upgraded to trusty on 2014-12-24 (23 days ago)
WifiSyslog:

dmi.bios.date: 01/21/2011
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A07
dmi.board.name: 0D441T
dmi.board.vendor: Dell Inc.
dmi.board.version: A03
dmi.chassis.type: 6
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA07:bd01/21/2011:svnDellInc.:pnOptiPlex980:pvr:rvnDellInc.:rn0D441T:rvrA03:cvnDellInc.:ct6:cvr:
dmi.product.name: OptiPlex 980
dmi.sys.vendor: Dell Inc.
---
ApportVersion: 2.14.1-0ubuntu3.6
Architecture: amd64
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/dsp', '/dev/snd/by-path', '/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/hwC0D3', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CurrentDmesg:
 Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied
 dmesg: write failed: Broken pipe
DistroRelease: Ubuntu 14.04
InstallationDate: Installed on 2011-08-25 (1240 days ago)
InstallationMedia: Ubuntu 11.04 "Natty Narwhal" - Release amd64 (20110427.1)
IwConfig:
 eth0 no wireless extensions.

 lo no wireless extensions.
Lsusb:
 Bus 001 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub
 Bus 002 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 002 Device 003: ID 0557:2220 ATEN International Co., Ltd
MachineType: Dell Inc. OptiPlex 980
Package: linux (not installed)
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-44-generic root=UUID=63c581a7-3055-4192-88c2-af192e84130d ro quiet splash crashkernel=384M-:128M vt.handoff=7
ProcVersionSignature: Ubuntu 3.13.0-44.73-generic 3.13.11-ckt12
RfKill:

Tags: trusty
Uname: Linux 3.13.0-44-generic x86_64
UpgradeStatus: Upgraded to trusty on 2014-12-24 (23 days ago)
UserGroups:

WifiSyslog:

_MarkForUpload: True
dmi.bios.date: 01/21/2011
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A07
dmi.board.name: 0D441T
dmi.board.vendor: Dell Inc.
dmi.board.version: A03
dmi.chassis.type: 6
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA07:bd01/21/2011:svnDellInc.:pnOptiPlex980:pvr:rvnDellInc.:rn0D441T:rvrA03:cvnDellInc.:ct6:cvr:
dmi.product.name: OptiPlex 980
dmi.sys.vendor: Dell Inc.

Frank Banul (frank-banul) wrote :

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1411734

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: apport-collected
description: updated
Frank Banul (frank-banul) wrote :

Sorry if I was not supposed to confirm the bug, the email instructed me to, yet the confirmed description says that it should be confirmed by someone other than the reporter.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.19 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19-rc4-vivid/

Changed in linux (Ubuntu):
importance: Undecided → High
Frank Banul (frank-banul) wrote :

This issue happened on the 3.13.0-43 (original install) kernel as well as on the current 3.13.0-44. I have not previously tested other kernel versions.

I have installed kernel 3.19.0-031900rc4-generic x86_64. The problem does not manifest immediately on 3.13.0-44 or 3.19. I will test and report the results back.

Frank Banul (frank-banul) wrote :

The kernel 3.19.0-031900rc4-generic has run without crash for a week while the bcache device has been used similarly to the previous kernel. I have added the kernel-fixed-upstream tag.

tags: added: kernel-fixed-upstream
Peter Maloney (peter-maloney) wrote :

You probably shouldn't be using bcache on kernels without this patch set: https://lkml.org/lkml/2015/12/22/154

That patch set is in the Ubuntu xenial kernel 4.4, and also vanilla 4.9. You can install it on 14.04 with: `apt-get install linux-image-generic-lts-xenial`

Download full text (8.9 KiB)

Thanks, I’ve been using linux-image-generic-lts-xenial for quite some time.

Frank

> On Oct 17, 2017, at 3:32 AM, Peter Maloney <email address hidden> wrote:
>
> You probably shouldn't be using bcache on kernels without this patch
> set: https://lkml.org/lkml/2015/12/22/154
>
> That patch set is in the Ubuntu xenial kernel 4.4, and also vanilla 4.9.
> You can install it on 14.04 with: `apt-get install linux-image-generic-
> lts-xenial`
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1411734
>
> Title:
> bcache crash
>
> Status in linux package in Ubuntu:
> Confirmed
>
> Bug description:
> Using bcache with a 512GB SSD as cache, 4TB HDD as backing, doing a
> backup of the bcache device sometimes results in a kernel crash. Below
> is an OCR of the crash log.
>
> lsb_release -rd
> Description: Ubuntu 14.04.1 LTS
> Release: 14.04
>
> Linux goliad 3.13.0-44-generic #73-Ubuntu SMP Tue Dec 16 00:22:43 UTC
> 2014 x86_64 x86_64 x86_64 GNU/Linux
>
> I have a picture of the screen as well if needed.
>
> [127032.260022] R13: 0000000000000002 R14: ffff8800d5e27ff0 R15: ffff8800d5e28000
> [127032.260952] FS: 0000000000000000(0000) GS:ffff88021fc80000(0000) knlGS:0000000000000000
> [127032.261880] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [127032.262813] CR2: 0000000000000028 CR3: 000000021ec0e000 CR4: 00000000000007e0
> [127032.263755] Stack:
> [127032.264683] ffff8800d724f830 ffffffff81084f51 ffffBB00d5e28000 ffff8800d724f890
> [127032.265629] ffffffff81724f39 ffff8800d5e28000 ffff8800d724ffd8 0000000000013480
> [127032.266577] 0000000000013480 ffff8800d5e28000 ffffBB00d5e28650 ffff8800d5e27ff0
> [127032.267523] Call Trace:
> [127032.268453] [<ffffffff81084f51>] wq_worker_sleeping+0x11/0x90
> [127032.269386] [<ffffffff81724f39>] __schedule+0x589/0x7d0
> [127032.270323] [<ffffffff817251a9>] schedule+0x29/0x70
> [127032.271260] [<ffffffff8106a15f>] do_exit+0x6df/0xa50
> [127032.272194] [<ffffffffB172a329>] oops_end+0xa9/0x150
> [127032.273115] [<ffffffff8171996d>] no_context+0x27e/0x28b
> [127032.274038] [<ffffffff817199ed>] __bad_area_nose,aphore+0x73/0x1ca
> [127032.274960] [<ffffffff81719b57>] bad_area_nosemaphore+0x13/0x15
> [127032.275878] [<ffffffff8172ccf7>] __do_page_fault+0xa7/0x560
> [127032.276791] [<ffffffff81197793>] ? alloc_pages_current+0xa3/0x160
> [127032.277708] [<ffffffff81153Bfe>] ? __get_free_pages+0xe/0x50
> [127032.278630] [<ffffffff8117083e>] ? kmalloc_order_trace+0x2e/0xa0
> [127032.279553] [<ffffffff8172d1ca>] do_page_fault+0x1a/0x70
> [127032.280474] [<ffffffff811a2941>] ? __kmalloc+0x211/0x230
> [127032.281390] [<ffffffff81729628>] page_fault+0xZ8/0x30
> [127032.282275] [<ffffffffa009a976>] ? bch_htree_node_read+0x176/0x550 [bcache1
> [127032.283108] [<ffffffffa009bf45>] bch_btree_node_get+0x165/0x290 [bcache]
> [127032.283933] [<ffffffffa009c108>] bch_btree_map_nodes_recurse+0x98/0x140 [bcache]
> [127032.284760] [<ffffffffa009eb90>] ? bch_btree_insert_check_key+0x190/0x190 [bcache]
> [127032.285593] [<ffffffff81151947>]...

Read more...

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers