kernel oops in bcache module

Bug #1793901 reported by step21 on 2018-09-22
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Unassigned
Trusty
Medium
Unassigned
Xenial
Medium
Unassigned
Bionic
Medium
Unassigned
Cosmic
Medium
Unassigned

Bug Description

SRU Justification
=================

[Impact]

Some users see panics like the following when performing fstrim on a bcached volume:

[ 529.803060] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 530.183928] #PF error: [normal kernel read fault]
[ 530.412392] PGD 8000001f42163067 P4D 8000001f42163067 PUD 1f42168067 PMD 0
[ 530.750887] Oops: 0000 [#1] SMP PTI
[ 530.920869] CPU: 10 PID: 4167 Comm: fstrim Kdump: loaded Not tainted 5.0.0-rc1+ #3
[ 531.290204] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015
[ 531.693137] RIP: 0010:blk_queue_split+0x148/0x620
[ 531.922205] Code: 60 38 89 55 a0 45 31 db 45 31 f6 45 31 c9 31 ff 89 4d 98 85 db 0f 84 7f 04 00 00 44 8b 6d 98 4c 89 ee 48 c1 e6 04 49 03 70 78 <8b> 46 08 44 8b 56 0c 48
8b 16 44 29 e0 39 d8 48 89 55 a8 0f 47 c3
[ 532.838634] RSP: 0018:ffffb9b708df39b0 EFLAGS: 00010246
[ 533.093571] RAX: 00000000ffffffff RBX: 0000000000046000 RCX: 0000000000000000
[ 533.441865] RDX: 0000000000000200 RSI: 0000000000000000 RDI: 0000000000000000
[ 533.789922] RBP: ffffb9b708df3a48 R08: ffff940d3b3fdd20 R09: 0000000000000000
[ 534.137512] R10: ffffb9b708df3958 R11: 0000000000000000 R12: 0000000000000000
[ 534.485329] R13: 0000000000000000 R14: 0000000000000000 R15: ffff940d39212020
[ 534.833319] FS: 00007efec26e3840(0000) GS:ffff940d1f480000(0000) knlGS:0000000000000000
[ 535.224098] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 535.504318] CR2: 0000000000000008 CR3: 0000001f4e256004 CR4: 00000000001606e0
[ 535.851759] Call Trace:
[ 535.970308] ? mempool_alloc_slab+0x15/0x20
[ 536.174152] ? bch_data_insert+0x42/0xd0 [bcache]
[ 536.403399] blk_mq_make_request+0x97/0x4f0
[ 536.607036] generic_make_request+0x1e2/0x410
[ 536.819164] submit_bio+0x73/0x150
[ 536.980168] ? submit_bio+0x73/0x150
[ 537.149731] ? bio_associate_blkg_from_css+0x3b/0x60
[ 537.391595] ? _cond_resched+0x1a/0x50
[ 537.573774] submit_bio_wait+0x59/0x90
[ 537.756105] blkdev_issue_discard+0x80/0xd0
[ 537.959590] ext4_trim_fs+0x4a9/0x9e0
[ 538.137636] ? ext4_trim_fs+0x4a9/0x9e0
[ 538.324087] ext4_ioctl+0xea4/0x1530
[ 538.497712] ? _copy_to_user+0x2a/0x40
[ 538.679632] do_vfs_ioctl+0xa6/0x600
[ 538.853127] ? __do_sys_newfstat+0x44/0x70
[ 539.051951] ksys_ioctl+0x6d/0x80
[ 539.212785] __x64_sys_ioctl+0x1a/0x20
[ 539.394918] do_syscall_64+0x5a/0x110
[ 539.568674] entry_SYSCALL_64_after_hwframe+0x44/0xa9

[Fix]

Under certain conditions, the test for whether an operation should be written back to the underlying device was incorrect. Specifically, in should_writeback(), we were hitting a case where an optimisation for partial stripe conditions was returning true and so should_writeback() was returning true early. This caused the code to go down an incorrect path and create bios that contained NULL pointers.

To fix this issue, make sure that should_writeback() on a discard op never returns true.

[Test Case]

We have observed it on some systems where both:
1) LVM/devmapper is involved (bcache backing device is LVM volume) and
2) writeback cache is involved (bcache cache_mode is writeback)

Not every machine exhibits the bug. On one machine that does exhibit the bug, we can reliably reproduce it with:

 # echo writeback > /sys/block/bcache0/bcache/cache_mode
 # mount /dev/bcache0 /test
 # for i in {0..10}; do file="$(mktemp /test/zero.XXX)"; dd if=/dev/zero of="$file" bs=1M count=256; sync; rm $file; done; fstrim -v /test

[Regression Potential]

This could affect any device where bcache is used.

In mitigation, however: the patch is simple, is limited to considering discard operations. The patch has been accepted upstream [1] and the maintainer will be including it in SuSE kernels [2]. A Gentoo user validated the upstream patch independently [3].

[1] https://www.spinics.net/lists/linux-bcache/msg06997.html
[2] https://www.spinics.net/lists/linux-bcache/msg06998.html
[3] https://bugzilla.kernel.org/show_bug.cgi?id=196103#c3

[Original Description]

This was on an 18.04.1 install running the 4.15-34 generic kernel image, running from a normal ext4 root device.
I had just a short while before created a new bcache device that was mounted but to which no data had been written yet. Then without any apparent particular reason, an apport error popped up to inform of a bcache kernel oops. Crash log was uploaded but no idea how to link it, so I attach it as well.
Mostly I would like to know how concerned I should be as after a previous, successful test I wanted to move the whole install to bcache. Ideally, if this is a bug or similar, it would be nice if it could get fixed.

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-34-generic 4.15.0-34.37
ProcVersionSignature: Ubuntu 4.15.0-34.37-generic 4.15.18
Uname: Linux 4.15.0-34-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair nvidia_modeset nvidia
ApportVersion: 2.20.9-0ubuntu7.3
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Sat Sep 22 18:20:22 2018
HibernationDevice: RESUME=UUID=6bcbe7fa-85b7-4baf-9b69-0558a668bcdd
InstallationDate: Installed on 2014-07-29 (1515 days ago)
InstallationMedia: It
IwConfig:
 zthnhe3w6d no wireless extensions.

 eth1 no wireless extensions.

 lo no wireless extensions.
MachineType: System manufacturer System Product Name
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
ProcFB: 0 EFI VGA
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.15.0-34-generic root=UUID=ebbab625-f14e-44ba-84d5-025ed92a5b2a ro quiet splash
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-34-generic N/A
 linux-backports-modules-4.15.0-34-generic N/A
 linux-firmware 1.173.1
RfKill:
 0: hci0: Bluetooth
  Soft blocked: yes
  Hard blocked: no
SourcePackage: linux
UpgradeStatus: Upgraded to bionic on 2018-09-07 (15 days ago)
dmi.bios.date: 10/22/2015
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 0604
dmi.board.asset.tag: Default string
dmi.board.name: H170I-PLUS D3
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev X.0x
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0604:bd10/22/2015:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnH170I-PLUSD3:rvrRevX.0x:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: Default string
dmi.product.name: System Product Name
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

step21 (step22) wrote :
step21 (step22) wrote :

repost with ubuntu-bug and for different package as this one might be more generally applicable. This is also reproducible, happens each time sometime after mount, even after a reboot. New crash log is attached.

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
step21 (step22) wrote :

It seems that this might be only a problem with linux-image-4.15.0-34-generic
I tested it now with linux-image-4.15.0-34-generic and 4.18.0 (self compiled) and in both cases the oops didn't trigger after a reasonable amount of time.
When checking the changelog for -34 it seems there was a change regarding bcache: bcache: fix kcrashes with fio in RAID5 backend dev
Could this be related to that?

Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.19 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.19-rc5

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
tags: added: needs-bisect
step21 (step22) wrote :

My system has changed somewhat, as I moved my root / to bcache with a kernel that worked.
Still, I installed the mainline build, and the oops did not occur so far. However I also cannot run that kernel properly as it does not seem to work well with Nvidia graphics. (very low resolution and probably wrong driver)
However I am not sure that ensures that the problem is fixed, as when I rebooted with the kernel where the problem did occur, it also did not occur anymore (or at least it didn't say) but as stated above, the setup is not the same as the bcache device is now /. I could maybe make a separate bcache device to test, and there were still a lot of errors about device already being registered (more in the older 'buggy' kernel than in the newer one). It is probably still possible to undelete the previous partition or recreate it where the problem did occur, but it takes some work and time to do that.

step21 (step22) wrote :

My system has changed somewhat, as I moved my root / to bcache with a kernel that worked.
Still, I installed the mainline build, and the oops did not occur so far. However I also cannot run that kernel properly as it does not seem to work well with Nvidia graphics. (very low resolution and probably wrong driver)
However I am not sure that ensures that the problem is fixed, as when I rebooted with the kernel where the problem did occur, it also did not occur anymore (or at least it didn't say) but as stated above, the setup is not the same as the bcache device is now /. I made a separate, similar bcache device (in the same devices, with the same settings), but there the problem also didn't occur so far. The only thing I couldn't didn't replicate (apart from the partitions being at different places) was that when the error occured, at first the caching device didn't seem to be properly registered, but when adding it manually it said it was. Finally after running partprobe and/or rebooting this was remedied. When now recreating a fresh bcache device, there were no issues with registering/recognizing, and I am not sure how to replicate this.

step21 (step22) on 2018-09-27
tags: added: invalid
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Lacyc3 (lacyc3) wrote :

Hello step21,

I can replicate your bug with fstrim -a command. Can you please try it (save your work before)?
In fact, bug reproduces itself in a weekly bases thanks to fstrim systemd timer. Timer can be disabled with: sudo systemctl disable fstrim.timer command.

In my understanding, fstrim tries to trim cache drive, however it locked by bcache.

OS: Ubuntu 18.10
Kernel: 4.18.0-12-generic #13-Ubuntu

Can you please confirm?

Changed in linux (Ubuntu):
status: Expired → Incomplete
Vladimir Grevtsev (vlgrevtsev) wrote :

We have a reproducer now:

$ uname -a
Linux ln-sv-infr01 4.15.0-42-generic #45-Ubuntu SMP Thu Nov 15 19:32:57 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 3.7T 0 disk
├─sda1 8:1 0 512M 0 part /boot/efi
└─sda2 8:2 0 3.7T 0 part
  ├─ln--sv--infr01--vg-root 253:0 0 301G 0 lvm /
  ├─ln--sv--infr01--vg-swap_1
  │ 253:1 0 976M 0 lvm [SWAP]
  └─ln--sv--infr01--vg-var 253:2 0 3.4T 0 lvm
    └─bcache0 252:0 0 3.4T 0 disk /var
nvme0n1 259:0 0 1.5T 0 disk
└─bcache0 252:0 0 3.4T 0 disk /var

$ sudo fstrim /var
# at this point it will immediately fail to kernel panic ... oops

Daniel Axtens (daxtens) wrote :

I think I have discovered the cause: https://<email address hidden>/

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Daniel Axtens (daxtens) wrote :

Hi,

I have a patch which I believe fixes your issue: https://www.spinics.net/lists/linux-bcache/msg06997.html

It looks like it will go in to the 5.1 kernel, and I will propose it for backporting to earlier Ubuntu kernels.

Regards,
Daniel

Daniel Axtens (daxtens) on 2019-01-24
description: updated
tags: removed: amd64 bionic invalid needs-bisect
Seth Forshee (sforshee) on 2019-01-28
Changed in linux (Ubuntu):
status: Confirmed → Fix Committed
step21 (step22) wrote :

So any idea when the backport will drop?

Stefan Bader (smb) on 2019-01-30
Changed in linux (Ubuntu Xenial):
importance: Undecided → Medium
Changed in linux (Ubuntu Trusty):
importance: Undecided → Medium
Changed in linux (Ubuntu Xenial):
status: New → Fix Committed
Changed in linux (Ubuntu Trusty):
status: New → Fix Committed
Changed in linux (Ubuntu Bionic):
status: New → In Progress
Changed in linux (Ubuntu Cosmic):
status: New → In Progress
Changed in linux (Ubuntu Bionic):
importance: Undecided → Medium
Changed in linux (Ubuntu Cosmic):
importance: Undecided → Medium
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Cosmic):
status: In Progress → Fix Committed
Launchpad Janitor (janitor) wrote :
Download full text (14.1 KiB)

This bug was fixed in the package linux - 4.19.0-12.13

---------------
linux (4.19.0-12.13) disco; urgency=medium

  * linux: 4.19.0-12.13 -proposed tracker (LP: #1813664)

  * kernel oops in bcache module (LP: #1793901)
    - SAUCE: bcache: never writeback a discard operation

  * Disco update: 4.19.18 upstream stable release (LP: #1813611)
    - ipv6: Consider sk_bound_dev_if when binding a socket to a v4 mapped address
    - mlxsw: spectrum: Disable lag port TX before removing it
    - mlxsw: spectrum_switchdev: Set PVID correctly during VLAN deletion
    - net: dsa: mv88x6xxx: mv88e6390 errata
    - net, skbuff: do not prefer skb allocation fails early
    - qmi_wwan: add MTU default to qmap network interface
    - ipv6: Take rcu_read_lock in __inet6_bind for mapped addresses
    - net: clear skb->tstamp in bridge forwarding path
    - netfilter: ipset: Allow matching on destination MAC address for mac and
      ipmac sets
    - gpio: pl061: Move irq_chip definition inside struct pl061
    - drm/amd/display: Guard against null stream_state in set_crc_source
    - drm/amdkfd: fix interrupt spin lock
    - ixgbe: allow IPsec Tx offload in VEPA mode
    - platform/x86: asus-wmi: Tell the EC the OS will handle the display off
      hotkey
    - e1000e: allow non-monotonic SYSTIM readings
    - usb: typec: tcpm: Do not disconnect link for self powered devices
    - selftests/bpf: enable (uncomment) all tests in test_libbpf.sh
    - of: overlay: add missing of_node_put() after add new node to changeset
    - writeback: don't decrement wb->refcnt if !wb->bdi
    - serial: set suppress_bind_attrs flag only if builtin
    - bpf: Allow narrow loads with offset > 0
    - ALSA: oxfw: add support for APOGEE duet FireWire
    - x86/mce: Fix -Wmissing-prototypes warnings
    - MIPS: SiByte: Enable swiotlb for SWARM, LittleSur and BigSur
    - crypto: ecc - regularize scalar for scalar multiplication
    - arm64: perf: set suppress_bind_attrs flag to true
    - drm/atomic-helper: Complete fake_commit->flip_done potentially earlier
    - clk: meson: meson8b: fix incorrect divider mapping in cpu_scale_table
    - samples: bpf: fix: error handling regarding kprobe_events
    - usb: gadget: udc: renesas_usb3: add a safety connection way for
      forced_b_device
    - fpga: altera-cvp: fix probing for multiple FPGAs on the bus
    - selinux: always allow mounting submounts
    - ASoC: pcm3168a: Don't disable pcm3168a when CONFIG_PM defined
    - scsi: qedi: Check for session online before getting iSCSI TLV data.
    - drm/amdgpu: Reorder uvd ring init before uvd resume
    - rxe: IB_WR_REG_MR does not capture MR's iova field
    - efi/libstub: Disable some warnings for x86{,_64}
    - jffs2: Fix use of uninitialized delayed_work, lockdep breakage
    - clk: imx: make mux parent strings const
    - pstore/ram: Do not treat empty buffers as valid
    - media: uvcvideo: Refactor teardown of uvc on USB disconnect
    - powerpc/xmon: Fix invocation inside lock region
    - powerpc/pseries/cpuidle: Fix preempt warning
    - media: firewire: Fix app_info parameter type in avc_ca{,_app}_info
    - ASoC: use dma_ops of parent device for acp_audio_dma
    - media: ve...

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-cosmic' to 'verification-done-cosmic'. If the problem still exists, change the tag 'verification-needed-cosmic' to 'verification-failed-cosmic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-cosmic
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'. If the problem still exists, change the tag 'verification-needed-trusty' to 'verification-failed-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-trusty
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers