This is the warning trigger line:
int bch_cached_dev_writeback_start(struct cached_dev *dc) ... WARN_ON(test_and_set_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags)); ...
This commit fixes this problem:
commit 1f0ffa67349c56ea54c03ccfd1e073c990e7411e Author: Coly Li <email address hidden> Date: Mon Jun 10 06:13:35 2019 +0800
bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached
It is included in linux 5.2, so for now only Ubuntu Eoan has the fix; I could reproduce the problem with Ubuntu Disco (5.0-based) kernel.
The problem can Oops the kernel (worse error) if the cache is not attached soon (reproduced below).
I'll provide a test kernel with the fix soon.
Steps to reproduce: ------------------
With attach (last line in script):
$ sudo ./setup-bcache-wb_percent-before-attach.sh ... [ 36.130659] WARNING: CPU: 2 PID: 1288 at drivers/md/bcache/writeback.c:837 bch_cached_dev_writeback_start+0x156/0x164 [bcache] ... [ 36.155742] RIP: 0010:bch_cached_dev_writeback_start+0x156/0x164 [bcache] ... [ 36.182455] Call Trace: [ 36.183387] bch_cached_dev_attach+0x356/0x480 [bcache] [ 36.185104] ? bch_keybuf_init+0x70/0x70 [bcache] [ 36.186653] __cached_dev_store+0x59f/0x8b0 [bcache] [ 36.188274] bch_cached_dev_store+0x39/0xc0 [bcache] [ 36.189887] sysfs_kf_write+0x3b/0x40 [ 36.191142] kernfs_fop_write+0x12e/0x1b0 [ 36.192502] __vfs_write+0x1b/0x40 [ 36.193686] vfs_write+0xab/0x1b0 [ 36.194842] ksys_write+0x5c/0xd0 [ 36.196026] __x64_sys_write+0x1a/0x20 [ 36.197296] do_syscall_64+0x5a/0x110 [ 36.198532] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 36.200158] RIP: 0033:0x7f69a1d09154 ... [ 36.222967] ---[ end trace ec65e906cd03afa6 ]--- [ 36.225017] bcache: bch_cached_dev_attach() Caching loop0 as bcache0 on set 57322d7a-1421-4dfc-a8cb-9c7b4e60c50a
Without attach (last line in script): in a few seconds:
$ sudo ./setup-bcache-wb_percent-before-attach.sh ... [ 30.623677] BUG: unable to handle kernel NULL pointer dereference at 0000000000000340 [ 30.627527] #PF error: [normal kernel read fault] [ 30.629825] PGD 0 P4D 0 [ 30.631194] Oops: 0000 [#1] SMP PTI ... [ 30.638818] Workqueue: events update_writeback_rate [bcache] [ 30.640543] RIP: 0010:update_writeback_rate+0x36/0x330 [bcache] ... [ 30.668496] Call Trace: [ 30.670003] ? __schedule+0x2d8/0x840 [ 30.671998] process_one_work+0x20f/0x410 [ 30.674163] worker_thread+0x34/0x400 [ 30.676160] kthread+0x120/0x140 [ 30.678014] ? process_one_work+0x410/0x410 [ 30.680238] ? __kthread_parkme+0x70/0x70 [ 30.682411] ret_from_fork+0x35/0x40 ... [ 30.704646] ---[ end trace 1bbc1b34c704f1c5 ]--- [ 30.707086] RIP: 0010:update_writeback_rate+0x36/0x330 [bcache]
Test script:
$ cat setup-bcache-wb_percent-before-attach.sh #!/bin/sh
#set -ex
dmesg -C echo 9 > /proc/sys/kernel/printk
FILE_BACKING=/backing.img FILE_CACHING=/caching.img
dd if=/dev/zero of=$FILE_BACKING bs=1G seek=1 count=0 dd if=/dev/zero of=$FILE_CACHING bs=1G seek=1 count=0
dd if=/dev/zero of=$FILE_BACKING bs=1M count=1 conv=notrunc dd if=/dev/zero of=$FILE_CACHING bs=1M count=1 conv=notrunc
LOOP_BACKING=$(losetup -f --show $FILE_BACKING) LOOP_CACHING=$(losetup -f --show $FILE_CACHING)
make-bcache -B $LOOP_BACKING make-bcache -C $LOOP_CACHING sleep 3
CSET_UUID=$(bcache-super-show $LOOP_CACHING | awk '/cset\.uuid/ { print $2 }')
# Change writeback_percent before attaching echo 11 > /sys/block/$(basename $LOOP_BACKING)/bcache/writeback_percent sleep 1
# Comment this line to Oops in a few seconds echo $CSET_UUID > /sys/block/$(basename $LOOP_BACKING)/bcache/attach
This is the warning trigger line:
int bch_cached_ dev_writeback_ start(struct cached_dev *dc)
WARN_ON( test_and_ set_bit( BCACHE_ DEV_WB_ RUNNING, &dc->disk.flags));
...
...
This commit fixes this problem:
commit 1f0ffa67349c56e a54c03ccfd1e073 c990e7411e
Author: Coly Li <email address hidden>
Date: Mon Jun 10 06:13:35 2019 +0800
bcache: only set BCACHE_ DEV_WB_ RUNNING when cached device attached
It is included in linux 5.2, so for now only Ubuntu Eoan has the fix;
I could reproduce the problem with Ubuntu Disco (5.0-based) kernel.
The problem can Oops the kernel (worse error) if the cache is not attached soon (reproduced below).
I'll provide a test kernel with the fix soon.
Steps to reproduce:
------------------
With attach (last line in script):
$ sudo ./setup- bcache- wb_percent- before- attach. sh md/bcache/ writeback. c:837 bch_cached_ dev_writeback_ start+0x156/ 0x164 [bcache] cached_ dev_writeback_ start+0x156/ 0x164 [bcache] dev_attach+ 0x356/0x480 [bcache] init+0x70/ 0x70 [bcache] dev_store+ 0x59f/0x8b0 [bcache] dev_store+ 0x39/0xc0 [bcache] write+0x3b/ 0x40 fop_write+ 0x12e/0x1b0 0x1b/0x40 0xab/0x1b0 0x5c/0xd0 write+0x1a/ 0x20 64+0x5a/ 0x110 64_after_ hwframe+ 0x44/0xa9 dev_attach( ) Caching loop0 as bcache0 on set 57322d7a- 1421-4dfc- a8cb-9c7b4e60c5 0a
...
[ 36.130659] WARNING: CPU: 2 PID: 1288 at drivers/
...
[ 36.155742] RIP: 0010:bch_
...
[ 36.182455] Call Trace:
[ 36.183387] bch_cached_
[ 36.185104] ? bch_keybuf_
[ 36.186653] __cached_
[ 36.188274] bch_cached_
[ 36.189887] sysfs_kf_
[ 36.191142] kernfs_
[ 36.192502] __vfs_write+
[ 36.193686] vfs_write+
[ 36.194842] ksys_write+
[ 36.196026] __x64_sys_
[ 36.197296] do_syscall_
[ 36.198532] entry_SYSCALL_
[ 36.200158] RIP: 0033:0x7f69a1d09154
...
[ 36.222967] ---[ end trace ec65e906cd03afa6 ]---
[ 36.225017] bcache: bch_cached_
Without attach (last line in script):
in a few seconds:
$ sudo ./setup- bcache- wb_percent- before- attach. sh writeback_ rate [bcache] writeback_ rate+0x36/ 0x330 [bcache] 0x2d8/0x840 one_work+ 0x20f/0x410 thread+ 0x34/0x400 one_work+ 0x410/0x410 parkme+ 0x70/0x70 fork+0x35/ 0x40 writeback_ rate+0x36/ 0x330 [bcache]
...
[ 30.623677] BUG: unable to handle kernel NULL pointer dereference at 0000000000000340
[ 30.627527] #PF error: [normal kernel read fault]
[ 30.629825] PGD 0 P4D 0
[ 30.631194] Oops: 0000 [#1] SMP PTI
...
[ 30.638818] Workqueue: events update_
[ 30.640543] RIP: 0010:update_
...
[ 30.668496] Call Trace:
[ 30.670003] ? __schedule+
[ 30.671998] process_
[ 30.674163] worker_
[ 30.676160] kthread+0x120/0x140
[ 30.678014] ? process_
[ 30.680238] ? __kthread_
[ 30.682411] ret_from_
...
[ 30.704646] ---[ end trace 1bbc1b34c704f1c5 ]---
[ 30.707086] RIP: 0010:update_
Test script:
$ cat setup-bcache- wb_percent- before- attach. sh
#!/bin/sh
#set -ex
dmesg -C kernel/ printk
echo 9 > /proc/sys/
FILE_BACKING= /backing. img /caching. img
FILE_CACHING=
dd if=/dev/zero of=$FILE_BACKING bs=1G seek=1 count=0
dd if=/dev/zero of=$FILE_CACHING bs=1G seek=1 count=0
dd if=/dev/zero of=$FILE_BACKING bs=1M count=1 conv=notrunc
dd if=/dev/zero of=$FILE_CACHING bs=1M count=1 conv=notrunc
LOOP_BACKING= $(losetup -f --show $FILE_BACKING) $(losetup -f --show $FILE_CACHING)
LOOP_CACHING=
make-bcache -B $LOOP_BACKING
make-bcache -C $LOOP_CACHING
sleep 3
CSET_UUID= $(bcache- super-show $LOOP_CACHING | awk '/cset\.uuid/ { print $2 }')
# Change writeback_percent before attaching $(basename $LOOP_BACKING) /bcache/ writeback_ percent
echo 11 > /sys/block/
sleep 1
# Comment this line to Oops in a few seconds $(basename $LOOP_BACKING) /bcache/ attach
echo $CSET_UUID > /sys/block/