I/O Error Test 1 ================
commit "bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags"
Problem: the cacheset is not retired immediately on I/O errors in cache device if I/O requests keep coming.
Original kernel: bcache device remains on top of caching device, and 'fio' never finishes.
Modified kernel: bcache device is removed from caching device, and fio finishes.
Original --------
# uname -rv 4.15.0-55-generic #60-Ubuntu SMP Tue Jul 2 18:22:20 UTC 2019
# ./setup.sh >/dev/null 2>&1 [ 285.677682] bcache: register_bdev() registered backing device dm-0 [ 285.697006] bcache: run_cache_set() invalidating existing data [ 285.710938] bcache: register_cache() registered cache device dm-1 [ 287.686924] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set c589879b-b1c3-49b3-9603-9795ddc750f5
# lsblk -e 252 NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop0 7:0 0 1G 0 loop └─fake-loop0 253:0 0 1024M 0 dm └─bcache0 251:0 0 1024M 0 disk loop1 7:1 0 1G 0 loop └─fake-loop1 253:1 0 1024M 0 dm └─bcache0 251:0 0 1024M 0 disk
# ./dm_fake_dev.sh /dev/loop1 bad [ 766.102586] Buffer I/O error on dev dm-1, logical block 262128, async page read [ 766.107602] Buffer I/O error on dev dm-1, logical block 262128, async page read [ 766.113889] bcache: register_bcache() error /dev/dm-1: device already registered
On another shell:
# fio --name=write --rw=randwrite --filename=/dev/bcache0 --bs=4k --iodepth=8 --ioengine=libaio --runtime=300s --continue_on_error=all write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=8 fio-3.1 Starting 1 process Jobs: 1 (f=1): [f(1)][100.0%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 00m:00s]
< fio never finishes, keeps showing the last line above > < console prints as below when fio starts running .... >
[ 777.451177] bcache: bch_count_io_errors() dm-1: IO error on writing btree, recovering [ 777.490882] bcache: error on c589879b-b1c3-49b3-9603-9795ddc750f5: [ 777.490885] journal io error [ 777.494087] , disabling caching [ 807.900700] bcache: bch_count_io_errors() dm-1: IO error on writing btree, recovering (error msgs looping)
bcache0 still present on top of cache device (fake-loop1)
# lsblk -e 252 NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop0 7:0 0 1G 0 loop └─fake-loop0 253:0 0 1024M 0 dm └─bcache0 251:0 0 1024M 0 disk loop1 7:1 0 1G 0 loop fake-loop1 253:1 0 1G 0 dm └─bcache0 251:0 0 1024M 0 disk
Modified --------
# uname -rv 4.15.0-55-generic #60+test20190703build1bcache1-Ubuntu SMP Wed Jul 3 21:41:37 UTC
# ./setup.sh >/dev/null 2>&1 [ 60.542088] bcache: register_bdev() registered backing device dm-0 [ 60.550509] bcache: run_cache_set() invalidating existing data [ 60.560109] bcache: register_cache() registered cache device dm-1 [ 62.548849] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set f6833a2c-53e6-468e-bf1d-a9f48b73d783
# ./dm_fake_dev.sh /dev/loop1 bad [ 72.639185] Buffer I/O error on dev dm-1, logical block 262128, async page read [ 72.644876] Buffer I/O error on dev dm-1, logical block 262128, async page read [ 72.650707] bcache: register_bcache() error /dev/dm-1: device already registered
# fio --name=write --rw=randwrite --filename=/dev/bcache0 --bs=4k --iodepth=8 --ioengine=libaio --runtime=300s --continue_on_error=all
[ 97.858468] bcache: bch_count_io_errors() dm-1: IO error on writing btree, recovering [ 97.868519] bcache: error on f6833a2c-53e6-468e-bf1d-a9f48b73d783: [ 97.868520] journal io error [ 97.869998] , disabling caching [ 97.871441] bcache: conditional_stop_bcache_device() stop_when_cache_set_failed of bcache0 is "auto" and cache is clean, keep it alive. [ 97.874423] Buffer I/O error on dev bcache0, logical block 2814, lost async page write [ 97.878697] Buffer I/O error on dev bcache0, logical block 2816, lost async page write [ 97.881702] Buffer I/O error on dev bcache0, logical block 2817, lost async page write [ 97.884790] Buffer I/O error on dev bcache0, logical block 2818, lost async page write [ 97.887709] Buffer I/O error on dev bcache0, logical block 2819, lost async page write [ 97.890558] Buffer I/O error on dev bcache0, logical block 2820, lost async page write [ 97.892419] Buffer I/O error on dev bcache0, logical block 2821, lost async page write [ 97.894228] Buffer I/O error on dev bcache0, logical block 2822, lost async page write [ 97.896107] Buffer I/O error on dev bcache0, logical block 2823, lost async page write [ 97.897900] Buffer I/O error on dev bcache0, logical block 2824, lost async page write [ 97.916818] bcache: cached_dev_detach_finish() Caching disabled for dm-0 [ 97.918511] bcache: bch_count_io_errors() dm-1: IO error on writing btree, recovering [ 97.920581] bcache: cache_set_free() Cache set f6833a2c-53e6-468e-bf1d-a9f48b73d783 unregistered
fio finished:
write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=8 fio-3.1 Starting 1 process Jobs: 1 (f=1): [f(1)][100.0%][r=0KiB/s,w=135MiB/s][r=0,w=34.7k IOPS][eta 00m:00s] ... Run status group 0 (all jobs): WRITE: bw=219MiB/s (229MB/s), 219MiB/s-219MiB/s (229MB/s-229MB/s), io=1024MiB (1074MB), run=4685-4685msec
bcache not on top of caching device:
# lsblk -e 252 NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop0 7:0 0 1G 0 loop └─fake-loop0 253:0 0 1024M 0 dm └─bcache0 251:0 0 1024M 0 disk loop1 7:1 0 1G 0 loop fake-loop1 253:1 0 1G 0 dm
I/O Error Test 1
================
commit "bcache: add CACHE_SET_ IO_DISABLE to struct cache_set flags"
Problem: the cacheset is not retired immediately on I/O errors in
cache device if I/O requests keep coming.
Original kernel: bcache device remains on top of caching device,
and 'fio' never finishes.
Modified kernel: bcache device is removed from caching device,
and fio finishes.
Original
--------
# uname -rv
4.15.0-55-generic #60-Ubuntu SMP Tue Jul 2 18:22:20 UTC 2019
# ./setup.sh >/dev/null 2>&1 dev_attach( ) Caching dm-0 as bcache0 on set c589879b- b1c3-49b3- 9603-9795ddc750 f5
[ 285.677682] bcache: register_bdev() registered backing device dm-0
[ 285.697006] bcache: run_cache_set() invalidating existing data
[ 285.710938] bcache: register_cache() registered cache device dm-1
[ 287.686924] bcache: bch_cached_
# lsblk -e 252
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 1G 0 loop
└─fake-loop0 253:0 0 1024M 0 dm
└─bcache0 251:0 0 1024M 0 disk
loop1 7:1 0 1G 0 loop
└─fake-loop1 253:1 0 1024M 0 dm
└─bcache0 251:0 0 1024M 0 disk
# ./dm_fake_dev.sh /dev/loop1 bad
[ 766.102586] Buffer I/O error on dev dm-1, logical block 262128, async page read
[ 766.107602] Buffer I/O error on dev dm-1, logical block 262128, async page read
[ 766.113889] bcache: register_bcache() error /dev/dm-1: device already registered
On another shell:
# fio --name=write --rw=randwrite --filename= /dev/bcache0 --bs=4k --iodepth=8 --ioengine=libaio --runtime=300s --continue_ on_error= all 100.0%] [r=0KiB/ s,w=0KiB/ s][r=0, w=0 IOPS][eta 00m:00s]
write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=8
fio-3.1
Starting 1 process
Jobs: 1 (f=1): [f(1)][
< fio never finishes, keeps showing the last line above >
< console prints as below when fio starts running .... >
[ 777.451177] bcache: bch_count_ io_errors( ) dm-1: IO error on writing btree, recovering b1c3-49b3- 9603-9795ddc750 f5: io_errors( ) dm-1: IO error on writing btree, recovering
[ 777.490882] bcache: error on c589879b-
[ 777.490885] journal io error
[ 777.494087] , disabling caching
[ 807.900700] bcache: bch_count_
(error msgs looping)
bcache0 still present on top of cache device (fake-loop1)
# lsblk -e 252
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 1G 0 loop
└─fake-loop0 253:0 0 1024M 0 dm
└─bcache0 251:0 0 1024M 0 disk
loop1 7:1 0 1G 0 loop
fake-loop1 253:1 0 1G 0 dm
└─bcache0 251:0 0 1024M 0 disk
Modified
--------
# uname -rv 3build1bcache1- Ubuntu SMP Wed Jul 3 21:41:37 UTC
4.15.0-55-generic #60+test2019070
# ./setup.sh >/dev/null 2>&1 dev_attach( ) Caching dm-0 as bcache0 on set f6833a2c- 53e6-468e- bf1d-a9f48b73d7 83
[ 60.542088] bcache: register_bdev() registered backing device dm-0
[ 60.550509] bcache: run_cache_set() invalidating existing data
[ 60.560109] bcache: register_cache() registered cache device dm-1
[ 62.548849] bcache: bch_cached_
# lsblk -e 252
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 1G 0 loop
└─fake-loop0 253:0 0 1024M 0 dm
└─bcache0 251:0 0 1024M 0 disk
loop1 7:1 0 1G 0 loop
└─fake-loop1 253:1 0 1024M 0 dm
└─bcache0 251:0 0 1024M 0 disk
# ./dm_fake_dev.sh /dev/loop1 bad
[ 72.639185] Buffer I/O error on dev dm-1, logical block 262128, async page read
[ 72.644876] Buffer I/O error on dev dm-1, logical block 262128, async page read
[ 72.650707] bcache: register_bcache() error /dev/dm-1: device already registered
On another shell:
# fio --name=write --rw=randwrite --filename= /dev/bcache0 --bs=4k --iodepth=8 --ioengine=libaio --runtime=300s --continue_ on_error= all
[ 97.858468] bcache: bch_count_ io_errors( ) dm-1: IO error on writing btree, recovering 53e6-468e- bf1d-a9f48b73d7 83: stop_bcache_ device( ) stop_when_ cache_set_ failed of bcache0 is "auto" and cache is clean, keep it alive. dev_detach_ finish( ) Caching disabled for dm-0 io_errors( ) dm-1: IO error on writing btree, recovering 53e6-468e- bf1d-a9f48b73d7 83 unregistered
[ 97.868519] bcache: error on f6833a2c-
[ 97.868520] journal io error
[ 97.869998] , disabling caching
[ 97.871441] bcache: conditional_
[ 97.874423] Buffer I/O error on dev bcache0, logical block 2814, lost async page write
[ 97.878697] Buffer I/O error on dev bcache0, logical block 2816, lost async page write
[ 97.881702] Buffer I/O error on dev bcache0, logical block 2817, lost async page write
[ 97.884790] Buffer I/O error on dev bcache0, logical block 2818, lost async page write
[ 97.887709] Buffer I/O error on dev bcache0, logical block 2819, lost async page write
[ 97.890558] Buffer I/O error on dev bcache0, logical block 2820, lost async page write
[ 97.892419] Buffer I/O error on dev bcache0, logical block 2821, lost async page write
[ 97.894228] Buffer I/O error on dev bcache0, logical block 2822, lost async page write
[ 97.896107] Buffer I/O error on dev bcache0, logical block 2823, lost async page write
[ 97.897900] Buffer I/O error on dev bcache0, logical block 2824, lost async page write
[ 97.916818] bcache: cached_
[ 97.918511] bcache: bch_count_
[ 97.920581] bcache: cache_set_free() Cache set f6833a2c-
fio finished:
write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=8 100.0%] [r=0KiB/ s,w=135MiB/ s][r=0, w=34.7k IOPS][eta 00m:00s]
fio-3.1
Starting 1 process
Jobs: 1 (f=1): [f(1)][
...
Run status group 0 (all jobs):
WRITE: bw=219MiB/s (229MB/s), 219MiB/s-219MiB/s (229MB/s-229MB/s), io=1024MiB (1074MB), run=4685-4685msec
bcache not on top of caching device:
# lsblk -e 252
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 1G 0 loop
└─fake-loop0 253:0 0 1024M 0 dm
└─bcache0 251:0 0 1024M 0 disk
loop1 7:1 0 1G 0 loop
fake-loop1 253:1 0 1G 0 dm