Comment 26 for bug 1829563

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Verification done for Disco (one patch change only).

Only one of the two bcache devices stop working upon failures in one backing device.
(see comment #21 for details).

# uname -rv
5.0.0-22-generic #23-Ubuntu SMP Tue Jul 23 17:23:54 UTC 2019

# ./setup-two-bcache-one-cache.sh >/dev/null 2>&1
[ 25.748828] bcache: register_bdev() registered backing device dm-1
[ 25.759145] bcache: register_bdev() registered backing device dm-0
[ 25.767247] bcache: run_cache_set() invalidating existing data
[ 25.778928] bcache: register_cache() registered cache device dm-2
[ 26.768350] bcache: bch_cached_dev_attach() Caching dm-0 as bcache1 on set 2bf1e70a-6f20-4680-bc63-f803142f294d
[ 26.795147] bcache: bch_cached_dev_attach() Caching dm-1 as bcache0 on set 2bf1e70a-6f20-4680-bc63-f803142f294d

# lsblk -e 252
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 1G 0 loop
└─fake-loop0 253:0 0 1024M 0 dm
  └─bcache1 251:128 0 1024M 0 disk
loop1 7:1 0 1G 0 loop
└─fake-loop1 253:1 0 1024M 0 dm
  └─bcache0 251:0 0 1024M 0 disk
loop2 7:2 0 1G 0 loop
└─fake-loop2 253:2 0 1024M 0 dm
  ├─bcache0 251:0 0 1024M 0 disk
  └─bcache1 251:128 0 1024M 0 disk

# echo writeback | tee /sys/block/bcache*/bcache/cache_mode
writeback

# echo always | tee /sys/block/bcache*/bcache/stop_when_cache_set_failed
always

# ./dm_fake_dev.sh /dev/loop0 bad
[ 42.723192] Buffer I/O error on dev dm-0, logical block 262128, async page read
[ 42.730031] Buffer I/O error on dev dm-0, logical block 262128, async page read
[ 42.736198] bcache: register_bcache() error /dev/dm-0: device already registered (emitting change event)
[ 42.738697] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable
[ 42.742277] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable
# [ 42.746748] Buffer I/O error on dev bcache1, logical block 262112, async page read
[ 42.752642] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable
[ 42.755650] Buffer I/O error on dev bcache1, logical block 262112, async page read
[ 42.758209] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable
[ 42.760642] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable
[ 42.762860] Buffer I/O error on dev bcache1, logical block 1, async page read

# dd if=/dev/zero of=/dev/bcache1 bs=4k & dd if=/dev/zero of=/dev/bcache0 bs=4k &
[1] 1557
[2] 1558
# [ 58.982340] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable
[ 58.984076] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable
[ 58.985718] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable
[ 58.987382] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable
[ 58.989011] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable
[ 58.990645] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable
[ 58.992293] Buffer I/O error on dev bcache1, logical block 0, lost async page write
[ 58.993733] Buffer I/O error on dev bcache1, logical block 1, lost async page write
[ 58.995201] Buffer I/O error on dev bcache1, logical block 2, lost async page write
[ 58.996651] Buffer I/O error on dev bcache1, logical block 3, lost async page write
...
[ 59.096950] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable
[ 59.098669] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable
[ 59.100621] bcache: bch_cached_dev_error() stop bcache1: too many IO errors on backing device dm-0
[ 59.100621]
dd: error writing '/dev/bcache1': No space left on device
262142+0 records in
262141+0 records out

[ 60.111733] bcache: bcache_device_free() bcache1 stopped

1073729536 bytes (1.1 GB, 1.0 GiB) copied, 2.10457 s, 510 MB/s
dd: error writing '/dev/bcache0': No space left on device
262142+0 records in
262141+0 records out
1073729536 bytes (1.1 GB, 1.0 GiB) copied, 4.67245 s, 230 MB/s

# lsblk -e 252
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 1G 0 loop
loop1 7:1 0 1G 0 loop
└─fake-loop1 253:1 0 1024M 0 dm
  └─bcache0 251:0 0 1024M 0 disk
loop2 7:2 0 1G 0 loop
└─fake-loop2 253:2 0 1024M 0 dm
  └─bcache0 251:0 0 1024M 0 disk
fake-loop0 253:0 0 1G 0 dm

only bcache1 was stopped. bcache0 remains working.

# reboot

# ./setup-two-bcache-one-cache.reboot.sh >/dev/null 2>&1
[ 17.606164] bcache: register_bdev() registered backing device dm-0
[ 17.672177] bcache: register_bdev() registered backing device dm-1
[ 17.752456] bcache: bch_journal_replay() journal replay done, 4936 keys in 6 entries, seq 207
[ 17.760279] bcache: bch_cached_dev_attach() Caching dm-1 as bcache1 on set 2bf1e70a-6f20-4680-bc63-f803142f294d
[ 17.766759] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set 2bf1e70a-6f20-4680-bc63-f803142f294d
[ 17.771989] bcache: register_cache() registered cache device dm-2

# lsblk -e 252
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 1G 0 loop
└─fake-loop0 253:0 0 1024M 0 dm
  └─bcache0 251:0 0 1024M 0 disk
loop1 7:1 0 1G 0 loop
└─fake-loop1 253:1 0 1024M 0 dm
  └─bcache1 251:128 0 1024M 0 disk
loop2 7:2 0 1G 0 loop
└─fake-loop2 253:2 0 1024M 0 dm
  ├─bcache0 251:0 0 1024M 0 disk
  └─bcache1 251:128 0 1024M 0 disk

both bcache devices reattached after reboot.