Bcache doesn't allow full unregistering without rebooting

Bug #1377142 reported by Kick In on 2014-10-03
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
bcache-tools (Ubuntu)
Undecided
Unassigned
linux (Ubuntu)
Medium
Unassigned

Bug Description

If you create a bcache device, you can't reuse all your disk/partitions without a reboot.

You can reproduce the case this way:

start a vm with 2 disks (caching must be bigger or equal to the backing cf bug :1377130) and an iso of utopic desktop amd64

create the bcache device:

  make-bcache --writeback --discard -C /dev/sda -B /dev/sdb

  UUID: b245150d-cfbe-4f90-836a-343e0e1a4c55
  Set UUID: c990a31a-f531-4231-9603-d40230dc6504
  version: 0
  nbuckets: 16384
  block_size: 1
  bucket_size: 1024
  nr_in_set: 1
  nr_this_dev: 0
  first_bucket: 1
  UUID: cc31e0bb-db29-4115-a1b2-e9ff54e5f127
  Set UUID: c990a31a-f531-4231-9603-d40230dc6504
  version: 1
  block_size: 1
  data_offset: 16

  ******************************
  command:
  lsblk

  result:
  NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
  sda 8:0 0 8G 0 disk
  └─bcache0 251:0 0 16G 0 disk
  sdb 8:16 0 16G 0 disk
  └─bcache0 251:0 0 16G 0 disk
  sr0 11:0 1 1,1G 0 rom /cdrom
  loop0 7:0 0 1,1G 1 loop /rofs
  ******************************

All is good

  lsmod | grep bcache:
  bcache 227884 3

format the bcache device:

  ******************************
  command:
  mkfs.ext4 /dev/bcache0

  result:
  Rejet des blocs de périphérique : 4096/4194302 complété
  Creating filesystem with 4194302 4k blocks and 1048576 inodes
  Filesystem UUID: 587d2249-3eaf-4590-a00d-42939f257e99
  Superblocs de secours stockés sur les blocs :
   32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
   4096000

  Allocation des tables de groupe : 0/128 complété
  Écriture des tables d'i-noeuds : 0/128 complété
  Création du journal (32768 blocs) : complété
  Écriture des superblocs et de l'information de comptabilité du système de
  fichiers : 0/128 2/128 complété

  ******************************

Now mount it:
  mount /dev/bcache0 /mnt/
  mkdir /mnt/test_dir
  touch /mnt/test_file

state of: /sys/fs/bcache/
  ls /sys/fs/bcache/

  c990a31a-f531-4231-9603-d40230dc6504
  register
  register_quiet

bcache-super-show /dev/sda
  sb.magic ok
  sb.first_sector 8 [match]
  sb.csum E6A8D9AC496B0B04 [match]
  sb.version 3 [cache device]

  dev.label (empty)
  dev.uuid b245150d-cfbe-4f90-836a-343e0e1a4c55
  dev.sectors_per_block 1
  dev.sectors_per_bucket 1024
  dev.cache.first_sector 1024
  dev.cache.cache_sectors 16776192
  dev.cache.total_sectors 16777216
  dev.cache.ordered yes
  dev.cache.discard yes
  dev.cache.pos 0
  dev.cache.replacement 0 [lru]

  cset.uuid c990a31a-f531-4231-9603-d40230dc6504
  ******************************

bcache-super-show -f /dev/sdb

  sb.magic ok
  sb.first_sector 8 [match]
  sb.csum 9600B159F36A67DD [match]
  sb.version 1 [backing device]

  dev.label (empty)
  dev.uuid cc31e0bb-db29-4115-a1b2-e9ff54e5f127
  dev.sectors_per_block 1
  dev.sectors_per_bucket 1024
  dev.data.first_sector 16
  dev.data.cache_mode 1 [writeback]
  dev.data.cache_state 2 [dirty]

  cset.uuid c990a31a-f531-4231-9603-d40230dc6504
  ******************************

mount:
  /dev/bcache0 on /mnt type ext4 (rw)

we will unregister the bcache:
  echo 1 /sys/fs/bcache/c990a31a-f531-4231-9603-d40230dc6504/unregister

check the content of /sys/fs/bcache
  ls /sys/fs/bcache/

  register
  register_quiet

So bcache is unregistered, but status of /dev/sda:

  bcache-super-show -f /dev/sda
  sb.magic ok
  sb.first_sector 8 [match]
  sb.csum E6A8D9AC496B0B04 [match]
  sb.version 3 [cache device]

  dev.label (empty)
  dev.uuid b245150d-cfbe-4f90-836a-343e0e1a4c55
  dev.sectors_per_block 1
  dev.sectors_per_bucket 1024
  dev.cache.first_sector 1024
  dev.cache.cache_sectors 16776192
  dev.cache.total_sectors 16777216
  dev.cache.ordered yes
  dev.cache.discard yes
  dev.cache.pos 0
  dev.cache.replacement 0 [lru]

  cset.uuid c990a31a-f531-4231-9603-d40230dc6504

  command:
  bcache-super-show -f /dev/sdb

  result:
  sb.magic ok
  sb.first_sector 8 [match]
  sb.csum D41799F794675FA8 [match]
  sb.version 1 [backing device]

  dev.label (empty)
  dev.uuid cc31e0bb-db29-4115-a1b2-e9ff54e5f127
  dev.sectors_per_block 1
  dev.sectors_per_bucket 1024
  dev.data.first_sector 16
  dev.data.cache_mode 1 [writeback]
  dev.data.cache_state 0 [detached]

  cset.uuid 00000000-0000-0000-0000-000000000000
  ******************************

Maybe I'm wrong, but I would expect the cahing device to be 000000000ed not the backing device, as we may still want to use the data on it.

But data is still accessible on the mount point.

  So we wipe /dev/sda now:
  ******************************
  command:
  wipefs -a /dev/sda

  result:
  /dev/sda: 16 bytes were erased at offset 0x00001018 (bcache): c6 85 73 f6 4e 1a 45 ca 82 65 f5 7f 48 ba 6d 81
  ******************************

mount:
  /dev/bcache0 on /mnt type ext4 (rw)

data still there:
  ******************************
  command:
  ls /mnt/

  result:
  lost+found
  test_dir
  test_file
  ******************************

  ******************************
  command:
  lsblk

  result:
  NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
  sda 8:0 0 8G 0 disk
  sdb 8:16 0 16G 0 disk
  └─bcache0 251:0 0 16G 0 disk /mnt
  sr0 11:0 1 1,1G 0 rom /cdrom
  loop0 7:0 0 1,1G 1 loop /rofs
  ******************************

Ok now we can backup our data if we need to, next, we will umount:
  umount /mnt/

ok no errors.

  lsblk:
  NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
  sda 8:0 0 8G 0 disk
  sdb 8:16 0 16G 0 disk
  └─bcache0 251:0 0 16G 0 disk
  sr0 11:0 1 1,1G 0 rom /cdrom
  loop0 7:0 0 1,1G 1 loop /rofs
  ******************************

bcache0 is still there, ok, but how to delete to unregister it definitively?

  ls /sys/fs/bcache/

  result:
  register
  register_quiet
  **************

  lsmod

  result:
  Module Size Used by
  bcache 227884 1

we can't wipefs -a /dev/sdb, device is in use, so we reboot machine (NOT OK)

  in the vm:
  ls /sys/fs/bcache/

  result:
  register
  register_quiet

update packages and reinstall bcache-tools:

  bcache-super-show -f /dev/sda

  result:
  Invalid superblock (bad magic)
  sb.magic bad magic

  bcache-super-show -f /dev/sdb

  result:
  sb.magic ok
  sb.first_sector 8 [match]
  sb.csum D41799F794675FA8 [match]
  sb.version 1 [backing device]

  dev.label (empty)
  dev.uuid cc31e0bb-db29-4115-a1b2-e9ff54e5f127
  dev.sectors_per_block 1
  dev.sectors_per_bucket 1024
  dev.data.first_sector 16
  dev.data.cache_mode 1 [writeback]
  dev.data.cache_state 0 [detached]

  cset.uuid 00000000-0000-0000-0000-000000000000

Now we can wipe /dev/sdb because we rebooted, and bcache don't use the device anymore:
  wipefs -a /dev/sdb

  result:
  /dev/sdb: 16 bytes were erased at offset 0x00001018 (bcache): c6 85 73 f6 4e 1a 45 ca 82 65 f5 7f 48 ba 6d 81

  **********************************************************

Now I can re-use my device, but I needed a reboot. Maybe we also must have better bcache management tools.

Kick In (kick-d) on 2014-10-03
Changed in linux (Ubuntu):
milestone: none → ubuntu-14.10
summary: - Bcache don't allow full unregistering without rebooting
+ Bcache doesn't allow full unregistering without rebooting
Kick In (kick-d) on 2014-10-03
description: updated

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1377142

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.17 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.17-rc7-utopic/

Changed in linux (Ubuntu):
importance: Undecided → Medium
Chris J Arges (arges) wrote :

bug 1377130 is the other bug referenced by this one

Stefan Bader (smb) wrote :

There does not seem to be clear documentation on this. Just a note that from your description I was not completely sure whether you did unmount before starting the release procedure. I would say unmount is a must.
Doing so, I also ended up in the state where bcache0 still existed. But I was able to get rid of that by

echo 1 >/sys/block/bcache0/bcache/stop

Yes you can get rid of bcache0, but you can't re-use the device for
anything unless you reboot ( re bcache or just plain fdisk/parted ).

Le 08/10/2014 13:15, Stefan Bader a écrit :
> There does not seem to be clear documentation on this. Just a note that from your description I was not completely sure whether you did unmount before starting the release procedure. I would say unmount is a must.
> Doing so, I also ended up in the state where bcache0 still existed. But I was able to get rid of that by
>
> echo 1 >/sys/block/bcache0/bcache/stop
>

Kick In (kick-d) wrote :

In fact whether you unmount before stopping the bcache or not doesn't
change the behaviour. I did that this way in the description, to show
the process. You just stop the caching device with the echo 1 > ...
But, the backing is still in use, but you have no control over it.
Le 08/10/2014 13:15, Stefan Bader a écrit :
> There does not seem to be clear documentation on this. Just a note that from your description I was not completely sure whether you did unmount before starting the release procedure. I would say unmount is a must.
> Doing so, I also ended up in the state where bcache0 still existed. But I was able to get rid of that by
>
> echo 1 >/sys/block/bcache0/bcache/stop
>

Stefan Bader (smb) wrote :

Hm, it was helping in my case. So here from my re-run with utopic KVM host:

# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 16G 0 disk
└─bcache0 251:0 0 8G 0 disk /mnt
sdb 8:16 0 8G 0 disk
└─bcache0 251:0 0 8G 0 disk /mnt

# bcache-super-show /dev/sda
sb.magic ok
sb.first_sector 8 [match]
sb.csum 29533C7B4D0F16EA [match]
sb.version 3 [cache device]

dev.label (empty)
dev.uuid 91706fc8-39ae-4cbd-b0a0-8a202ee6a377
dev.sectors_per_block 1
dev.sectors_per_bucket 1024
dev.cache.first_sector 1024
dev.cache.cache_sectors 33553408
dev.cache.total_sectors 33554432
dev.cache.ordered yes
dev.cache.discard yes
dev.cache.pos 0
dev.cache.replacement 0 [lru]

cset.uuid a8f70bd1-48df-462f-9a2c-ca4b8af9059c

# bcache-super-show /dev/sdb
sb.magic ok
sb.first_sector 8 [match]
sb.csum 81BB0342C7270559 [match]
sb.version 1 [backing device]

dev.label (empty)
dev.uuid 4ae46f48-d11d-45ca-a1ec-905eaf8e1da8
dev.sectors_per_block 1
dev.sectors_per_bucket 1024
dev.data.first_sector 16
dev.data.cache_mode 1 [writeback]
dev.data.cache_state 1 [clean]

cset.uuid a8f70bd1-48df-462f-9a2c-ca4b8af9059c

Stefan Bader (smb) wrote :

# umount /mnt
# echo 1 >/sys/fs/bcache/a8f70bd1-48df-462f-9a2c-ca4b8af9059c/unregister
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 16G 0 disk
sdb 8:16 0 8G 0 disk
└─bcache0 251:0 0 8G 0 disk

# echo 1 >/sys/block/bcache0/bcache/stop
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 16G 0 disk
sdb 8:16 0 8G 0 disk

# wipefs -a /dev/sda
/dev/sda: 16 bytes were erased at offset 0x00001018 (bcache): c6 85 73 f6 4e 1a 45 ca 82 65 f5 7f 48 ba 6d 81
# wipefs -a /dev/sdb
/dev/sdb: 16 bytes were erased at offset 0x00001018 (bcache): c6 85 73 f6 4e 1a 45 ca 82 65 f5 7f 48 ba 6d 81

Stefan Bader (smb) wrote :

Interesting note: "echo 1 >/sys/block/bcache0/bcache/stop" from a point where the bcache is registered and running (but not mounted) directly releases both devices for me.

Stefan Bader (smb) wrote :

Ok, but that leaves sda in a state which looks unclaimed but is still in use.

Stefan Bader (smb) wrote :

which can be fixed by unregistering via /sys/fs/bcache... what a horrible interface.

Kick In (kick-d) wrote :

Thanks,

I didn't wrote to /sys/block/bcache0/bcache/stop, I missed this as I was only looking in /dev/bcache* and in /sys/fs/bcache; which explains why I was left with an used device.

I agree that the interface is not very user-fiendly, we may need to improve the bcache-tools command to simplify this.

Ryan Harper (raharper) wrote :

While the interface is annoying, it *is* possible to have bcache let go of both backing store and cache devices via the sysfs interfaces.

Changed in bcache-tools (Ubuntu):
status: New → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers