"btrfs device delete /dev/sdaX /" fails with error "ERROR: error removing the device '/dev/sdaX'"

Bug #880645 reported by Erik B. Andersen on 2011-10-24
52
This bug affects 9 people
Affects Status Importance Assigned to Milestone
btrfs-tools (Ubuntu)
Undecided
Unassigned

Bug Description

What I did to cause this:

Installed ubuntu using btrtfs root.
Added a device with 'btrfs device add /dev/sda9 /'
Balanced the file system with 'btrfs filesystem balance /'
Tried to remove a device with 'btrfs device delete /dev/sda9 /'. Noticed that I got the error "ERROR: error removing the device '/dev/sda9'".

(Unfortunately, I'm not sure how to get more information on the problem.)

ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: btrfs-tools 0.19+20100601-3ubuntu3
ProcVersionSignature: Ubuntu 3.0.0-12.20-generic 3.0.4
Uname: Linux 3.0.0-12-generic x86_64
NonfreeKernelModules: nvidia wl
ApportVersion: 1.23-0ubuntu3
Architecture: amd64
Date: Mon Oct 24 00:44:42 2011
InstallationMedia: Ubuntu 11.10 "Oneiric Ocelot" - Release amd64 (20111010)
ProcEnviron:
 LC_CTYPE=C
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/zsh
SourcePackage: btrfs-tools
UpgradeStatus: No upgrade log present (probably fresh install)

Erik B. Andersen (azendale) wrote :
description: updated
Erik B. Andersen (azendale) wrote :
Erik B. Andersen (azendale) wrote :
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in btrfs-tools (Ubuntu):
status: New → Confirmed
Erik B. Andersen (azendale) wrote :

According to #btrfs on freenode, this problem occurs because balancing the filesystem raid1 mirrors the meta data. Then you are not allowed to remove any devices because the metadata wouldn't be raid1 mirrored any more. The weird question that comes from this is why are you allowed to make a btrfs filesystem with raid1 that only has one device in the first place?

It seems like you can not have the raid1 problem by making your btrfs partitions with 'mkfs.btrfs -m single -d single /dev/sdxx' which specifies that both the data (-d) and the metadata (-m) are not raid1 mirrored.

Tom (doyenguy) wrote :

I can also reproduce this bug, both with 10.10 (kernel 2.6.35) and 11.10 (which has 3.x)

It also fails even if you use "mkfs.btrfs -m single -d single /dev/sdxx" when you create the initial device for the btrfs filesystem.

It seems to work OK if all the devices have no data on them, like if you create several small loopback devices, add no data to the filesystem, then you will have no trouble adding and removing devices willy-nilly.

Here's how it pops up for me: I have a 2 TB btrfs filesystem (1 drive) about halfway full. I then added a 2nd 2TB drive to the pool/filesystem. I did a sync and then immediately tried to remove the 2nd device. No go. You can issue "btrfs device delete ..." all you like. It won't remove, even though no data has been written to it yet. It's locked in place it seems :-(

I am attaching a testcase that reproduces the bug.

Tom (doyenguy) wrote :
Ketil Malde (ketil-ii) wrote :

I get an error when trying to remove one disk from a set of three (so metadata could still be mirrored). This is on Precise Pangolin. I also include the output of "filesystem show", as something clearly happened here, although I'm unsure how to interpret it.

% sudo btrfs fi show
Label: 'scratch' uuid: 184706ea-89f5-438a-a9f5-b5e91b3ce267
        Total devices 3 FS bytes used 140.62GB
        devid 1 size 2.73TB used 138.02GB path /dev/sda
        devid 2 size 2.73TB used 137.01GB path /dev/sdb
        devid 3 size 2.73TB used 138.01GB path /dev/sdc

% sudo btrfs dev del /dev/sdc /scratch
ERROR: error removing the device '/dev/sdc'

% sudo btrfs fi show
Label: 'scratch' uuid: 184706ea-89f5-438a-a9f5-b5e91b3ce267
        Total devices 3 FS bytes used 141.49GB
        devid 1 size 2.73TB used 75.03GB path /dev/sda
        devid 2 size 2.73TB used 75.01GB path /dev/sdb
        devid 3 size 2.73TB used 138.01GB path /dev/sdc

Btrfs Btrfs v0.19

jperez (jmperezbeth) wrote :

Suffering same problem with 4 disk setup with raid1 metadata and raid0 data. Enough free space on filesystem, added a fourth disk to replace one with relocate pending sectors (only 56 to be exact, on a 1.5 TB disk), tried to delete the bad one and no luck. System exercices disks a couple of hours and pop... error removing device.

I'm actually concerned with the old version of btrfs-tools, it's btrfs command doesn't support scrub etc, that ships with 11.10 and ¡12.04 same version! So waiting a month won't update the tools. :-/

Thorsten Zitterell (hik) wrote :

I installed the mainline kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4-precise/ and compiled the btrfs tools from git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git.

Then, I could remove the second drive after converting raid1 back to raid0:

% ./btrfs balance start --force -mconvert=raid0 /

Thomas Mayer (thomas303) wrote :

With Ubuntu 14.04, Kernel 3.16 (from HES), I could btrfs-balance a non-raid /dev/sda2 btrfs fs with

cd /mnt
mkdir fulldisk
mount /dev/sda2 fulldisk
btrfs device add /dev/sdb fulldisk
btrfs balance start fulldisk
btrfs device delete /dev/sdb fulldisk

But with Ubuntu 14.04, Kernel 4.2.0-30-generic btrfs-delete gives me a message

"error removing the device '/dev/sdb' - unable to go below two devices on raid1"

It does not make sense to me why btrfs is reporting a raid 1 (mirroring) instead of a raid 0 (for adding a device).

I consider this to be a regression (or at least a change in behaviour) somewhere between kernel 3.16 and kernel 4.2. Note that I'm still using the same btrfs-tools (3.12), so the change in behaviour should have happened in the kernel.

-----------

I have:

btrfs filesystem show
Label: none uuid: 16120d81-8cde-4e81-87cd-f55f65a4923b
        Total devices 2 FS bytes used 1.99TiB
        devid 1 size 2.73TiB used 1.81TiB path /dev/sda2
        devid 2 size 465.76GiB used 200.03GiB path /dev/sdb

Rolf Leggewie (r0lf) wrote :

The error as originally reported is due to btrfs converting meta and system data to RAID1 when adding a second device to a single device filesystem and rebalancing. You can then no longer go below 2 devices. Check dmesg or /var/log/syslog as suggested. The remedy is to mconvert to single and remove the device then. Then another mconvert to DUP. Cumbersome indeed, but not a bug.

Most comments from #8 onwards are about different issues and should be dealt with in their own tickets.

Changed in btrfs-tools (Ubuntu):
status: Confirmed → Invalid
Aeaeaeaeaeae (aeaeaeaeaeae) wrote :

should btrfs maybe add a hint to add the "--force" option when it refuses to do

"btrfs balance start -mconvert=raid0 /" (maybe "single" instead of "raid0" works too?)

when i tried this command without --force, I just got "invalid argument" only the log file suggested to use "--force"

similarly if would be helpful if "btrfs device delete" would print a hint to do the -mconvert=raid0 (or single) in order to get back to a setup with only one disk

This would help to avoid hours of pointless googling...

Thomas Mayer (thomas303) wrote :

I agree that this issue is not a bug as long as the behaviour I experienced is intentional.

I fully agree that in case the behaviour is intentional, it also should be documented somewhere (including kernel versions in case of breaking changes).

I also agree that btrfs tools' output should be more verbose and give hints what happens in the background and why operations just don't work any more and how to fix it. I think that I was quite close to a data loss, which is why I think this issue still is relevant and somehow critical (from a usability perspective at least).

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers