fscking btrfs consisting of multiple partitions fails

Bug #1447879 reported by Roger Binns on 2015-04-24
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Undecided
Unassigned

Bug Description

I have two SSDs in raid0 (striped) configuration with my root and home as separate subvolumes. This setup has worked fine for several Ubuntu releases, but fails with systemd on 15.04. systemd does manage to mount root, but then looks like it is trying to fsck the "disk" for home which is nonsensical. After 90 seconds it drops into a useless shell where I can't fix things.

Relevant lines from /etc/fstab

UUID=3ff68715-0daa-4e44-8de2-0997f36d8ab6 / btrfs defaults,device=/dev/disk/by-id/ata-Crucial_CT960M500SSD1_1347095B97B4-part1,device=/dev/disk/by-id/ata-Crucial_CT960M500SSD1_1338094EA3CA-part1,autodefrag,compress=lzo,subvol=@ 0 1
UUID=3ff68715-0daa-4e44-8de2-0997f36d8ab6 /home btrfs defaults,device=/dev/disk/by-id/ata-Crucial_CT960M500SSD1_1347095B97B4-part1,device=/dev/disk/by-id/ata-Crucial_CT960M500SSD1_1338094EA3CA-part1,autodefrag,compress=lzo,subvol=@home 0 2

Note how the lines are identical except for the subvol value. I originally didn't have the device= bits either but had to add it for root because of how `btrfs device scan` is run. With or without it present doesn't make any difference for home.

There is no such problem booting with upstart

ProblemType: Bug
DistroRelease: Ubuntu 15.04
Package: systemd 219-7ubuntu3
ProcVersionSignature: Ubuntu 3.19.0-15.15-generic 3.19.3
Uname: Linux 3.19.0-15-generic x86_64
ApportVersion: 2.17.2-0ubuntu1
Architecture: amd64
CurrentDesktop: XFCE
Date: Thu Apr 23 19:45:38 2015
MachineType: System manufacturer System Product Name
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.19.0-15-generic root=UUID=3ff68715-0daa-4e44-8de2-0997f36d8ab6 ro rootflags=subvol=@ nomdmonddf nomdmonisw init=/sbin/upstart
SourcePackage: systemd
SystemdDelta:
 [EXTENDED] /lib/systemd/system/systemd-timesyncd.service → /lib/systemd/system/systemd-timesyncd.service.d/disable-with-time-daemon.conf

 1 overridden configuration files found.
UpgradeStatus: Upgraded to vivid on 2015-04-23 (0 days ago)
dmi.bios.date: 08/13/2013
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 2104
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: P8Z77-V PRO
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev 1.xx
dmi.chassis.asset.tag: Asset-1234567890
dmi.chassis.type: 3
dmi.chassis.vendor: Chassis Manufacture
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr2104:bd08/13/2013:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnP8Z77-VPRO:rvrRev1.xx:cvnChassisManufacture:ct3:cvrChassisVersion:
dmi.product.name: System Product Name
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

Roger Binns (ubuntu-rogerbinns) wrote :
Martin Pitt (pitti) wrote :

I use the same btrfs layout (our installer creates @ and @home subvolumes automatically), with a similar fstab except for these weird device= options, and it works just fine. So I can't reproduce this yet. Can you please boot up to the point where you get into the rescue shell, then do

  systemctl status -l systemd-fsck-root.service > /root/fsck-root-status.txt
  journalctl -b > /root/journal.txt

then reboot back to upstart and attach the two files here? Thanks!

summary: - Systemd outsmarts itself with btrfs
+ fscking btrfs mount fails
Changed in systemd (Ubuntu):
status: New → Incomplete

The "weird device=" lines were me trying to work around a different regression. I have two hard drive controllers on the motherboard - a 4 port one from Intel and a 4 port one from ASMedia. I also have 6 devices plugged in (8 at one point). Before 15.04 I placed the drives randomly, and it turned out that one half of root was on the Intel controller and the other half was on the ASMedia controller. That worked fine before 15.04. With 15.04 the btrfs device scan step was run after the first controller was detected, but before/during the second one. That meant that root couldn't be mounted by initramfs because the second half hadn't been found by the earlier btrfs device scan.

In theory adding the device= bits would make that work. In practise they didn't. Since I needed a working system I had to change cables so that both devices were on the same first detected controller at which point mounting root worked. I've attached lshw output from right now (ie after moving devices).

Roger Binns (ubuntu-rogerbinns) wrote :

The problem is actually the mounting of /home not root. As you can see from the fstab they are identical except for the subvol= parameter. Note that I also tried removing the device= bits, updating the initramfs and trying again. It made no difference.

Roger Binns (ubuntu-rogerbinns) wrote :
Roger Binns (ubuntu-rogerbinns) wrote :
Roger Binns (ubuntu-rogerbinns) wrote :

I also got the status for dev-disk-by\x2duuid-3ff68715\x2d0daa\x2d4e44\x2d8de2\x2d0997f36d8ab6.device/start which just had messages about file/directory not found without saying which ones it was looking for. (To be clear the status for that job was found just fine - the contents of the status is what complained. I lost the file due to rebooting etc.)

The uuid it is stuck on there is the one of the filesystem containing subvols @ and @home, and is shared by both parts - /dev/sda1 and /dev/sdb1. Also note that reverse lookups such as /dev/disk/by-uuid/3ff68715-0daa-4e44-8de2-0997f36d8ab6 can only point to one those two (sdb1 right now). And just to make life more interesting the SSDs are one set of two identical 256GB and one set of two identical 960GB.

Roger Binns (ubuntu-rogerbinns) wrote :

$ sudo btrfs fi show
Label: 'main' uuid: 3ff68715-0daa-4e44-8de2-0997f36d8ab6
 Total devices 2 FS bytes used 408.16GiB
 devid 2 size 894.25GiB used 319.03GiB path /dev/sdb1
 devid 3 size 894.25GiB used 319.03GiB path /dev/sda1

Label: 'newspace' uuid: 9c0be64e-1841-445a-ac85-b2b46e92e5d8
 Total devices 1 FS bytes used 2.00TiB
 devid 1 size 3.58TiB used 2.33TiB path /dev/sdc2

Label: 'backups' uuid: b02cc605-dd78-40bc-98a5-8f5543d83b66
 Total devices 1 FS bytes used 662.07GiB
 devid 1 size 1.58TiB used 667.06GiB path /dev/sdd2

Label: none uuid: a877962c-c242-4517-8f5a-ae8da82bab64
 Total devices 2 FS bytes used 113.36GiB
 devid 1 size 207.74GiB used 65.03GiB path /dev/sdf3
 devid 2 size 208.67GiB used 65.03GiB path /dev/sde2

Hmm, non-unique UUIDs? That makes them fairly useless :-) Do you know, is that actually part of how the btrfs-builtin RAID-0 mode works? I'm familiar with md where you get one virtual device with the actual file system, and the underlying RAID components don't have a mountable file system and still separate UUIDs, which is a lot simpler to handle. Do you have a pointer to documentation how to set up such a system?

summary: - fscking btrfs mount fails
+ fscking btrfs mount fails with non-unique UUIDs

Apologies for misleading you - I thought you were more familiar with btrfs. The summary change is incorrect and btrfs is doing the right thing. Quite simply the filesystem is made up of more than one partition, so more than one partition correctly has the same UUID. But that also means looking at /dev/disk/by-uuid/UUID can only point to one of the partitions making up the filesystem UUID.

The btrfs wiki talks about multiple devices: https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices

@pitti: please can you change the title to something that isn't so misleading. That would let others realise this same bug affects them.

With btrfs if you make a filesystem consist of multiple partitions then each of those partitions will get the same UUID because they are all parts of the same filesystem. This is not incorrect behaviour. PARTUUID and UUID_SUB will be unique per partition as expected, but those are not what you use in /etc/fstab

I did a fresh install in Virtualbox with a RAID 0 btrfs and @ and @home subvolumes and it did work. My real system with considerably more drives is more complicated.

How do I troubleshoot this? I need to know what systemd thinks it is doing and why.

Martin Pitt (pitti) on 2015-05-26
summary: - fscking btrfs mount fails with non-unique UUIDs
+ fscking btrfs consisting of multiple partitions fails
Changed in systemd (Ubuntu):
status: Incomplete → New
Martin Pitt (pitti) wrote :

Quick answer: it's probably better to ask that on http://lists.freedesktop.org/mailman/listinfo/systemd-devel , there are a bunch of people familiar with both btrfs and systemd. Also, systemd-fsck-root.service apparently succeeded, so it's a different partition. What does this say?

  sudo systemctl status systemd-fsck@*.service

I will ask on the mailing list. It is failing to mount /home. The lines in /etc/fstab for root and /home are identical, except that one is subvol=@ and one is subvol=@home.

Root mounts just fine. When it comes time to mount /home, systemd is trying to check the filesystem but that isn't going succeed as all the parts are already mounted for root.

There are some logs included in earlier responses. No clues as to the actual problem though. I will reboot and get the logs for your request.

Outputs from the root shell you get dumped to on an unsuccessful boot. I did reboot back into it at one point because it kept saying that logs had been rotated.

The mailing list eventually went nowhere, including someone helping me out privately. He seemed to think it was a udev issue, which was highly amusing as the udev (and systemd) binaries were running from the very devices they believed weren't ready yet.

Here is the final scorecard. I have 3 systems, all using btrfs with raid0 across two devices. Both root and home are separate subvolumes on that.

System 1 (workstation - above report): systemd brings the system up but times out waiting to mount home. Setting /home to nofail in fstab and then running mount -a from a console works just fine. This is the only system where the two drives are identical - they are exactly the same SSD models and firmware versions on the same controller. (Note system fails to boot if they are on different controllers which wasn't a problem with earlier Ubuntu versions.) Booting with upstart works fine and is how I have now configured the system.

System 2 (laptop): the two devices are wrapped in LUKS+dmcrypt. They are decrypted at boot and then there is a pause, and then finally everything comes up. Most of the time that pause is 3 minutes, but sometimes is a few seconds. Again no problems in earlier Ubuntu releases.

System 3 (server): Like system 1, it is bare devices but in this case they are about the same capacity but from different vendors. Everything works perfectly, and if anything boots a bit faster than earlier Ubuntu releases.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers