btrfs seed sprout based installation

Bug #1842198 reported by Chris Murphy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
subiquity
New
Wishlist
Unassigned

Bug Description

This is not a bug, it's just an idea for enhancement so file it accordingly.

Current image, e.g. ubuntu-19.04-live-server-amd64.iso, contains an embedded squashfs image. If the image were instead btrfs based, created with compress=zstd:15 mount option, you can do a few extra things for free.

a. You get close to the same compression you're used to with the squashfs image (it's a bit bigger, not quite as efficient as squashfs)

b. Most users now skip the media checker. But since the btrfs image contains metadata and data checksums, any corruptions result in EIO, rather than propagating corruption to the installation. It's non-optional. It doesn't check for corruptions outside of the root image, but it's definitely better than no checking; plus it's a persistent check rather than a one time check.

c. It's still possible to cp/rsync/tar pipe the source to a non-btrfs destination. But if the destination is also chosen to be btrfs, instead of mkfs, just add the destination device as a new device for the btrfs volume, then delete the seed device - causing replication.
https://btrfs.wiki.kernel.org/index.php/Seed-device

Example, rudimentary test with a Fedora live image a conventional rsync installs in ~5 minutes. Using seed sprout replication, it's 20 seconds following the removal of the seed device. It's fast because compressed block groups are directly copied from source to destination, not files, and therefore no decompression + recompression.

d. reboot optional. Once the seed is removed, there's no dependency on the install media, it can be removed. Of course the current environment is overlayfs based, so the no reboot feature would take a bit of work to make it use the btrfs overlay method instead, but if you've made it this far...

Similar in concept to LVM2's pvmove command, but quite a bit faster and uses transparent compression and checksumming.

Some other optimizations are possible. Servers often have gobs of memory and decently fast internal storage. Check for that, and if there's enough, start this replication right away using an e.g. /dev/zram0 device as destination. The user can be making installation choices while this is happening. Whether it's copied to a non-btrfs destination, or 'btrfs dev add' + 'btrfs dev del' to install, it's way faster now that the payload is coming from a RAM disk than the stick.

Of course, it can apply to desktop systems also.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

These are certainly interesting ideas! I would guess we'd be more likely to do something zfs based than btrfs based but I'm not completely sure.

For d, having to reboot after the install is more of a feature than a bug IMO. If the installer has screwed up the bootloader configuration somehow, it's much better to find out immediately than three weeks later after a power cut! (Also once the install media is more than about three weeks old, you'll usually be installing a newer kernel than the one running the live session).

Opportunistically coping the rootfs to RAM if there is a lot of it is certainly an interesting idea (although I installed my NVMe laptop with the server installer the other day and the entire curtin invocation only took about 80s of which the rsync only took 20s so there's not a whole heap of room for improvement here when the disks are this fast anyway).

Changed in subiquity:
importance: Undecided → Wishlist
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.