Ubuntu 12.04 LTS, 14.04 LTS, 16.04 LTS do not support ext4 metadata checksumming

Bug #1365874 reported by Qoo Seven on 2014-09-05
82
This bug affects 14 people
Affects Status Importance Assigned to Milestone
e2fsprogs (Ubuntu)
Medium
Unassigned

Bug Description

In the Trusty release notes "Metadata checksumming" is listed as one of the tech highlights of kernel 3.13.

https://wiki.ubuntu.com/TrustyTahr/ReleaseNotes#New_features_in_14.04_LTS

However, the userland tools do not support this kernel feature. To the best of my knowledge this will be supported in 1.43, and won't be backported to 1.42.

IMHO this is very misleading. It's like a car salesperson sold you a sports car with an V8 engine. After you drove it home, you opened the hood and realized only 6 cylinders are working because 2 of the spark plugs were not included, and you have to buy aftermarket ones.

<rant>BTW, I've been following the development of e2fsprogs for over a year. 1.43 release has been in limbo for God knows how long :( </rant>

Theodore Ts'o (tytso) wrote :

Canonical doesn't employ any ext4 engineers, and as far as I know, no Ubuntu developers contribute to ext4 development. So the fact the release notes might get a few things wrong shouldn't be surprising. Canonical simply doesn't have any file system developers on staff, as far as I know.

If you are willing to try out Metadata checksumming, testers who report bugs are always appreciated. The way to get a feature out sooner is to get more people to contribute to those features. If you are ranting about the lack of a feature, I wonder how much more you would rant if the feature was released before it was ready and your data got damanged or lost?

Well, Ubuntu based on Debian (mostly testing stage).

If i'm looking here https://packages.qa.debian.org/e/e2fsprogs.html, we can see that Debian has no support for extended etx4 features yet and it looks like that take a while. Probably it was a little bit optimistic to add such features to the release notes.

Oleksij Rempel (olerem) wrote :

I would love to help with testing of metadata_csum as match as possible. Espesially becouse i waiting for this option so long time. But there are two problem which make it hard to do:
- no e2fsprogs 1.43 package even on ppa.
- all systems are still created without 64bit enabled.

Before testing we need:
- add 64bit option to mke2fs.conf by default.
- make 1.42 not complain if metadata_csum is enabled.
- provide ppa package with big fat warning: "don't cry if you test it"

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in e2fsprogs (Ubuntu):
status: New → Confirmed
Theodore Ts'o (tytso) wrote :

Note that there were bugs found that necessitated format changes for metadata checksums in 3.18 --- which was released earlier this week.

E2fsprogs 1.43 hasn't been released yet, and e2fsprogs 1.42 has ***no**** support for metadata checksums.

People who are interested in testing metadata checksums are welcome, but it should be people who are willing to use bleeding edge kernels and it's not **that** hard to build e2fsprogs from the git tree.

Cheat sheet:

git clone git://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git
cd e2fsprogs
./debian/rules
dpkg-buildpackage -b -uc -rfakeroot

Someone who doesn't feel themselves comfortable typing the above commands, and subcribing to the linux-ext4 list, and be prepared to potentially lose data and send bug reports to linux-ext4, probably shouldn't be experimenting with metadata_csum just yet.

I build a personal Ubuntu PPA for this tools collection (e2fsprogs).

I did a snapshot of git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git and used the master trunk maint. I will update my personal PPA monthly or if tytso tells me it's time to do so. :-)

I build deb packages for trusty, utopic and vivet and/or you can use my bzr branch for these packages.

You can add my PPA to your Ubuntu system with:

sudo add-apt-repository ppa:daniel-mehrmann/admin
sudo apt-get update
sudo apt-get upgrade

Happy testing!!

(And don't forget to report bugs to linux-ext4 and help developers to get this feature more stable)

Links:

PPA: https://launchpad.net/~daniel-mehrmann/+archive/ubuntu/admin
BZR branch: https://code.launchpad.net/~daniel-mehrmann/e2fsprogs/maint

Changed in e2fsprogs (Ubuntu):
assignee: nobody → Daniel Mehrmann (daniel-mehrmann)
status: Confirmed → In Progress

My e2fsprogs ppa got an update :-)

* New upstream snapshot from master branch (23-02-2015)

  - libext2fs: fix potential buffer overflow in closefs()
  - e2fsck: salvage under-sized dirents by removing them
  - e2fsck: improve the inline directory detector
  - e2fsck: inspect inline dir data as two directory blocks
  - e2fsck: decrement bad count _after_ remapping a duplicate block
  - e2fsck: handle multiple *ind block collisions with critical metadata
  - e2fsck: fix message when the journal is deleted and regenerated
  - e2fsck: on read error, don't rewrite blocks past the end of the fs
  - e2fsck: clear i_block[] when there are too many bad mappings on a special inode
  - tune2fs: direct user to resize2fs for 64bit conversion
  - tune2fs: abort when trying to enable/disable metadata_csum on mounted fs
  - tune2fs: disable csum verification before resizing inode
  - resize2fs: fix regression test to not depend on ext4.ko being loaded
  - libext2fs: fix tdb.c mmap leak
  - libext2fs: strengthen i_extra_isize checks when reading/writing xattrs
  - libext2fs: avoid pointless EA block allocation
  - libext2fs: initialize i_extra_isize when writing EAs
  - debugfs: fix crash in ea_set argument handling
  - debugfs: document new commands
  - misc: fix minor testcase problems
  - Reserve the codepoints for the new INCOMPAT feature ENCRYPT
  - buildsystem: use 'chmod a-w' instead of 'chmod -w'
  - e2fsck: fix corruption of Hurd filesystems
  - e2fuzz: fix clang warning
  - Fix clang warning and a resource leak
  - e2fsck: close the progress_fd in the logfile child process
  - libext2fs: add sanity check for an invalid itable_used value in inode scan code

New bzr branch is: https://code.launchpad.net/~daniel-mehrmann/e2fsprogs/master

Norbert (nrbrtx) wrote :

I have recently installed Ubuntu 16.10 to my flash drive. When I tried to make fsck on it I can't do it from any current LTS Ubuntu versions - 12.04, 14.04, 16.04. The only one working solution was to scan from 16.10 live install media.

So I think there are two solutions:
1. support 'metadata_csum' feature on all current supported Ubuntu LTS versions.
2. remove 'metadata_csum' feature from 16.10 default install options.

summary: - Ubuntu 14.04 does not support ext4 metadata checksumming
+ Ubuntu 12.04 LTS and 14.04 LTS does not support ext4 metadata
+ checksumming
tags: added: precise trusty xenial
summary: - Ubuntu 12.04 LTS and 14.04 LTS does not support ext4 metadata
+ Ubuntu 12.04 LTS, 14.04 LTS, 16.04 LTS do not support ext4 metadata
checksumming
Norbert (nrbrtx) wrote :

I reported bug 1601997 about setting 'metadata_csum' option on ext4 while installing fresh Ubuntu 16.10.

Changed in e2fsprogs (Ubuntu):
importance: Undecided → Wishlist
Richard Laager (rlaager) wrote :

This is fixed in 16.10. Is there a plan to backport this? I'm guessing not, because of the risk of regressions. If there's no plans to SRU, then this bug should probably be closed.

rich painter (painterengr) wrote :

I would REALLY like metadata_csum to be backported for 16.04 LTS

Norbert (nrbrtx) wrote :

Unable to check 16.10 ext4 file-system from 16.04 LTS (!).
Here is log:

   sudo fsck -fy /dev/sdc1
   fsck from util-linux 2.27.1
   e2fsck 1.42.13 (17-May-2015)
   /dev/sdc1 has unsupported feature(s): metadata_csum
   e2fsck: Get a newer version of e2fsck!

Norbert (nrbrtx) wrote :

Unable to check 17.10 ext4 file-system from 16.04 LTS (!).

Ken Sharp (kennybobs) wrote :

Aaaand... installing the hwe-edge packages on Xenial causes LVM EXT4 partitions to fail entirely because e2fsck cannot handle it.

A bit more of an issue than "wishlist".

Workaround: install Zesty e2fsck and e2libs and pray that they work.

TJ (tj) wrote :

I'm re-assigning the status and importance based on user reports in IRC and elsewhere.

See also the related "Ubuntu 16.10 installer sets metadata_csum option on ext4 partition which is incompatible with other LTS Ubuntu versions"

https://bugs.launchpad.net/ubuntu/+source/e2fsprogs/+bug/1601997

Changed in e2fsprogs (Ubuntu):
status: In Progress → Triaged
importance: Wishlist → Medium
assignee: Daniel Mehrmann (daniel-mehrmann) → nobody

Ubuntu 18.04 may well enable (under review) 64bit,metadata_csum by default, thereby creating ext4 filesystems that are not compatible with e2fsck on Ubuntu-16.04 LTS (or 14.04)?

This creates all sorts of problems for compatibility/portability of filesystems, for e.g.:-
* dual-booting 18.04 and older LTS versions
* "ext4 portable disks" do not work.
* (notice also Ken's hwe-edge/lvm issue above too).

I strongly support that Xenial gets a backport of e2fsprogs-1.43 (as requested) so that this compatibility-annoyance is ameliorated, at least.
Debian have already done this, created a "jessie-backports" e2fsprogs=1.43.3-1~bpo8+1

Hopefully tytso can advise us on the best version of e2fsprogs to backport (18.04 currently has 1.43.9-2).

Theodore Ts'o (tytso) wrote :

I recently released e2fsprogs 1.44.0 (currently in Debian Unstable, should hopefully hit Debian Testing in three more days) which turns on Metadata Checksums for everyone.

https://packages.debian.org/sid/e2fsprogs

Pulling in 1.44.0 supports two new features, largedir and ea_inode, (neither turned on by default yet, but the second in particular is very useful for Samba servers since it was written specifically to efficiently support Windows ACL's and Security ID's better than any other file system by supporting xattr dedupe.)

http://e2fsprogs.sourceforge.net/e2fsprogs-release.html#1.44.0

In general e2fsprogs has a very conservative release process, and there have been a *large* number of bug fixes since e2fsprogs 1.42.13. (Including some that can cause horrible file system corruption and data loss if you do off-line resize2fs operations (resizing the file system while the file system is not mounted) under some circumstances. So if you are using distribution with that has e2fsprogs 1.42.13, on the misguided assumption that staying on an ancient version of e2fsprogs is "safer" --- that is simply not true.

One caveat: in 1.44.0 I started relying on dpkg build profiles in debian/rules. This means e2fsprogs 1.44.0 no longer builds out of the box for Debian 7 (wheezy, aka Debian old-old-stable) and Ubuntu Trusty (14.04 LTS) and older releases. It should be possible to backport e2fsprogs 1.44.0 to 14.04 LTS, but at the very least 14.04 LTS should go to 1.43.9 to fix a huge number of bugs. But for 16.04, e2fsprogs 1.44.0 should just drop right in. And with 14.04 LTS and older, e2fsprogs 1.43.9 should Just Work.

It wouldn't be that hard to make e2fsprogs 1.44.0 work on 14.04 LTS, but it won't be turn-key.

Given your other comment (which I think may have been posted to the wrong thread):-
[E2fsprogs 1.44.0 now depends on dpkg build-profiles, which means that getting it backported to 14.04 LTS would require adjusting debian/control and debian/rules a bit. For 14.04 LTS, I'd urge consideration of going to e2fsprogs 1.43.9. This will get you most of the latest bug fixes, including some that could cause massive file system corruption and data loss (relative to e2fsprogs 1.42.x) in the right (wrong) circumstances.]

--are you saying that 1.39.9/1.44.0 ought to not only go to trusty-backports and xenial-backports, but also then into 'updates' to be 'pushed to all users' -- that needs some fiddly SRU process ?
I note Debian haven't pushed such an update back to jessie/wheezy either.

I (suspect) Canonical will require significant 'evidence'/'bug-reports' for backports to become 'updates' in this circumstance... HOPEFULLY 1.44.0 into bionic will be easier.

Hope that helps,

FWIW, As was discussed, I checked into Grub2 and os-prober in older supported ubuntu (which may have similar incompatibility to e2fsprogs, thereby creating a multibooting issue with 18.04).
After experimenting carefully with a 14.04+16.04+18.04 (GA kernels, no HWE specifically) BIOS-style triple boot, can confirm the grub ext4 support is all cross-compatible (14.04 can autodetect and boot 64bit,metadata_csum 18.04 from its' own grub2).

However, e2fsprogs is DEFINITELY an issue as above, definitely worth sorting-out whatever-happens to the 18.04 default filesystem options, in my view.

Theodore Ts'o (tytso) wrote :

The Metadata_csum feature is a RO_COMPAT feature. Hence, grub2 and os-prober shouldn't care about this feature (if properly implemented). This is because the guarntee is that even if the kernel (or userspace application directly accessing the file system) doesn't know about a bit in the RO_COMPAT bitmask, it is safe to access the file system in read-only mode.

Now, the 64-bit feature is an INCOMPAT feature. This means that if the kernel (or user-space program directly accessing the file sytem) doesn't understand a bit in the INCOMPAT feature set, it is *not* safe for it to try to understand the file system. This feature was only enabled in e2fsprogs 1.43.x if the file system was larger than 16TB. JHowever, by e2fsprogs 1.44.0, it is turned on by default even for file systems smaller than 16TB. That's because it was expected that by now, we had given grub2 and os-prober enough time to get with the program --- and if you create a file system without the 64-bit feature, it is not possible to online resize it beyond 16TB. So this is why we turn on the 64-bit feature out of the box with e2fsprogs 1.44.0 (there are failure modes with leaving it turned off for the sake of backwards compatibility with antique software versions). However, if a enterprise distribution decides that backwards compatibility is more important new features, it can ship e2fsprogs with an edited version of misc/mke2fs.conf.in. Feel free to turn off 64-bit if you think breaking the 15TB->16TB online resize is an acceptable consequence about the 16.04 vs 18.04 multi-booting issue. That's why the customers pay the enterprise distro providers the big bucks. :-)

Or get really frustrated with the enterprise distro providers. Probably both. :-)

See linked bug 1601997, response seems to be to accept your 'new' defaults for ext4 in 18.04. I note, particularly -- your request about 1.44.0 inclusion doesn't yet seem to be addressed [maybe it requires a separate bug//issue] ;-(. Do expand on that point if you can.

e2fsprogs 1.44.0 for bionic18.04 has apparently been agreed:-
https://bugs.launchpad.net/ubuntu/+source/e2fsprogs/+bug/1756177

[launchpad build-logs suggest its' been built for PROPOSED but not yet seen it 'come through' as a package or appear on 'packages.ubuntu.com' ...].

Hopefully that will go through. Is the first-step thereafter, to then "xenial-backports" 1.44.0 and "trusty-backports" 1.43.9 ? -- I believe devs spoke of accepting an SRU (full stable-release-update) for xenial e2fsprogs, but I suspect backporting in first instance may be a good approach [?].

OK On further investigation, I have confirmed a lot of key-facts (1 of 2 linked comments).

e2fsprogs 1.44.0-1 backports to xenial with no difficulty whatsoever, passes "make fullcheck" and works in every way I can tell, lots of resizing and checking and use within gparted, etc, all (apparently) behaving...

HOWEVER, for an official xenial-backport (and especially xenial-SRU), to minimize possible problems, I would highly recommend making single change of restoring the 'default' mke2fs.conf EXACTLY (byte-for-byte) as that which came with xenial e2fsprogs 1.42.13-1ubuntu1 :-
https://www.iremonger.me.uk/noidx/e2fs/mke2fs.conf.xenial
This (a) avoids prompting users who've customized their mke2fs.conf about merging-changes, and (b) avoids functional-change for those with automated-deployment-scripts etc. based upon ext4 creation.

I presume, a 'xenial-backport' -or- SRU "proposed" update can be started straight-away? xenial-SRU should DEFINITELY be considered for fsck compatibility with bionic-created FS.

HOWEVER, Trusty e2fsprogs backporting situation actually seems to be that, some change between 1.43.3-1~bpo8+1 and 1.43.4-2 is where the trusty-incompatibility has accrued, 1.43.9-2 does NOT build on trusty! :-
https://www.iremonger.me.uk/noidx/e2fs/e2fsprogs-1.43.9_trusty_build-fail.log

A simple patch to the 'debian/control' file adding the explicit '-dbg' package entries on the end *APPEARS* to solve all the problems, and allows 1.43.9 [or 1.44.0 for that matter!] to SEEMINGLY build fine, pass all tests, and work fine on trusty without any issues I can find so-far!:-
https://www.iremonger.me.uk/noidx/e2fs/e2fsprogs-1.43.9_trusty_debian-control.patch
https://www.iremonger.me.uk/noidx/e2fs/e2fsprogs-1.44.0_trusty_debian-control.patch

This all SEEMS to work fine, but I'd like tytso to comment on this, is this really a safe workaround or just 'fixing the symptom'. From what I can see all the right programs are generated and work fine.

AGAIN, I'd highly recommend installing byte-for-byte the 'original' mke2fs.conf in any trusty-backport version of e2fsprogs, so as to avoid any unwanted behavioural-changes or configuration-file-update-prompts :-.
https://www.iremonger.me.uk/noidx/e2fs/mke2fs.conf.trusty

From what I can SEE, if doing a significant backport to trusty, I can't see why not to just go straight to 1.44.0 in this case [again, hopefully tytso can comment!].

Theodore Ts'o (tytso) wrote :

If you are trying to build from the unpacked debian sources for 1.43.x, you'll need to run the commands:

./debian/rules mrproper
./debian/rules

... to have the build system figure out which antique version of Debian build infrastructure you are using, and regenerate debian/control from debian/control.in. (Usually I just check

That may be what you are running into. Note that starting in 1.44.0, I've dropped all of the backwards compatibility stuff, on the theory that people who want to support ancient Linux distribution systems are paid the big bucks, and it wasn't worth my volunteer time to make it all work and do all of the testing on ancient systems. (And especially since most enterpise distro folks weren't taking the latest bug fixes anyway, and it's been over ten years since I've had to care about enterprise distro customers.)

The short version is that the backwards compatibility stuff was all about making things like the debug packages work. It's all packaging gunk, and so as long as you crate packages that pass lintian checks, it's probably fine.

As a reminder, the distro packaging of the latest version of e2fsprogs for old back-level enterpise distros may cause the distro release folks to want to constain the default ext4 file system features that are enabled, since the older versions of grub, the linux kernel, et. al, might not support metadata checksuming (for example). So I could easily see that Ubuntu might want to adjust misc/mke2fs.conf.in file to disable certain file system features from being enabled in freshly created file systems by default, for example. I also have a vague memory that Ubuntu had a slightly different convention for supporting debug symbol packages on older Ubuntu systems. If so, that may require more adjustments --- or maybe you'll just decide to disable the debug symbol packages and call it a day.

Marc Peña (pachulo) wrote :

I would like to test this configuration in my laptop with a slow rotational disk + a small SSD: https://raid6.com.au/posts/fs_ext4_external_journal_caveats/

But, if I understood correctly, I need to configure journal_async_commit for it to be reliable, and that option automatically enables journal_checksum, that it's not available in the current e2fsprogs version that ubuntu 16.04 has...am I missing something?

I also think that what Simon comments about the SRU is quite sensible, but if getting a SRU is too hard, would it make sense to open a ticket requesting a backport from bionic to xenial?

Marc, Briefly just to let you know I'm working on a PPA for bionic e2fsprogs backport to xenial, will update when thats' ready. Turns out that:-
(a) Ubuntu devs are rather tied-up fixing bionic18.04 bugs, and
(b) to do a good SRU would need much regression-testing and somebody to push it forwards.

Marc Peña (pachulo) wrote :

That's great Simon! I will test it once it's ready then.

Regarding (b): wouldn't it be easier to try to get the package in xenial-backports then?
That repo it's activated by default, but as I can read here: https://help.ubuntu.com/community/UbuntuBackports

On releases before (but not including) Ubuntu 11.04 (Natty Narwhal), apt defaulted to always installing packages from Backports. On later releases, apt only installs packages from Backports when they are explicitly requested.

So the risk would be much lower.

Here is the PPA for all architectures, please test :-
https://launchpad.net/~ubuntu-iremonger/+archive/ubuntu/e2fsprogs-xenial

That is currently a backport of the version in bionic release itself, but maintains the xenial mke2fs.conf defaults [creating filesystems without 64bit,metadata_csum] for compatibility.

I notice e2fsprogs in cosmic [1.44.2] introduces an anti-crash-fix [filesystems designed to crash e2fsck!]. Exactly what versions should then be considered for xenial-backports and xenial-updates, and if any updates to bionic should also be considered, is another matter!.
Debian already have 1.44.2 backported to stretch (their current LTS release), for example.

"-updates" versions might not, for example, want to update the comerr development headers and other potential compatibility regression areas, who knows?.

Theodore Ts'o (tytso) wrote :

The blog post here: https://raid6.com.au/posts/fs_ext4_external_journal_caveats/

is not generally true. What journal_async_commit does is convert the sequence:

1. write journal blocks
2. cache flush
3. write journal commit block
4. cache flush

... to:

1. write journal blocks
2. write journal commit block
3. cache flush

This tends to make a lot of difference on HDD's from a performance perspective, because a cache flush commands are so expensive. On an SSD with a competently implemented flash translation layer (FTL), it shouldn't make much of a difference from a performance perspective, and it shouldn't hardly any difference from write endurance perspective.

The way flash works is that flash chips are organized into erase blocks, which might be say, 64k. This is the minimal size must be erased as a unit. Once an erase block is cleared (which is the slow operation) it can be written a flash page (typically 4k in size) at a time. Once a flash page is written, it can't be erased except by erasing the entire erase block. If most of the erase blocks are filled, either with real data, or with garbage (former data contents which have been superceded), then it might be necessary to copy the still-used data contents to other erase blocks, so that an erase block can be emptied so it can be erased. If it is necessary to do those extra copies before the erase block can be rebase, this is cause of the "write amplifification" effect.

However, doing an extra CACHE FLUSH operation (which is what journal_async_commit eliminates) should not make any difference on any competently implemented FTL on a normal SSD.

The place where it make a difference is on what gets referred to as "cost optimized" flash in polite company, or "crap flash" by more honest engineers. You will most often find this in eMMC flash or SD cards found in the cash register aisle of Micro Center (assuming of course, that you actually get honestly labelled flash as opposed to a SD card claiming to have 1G of flash, but which is only backed by 16 MB of flash --- such that the 16MB + 4k write will end up overwriting previously written data). In these "cost optimized" flash, the FTL may end up mapping each 64k erase page to a 64k LBA address space. In that case, a 4k write followed by a cache flush will end up being the equivalent of a 64k flash erase/write. In the even more awful "crappy flash", each 64k erase block is direct mapped to a 64k LBA address space. In that case, if you are constantly overwriting any portion of the flash (either the FAT table for FAT file systems, or the journal in ext4), then those erase blocks will get worn out first --- and once they are worn out, the flash device becomes broken.

But I emphasize that this is really only a problem for crap flash. For a normal SSD with a competent FTL, the use of journal_async_commit (or not using journal_async_commit) should not make any real difference to how long your flash device lasts.

Marc Peña (pachulo) wrote :

I've been using e2fsprogs-xenial from Simon PPA for a month without issues so far.

I'm using the an external drive for journal an this options in my fstab for the volume:

errors=remount-ro,data=journal,journal_async_commit,nobarrier,commit=60,noatime 0

Thanks for this!

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers