Probable regression with EXT3 file systems and CVE-2018-1093 patches

Bug #1789131 reported by Sarah Newman
40
This bug affects 6 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Critical
Joseph Salisbury
Trusty
Fix Released
Critical
Joseph Salisbury

Bug Description

== SRU Justification ==
Mainline commit 7dac4a1726a9 introduced a regression in v4.17-rc1, which
made it's way into Trusty via upstream stable updates. This regression
is resolved by mainline commit 22be37acce25. This commit has been cc'd
to upstream stable, but has not made it's way into Trusty as of yet.

== Fix ==
22be37acce25 ("ext4: fix bitmap position validation")

== Regression Potential ==
Low. This commit has been cc'd to upstream stable, so it has had
additional upstream review.

== Test Case ==
A test kernel was built with this patch and tested by the original bug reporter.
The bug reporter states the test kernel resolved the bug.

A customer reported on all of their ext3 and none of their ext4 systems that the file system was in read-only mode, I believe after rebooting into 3.13.0-157.207 from 3.13.0-156.206. Here is the output of tune2fs -l for one of the file systems:

tune2fs 1.42.12 (29-Aug-2014)
Last mounted on: /
Filesystem UUID: 748f503a-443d-4769-8dd2-45ff46b48555
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
Filesystem flags: signed_directory_hash
Default mount options: (none)
Filesystem state: clean with errors
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 1966080
Block count: 7863296
Reserved block count: 393164
Free blocks: 4568472
Free inodes: 1440187
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 1022
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
RAID stride: 128
RAID stripe width: 512
Filesystem created: Thu Feb 25 21:54:24 2016
Last mount time: Fri Aug 24 07:40:51 2018
Last write time: Fri Aug 24 07:40:51 2018
Mount count: 1
Maximum mount count: 25
Last checked: Fri Aug 24 07:38:54 2018
Check interval: 15552000 (6 months)
Next check after: Wed Feb 20 07:38:54 2019
Lifetime writes: 7381 GB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: d6564a54-cd2a-4804-ad94-1e4e0e47933a
Journal backup: inode blocks
FS Error count: 210
First error time: Fri Aug 24 07:40:51 2018
First error function: ext4_validate_block_bitmap
First error line #: 376
First error inode #: 0
First error block #: 0
Last error time: Sun Aug 26 19:35:16 2018
Last error function: ext4_remount
Last error line #: 4833
Last error inode #: 0
Last error block #: 0

Revision history for this message
Sarah Newman (srn-f) wrote :

I noticed the unusual thing here is:

RAID stride: 128
RAID stripe width: 512

I'm going to do some testing around this to see if it's related. It could be this is the issue and not anything EXT3 vs. EXT4.

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1789131

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Sarah Newman (srn-f) wrote :

I confirmed that adding "-E stride=128,stripe_width=512" to the call to mkfs results gets the following errors almost immediately at boot with 3.13.0-157.207:

[ 5.680480] EXT4-fs error (device xvda1): ext4_validate_block_bitmap:376: comm mountall: bg 213: block 7007360: invalid block bitmap
[ 5.681182] Aborting journal on device xvda1-8.
[ 5.681827] EXT4-fs (xvda1): Remounting filesystem read-only
[ 5.681850] EXT4-fs error (device xvda1) in ext4_free_blocks:4876: IO failure
[ 5.682538] EXT4-fs error (device xvda1) in ext4_reserve_inode_write:4967: Journal has aborted
[ 5.683254] EXT4-fs error (device xvda1) in ext4_orphan_del:2682: Journal has aborted
[ 5.683883] EXT4-fs error (device xvda1) in ext4_reserve_inode_write:4967: Journal has aborted

And that this does not happen without stride=128,stripe_width=512 and that is not present in 3.13.0-156.206.

It does *not* happen with ext4 and "-E stride=128,stripe_width=512" being set. This file system mounted without errors:

Last mounted on: /
Filesystem UUID: 12df937e-795c-4d98-a90b-dca002109a34
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 1966080
Block count: 7863296
Reserved block count: 393164
Free blocks: 7358461
Free inodes: 1885947
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 1022
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
RAID stride: 128
RAID stripe width: 512
Flex block group size: 16
Filesystem created: Sun Aug 26 21:06:58 2018
Last mount time: Sun Aug 26 21:14:39 2018
Last write time: Sun Aug 26 21:14:39 2018
Mount count: 3
Maximum mount count: 20
Last checked: Sun Aug 26 21:06:58 2018
Check interval: 15552000 (6 months)
Next check after: Fri Feb 22 21:06:58 2019
Lifetime writes: 2504 MB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 8d081584-5ec6-4446-a213-97f2012d8755
Journal backup: inode blocks

Revision history for this message
Sarah Newman (srn-f) wrote :

I believe a command sequence like this can be used to reliably reproduce the issue:

umount /mnt
truncate -s128m /tmp/test.img
cmd="/sbin/mkfs.ext3 -E stride=128,stripe_width=512 -F /tmp/test.img"
echo $cmd > /dev/kmsg
$cmd
mount -o loop /tmp/test.img /mnt/
while dd if=/dev/zero of=/mnt/$RANDOM bs=1M count=1; do true; done

I do not see any failures when ext4 is replaced for ext3 above.

Revision history for this message
Sarah Newman (srn-f) wrote :

As for running

apport-collect 1789131

This doesn't appear to work well in a headless system with a minimal install. I believe I've given all the necessary details for reproducing the issue.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Sarah Newman (srn-f) wrote :

This problem doesn't show up in 4.4.0-134-generic.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the proposed kernel and post back if it resolves this bug?
See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed.

Thank you in advance!

Changed in linux (Ubuntu):
importance: Undecided → High
Changed in linux (Ubuntu Trusty):
importance: Undecided → High
status: New → Incomplete
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: kernel-key trusty
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

The specific version to test is 3.13.0-158.

Revision history for this message
Sarah Newman (srn-f) wrote : Re: [Bug 1789131] Re: Probable regression with EXT3 file systems and CVE-2018-1093 patches

On 08/27/2018 11:16 AM, Joseph Salisbury wrote:
> The specific version to test is 3.13.0-158.
>

No, that doesn't fix it.

root@scratch2:~# uname -a
Linux scratch2 3.13.0-158-generic #208-Ubuntu SMP Fri Aug 24 17:07:38 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
root@scratch2:~# dmesg | grep mount
[ 2.353820] EXT4-fs (xvda1): mounting ext3 file system using the ext4 subsystem
[ 2.386459] EXT4-fs (xvda1): mounted filesystem with ordered data mode. Opts: barrier=0
[ 5.922761] EXT4-fs (xvda1): re-mounted. Opts: errors=remount-ro,barrier=0
[ 6.049433] EXT4-fs error (device xvda1): ext4_validate_block_bitmap:376: comm mountall: bg 72: block 2369024: invalid block bitmap
[ 6.051556] EXT4-fs (xvda1): Remounting filesystem read-only

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Does 3.13.0-157.207 exhibit the bug and 3.13.0-156.206 does not? If that is the case, we can perform a bisect to identify the offending commit.

Changed in linux (Ubuntu Trusty):
status: Incomplete → Triaged
Changed in linux (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
Sarah Newman (srn-f) wrote :

On 08/27/2018 01:21 PM, Joseph Salisbury wrote:
> Does 3.13.0-157.207 exhibit the bug and 3.13.0-156.206 does not? If
> that is the case, we can perform a bisect to identify the offending
> commit.

That is correct.

Changed in linux (Ubuntu):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Trusty):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu):
status: Triaged → In Progress
Changed in linux (Ubuntu Trusty):
status: Triaged → In Progress
Changed in linux (Ubuntu):
importance: High → Critical
Changed in linux (Ubuntu Trusty):
importance: High → Critical
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

This is probably the fix:
22be37acce25 ext4: fix bitmap position validation

I built a test kernel with commit 22be37acce25. The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1789131

Can you test this kernel and see if it resolves this bug?

Note about installing test kernels:
• If the test kernel is prior to 4.15(Bionic) you need to install the linux-image and linux-image-extra .deb packages.
• If the test kernel is 4.15(Bionic) or newer, you need to install the linux-modules, linux-modules-extra and linux-image-unsigned .deb packages.

Thanks in advance!

Revision history for this message
Sarah Newman (srn-f) wrote :

On 08/28/2018 09:49 AM, Joseph Salisbury wrote:
> This is probably the fix:
> 22be37acce25 ext4: fix bitmap position validation
>
> I built a test kernel with commit 22be37acce25. The test kernel can be downloaded from:
> http://kernel.ubuntu.com/~jsalisbury/lp1789131
>
> Can you test this kernel and see if it resolves this bug?

Yes, it does.

Out of curiosity, how do you think this was missed originally? It's in the 3.16 LTS tree.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Thanks for testing, Sarah. I'll submit an SRU request to have this commit included in Trusty.

This regression was introduced by the following mainline commit:
7dac4a1726a9 ("ext4: add validity checks for bitmap block numbers")

This commit was added to mainline in version v4.17-rc1. However, the regression it introduced was not fixed until v4.17-rc3. The regression was fixed by commit 22be37acce25.

Commit 7dac4a1726a9 was added to Trusty in the 3.13.0-157 kernel, but commit 22be37acce25 has not made it's way to Trusty via upstream stable updates as of yet. The SRU request I send will get the fix into Trusty without having to wait for the updates from upstream stable.

Revision history for this message
gagzou (gagzou) wrote :

Impacted also in 16.04 lts (kernel 4.4.0-134-generic ) with all partitions in ext4 with flag meta_bg :

ext4_check_descriptors: block bitmap for group 0 overlaps block group descriptors

the flag meta_bg activates automatically by warming up a partition above 60GB (with lvextend and resize2fs).

go back with the kernel 4.4.0-133-generic and all partitions mount correctly.

Revision history for this message
gagzou (gagzou) wrote :

I opened another bug for the 16.04 because the cause is not the patch CVE-2018-1093 :

1789653 : regression with EXT4 file systems and meta_bg flag

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
description: updated
Changed in linux (Ubuntu Trusty):
status: In Progress → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'. If the problem still exists, change the tag 'verification-needed-trusty' to 'verification-failed-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-trusty
Revision history for this message
Msd (msd+launchpad) wrote :

I can confirm that the kernel 3.13.0-159-generic from trusty-proposed solves the problem.

tags: added: verification-done-trusty
removed: verification-needed-trusty
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.13.0-160.210

---------------
linux (3.13.0-160.210) trusty; urgency=medium

  * CVE-2018-14633
    - iscsi target: Use hex2bin instead of a re-implementation

  * CVE-2018-14634
    - exec: Limit arg stack to at most 75% of _STK_LIM

linux (3.13.0-159.209) trusty; urgency=medium

  * linux: 3.13.0-159.209 -proposed tracker (LP: #1791754)

  * L1TF mitigation not effective in some CPU and RAM combinations
    (LP: #1788563) // CVE-2018-3620 // CVE-2018-3646
    - x86/speculation/l1tf: Fix overflow in l1tf_pfn_limit() on 32bit
    - x86/speculation/l1tf: Fix off-by-one error when warning that system has too
      much RAM
    - x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+

  * CVE-2018-15594
    - x86/paravirt: Fix spectre-v2 mitigations for paravirt guests

  * i40e NIC not recognized (LP: #1789215)
    - SAUCE: i40e_bpo: Import the i40e driver from Xenial 4.4
    - SAUCE: i40e_bpo: Add a compatibility layer
    - SAUCE: i40e_bpo: Don't probe for NICs supported by the in-tree driver
    - SAUCE: i40e_bpo: Rename the driver to i40e_bpo
    - SAUCE: i40e_bpo: Hook the driver into the kernel tree
    - [Config] Add CONFIG_I40E_BPO=m

  * Probable regression with EXT3 file systems and CVE-2018-1093 patches
    (LP: #1789131)
    - ext4: fix bitmap position validation

  * CVE-2018-3620 // CVE-2018-3646
    - mm: x86 pgtable: drop unneeded preprocessor ifdef
    - x86/asm: Move PUD_PAGE macros to page_types.h
    - x86/asm: Add pud/pmd mask interfaces to handle large PAT bit
    - x86/asm: Fix pud/pmd interfaces to handle large PAT bit
    - x86/mm: Fix regression with huge pages on PAE
    - SAUCE: x86/speculation/l1tf: Protect NUMA hinting PTEs against speculation
    - Revert "UBUNTU: [Config] disable NUMA_BALANCING"

  * CVE-2018-15572
    - x86/retpoline: Fill RSB on context switch for affected CPUs
    - x86/speculation: Protect against userspace-userspace spectreRSB

  * CVE-2018-6555
    - SAUCE: irda: Only insert new objects into the global database via setsockopt

  * CVE-2018-6554
    - SAUCE: irda: Fix memory leak caused by repeated binds of irda socket

  * BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:1:1119] (LP: #1788817)
    - drm/ast: Fixed system hanged if disable P2A

  * errors when scanning partition table of corrupted AIX disk (LP: #1787281)
    - partitions/aix: fix usage of uninitialized lv_info and lvname structures
    - partitions/aix: append null character to print data from disk

 -- Stefan Bader <email address hidden> Mon, 24 Sep 2018 19:38:31 +0200

Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu):
status: In Progress → Fix Released
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.