e2fsck crashes with Signal (11) SIGSEGV when using undo-file

Bug #1962789 reported by Felix E
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
e2fsprogs (Ubuntu)
New
Undecided
Unassigned

Bug Description

The behavior is reproducable on 18.04 and 20.04 LTS with the following versions of e2fsck:
* e2fsck 1.44.1 (24-Mar-2018)
* e2fsck 1.45.5 (07-Jan-2020)

Expected behavior occurs when:
* Running e2fsck as a dry run does work fine: It does find quite a few errors and then completes execution normally
* Running e2fsck "normally" (without an undo-file) works also fine - but I aborted execution at the first error it found (I want to use the undo file).

The error occurs reproducable when I specify an undo-file.
The error occurs on Ubuntu 18.04 and 20.04 as well.
The stack trace (see below) is identical when I repeat the execution.

As someone guessed that a lack of RAM might cause this I also observed the process in top: It shows e2fsck running with 100% CPU and ~1% of RAM for a few seconds, then it terminates with Signal (11).

Stacktrace with Ubuntu 18.04:

# e2fsck -v -z /root/e2fsck.undo_2022-03-01_15.00 /dev/mapper/md125_crypt
e2fsck 1.44.1 (24-Mar-2018)
Overwriting existing filesystem; this can be undone using the command:
    e2undo /root/e2fsck.undo_2022-03-01_15.00 /dev/mapper/md125_crypt

ext2fs_check_desc: Corrupt group descriptor: bad block for block bitmap
e2fsck: Group descriptors look bad... trying backup blocks...
Signal (11) SIGSEGV si_code=SEGV_MAPERR fault addr=0x178
e2fsck(+0x30d29)[0x55a7f46c4d29]
/lib/x86_64-linux-gnu/libc.so.6(+0x3f040)[0x7f091f069040]
/lib/x86_64-linux-gnu/libext2fs.so.2(ext2fs_close2+0x134)[0x7f091fc93534]
/lib/x86_64-linux-gnu/libext2fs.so.2(ext2fs_close_free+0x16)[0x7f091fc93596]
e2fsck(main+0x65d)[0x55a7f46a310d]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f091f04bbf7]
e2fsck(_start+0x2a)[0x55a7f46a5afa]

Stacktrace with Ubuntu 20.04 (I did an dist-upgrade from 18 to 20 in the hope that this issue might have been fixed there..):

# fsck.ext4 -v -z /root/fsck.ext4.undo_2022-03-01_17.53 /dev/mapper/md125_crypt
e2fsck 1.45.5 (07-Jan-2020)
Overwriting existing filesystem; this can be undone using the command:
    e2undo /root/fsck.ext4.undo_2022-03-01_17.53 /dev/mapper/md125_crypt

ext2fs_check_desc: Corrupt group descriptor: bad block for block bitmap
fsck.ext4: Group descriptors look bad... trying backup blocks...
Signal (11) SIGSEGV si_code=SEGV_MAPERR fault addr=0x178
fsck.ext4(+0x34c21)[0x55d447822c21]
/lib/x86_64-linux-gnu/libc.so.6(+0x46210)[0x7f81371d6210]
/lib/x86_64-linux-gnu/libext2fs.so.2(ext2fs_close2+0x15c)[0x7f813740c6dc]
/lib/x86_64-linux-gnu/libext2fs.so.2(ext2fs_close_free+0x1a)[0x7f813740c71a]
fsck.ext4(main+0x12a2)[0x55d4477ffe22]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7f81371b70b3]
fsck.ext4(_start+0x2e)[0x55d4478020be]

Background-Info:
On 18.04:
A filesystem with ~66 TB was to be resized (resize2fs) to ~77TB.
Unfortunately I shut down the box while the resize was still running by accident (don't ask...)
After booting up again a run of e2fsck found and fixed some errors (a few Block bitmap differences and one occurrence of free block count and free inodes counts each.
A second run of e2fsck came clean.
The files inside the filesystem were fine (I had a sfv-file for a major part of the data inside).
Then I started resize2fs again, as the FS had only grown to ~70 TB so far.
Now i got a myriad of errors like
> Illegal block number passed to ext2fs_mark_block_bitmap #17446378287038201972 for copy of block bitmap for /dev
and possibly other errors as well which had scrolled outside of my screen.
Then it finally completed with
> The filesystem on /dev/mapper/md125_crypt is now 20507814912 (4k) blocks long.

Now I'm trying to fix things but as I get errors like
> Inode table for group 585696 is not in group. (block 17480552796351627264)
> WARNING: SEVERE DATA LOSS POSSIBLE.
> Relocate<y>?
I really would prefer to proceed with an undo-file now.

Revision history for this message
Felix E (fehlerman) wrote :

A brief update:

I can also reproducee the crash on Ubuntu 21.10 / e2fsck 1.46.3:

# e2fsck -v -z e2fsck.undofile.2022-03-20 /dev/mapper/md125_crypt
e2fsck 1.46.3 (27-Jul-2021)
Overwriting existing filesystem; this can be undone using the command:
    e2undo e2fsck.undofile.2022-03-20 /dev/mapper/md125_crypt

ext2fs_check_desc: Corrupt group descriptor: bad block for block bitmap
e2fsck: Group descriptors look bad... trying backup blocks...
Signal (11) SIGSEGV si_code=SEGV_MAPERR fault addr=0x178
e2fsck(+0x31e3e)[0x55deba41ce3e]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7fc444c0a520]
/lib/x86_64-linux-gnu/libext2fs.so.2(ext2fs_close2+0x134)[0x7fc444e59b44]
/lib/x86_64-linux-gnu/libext2fs.so.2(ext2fs_close_free+0x1a)[0x7fc444e59bba]
e2fsck(main+0xebd)[0x55deba3fcb6d]
/lib/x86_64-linux-gnu/libc.so.6(+0x29fd0)[0x7fc444bf1fd0]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x7d)[0x7fc444bf207d]
e2fsck(_start+0x25)[0x55deba3feff5]

If I run "e2fsck -v /dev/mapper/md125_crypt" (without -z), e2fsck 1.46.3 does not crash.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.