resize2fs does not start to actually grow an ext4

Bug #1321958 reported by Jörg on 2014-05-21
30
This bug affects 6 people
Affects Status Importance Assigned to Milestone
e2fsprogs (Ubuntu)
High
Unassigned
Trusty
High
Unassigned

Bug Description

Ubuntu 14.04 LTS, all proposed updates done
Kernel: 3.13.0-24-generic,
Package: e2fsprogs 1.42.9-3ubuntu1,
System: Haswell i7, 32GB RAM, LSI SAS 9207-8i HBA and LSI SAS 9211-8i HBA

I tried to increse the size of an ext4 filesystem. Old size 20TB, wanted new size 28TB. I tried offline resize with "resize2fs -fp /dev/md2" and later online resize using "resize2fs -f /dev/md2". In both cases, after giving the command a rezise2fs process is created that uses nearly 100% cpu according to top, but it does not perform any actual resize. It only prints its version and date and then it does not finish for hours. I had it running for more than a day without finishing:

root@marvin:~# resize2fs -f /dev/md2
    resize2fs 1.42.9 (4-Feb-2014)

There is never more terminal output than that. It looks to me that resize2fs hangs in a endless calcualtion or loop or something similar.

Some more info about the filesystem:

root@marvin:~# tune2fs -l /dev/md2
    tune2fs 1.42.9 (4-Feb-2014)
    Filesystem volume name: data
    Last mounted on: /media/data01
    Filesystem UUID: e3845e15-0336-47ae-8aec-df75acb217c5
    Filesystem magic number: 0xEF53
    Filesystem revision #: 1 (dynamic)
    Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
    Filesystem flags: signed_directory_hash
    Default mount options: user_xattr acl
    Filesystem state: clean
    Errors behavior: Continue
    Filesystem OS type: Linux
    Inode count: 305225728
    Block count: 4883606400
    Reserved block count: 0
    Free blocks: 22919919
    Free inodes: 302894731
    First block: 0
    Block size: 4096
    Fragment size: 4096
    Group descriptor size: 64
    Reserved GDT blocks: 1024
    Blocks per group: 32768
    Fragments per group: 32768
    Inodes per group: 2048
    Inode blocks per group: 128
    RAID stride: 128
    RAID stripe width: 640
    Flex block group size: 16
    Filesystem created: Fri Sep 20 19:45:01 2013
    Last mount time: Tue May 20 02:14:37 2014
    Last write time: Tue May 20 02:14:37 2014
    Mount count: 3
    Maximum mount count: -1
    Last checked: Tue May 20 01:34:05 2014
    Check interval: 0 (<none>)
    Lifetime writes: 34 TB
    Reserved blocks uid: 0 (user root)
    Reserved blocks gid: 0 (group root)
    First inode: 11
    Inode size: 256
    Required extra isize: 28
    Desired extra isize: 28
    Journal inode: 8
    Default directory hash: half_md4
    Directory Hash Seed: 569ec5fc-4d5e-4639-bef3-42cde5fbe948
    Journal backup: inode blocks

I did also run an filesystem check:

root@marvin:~# e2fsck -vfp /dev/md2

         2330890 inodes used (0.76%, out of 305225728)
           14882 non-contiguous files (0.6%)
             949 non-contiguous directories (0.0%)
                 # of inodes with ind/dind/tind blocks: 0/0/0
                 Extent depth histogram: 2317190/13041/651
      4868171016 blocks used (99.68%, out of 4883606400)
               0 bad blocks
            1654 large files

         2273776 regular files
           57105 directories
               0 character device files
               0 block device files
               0 fifos
               0 links
               0 symbolic links (0 fast symbolic links)
               0 sockets
    ------------
         2330881 files

The underlying device is an mdadm RAID6 that was grown from 7 to 9 disks. The growing finished without problems before I tried to increase the ext4 size.

Solution:
The solution for me was to downgrade to e2fsprogs 1.42.8. Then the resize did work and finished within a few minutes. I got the hint to do so in a forum from an user, who had the same problem and solved it with the older version. I have not tested the new 1.42.10.

I think this must be a bug introduced in the e2fsprogs version 1.42.9, because all works as expected with the older version.

I hope this helps to identify the problem. Best regards, Joerg

M D (mdrules) wrote :

I had similar problem, 24TB raid 6 array growing to 48TB raid 6 array. Downgraded to 1.42.8 and was able to resize filesystem. However, with 20TB raid 5 grown to 24TB raid 5 I did not encounter issue.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in e2fsprogs (Ubuntu):
status: New → Confirmed
Theodore Ts'o (tytso) wrote :

If someone could try running the following tests, it would be very useful.

1) strace -o /tmp/resize.strace resize2fs /dev/XXX

2) resize2fs -d 31 /dev/XXX | tee /tmp/resize.debug

And then upload the files of the "hanging" resize2fs. It's not something I've been able to reproduce when doing a quick test.

Thanks!!

Jörg (joerg-niemoeller) wrote :

Here is the output of strace. The resize with - 31 did only produce an empty file.

Jörg (joerg-niemoeller) wrote :

I should maybe mention that both commands showed the same hanging behaviour again.

Thermaltaker (dennis-benndorf) wrote :

This bug also affects me. I have tested the new 1.42.11 but no success. Attached the strace; the debug log is empty for my setup also.
In addition I added a strace/debug file of the 1.42.8 which worked. I hope that might be helpful.

Thermaltaker (dennis-benndorf) wrote :
Thermaltaker (dennis-benndorf) wrote :
Theodore Ts'o (tytso) wrote :

Someone has reported what appears to be the same problem on the linux-ext4 list, and between correlations of the observations from Dennis (Thermaltaker) and on the linux-ext4 list, I believe we have a fix that should address this problem:

http://thread.gmane.org/gmane.comp.file-systems.ext4/44870

If someone wants to try the patch proposed by Azat applied to 1.42.11, that would be much appreciated.

Rudy Broersma (tozz) wrote :

I have just tried this patch (to resize2fs.c. There is another patch for resize.c, but I dont have a resize.c in the 1.42.11 src tree), but it didn't work out the way I'd like.

It seems to went all okay:

/dev/sdb 29T 26T 2.1T 93% /backups

root@dione:/usr/local/e2fsprogs/sbin# umount /backups/

root@dione:/usr/local/e2fsprogs/sbin# ./resize2fs -fp /dev/sdb
resize2fs 1.42.11 (09-Jul-2014)
Resizing the filesystem on /dev/sdb to 8775931648 (4k) blocks.
Begin pass 2 (max = 11703)
Relocating blocks XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Begin pass 3 (max = 238018)
Scanning inode table XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Begin pass 5 (max = 1)
Moving inode table XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
The filesystem on /dev/sdb is now 8775931648 blocks long.

root@dione:/# mount -a
root@dione:/# df -h

/dev/sdb 33T 26T 5.5T 83% /backups

However:

root@dione:/backups# ls -la
total 0

As shown, this disk contains 26 terabytes of data, but according to 'ls' the disk is empty. I am trying to see if this can be fixed with e2fsck.. But this is scary.

Rudy Broersma (tozz) wrote :

I ran e2fsck on this disk, and it found quite alot of problems with early inodes. After e2fsck completed almost all the root folders were moved to lost+found. I managed to restore the data by moving them back from lost+found.

I'd like to note that although I ran resize2fs with the -f flag, I did ran e2fsck prior to resizing the disk. However, resize2fs kept bugging me that I needed to run e2fsck. Therefor I used the -f flag.

TJ (tj) wrote :

Increasing importance due to possible denial-of-service during a resize of large 'complex' file-systems. In this case both reports using DM RAID 6.

Changed in e2fsprogs (Ubuntu):
importance: Undecided → High
status: Confirmed → Triaged
tags: added: trusty
Changed in e2fsprogs (Ubuntu Trusty):
milestone: none → trusty-updates
Changed in e2fsprogs (Ubuntu Trusty):
importance: Undecided → High
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in e2fsprogs (Ubuntu Trusty):
status: New → Confirmed
Jacob Becker (jacob-becker-h) wrote :

according to
http://e2fsprogs.sourceforge.net/e2fsprogs-release.html#1.42.13
this BUG is fixed in e2fsprogs 1.42.12 or higher.

"Fix a 32/64-bit overflow bug that could cause resize2fs to loop forever. (Addresses-Launchpad-Bug: #1321958)"

is it possible to backport the newest e2fsprogs package to trusty ?

Mathew Hodson (mathew-hodson) wrote :

This should be fixed in Vivid and later then.

Changed in e2fsprogs (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers