resize2fs does not start to actually grow an ext4

Bug #1321958 reported by Jörg on 2014-05-21
30
This bug affects 6 people
Affects Status Importance Assigned to Milestone
e2fsprogs (Ubuntu)
High
Unassigned
Trusty
High
Unassigned

Bug Description

Ubuntu 14.04 LTS, all proposed updates done
Kernel: 3.13.0-24-generic,
Package: e2fsprogs 1.42.9-3ubuntu1,
System: Haswell i7, 32GB RAM, LSI SAS 9207-8i HBA and LSI SAS 9211-8i HBA

I tried to increse the size of an ext4 filesystem. Old size 20TB, wanted new size 28TB. I tried offline resize with "resize2fs -fp /dev/md2" and later online resize using "resize2fs -f /dev/md2". In both cases, after giving the command a rezise2fs process is created that uses nearly 100% cpu according to top, but it does not perform any actual resize. It only prints its version and date and then it does not finish for hours. I had it running for more than a day without finishing:

root@marvin:~# resize2fs -f /dev/md2
    resize2fs 1.42.9 (4-Feb-2014)

There is never more terminal output than that. It looks to me that resize2fs hangs in a endless calcualtion or loop or something similar.

Some more info about the filesystem:

root@marvin:~# tune2fs -l /dev/md2
    tune2fs 1.42.9 (4-Feb-2014)
    Filesystem volume name: data
    Last mounted on: /media/data01
    Filesystem UUID: e3845e15-0336-47ae-8aec-df75acb217c5
    Filesystem magic number: 0xEF53
    Filesystem revision #: 1 (dynamic)
    Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
    Filesystem flags: signed_directory_hash
    Default mount options: user_xattr acl
    Filesystem state: clean
    Errors behavior: Continue
    Filesystem OS type: Linux
    Inode count: 305225728
    Block count: 4883606400
    Reserved block count: 0
    Free blocks: 22919919
    Free inodes: 302894731
    First block: 0
    Block size: 4096
    Fragment size: 4096
    Group descriptor size: 64
    Reserved GDT blocks: 1024
    Blocks per group: 32768
    Fragments per group: 32768
    Inodes per group: 2048
    Inode blocks per group: 128
    RAID stride: 128
    RAID stripe width: 640
    Flex block group size: 16
    Filesystem created: Fri Sep 20 19:45:01 2013
    Last mount time: Tue May 20 02:14:37 2014
    Last write time: Tue May 20 02:14:37 2014
    Mount count: 3
    Maximum mount count: -1
    Last checked: Tue May 20 01:34:05 2014
    Check interval: 0 (<none>)
    Lifetime writes: 34 TB
    Reserved blocks uid: 0 (user root)
    Reserved blocks gid: 0 (group root)
    First inode: 11
    Inode size: 256
    Required extra isize: 28
    Desired extra isize: 28
    Journal inode: 8
    Default directory hash: half_md4
    Directory Hash Seed: 569ec5fc-4d5e-4639-bef3-42cde5fbe948
    Journal backup: inode blocks

I did also run an filesystem check:

root@marvin:~# e2fsck -vfp /dev/md2

         2330890 inodes used (0.76%, out of 305225728)
           14882 non-contiguous files (0.6%)
             949 non-contiguous directories (0.0%)
                 # of inodes with ind/dind/tind blocks: 0/0/0
                 Extent depth histogram: 2317190/13041/651
      4868171016 blocks used (99.68%, out of 4883606400)
               0 bad blocks
            1654 large files

         2273776 regular files
           57105 directories
               0 character device files
               0 block device files
               0 fifos
               0 links
               0 symbolic links (0 fast symbolic links)
               0 sockets
    ------------
         2330881 files

The underlying device is an mdadm RAID6 that was grown from 7 to 9 disks. The growing finished without problems before I tried to increase the ext4 size.

Solution:
The solution for me was to downgrade to e2fsprogs 1.42.8. Then the resize did work and finished within a few minutes. I got the hint to do so in a forum from an user, who had the same problem and solved it with the older version. I have not tested the new 1.42.10.

I think this must be a bug introduced in the e2fsprogs version 1.42.9, because all works as expected with the older version.

I hope this helps to identify the problem. Best regards, Joerg

M D (mdrules) wrote :

I had similar problem, 24TB raid 6 array growing to 48TB raid 6 array. Downgraded to 1.42.8 and was able to resize filesystem. However, with 20TB raid 5 grown to 24TB raid 5 I did not encounter issue.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in e2fsprogs (Ubuntu):
status: New → Confirmed
Theodore Ts'o (tytso) wrote :

If someone could try running the following tests, it would be very useful.

1) strace -o /tmp/resize.strace resize2fs /dev/XXX

2) resize2fs -d 31 /dev/XXX | tee /tmp/resize.debug

And then upload the files of the "hanging" resize2fs. It's not something I've been able to reproduce when doing a quick test.

Thanks!!

Jörg (joerg-niemoeller) wrote :

Here is the output of strace. The resize with - 31 did only produce an empty file.

Jörg (joerg-niemoeller) wrote :

I should maybe mention that both commands showed the same hanging behaviour again.

Thermaltaker (dennis-benndorf) wrote :

This bug also affects me. I have tested the new 1.42.11 but no success. Attached the strace; the debug log is empty for my setup also.
In addition I added a strace/debug file of the 1.42.8 which worked. I hope that might be helpful.

Thermaltaker (dennis-benndorf) wrote :
Thermaltaker (dennis-benndorf) wrote :
Theodore Ts'o (tytso) wrote :

Someone has reported what appears to be the same problem on the linux-ext4 list, and between correlations of the observations from Dennis (Thermaltaker) and on the linux-ext4 list, I believe we have a fix that should address this problem:

http://thread.gmane.org/gmane.comp.file-systems.ext4/44870

If someone wants to try the patch proposed by Azat applied to 1.42.11, that would be much appreciated.

Rudy Broersma (tozz) wrote :

I have just tried this patch (to resize2fs.c. There is another patch for resize.c, but I dont have a resize.c in the 1.42.11 src tree), but it didn't work out the way I'd like.

It seems to went all okay:

/dev/sdb 29T 26T 2.1T 93% /backups

root@dione:/usr/local/e2fsprogs/sbin# umount /backups/

root@dione:/usr/local/e2fsprogs/sbin# ./resize2fs -fp /dev/sdb
resize2fs 1.42.11 (09-Jul-2014)
Resizing the filesystem on /dev/sdb to 8775931648 (4k) blocks.
Begin pass 2 (max = 11703)
Relocating blocks XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Begin pass 3 (max = 238018)
Scanning inode table XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Begin pass 5 (max = 1)
Moving inode table XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
The filesystem on /dev/sdb is now 8775931648 blocks long.

root@dione:/# mount -a
root@dione:/# df -h

/dev/sdb 33T 26T 5.5T 83% /backups

However:

root@dione:/backups# ls -la
total 0

As shown, this disk contains 26 terabytes of data, but according to 'ls' the disk is empty. I am trying to see if this can be fixed with e2fsck.. But this is scary.

Rudy Broersma (tozz) wrote :

I ran e2fsck on this disk, and it found quite alot of problems with early inodes. After e2fsck completed almost all the root folders were moved to lost+found. I managed to restore the data by moving them back from lost+found.

I'd like to note that although I ran resize2fs with the -f flag, I did ran e2fsck prior to resizing the disk. However, resize2fs kept bugging me that I needed to run e2fsck. Therefor I used the -f flag.

TJ (tj) wrote :

Increasing importance due to possible denial-of-service during a resize of large 'complex' file-systems. In this case both reports using DM RAID 6.

Changed in e2fsprogs (Ubuntu):
importance: Undecided → High
status: Confirmed → Triaged
tags: added: trusty
Changed in e2fsprogs (Ubuntu Trusty):
milestone: none → trusty-updates
Changed in e2fsprogs (Ubuntu Trusty):
importance: Undecided → High
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in e2fsprogs (Ubuntu Trusty):
status: New → Confirmed
Jacob Becker (jacob-becker-h) wrote :

according to
http://e2fsprogs.sourceforge.net/e2fsprogs-release.html#1.42.13
this BUG is fixed in e2fsprogs 1.42.12 or higher.

"Fix a 32/64-bit overflow bug that could cause resize2fs to loop forever. (Addresses-Launchpad-Bug: #1321958)"

is it possible to backport the newest e2fsprogs package to trusty ?

Mathew Hodson (mathew-hodson) wrote :

This should be fixed in Vivid and later then.

Changed in e2fsprogs (Ubuntu):
status: Triaged → Fix Released
Danny Mann (adayforgotten) wrote :

Doesn't look like the fixed version was ever pushed out to Trusty. If you are using trusty and need this fixed, simply download the source to your machine and compile it. If you have the build-essential and git packages installed, it's quite simple. Clone the git repo source for the latest version and then cd into it and type:
./configure
make

After that, you will have the needed binaries in that folder. You can run them directly from there when you want to use that version rather than the on in Trusty.

I compiled 1.43.6 from source and used resize2fs from there to successfully resize from 21 -> 24 TB on a RAID6.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers