e2fsprogs segfaults when fsck-ing a FS that has been unmounted by Ubuntu

Bug #438248 reported by Ævar Arnfjörð Bjarmason on 2009-09-28
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
e2fsprogs (Ubuntu)
Medium
Unassigned

Bug Description

Binary package hint: e2fsprogs

When I'm trying to run e2fsck on my damaged /dev/sdb2 filesystem in Ubuntu 9.04 using e2fsprogs 1.41.4-1ubuntu1 /dev/sdb2 will be unmounted for some unknown reason during the process. This causes e2fsck to segfault.

Here's a backtrace I got with GDB using a debugging build (CFLAGS="-O0 -ggdb3" ./configure) I got by building the source from "apt-get source e2fsprogs":

Bad or non-existent /lost+found. Cannot reconnect.
Unconnected directory inode 27181065 (???)
Connect to /lost+found? yes

Bad or non-existent /lost+found. Cannot reconnect.
Unconnected directory inode 27181066 (???)
Connect to /lost+found? yes

Bad or non-existent /lost+found. Cannot reconnect.
Error while trying to find /lost+found: Ext2 inode is not a directory
/lost+found not found. Create? yes

Error creating /lost+found directory (ext2fs_link): Ext2 inode is not a directory
Pass 3A: Optimizing directories
[New Thread 0xb7dbf720 (LWP 11657)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb7dbf720 (LWP 11657)]
0x08067990 in ino_cmp (a=0x930d4020, b=0x130d4040) at rehash.c:175
175 return (he_a->ino - he_b->ino);
(gdb)
(gdb) bt
#0 0x08067990 in ino_cmp (a=0x930d4020, b=0x130d4040) at rehash.c:175
#1 0xb7dee2db in ?? () from /lib/tls/i686/cmov/libc.so.6
#2 0xb7def1b7 in qsort_r () from /lib/tls/i686/cmov/libc.so.6
#3 0xb7def2ce in qsort () from /lib/tls/i686/cmov/libc.so.6
#4 0x08068e7a in e2fsck_rehash_dir (ctx=0x9a63c38, ino=11) at rehash.c:746
#5 0x0806914a in e2fsck_rehash_directories (ctx=0x9a63c38) at rehash.c:855
#6 0x0805ba01 in e2fsck_pass3 (ctx=0x9a63c38) at pass3.c:130
#7 0x0804ec19 in e2fsck_run (ctx=0x9a63c38) at e2fsck.c:217
#8 0x0804e04d in main (argc=3, argv=0xbf879924) at unix.c:1292
(gdb)
#0 0x08067990 in ino_cmp (a=0x930d4020, b=0x130d4040) at rehash.c:175
#1 0xb7dee2db in ?? () from /lib/tls/i686/cmov/libc.so.6
#2 0xb7def1b7 in qsort_r () from /lib/tls/i686/cmov/libc.so.6
#3 0xb7def2ce in qsort () from /lib/tls/i686/cmov/libc.so.6
#4 0x08068e7a in e2fsck_rehash_dir (ctx=0x9a63c38, ino=11) at rehash.c:746
#5 0x0806914a in e2fsck_rehash_directories (ctx=0x9a63c38) at rehash.c:855
#6 0x0805ba01 in e2fsck_pass3 (ctx=0x9a63c38) at pass3.c:130
#7 0x0804ec19 in e2fsck_run (ctx=0x9a63c38) at e2fsck.c:217
#8 0x0804e04d in main (argc=3, argv=0xbf879924) at unix.c:1292
(gdb)
#0 0x08067990 in ino_cmp (a=0x930d4020, b=0x130d4040) at rehash.c:175
#1 0xb7dee2db in ?? () from /lib/tls/i686/cmov/libc.so.6
#2 0xb7def1b7 in qsort_r () from /lib/tls/i686/cmov/libc.so.6
#3 0xb7def2ce in qsort () from /lib/tls/i686/cmov/libc.so.6
#4 0x08068e7a in e2fsck_rehash_dir (ctx=0x9a63c38, ino=11) at rehash.c:746
#5 0x0806914a in e2fsck_rehash_directories (ctx=0x9a63c38) at rehash.c:855
#6 0x0805ba01 in e2fsck_pass3 (ctx=0x9a63c38) at pass3.c:130
#7 0x0804ec19 in e2fsck_run (ctx=0x9a63c38) at e2fsck.c:217
#8 0x0804e04d in main (argc=3, argv=0xbf879924) at unix.c:1292
(gdb) bt full
#0 0x08067990 in ino_cmp (a=0x930d4020, b=0x130d4040) at rehash.c:175
        he_a = (const struct hash_entry *) 0x930d4020
        he_b = (const struct hash_entry *) 0x130d4040
#1 0xb7dee2db in ?? () from /lib/tls/i686/cmov/libc.so.6
No symbol table info available.
#2 0xb7def1b7 in qsort_r () from /lib/tls/i686/cmov/libc.so.6
No symbol table info available.
#3 0xb7def2ce in qsort () from /lib/tls/i686/cmov/libc.so.6
No symbol table info available.
#4 0x08068e7a in e2fsck_rehash_dir (ctx=0x9a63c38, ino=11) at rehash.c:746
        fs = (ext2_filsys) 0x9a64188
        retval = 0
        inode = {i_mode = 0, i_uid = 0, i_size = 0, i_atime = 0, i_ctime = 0, i_mtime = 0, i_dtime = 0, i_gid = 0, i_links_count = 0,
  i_blocks = 0, i_flags = 0, osd1 = {linux1 = {l_i_version = 0}, hurd1 = {h_i_translator = 0}}, i_block = {0 <repeats 15 times>},
  i_generation = 0, i_file_acl = 0, i_dir_acl = 0, i_faddr = 0, osd2 = {linux2 = {l_i_blocks_hi = 0, l_i_file_acl_high = 0,
      l_i_uid_high = 0, l_i_gid_high = 0, l_i_reserved2 = 0}, hurd2 = {h_i_frag = 0 '\0', h_i_fsize = 0 '\0', h_i_mode_high = 0,
      h_i_uid_high = 0, h_i_gid_high = 0, h_i_author = 0}}}
        dir_buf = 0x13179020 "x\021�\030@\r\023\020"
        fd = {buf = 0x13179020 "x\021�\030@\r\023\020", inode = 0xbf879534, err = 0, ctx = 0x9a63c38, harray = 0x130d4020,
  max_array = 0, num_array = 0, dir_size = 0, compress = 1, parent = 0}
        outdir = {num = 0, max = 0, buf = 0x0, hashes = 0x0}
#5 0x0806914a in e2fsck_rehash_directories (ctx=0x9a63c38) at rehash.c:855
        pctx = {errcode = 0, ino = 0, ino2 = 0, dir = 11, inode = 0x0, dirent = 0x0, blk = 0, blk2 = 0, blkcount = -1, group = -1,
  num = 0, str = 0x0}
        rtrack = {time_start = {tv_sec = 1254151889, tv_usec = 341337}, user_start = {tv_sec = 17585, tv_usec = 807044},
  system_start = {tv_sec = 101, tv_usec = 150321}, brk_start = 0x17106000, bytes_read = 16380503040, bytes_written = 1243066368}
        dir = (struct dir_info *) 0x9a64188
        iter = (ext2_u32_iterate) 0x131dc030
        dirinfo_iter = (struct dir_info_iter *) 0x0
---Type <return> to continue, or q <return> to quit---
        ino = 11
        retval = 0
        cur = 0
        max = 192
        all_dirs = 0
        dir_index = 32
        first = 0
#6 0x0805ba01 in e2fsck_pass3 (ctx=0x9a63c38) at pass3.c:130
        fs = (ext2_filsys) 0x9a64188
        iter = (struct dir_info_iter *) 0x13260030
        rtrack = {time_start = {tv_sec = 1254151885, tv_usec = 452264}, user_start = {tv_sec = 17583, tv_usec = 950928},
  system_start = {tv_sec = 99, tv_usec = 506218}, brk_start = 0x17106000, bytes_read = 16354583552, bytes_written = 1229254656}
        pctx = {errcode = 0, ino = 27181066, ino2 = 0, dir = 0, inode = 0x0, dirent = 0x0, blk = 0, blk2 = 0, blkcount = -1,
  group = -1, num = 0, str = 0x0}
        dir = (struct dir_info *) 0x0
        maxdirs = 83576
        count = 1
#7 0x0804ec19 in e2fsck_run (ctx=0x9a63c38) at e2fsck.c:217
        i = 2
        e2fsck_pass = (pass_t) 0x805b768 <e2fsck_pass3>
#8 0x0804e04d in main (argc=3, argv=0xbf879924) at unix.c:1292
        retval = 0
        orig_retval = 0
        exit_value = 0
        fs = (ext2_filsys) 0x9a64188
        io_ptr = (io_manager) 0x809ed60
        sb = (struct ext2_super_block *) 0x9a673f0
        lib_ver_date = 0x809af58 "27-Jan-2009"
        my_ver = 141
---Type <return> to continue, or q <return> to quit---
        lib_ver = 141
        ctx = (e2fsck_t) 0x9a63c38
        pctx = {errcode = 0, ino = 0, ino2 = 0, dir = 0, inode = 0x0, dirent = 0x0, blk = 0, blk2 = 0, blkcount = -1, group = -1,
  num = 0, str = 0x0}
        flags = 81921
        run_result = 0
        journal_size = 128
        sysval = 4096
        sys_page_size = 4096
        features = {0, 0, 0}
        cp = 0x9a642dc ""

Theodore Ts'o (tytso) wrote :

Is the crash consistently repeatable?

How big is /dev/sdb2? Would you be willing to send me a compressed e2image file of the filesystem? (See the REPORTING BUGS section of the e2fsck man page for more details.)

I've had it crash twice now. Once under GDB. I pulled the latest e2fsprogs from git and I'm running it with those.

/dev/sdb2 is minus 5 GB now according to df -h :)

    $ df -h /media/Igla/
    Filesystem Size Used Avail Use% Mounted on
    /dev/sdc2 411G -5.1G 416G - /media/Igla

But actually it's around 280 GB, aside from Internet connectivity issues I'd have currently transferring something that big there's private data on that partition. Like a backup of my ~ directory.

I'm running it again with the git/master version and I'm noting the full e2fsck/syslog/dmesg output being produced in the process.

Theodore Ts'o (tytso) wrote :

Can you try using running debugfs on the filesystem and sending me the output of the following two debugfs commands:

stat <11>
ls <11>

thanks!!

avar@aoeu:~/src/e2fsprogs$ sudo debugfs/debugfs /dev/sdc2
debugfs 1.41.9 (22-Aug-2009)
debugfs: stat <11>
Inode: 11 Type: directory Mode: 0700 Flags: 0x0
Generation: 0 Version: 0x00000000:00000000
User: 0 Group: 0 Size: 16384
File ACL: 0 Directory ACL: 0
Links: 2 Blockcount: 32
Fragment: Address: 0 Number: 0 Size: 0
 ctime: 0x49b57158:00000000 -- Mon Mar 9 19:43:20 2009
 atime: 0x4ab98bf4:00000000 -- Wed Sep 23 02:46:12 2009
 mtime: 0x49b57158:00000000 -- Mon Mar 9 19:43:20 2009
crtime: 0x49b57158:00000000 -- Mon Mar 9 19:43:20 2009
Size of extra inode fields: 28
BLOCKS:
(0-3):1540-1543
TOTAL: 4

debugfs: ls <11>
 11 (12) . 2 (4084) .. 0 (4096) 0 (4096) 0 (4096)

Changed in e2fsprogs (Ubuntu):
status: New → Triaged
importance: Undecided → Medium

Sorry for not getting back to you earlier. I ran another fsck a while ago followed by a GDB session. The problem is that qsort() is being fed an invalid pointer. But I don't know why though. Here's the most interesting bit:

The other 70MB of the log are attached (bzipped).

(gdb) l
169 static EXT2_QSORT_TYPE ino_cmp(const void *a, const void *b)
170 {
171 const struct hash_entry *he_a = (const struct hash_entry *) a;
172 const struct hash_entry *he_b = (const struct hash_entry *) b;
173
174 return (he_a->ino - he_b->ino);
175 }
176
177 /* Used for sorting the hash entry */
178 static EXT2_QSORT_TYPE name_cmp(const void *a, const void *b)
(gdb) x he_a
0x89fbc1b8: Cannot access memory at address 0x89fbc1b8
(gdb) p he_a
$1 = (const struct hash_entry *) 0x89fbc1b8
(gdb) p *he_a
Cannot access memory at address 0x89fbc1b8
(gdb) p he_b
$2 = (const struct hash_entry *) 0x9fbc1d8
(gdb) p *he_b
$3 = {hash = 1768448865, minor_hash = 17, ino = 3087286648, dir = 0xb8044178}

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers