e2fsprogs wrongly identifies ext4 as mounted

Bug #711799 reported by udippel on 2011-02-02
44
This bug affects 8 people
Affects Status Importance Assigned to Milestone
e2fsprogs (Ubuntu)
Undecided
Unassigned

Bug Description

Binary package hint: e2fsprogs

After a crash of Ubuntu netbook, the machine hang with initramfs (I
have a /boot and /).
Booting with the same system (ubuntu 10.10) from thumb drive, I cannot fsck it:
$ sudo fsck /dev/sda2
fsck from util-linux-ng 2.17.2
e2fsck 1.41.12 (17-May-2010)
fsck.ext4: Device or resource busy while trying to open /dev/sda2
Filesystem mounted or opened exclusively by another program?

But it is not mounted:
$ cat /proc/mounts
shows that it is not mounted; and it can't be unmounted.

dmesg knows what is going on:
$ dmesg | grep sda2
[ 6.513953] sda: sda1 [01;31m [Ksda2 [m [K sda3 < sda5 > sda4
[ 9.300388] EXT4-fs ( [01;31m [Ksda2 [m [K): INFO: recovery
required on readonly filesystem
[ 9.300398] EXT4-fs ( [01;31m [Ksda2 [m [K): write access will be
enabled during recovery
[ 9.312706] EXT4-fs warning (device [01;31m [Ksda2 [m [K):
ext4_clear_journal_err: Filesystem error recorded from previous mount:
IO failure
[ 9.312729] EXT4-fs warning (device [01;31m [Ksda2 [m [K):
ext4_clear_journal_err: Marking fs in need of filesystem check.
$
But this fsck does never materialise, and can't be done manually.

Finally, I tried to delete the journal, but to no avail, the "Device
or resource busy" stays. Is there any way to trick fsck into believing
me that it is not mounted?
If not, I still consider the behaviour somewhat wrong: if not in
/proc/mount, why does fsck say so?
And when I
sudo mount /dev/sda2 /mnt
it starts the mount process, but never finishes, and also it is
impossible to ever exit this process, I tried with Ctrl-C, Ctrl-Z, and
even with kill -9 from another console. Ubuntu isn't even able to shut
down then, but keeps trying forever.

In a nutshell, it is a bug in 10.10. I use the installer-CD written to the thumb drive (Startup Disk creator).
 Confirmed: Because when I boot with a 9.04 thumb drive, I can easily open a terminal and run fsck. Done and over.

udippel (udippel) on 2011-02-02
tags: added: fsck.ext4 mount
Ian! D. Allen (idallen) wrote :

Same error, different circumstances:

# fsck.ext4 /dev/sde1
e2fsck 1.41.12 (17-May-2010)
fsck.ext4: Group descriptors look bad... trying backup blocks...
fsck.ext4: Bad magic number in super-block when using the backup blocks
fsck.ext4: going back to original superblock
fsck.ext4: Device or resource busy while trying to open /dev/sde1
Filesystem mounted or opened exclusively by another program?

The bug above is that the program opens /dev/sde1 once, forgets to close it,
then tries to open it a second time and fails. Found with strace:

[...]
write(1, "fsck.ext4: Group descriptors loo"..., 65) = 65
close(3) = 0
munmap(0x7f7d658b7000, 483328) = 0
open("/dev/sde1", O_RDWR|O_EXCL) = 3
ioctl(3, BLKROGET, 0x7fffe00b6bfc) = 0
uname({sys="Linux", node="idallen-oak.home.idallen.ca", ...}) = 0
lseek(3, 134217728, SEEK_SET) = 134217728
read(3, "a%\20\217\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1024) = 1024
write(2, "fsck.ext4", 9) = 9
write(2, ": ", 2) = 2
write(2, "Bad magic number in super-block", 31) = 31
write(2, " ", 1) = 1
write(2, "when using the backup blocks", 28) = 28
ioctl(2, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(2, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
write(2, "\n", 1) = 1
write(1, "fsck.ext4: going back to origina"..., 45) = 45
open("/dev/sde1", O_RDWR|O_EXCL) = -1 EBUSY (Device or resource busy)
write(2, "fsck.ext4", 9) = 9
write(2, ": ", 2) = 2
write(2, "Device or resource busy", 23) = 23
write(2, " ", 1) = 1
write(2, "while trying to open /dev/sde1", 30) = 30
ioctl(2, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(2, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
write(2, "\n", 1) = 1
write(1, "Filesystem mounted or opened exc"..., 61) = 61
exit_group(8) = ?

Ian! D. Allen (idallen) wrote :

This one-line patch - add ext2fs_close(fs) in unix.c - seems to fix it:

                        if ((orig_retval == 0) && retval != 0) {
                                com_err(ctx->program_name, retval,
                                        "when using the backup blocks");
                                printf(_("%s: going back to original "
                                         "superblock\n"), ctx->program_name);
+ ext2fs_close(fs);
                                ctx->superblock = orig_superblock;
                                retval = try_open_fs(ctx, flags, io_ptr, &fs);
                        }

No more "device busy".

udippel (udippel) wrote :

I can't change the Importance, though it looks like a 'major', since everyone with a crash will lose their installs. 99% will not know any better than to do a reinstall; and probably quite a number of users will already have done so by now; destroying the positive user experience of netbook 10.10.
Do we have a tag for 'showstopper for 11.04'?

udippel (udippel) wrote :

I have some problem with your patch in #2. It 'helps' on the short run; but the actual bug is elsewhere, and a tad more serious.
Look at my first post: Even if the fsck dies as both of us found out, the 'mount' starts (so it finds the partition to be not mounted, okay), but never continues, never finishes, and it is impossible to kill the mount process one way or another; including a 'halt'.
'mount' should know better than staying indefinitely, it should find that it actually is working on an *open* filesystem (if your theory/patch was correct, the filesystem *is* open!), and if it can't open it properly, it should say so, and it also should time out.
Or, we have to assume that the code of mount is totally crap. Hmm.

Theodore Ts'o (tytso) wrote :

The fix for this is in e2fsprogs 1.41.13, but I strongly recommend you fix this by going to e2fsprogs 1.41.14, which which has additional bug fixes. From the RELEASE-NOTES file:

Fixed a bug in e2fsck where if both the original and backup superblock
are invalid in some way, e2fsck will fail going back to the original
superblock because it didn't close the backup superblock first, and
the exclusive open prevented the file system from being reopened.

So this is a problem that won't occur for a cases of file system corruption, but for a user that does have both the original and backup superblocks corrupted, the results are rather catastrophic, since the only way they will be able to fix it is to boot a rescue CD and then run e2fsck from that rescue CD.

I can fetch the specific git commit which fixed this bug, but if you're using e2fsprogs 1.41.12, you really should upgrade...

Theodore Ts'o (tytso) wrote :

Sigh, I was typing too fast.

The first sentence of the 3rd paragraph above should read:

So this is a problem that won't occur for _all_ cases of file system corruption, but for a user which does have both the original and backup superblocks corrupted, the results are rather catastrophic...

udippel (udippel) wrote :

"if you're using e2fsprogs 1.41.12, you really should upgrade..."

Make that better to "if Ubuntu is using e2fsprogs 1.41.12, Ubuntu really should upgrade...". And since it is a bug affecting all 10.10 (at least), it needs to enter the official upgrade path, urgently. What I used was the installer-CD, but the system in question was pretty much updated (actually, crashed during the regular 'apt-get upgrade' from which it could never recover on its own). The current version offered by the repositories is 1.41.12-1ubuntu2, and I have no clue which actual release this reflects. (Does it make sense to file a RFE to make the upstream version visible in apt-get?)

DaveQB (david-dward) wrote :

I can confirm I have also been hit with this problem.

Ubuntu 10.10 using e2fsprogs 1.41.12
There doesn't seem a clear upgrade path to 1.41.14 due to unsatisfied dependencies. If anyone knows otherwise I would be happy to hear.

DaveQB (david-dward) wrote :

I have tried with e2fsprogs 1.41.14 and I am seeing the same result. So I am not sure if this is a different issue I have here....

DaveQB (david-dward) wrote :

This seems to me as a bug from upstream that is filtering down through the distro's.

Using Ubuntu 10.10 LiveCD's e2fsck from the e2fsprogs package 1,41,12 failed to check an ext4 disk with errors saying the aforementioned error.

Using a systemrescuecd-x86-2.1.0 CD's e2fsck from the e2fsprogs package 1.41.14 failed to check an ext4 disk with errors saying the aforementioned error.

Using a Slax 6.1.2 CD's e2fsck from the e2fsprogs package 1.41.3 (2008) checked and fixed an ext4 disk with errors.

Even though the Slax CD could not mount or do anything with the partition as it does not have ext4 support.
Go figure.

This leads me to think that this has not been fixed

E2fsprogs 1.41.13 (December 13, 2010) states:
"Fixed a bug in e2fsck where if both the original and backup superblock are invalid in some way, e2fsck will fail going back to the original superblock because it didn't close the backup superblock first, and the exclusive open prevented the file system from being reopened."

Unless Gentoo's e2fsprogs 1.41.14 missed this patch??

Theodore Ts'o (tytso) wrote :

It definitely is fixed in 1.41.13 and 1.41.14. If you are using a systemrescuecd 2.1.0, are you sure it didn't try mounting the root file system before you ran the e2fsck manually? Can you check /proc/mounts to be absolutely sure it's not really mounted?

It's not that I don't trust you, but, well, in general users have reported the darned things...

Can you send me the exact output of e2fsck and command you issued to invoke e2fsck?

-- Ted

DaveQB (david-dward) wrote :

Thanks for the response Theodore.

Yes it was definitely not mounted. I checked /proc/mounts (grep sda1 /proc/mounts), the mount -l command itself and also issued several umount /dev/sda1 to be sure. (I do have 10 years of Linux experience.)

The output from e2fsck /dev/sda1 was the same as others are having:

e2fsck 1.41.12 (17-May-2010)
fsck.ext4: Device or resource busy while trying to open /dev/sda1
Filesystem mounted or opened exclusively by another program?

Also got this on the SystemRescueCD:

e2fsck 1.41.14 (22-Dec-2010)
fsck.ext4: Device or resource busy while trying to open /dev/sda1
Filesystem mounted or opened exclusively by another program?

I did a "equery list|grep e2fsprog" on the systemrescueCD to confirm it is ver 1.41.14. Maybe Gentoo has missed your upstream patch for this bug??

I ended up using Slax 6.1.2 with e2fsprogs 1.41.3 to fix the partition and allow the Ubuntu 10.10 system to boot normally. I couldn't delay to do any further testing as it was a laptop of a university Lecturer that had to prepare classes for this week coming (emergency situation).

So maybe I found a new, slightly different bug?

How can one purposely corrupted a FS/journal for testing?

Theodore Ts'o (tytso) wrote :

So what has been specifically fixed is this case:

# fsck.ext4 /dev/sde1
e2fsck 1.41.12 (17-May-2010)
fsck.ext4: Group descriptors look bad... trying backup blocks...
fsck.ext4: Bad magic number in super-block when using the backup blocks
fsck.ext4: going back to original superblock
fsck.ext4: Device or resource busy while trying to open /dev/sde1
Filesystem mounted or opened exclusively by another program?

What you are reporting is something else. Can you tell me what the older version of e2fsck said that it fixed?

DaveQB (david-dward) wrote :

Sorry for the slow response.

"What you are reporting is something else. Can you tell me what the older version of e2fsck said that it fixed?"

Sorry?

What ver 1.41.3 on Slax 6.1.2 said while fixing the partition/FS?

I didn't see it for my own eyes, but had it read out to me and it was the usual issues you have and data at inode X does not match etc etc ...fix<y>?

It had a bunch of them. And then some other error which I have seen before but can not remember right now.
Nothing unusual though.

I encountered the same problem and fixed it by booting with a 11.04 (Natty) LiveCD (e2fsprogs @ 1.41.14-1ubuntu3). This should really be backported into the 10.04 LTS livecd if possible.

markb (mark-blakeney) wrote :

I am suffering this bug on my 10.10 laptop. It just decided to not boot anymore last night. According to TT, I downloaded the latest e2fsprogs 1.41.14 from sourceforge, built and installed it, but still get the error:

% sudo fsck /dev/sda3
fsck 1.41.14 (22-Dec-2010)
e2fsck 1.41.14 (22-Dec-2010)
fsck.ext4: Device or resource busy while trying to open /dev/sda3
Filesystem mounted or opened exclusively by another program?

So I don't think the latest e2fsprogs fixes this at all, or it is another problem (no - /dev/sda3 is not mounted anywhere). I am going to have to find a live CD to try and fix this very annoying bug. How ubuntu could leave a serious bug like this open in 10.10 just shocks me?

Theodore Ts'o (tytso) wrote :

Can you send me a raw e2image dump of your /dev/sda2 file system? Please see the man page of e2image, under "RAW IMAGE FILES", for more details, but basically it's something like this:

            e2image -r /dev/sda2 - | bzip2 > /tmp/sda2.e2i.bz2

This will allow me to reproduce the problem. I'm sure its only happening with a very specific file system corruption, because it doesn't happen normally for most users.

markb (mark-blakeney) wrote :

Sorry, I am not keen to disclose my private data. I am OS atm with only my failed ubuntu10.10 laptop and a 10.10 live cd. It seems this is a problem in 10.10 only (i.e. kernel related?). I just went down to the nearest bookshop, bought a UK linux user magazine which had a xubuntu 11.04 live cd. Booted from that, ran fdisk on my root partition, and all ok now. I remember it prompted me once to fix a single issue but sorry did not remember what the wording was.

Theodore Ts'o (tytso) wrote :

If you had read the e2image man page, you would have seen that it doesn't send any data blocks; just the file system metadata. It does send the directory and file names, but there is a scramble option that scrambles the file names leaving only the directory topology.

I'll note that Ubuntu 11.04 seems to be shipping e2fsprogs 1.41.14, so it should have been no different from what you were trying. It makes me wonder if there was something else going on, such as some other user daemon that was holding the block device open, or some such.

Anyway, all I can say is this isn't a problem I can reproduce, and if I can't reproduce it, I can't help people try to fix it, especially if they refuse to give me data that might help me determine why it's not working for you when it works for me....

markb (mark-blakeney) wrote :

I did read the man page and noted the scramble option. In fact, I did save a scrambled image before I recovered my partition but the man page is vague on how the names are (one-way?) ciphered and the e2image.c code had little comments to help explain it so I am not willing to publish my image anywhere. Are there some debugfs type commands you would like me to try on your behalf? Please contact me privately as this is a terrible bug that I think warrants your attention.

I have 2 similar partitions, sda2 and sda3. Both are ext4, exactly the same size, and were created about the same time. I could run fsck fine on sda2, but (the corrupted) sda3 always gave the "device busy" error as above. The 11.04 e2fsprogs_1.41.14_1ubuntu3 does seem pretty much the same as the generic 1.41.14 sources I had already tried so there must be something else unique about the 10.10 kernel/environment?

Theodore Ts'o (tytso) wrote :

The directory names are not enciphered, they are replaced by unique filenames AAA, BAA, CAA, DAA, etc.

The relevant section of code is:

  memset(cp, 'A', dirent->name_len);
  len = dirent->name_len;
  id = name_id[len]++;
  while ((len > 0) && (id > 0)) {
   *cp += id % 26;
   id = id / 26;
   cp++;
   len--;
  }

note the first thing done is to replace the file name with 'AAAA...'. The rest is to change it up so the filenames are unique.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in e2fsprogs (Ubuntu):
status: New → Confirmed
Philipp Gampe (pgampe) wrote :
Download full text (20.0 KiB)

I think I got hit by this problem too.

Fedora 16
# e2fsck -V
e2fsck 1.41.14 (22-Dec-2010)
 Benutze EXT2FS Library version 1.41.14, 22-Dec-2010

The device is definitely not mounted.

However I can mount and use the device as usual. Just once I umount it (and it is not listed any more on /proc/mounts), I can not run e2fsck on it.

---------------------------------------------------------------------------------------

# tune2fs -l /dev/sdb1
tune2fs 1.41.14 (22-Dec-2010)
Filesystem volume name: win7
Last mounted on: /media/win7
Filesystem UUID: 7617b06c-9916-47ca-96bf-3f9ddc6fc592
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: ext_attr resize_inode dir_index filetype sparse_super large_file
Filesystem flags: signed_directory_hash
Default mount options: (none)
Filesystem state: not clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 9898080
Block count: 39731200
Reserved block count: 1986560
Free blocks: 3582970
Free inodes: 9833445
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 1014
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8160
Inode blocks per group: 510
Filesystem created: Sun Jan 9 15:10:44 2011
Last mount time: Sat Apr 28 15:54:11 2012
Last write time: Sat Apr 28 15:55:08 2012
Mount count: 4
Maximum mount count: 28
Last checked: Sat Apr 28 15:01:42 2012
Check interval: 15552000 (6 months)
Next check after: Thu Oct 25 15:01:42 2012
Lifetime writes: 176 GB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Default directory hash: half_md4
Directory Hash Seed: 233e63e3-27e2-4b33-987f-b932f2c81aa0

---------------------------------------------------------------------------------------

I can not guaranty that there are physical errors, but smartctl still says PASSED.
# smartctl -a /dev/sdb1
smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.3.2-6.fc16.x86_64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda LP
Device Model: ST31500541AS
Serial Number: 9XW0ECDZ
LU WWN Device Id: 5 000c50 0206609c0
Firmware Version: CC34
User Capacity: 1.500.301.910.016 bytes [1,50 TB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Sun Apr 29 01:19:47 2012 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
     was never started.
     Auto Offline Data ...

Theodore Ts'o (tytso) wrote :

Looking at the strace which Philipp provided, what is happening is that an exclusive open of the device is returning EBUSY from the kernel:

open("/dev/sdb1", O_RDONLY|O_EXCL) = -1 EBUSY (Device or resource busy)

This means that either (a) the device is mounted, (b) some other program has the device open, or (c) /dev/sdb1 is being used as part of a RAID device by md, or (d) it is being used as a physical volume by LVM.

In any case, this isn't an e2fsck bug. The kernel is returning EBUSY; e2fsck is refusing to check the file system because in any of the above situations, it is dangerous to proceed. Why that's the case, you would have to debug the system init scripts and/or any whacko upstartd or systemd setups. All I can tell you is that with Ubuntu 10.04 and Debian Testing using system V init script, it works just fine for me.

I would recommend using lsof to see what other daemon or program might be holding /dev/sdb1 open. And also double checking to make sure /dev/sdb1 isn't being used by md or lvm.

Philipp Gampe (pgampe) wrote :

I now tried in single user mode and there I could run the check. It even said it was clean? (Looking at what tune2fs said, that is - for me - surprising).

I did run lsof on the partition, but it did not return anything.
It might have been an unclean shutdown of gparted, leaving either parted or ntfsresize as zombies on /dev/sdb. That happened today, but was visible in lsof. But yesterday I did not run fdisk -l, which is quite useful to see more information on what the kernel knows.

I also run the check with the -c option, but no error was mentioned. Everything turned out the be clean.

I then removed /dev/sdb1 from /etc/fstab and continued to boot normal. Then I could change the size of the partition (that was my initial goal). Afterwards I added it back and all seems to run smooth now.

So if anyone runs into the same problem, here is what to check:
0. umount /dev/sdb1; mount |grep sdb1
1. lsof /dev/sdb1
2. fdisk -l
3. cat /proc/mounts

4. remove partition from /etc/fstab and boot into single user mode (press e on grup and add "single" at the end of the kernel line, then press ctrl-x to boot). There check again.

5. If all fails, come back here ask again :)

Theodore Ts'o (tytso) wrote :

Philipp, the reason why tune2fs showed that it was not clean is either (a) because in fact the file system was mounted, or (b) the system had previously crashed while the file system was mounted.

If in fact e2fsck later said that the file system was clean, then that's actually a pretty good indication that the file system *was* mounted. It may be that recent Ubuntu's or Fedora scripts are buggy with respect to making sure /etc/mtab is properly updated, especially in single user mode. (Again, all of this worked fine using standard System V init scripts, before the various distributions started descenging into the madness which is Upstart or Systemd.)

DaveQB (david-dward) wrote :

Just another thought that helped me with another issue...run:

/sbin/dmsetup table

If disk is listed there, it will be marked busy.

Philipp Gampe (pgampe) wrote :

@DavaQB nope, not listed there either (I now have the same problem again). Not in /etc/mtab or /proc/mount. lsof does not return anything either. I will now go to single user mode again and see there, but I think this is a fedora bug. Any hint where else the kernel stores information about mounted devices?

willdeans (william-deans) wrote :
Download full text (31.5 KiB)

#####################################################
# possible bug encountered running EXT2FS 1.42
# open call on device with no corresponding close
#####################################################

deans@deans2188:~$ sudo e2fsck -V
e2fsck 1.42 (29-Nov-2011)
 Using EXT2FS Library version 1.42, 29-Nov-2011
deans@deans2188:~$ sudo fsck.ext4 -V
e2fsck 1.42 (29-Nov-2011)
 Using EXT2FS Library version 1.42, 29-Nov-2011

deans@deans2188:~$ sudo fsck.ext4 -v /dev/sdb
e2fsck 1.42 (29-Nov-2011)
fsck.ext4: Group descriptors look bad... trying backup blocks...
fsck.ext4: Bad magic number in super-block when using the backup blocks
fsck.ext4: going back to original superblock
fsck.ext4: Group descriptors look bad... trying backup blocks...
fsck.ext4: Bad magic number in super-block when using the backup blocks
fsck.ext4: going back to original superblock
fsck.ext4: Device or resource busy while trying to open /dev/sdb
Filesystem mounted or opened exclusively by another program?

#####################################################
# additional (log) messages
#####################################################

#dmesg after boot
deans@deans2188:~$ sudo dmesg
...
[ 45.780347] EXT4-fs (sdb): ext4_check_descriptors: Block bitmap for group 0 not in group (block 3111236179)!
[ 45.780355] EXT4-fs (sdb): group descriptors corrupted!

#dmesg after un/replugging usb cord
...
[ 45.780347] EXT4-fs (sdb): ext4_check_descriptors: Block bitmap for group 0 not in group (block 3111236179)!
[ 45.780355] EXT4-fs (sdb): group descriptors corrupted!
[ 4111.995729] usb 2-1: USB disconnect, device number 2
[ 4128.712144] usb 2-2: new high-speed USB device number 3 using ehci_hcd
[ 4128.846614] scsi6 : usb-storage 2-2:1.0
[ 4129.845107] scsi 6:0:0:0: Direct-Access ST315003 41AS PQ: 0 ANSI: 2 CCS
[ 4129.846813] sd 6:0:0:0: Attached scsi generic sg1 type 0
[ 4129.847307] sd 6:0:0:0: [sdb] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB)
[ 4129.848657] sd 6:0:0:0: [sdb] Write Protect is off
[ 4129.848667] sd 6:0:0:0: [sdb] Mode Sense: 34 00 00 00
[ 4129.849406] sd 6:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 4129.869964] sdb: unknown partition table
[ 4129.872645] sd 6:0:0:0: [sdb] Attached SCSI disk
[ 4130.080293] EXT4-fs (sdb): ext4_check_descriptors: Block bitmap for group 0 not in group (block 3111236179)!
[ 4130.080301] EXT4-fs (sdb): group descriptors corrupted!

#Ubuntu Popup Window Title
Unable to mount 1.5 TB Filesystem

#Ubuntu Popup Window Text
Error mounting: mount: wrong fs type, bad option, bad superblock on /dev/sdb,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail or so

#####################################################
# machine is up to date as per Ubuntu 12.04.1 LTS
#####################################################

deans@deans2188:~$ uname -a
Linux deans2188 3.2.0-35-generic #55-Ubuntu SMP Wed Dec 5 17:42:16 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
deans@deans2188:~$ cat /etc/issue
Ubuntu 12.04.1 LTS \n \l
deans@deans2188:~$ sudo apt-get update
... nothing
deans@deans2188:~$ sudo...

willdeans (william-deans) wrote :

"In any case, this isn't an e2fsck bug. The kernel is returning EBUSY;" <-- why couldn't the kernel be returning EBUSY because of an e2fsck bug which attempts to open the device twice (without first closing it)?

Why do I see in the E2fsprogs 1.41.13 (December 13, 2010) release notes:

http://e2fsprogs.sourceforge.net/e2fsprogs-release.html
http://git.kernel.org/?p=fs/ext2/e2fsprogs.git;a=blob;f=RELEASE-NOTES

 948 Fixed a bug in e2fsck where if both the original and backup superblock
 949 are invalid in some way, e2fsck will fail going back to the original
 950 superblock because it didn't close the backup superblock first, and
 951 the exclusive open prevented the file system from being reopened.

Notice I am having this problem on version 1.42 so either this isn't the same issue as fixed in 1.41.13, the fix didn't work, or it somehow isn't in 1.42.

Theodore Ts'o (tytso) wrote :

@willdeans: thanks for the strace. It was very useful indeed. Can you try e2fsprogs 1.42.6 and see if the problem still occurs with that version? The fact that you are seeing the "fallback back to backup descriptors", "nope, that didn't work", etc., twice was something that was fixed in 1.42.2, and 1.42.6 is the latest released version, and it's after the second "nope that didn't work, going back to the original supberblock" that you're getting the EBUSY error. So the newest version of e2fsck should hopefully fix this for you.

Note to others. willdeans problem may be different from others. If you are using some other version of e2fsprogs, ***please*** open a new bug and don't just glom onto this bug. It's one of the things that I loathe about Launchpad (or at least how most Ubuntu users tend to use Launchpad). People mix up bugs with superficially similar symptoms, and make it very hard to impossible to disentangle bugs that really should have been kept separate. Thanks!!!

It's always better to open a new bug, and if it turns out to be the same as another bug, we can always declare it a duplicate later, after we are 100% sure it is a duplicate.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers