ext4 on mmcblk card causes major kernel problems

Bug #913860 reported by Stefan Bader
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Debian
Fix Released
Unknown
linux (Ubuntu)
Won't Fix
Medium
Unassigned

Bug Description

Release: Precise
Architecture: i386

Testcase:
- Have a ext4 formatted partition on a mmcblk device mounted
- Start a shell and change into a directory on that sd card
- Suspend and Resume

Result: Kernel fails to forcefully unmount the partition and produces a lot of errors because re-discovery of the filesystem runs into duplicate proc and sysfs files. fs is unmountable until the shell leaves the directory.

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: linux-image-3.2.0-8-generic 3.2.0-8.14
ProcVersionSignature: Ubuntu 3.2.0-8.14-generic 3.2.0
Uname: Linux 3.2.0-8-generic i686
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
AplayDevices:
 **** List of PLAYBACK Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: ALC268 Analog [ALC268 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
ApportVersion: 1.90-0ubuntu1
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: ALC268 Analog [ALC268 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: test 1445 F.... pulseaudio
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0x78540000 irq 45'
   Mixer name : 'Realtek ALC268'
   Components : 'HDA:10ec0268,1025015b,00100101'
   Controls : 13
   Simple ctrls : 8
Date: Mon Jan 9 17:02:53 2012
HibernationDevice: RESUME=UUID=a6633ae3-3737-4a80-b4af-ba0012580b02
InstallationMedia: Ubuntu 11.10 "Oneiric Ocelot" - Alpha i386 (20110819)
MachineType: Acer AOA110
ProcEnviron:
 LANGUAGE=en_US:en
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-8-generic root=UUID=9c8a3303-f560-4677-9c67-c970b7180082 ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-8-generic N/A
 linux-backports-modules-3.2.0-8-generic N/A
 linux-firmware 1.67
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 10/06/2008
dmi.bios.vendor: Acer
dmi.bios.version: v0.3310
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.vendor: Acer
dmi.board.version: Base Board Version
dmi.chassis.type: 1
dmi.chassis.vendor: Chassis Manufacturer
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAcer:bvrv0.3310:bd10/06/2008:svnAcer:pnAOA110:pvr1:rvnAcer:rn:rvrBaseBoardVersion:cvnChassisManufacturer:ct1:cvrChassisVersion:
dmi.product.name: AOA110
dmi.product.version: 1
dmi.sys.vendor: Acer

Revision history for this message
Stefan Bader (smb) wrote :
Stefan Bader (smb)
Changed in linux (Ubuntu):
assignee: nobody → Stefan Bader (stefan-bader-canonical)
importance: Undecided → Medium
Stefan Bader (smb)
Changed in linux (Ubuntu):
status: New → In Progress
Revision history for this message
Stefan Bader (smb) wrote :

So getting back to this issue. It still happens with a relatively upstream kernel version 3.4-rc3. When experimenting a bit more it seems that there are various cases:

1. automatically mounted by udisk under /media and not having a shell changed into that mount dir.
    -> this seems to work, after resuming a new nautilus window appears as with a freshly inserted card
2. automatically mounted by udisk but having a shell changed into the mount dir.
   -> Shows the mount error, the mountpoint does _not_ show up in df or /proc/mounts and typing ls
        in the shell results in an IO error.
3. manually mounted (to /mnt) without shell changed into the mount dir.
    -> Shows the mount error, df and /proc/mounts do contain the mount point, typing ls results
         in the IO error.
4. manually mounted and changing into the mount dir seems the same as 3.

Revision history for this message
Stefan Bader (smb) wrote :

Adding some debug info, shows that the mount problem (which is /proc/fs/ext4/mmcblk0p1, /proc/fs/jbd2/mmcblk0p1-8, /sys/ext4/mmcblkdp1 and an internal kobject named mmcblk0p1 trying to get created while they still exist) at least is made worse by the fact that ext4 seems to create /sys and /proc entries when a block device is accessed.

So on the way to suspend, at least mmc_remove_card() gets called (as it is intended to protect against cards being changed). This removes the card from the mmc bus and causes mmc_blk_remove() to be called. This in turn calls mmc_blk_remove_req() for each partition and the main device and that theoretically would call del_gendisk() (but only if GENHD_FL_UP is set).

So next step is to try to find out whether this is true or not...

Revision history for this message
Stefan Bader (smb) wrote :

It seems I am about to end up on the same conclusion as before (just failing to document them back then). So the block device is removed as intended (del_gendisk is called). Just from there it seems to be no way to sanely handle it.
I suspect the udisk assisted mount under /media also creates a watch (fsnotify/inotify) that will try to do an unmount when the block device goes away. If anything is using the fs at that point, things get screwed. While somehow the entries in /proc/mounts (and probably /etc/mtab as well) go away, the fs instance remains somewhere dangling and causes a new mount to fail.
For manual mounts its even worse because there is no udev rule that would even try to unmount when a remove event takes place.

And despite the scary aspect from getting those pop-ups about the mount failure after suspend, it probably is safer to immediately see there is a problem than to have the fs in use be some application and then only getting unexpected IO errors.

There is a manual workaround there (which also has its own dangers): the mmc core accepts a parameter removable (which is also settable through sysfs). If that is set to 0, then the card is really only suspended. Anybody replacing the card while in suspend will be in pain though...

Stefan Bader (smb)
Changed in linux (Ubuntu):
status: In Progress → Won't Fix
assignee: Stefan Bader (stefan-bader-canonical) → nobody
Revision history for this message
Adam Porter (alphapapa) wrote :

Please explain why this was marked wontfix. I'm having this problem on Raring. I put an ext4 partition on an SDHC card I'm testing using for backup purposes, and I'm having kernel bugs preventing access to the filesystem, preventing unmounting, and even causing unrelated I/O to fail, forcing a hard reset of the system. This needs to be fixed.

summary: - Suepend/resume causes problems with mounted and active mmcblk device
+ ext4 on mmcblk card causes major kernel problems
Revision history for this message
Stefan Bader (smb) wrote :

If you are trying to keep a sd card in a mmc controlled slot over suspend resume with ext3/4 on it, read comment #4. It is just not possible to cope with the drivers default behaviour of forcefully removing the drive. But you can avoid it by change the removable option.
If you have issues that are not related to suspend/resume, please open a new bug.

Martin Pavelek (he29-hs)
Changed in debian:
importance: Undecided → Unknown
status: New → Unknown
Changed in debian:
status: Unknown → New
Changed in debian:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.