SDHC card not recognized

Bug #591941 reported by Mathieu Poirier
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Lee Jones
Maverick
Fix Released
High
Lee Jones

Bug Description

Maverick kernel fails to initialize the SDHC card properly upon cold startup. Here are a few facts:

 - started after the rebase with 2.6.34
 - some SD cards work, others don't.
 - when the initrd can't mount the root partition, you get dropped in a busybox shell. While waiting in the shell, if the card is yanked in and out, the kernel recognize the card properly. From there the root partition can be mounted manually.
 - When booting the line "[ 2.283355] mmc0: error -110 whilst initialising SD card" indicate the SD card can't be recognized by the kernel.

Changed in linux (Ubuntu):
status: New → Confirmed
Oliver Grawert (ogra)
tags: added: armel
Changed in linux (Ubuntu Maverick):
importance: Undecided → High
milestone: none → maverick-alpha-2
Revision history for this message
Mathieu Poirier (mathieu.poirier-deactivatedaccount) wrote :

- posted request for help on "<email address hidden>"
- got in contact with Jayabharath to get access to the original TI developers.
- have been working with TI developers on the problem.
- issue is related to DMA - transfers are coming back with a timeout error when preemption model is set to PREEMPT_VOLUNTARY.
- got a patch from TI to set timeouts to something overly long but still no resolution.

Changed in linux (Ubuntu Maverick):
milestone: maverick-alpha-2 → maverick-alpha-3
Changed in linux (Ubuntu Maverick):
status: Confirmed → Triaged
status: Triaged → In Progress
tags: added: iso-testing
Changed in linux (Ubuntu Maverick):
milestone: maverick-alpha-3 → ubuntu-10.10-beta
Lee Jones (lag)
Changed in linux (Ubuntu Maverick):
assignee: Mathieu Poirier (mathieu.poirier) → Lee Jones (lag)
Revision history for this message
Robert Nelson (robertcnelson) wrote :

Hi Lee,

Sorry i didn't get back on irc like i thought, some fun meetings..

Just confirmed with the image I downloaded this morning on my XM "p7" 256Mb beagle...

2.2G 2010-08-01 21:16 maverick-preinstalled-netbook-armel+omap.img

Same issue with my 4GB Sandisk..

dmesg log: (enabled serial port on boot) http://pastebin.com/R0tTmXjE

Interesting, this card work just fine with my kernel/config... So it may be a patch/config difference..?

For comparison sake, i ran revision 52 of https://code.launchpad.net/~beagleboard-kernel/+junk/2.6.35-devel last night with the same 4GB Sandisk Micro sd card..

Regards,

Revision history for this message
Mathieu Poirier (mathieu.poirier-deactivatedaccount) wrote :

Robert,

I've seen this issue with all type and sizes. Aside from the above stated trick of configuring the kernel with PREEMPT_NONE rather than PREEMPT_VOLUNTARY, disabling the "CPU Power Management" options (CONFIG_CPU_FREQ and CONFIG_CPU_IDLE) will also fix the problem.

Experimenting with CONFIG_CPU_IDLE, I noticed the failure comes from switching the "ladder" to the "menu" governor. On their own they both work fine. The problem happens when the kernel decides to switch to "menu" after finding the higher rating (20) of the governor.

Revision history for this message
Robert Nelson (robertcnelson) wrote :

Hi Mathieu,

It might be CONFIG_CPU_IDLE, as all the rest i have enabled in my config..

I have always disabled that in the past due to the serial port becoming corrupt... (well only tested 2.6.26 thru 2.6.34 on the beagle with it enabled..)

cat patches/lucid-defconfig | grep CONFIG_CPU_FREQ=
CONFIG_CPU_FREQ=y

cat patches/lucid-defconfig | grep PREEMPT=
CONFIG_PREEMPT=y
CONFIG_DEBUG_PREEMPT=y

Regards,

Revision history for this message
Lee Jones (lag) wrote :

Thanks for getting back to me Robert.

Would you mind enabling CONFIG_CPU_IDLE on your own kernel and testing?

It would be good to at least pin the issue down to one config option.

Ping me on IRC if you wish to discuss.

Revision history for this message
Robert Nelson (robertcnelson) wrote :

Hi Lee,

No dice..

I modified my config with:

-# CONFIG_CPU_IDLE is not set
+CONFIG_CPU_IDLE=y
+CONFIG_CPU_IDLE_GOV_LADDER=y
+CONFIG_CPU_IDLE_GOV_MENU=y

It finds the mmc card just fine, mounts card, login, etc...

Boot Log: http://pastebin.com/4vXH1yx4

Regards,

Revision history for this message
Mathieu Poirier (mathieu.poirier-deactivatedaccount) wrote : Re: [Bug 591941] Re: SDHC card not recognized

On Wed, 2010-08-04 at 18:10 +0000, Robert Nelson wrote:
> Hi Lee,
>
> No dice..
>
> I modified my config with:
>
> -# CONFIG_CPU_IDLE is not set
> +CONFIG_CPU_IDLE=y
> +CONFIG_CPU_IDLE_GOV_LADDER=y
> +CONFIG_CPU_IDLE_GOV_MENU=y
>
> It finds the mmc card just fine, mounts card, login, etc...
>
> Boot Log: http://pastebin.com/4vXH1yx4
>
> Regards,
>

Ubuntu has a very complex kernel config file with most option enabled.
Enabling CONFIG_CPU_IDLE won't trigger the failure in upstream
2.6.35-rc6. In the Ubuntu tree it is enabled and seems to be related to
the failure.

Code-wise, there isn't much difference between the Ubuntu tree and
upstream. In my opinion we are faced with some sort of feature
interaction issue that is not seen by the community because the right
options (to cause the failure) aren't selected in the default
omap3_beagle.

I am currently working on a .config based on 2.6.35-rc6 that will
exhibit the failure with the hope of getting help from a broader range
of people.

Today I was able to reliably *fix* the problem by removing a printk in
one of the govenor (drivers/cpuidle/governor.c:69). Since the exact
same line doesn't cause problem in upstream, I assume this is a symptom
rather than the cause.

Revision history for this message
Mathieu Poirier (mathieu.poirier-deactivatedaccount) wrote :

This is the upstream version of the ubuntu config file that will cause the mmc card to fail during bootup time. It was generated with upstream 2.6.35-rc6. You should simply be able to checkout, make oldconfig, make uImage and get a binary that will exhibit the behavior.

My u-boot "bootargs" looks like: 'console=tty0 console=ttyS2,115200n8 root=/dev/mmcblk0p2 rootwait ro vram=12M omapfb.mode=dvi:1280x720MR-16@60 fixrtc'

Revision history for this message
Mathieu Poirier (mathieu.poirier-deactivatedaccount) wrote :

This is a uImage that was generated by compiling 2.6.35-rc6 with the above posted config file. To know if your card is subject to the failure, simply boot with this kernel and watch out for the " mmc0: error -110 whilst initialising SD card" error message.

Obviously, booting with this kernel will generate a lot of errors related to modules not being found. Depending on the rootfs you may not even get to a bash prompt but this isn't the goal of the exercise.

md5sum: 6d6dd595deb5ba0f6f114a62cee243fe

Revision history for this message
Robert Nelson (robertcnelson) wrote :

Thanks Mathieu for the updated config.

I've been config hunting since yesterday, my baseline ubuntu.config was just a little off: http://pastebin.com/dg4wiEZM

So far nothing obvious is causing it...

Revision history for this message
Robert Nelson (robertcnelson) wrote :

Loosing my mind, so let's reset some things.. ;)

Mathieu, did you happen to boot the uImage posted in Message #9 on a XM?

I get this on my XM:

mmc1 is available
reading uImage

3606988 bytes read
## Booting kernel from Legacy Image at 80300000 ...
   Image Name: Linux-2.6.35-rc6
   Image Type: ARM Linux Kernel Image (uncompressed)
   Data Size: 3606924 Bytes = 3.4 MB
   Load Address: 80008000
   Entry Point: 80008000
   Verifying Checksum ... OK
   Loading Kernel Image ... OK
OK

Starting kernel ...

Uncompressing Linux... done, booting the kernel.

(hard locks/stops/dies...)

Anyways, here's the reason for the reset, using the config from Message #8 with my 2.6.35 tree (which has the xm patches* i've linked over on irc to Lee, some are in ubuntu's tree) and using my angstrom 4.3.1** cross compiler:

[ 2.364532] mmc0: host does not support reading read-only switch. assuming write-enable.
[ 2.372741] mmc0: new high speed SDHC card at address c555
[ 2.378723] mmcblk0: mmc0:c555 SU04G 3.69 GiB
[ 2.384368] hub 1-2:1.0: USB hub found
[ 2.388305] mmcblk0:
[ 2.390930] hub 1-2:1.0: 5 ports detected
[ 2.395355] p1 p2

(same sd card that fails in Message #2)

* give me a couple hours to rebase it into: 2.6.35 + "Required" xm patches.. (my zippy1/2 patches in omap3beagle.c make it mess at this exact point in time...)

** I'll download this: http://people.canonical.com/~hrw/ubuntu-maverick-armel-cross-compilers/ and retest tonight..

Regards,

Revision history for this message
Mathieu Poirier (mathieu.poirier-deactivatedaccount) wrote :

The uImage I posted is for a beagleboard C4 - I haven't tried with a XM.

On Thu, 2010-08-05 at 21:36 +0000, Robert Nelson wrote:
> Loosing my mind, so let's reset some things.. ;)
>
> Mathieu, did you happen to boot the uImage posted in Message #9 on a XM?
>
> I get this on my XM:
>
> mmc1 is available
> reading uImage
>
> 3606988 bytes read
> ## Booting kernel from Legacy Image at 80300000 ...
> Image Name: Linux-2.6.35-rc6
> Image Type: ARM Linux Kernel Image (uncompressed)
> Data Size: 3606924 Bytes = 3.4 MB
> Load Address: 80008000
> Entry Point: 80008000
> Verifying Checksum ... OK
> Loading Kernel Image ... OK
> OK
>
> Starting kernel ...
>
> Uncompressing Linux... done, booting the kernel.
>
> (hard locks/stops/dies...)
>
> Anyways, here's the reason for the reset, using the config from Message
> #8 with my 2.6.35 tree (which has the xm patches* i've linked over on
> irc to Lee, some are in ubuntu's tree) and using my angstrom 4.3.1**
> cross compiler:
>
> [ 2.364532] mmc0: host does not support reading read-only switch. assuming write-enable.
> [ 2.372741] mmc0: new high speed SDHC card at address c555
> [ 2.378723] mmcblk0: mmc0:c555 SU04G 3.69 GiB
> [ 2.384368] hub 1-2:1.0: USB hub found
> [ 2.388305] mmcblk0:
> [ 2.390930] hub 1-2:1.0: 5 ports detected
> [ 2.395355] p1 p2
>
> (same sd card that fails in Message #2)
>
> * give me a couple hours to rebase it into: 2.6.35 + "Required" xm
> patches.. (my zippy1/2 patches in omap3beagle.c make it mess at this
> exact point in time...)
>
> ** I'll download this: http://people.canonical.com/~hrw/ubuntu-maverick-
> armel-cross-compilers/ and retest tonight..
>
> Regards,
>

Revision history for this message
Tobin Davis (gruemaster) wrote :

I tried the above kernel on my known-to-fail sd card and it failed to find the root partition. See attached log. Oddly enough, I didn't see the older -110 error.

Revision history for this message
Robert Nelson (robertcnelson) wrote :

Okay, finally replicated the -110 error on 2.6.35 with my XM

Minimal 2.6.35 tree, using Mathieu's config, on my XM:

https://code.launchpad.net/~beagleboard-kernel/+junk/2.6.35-XM

uImage: http://rcn-ee.homeip.net:81/testing/lp-591941/

PS. i was under the impression this "-110" error was only a XM problem, are users getting this on the Bx/Cx too?

Regards,

Revision history for this message
Tobin Davis (gruemaster) wrote :

I am seeing it on my C4 beagleboard. I have tried this kernel on an SD card with a working lucid image (lucid kernel boots to it just fine). In my testing, I found that anything other than my Class 4 SDHC cards exhibited this behavior. I have tried Class 2, Class 4 and Class 6. 2G, 4G, 8G, and 16G.

Revision history for this message
Mathieu Poirier (mathieu.poirier-deactivatedaccount) wrote :

Glad you were able to reproduce - the failure is indeed present on the
Cx models but wouldn't know about the Bx.

On Thu, 2010-08-05 at 23:17 +0000, Robert Nelson wrote:
> Okay, finally replicated the -110 error on 2.6.35 with my XM
>
> Minimal 2.6.35 tree, using Mathieu's config, on my XM:
>
> https://code.launchpad.net/~beagleboard-kernel/+junk/2.6.35-XM
>
> uImage: http://rcn-ee.homeip.net:81/testing/lp-591941/
>
> PS. i was under the impression this "-110" error was only a XM problem,
> are users getting this on the Bx/Cx too?
>
> Regards,
>

Revision history for this message
Robert Nelson (robertcnelson) wrote :
Download full text (5.1 KiB)

Success... My Bx and XM work great with 6da20c89af64b75302399369a90b9d50c1a87665 Reverted..

At first we looked at the configs, the kernel was bootable again with disabling these

CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
# CONFIG_SND_SOC is not set

So it looked like the SND_SOC was messing with the mmc clock..

After ruling out 2.6.34-rc1 -> 2.6.35... Started a git bisect between 2.6.33 and 2.6.34-rc1, that led to:

        git bisect start
        git bisect good v2.6.33
        git bisect bad v2.6.34-rc1
 git bisect good 47871889c601d8199c51a4086f77eebd77c29b0b
 git bisect bad 1154fab73ccbab010cfaa272b6987c624cfd63c6
 git bisect good 94015f6e6ba11040f75f4b42aada8de23965290e
 git bisect bad 3ff1562ea48cddaa5ac1adcb8892227389a4c96c
 git bisect bad b610ec502376d915b76a62e22576c5d0462cc9c9
 git bisect bad e3d4d0a2385593e7873e7d7688eeffea949facff
 git bisect good 918cae14872c56446415299fc17cf98704c9a537
 git bisect good 97ec7d585b33bbcc2be92dafa05b540959b4ea47
 git bisect good c2798e9342a1394de966c31703e0410ee3988378
 git bisect good 1df58db8a25ec7656005f1dd161a9ede044551b7
 git bisect bad e0eb2424469ec2333885672d3db8bd07d322455d
 git bisect bad 6da20c89af64b75302399369a90b9d50c1a87665
 git bisect good 4380eea266940a82e5b8edd5c16ce0289679bcfe

Bisecting: 3129 revisions left to test after this (roughly 12 steps)
[47871889c601d8199c51a4086f77eebd77c29b0b] Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Bisecting: 1607 revisions left to test after this (roughly 11 steps)
[1154fab73ccbab010cfaa272b6987c624cfd63c6] SLUB: Fix per-cpu merge conflict
Bisecting: 760 revisions left to test after this (roughly 10 steps)
[94015f6e6ba11040f75f4b42aada8de23965290e] USB: BKL removal: cdc-wdm
Bisecting: 358 revisions left to test after this (roughly 9 steps)
[3ff1562ea48cddaa5ac1adcb8892227389a4c96c] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Bisecting: 187 revisions left to test after this (roughly 8 steps)
[b610ec502376d915b76a62e22576c5d0462cc9c9] Merge branch 'for_2.6.34_b' of git://git.pwsan.com/linux-2.6 into omap-for-linus
Bisecting: 106 revisions left to test after this (roughly 7 steps)
[e3d4d0a2385593e7873e7d7688eeffea949facff] AM35xx: Introduce am35xx.h file
Bisecting: 52 revisions left to test after this (roughly 6 steps)
[918cae14872c56446415299fc17cf98704c9a537] Merge branch 'for-tony' of git://gitorious.org/linux-omap-dss2/linux into omap-for-linus
Bisecting: 26 revisions left to test after this (roughly 5 steps)
[97ec7d585b33bbcc2be92dafa05b540959b4ea47] omap iommu: cleanup iommu page address mask and definitions
Bisecting: 13 revisions left to test after this (roughly 4 steps)
[c2798e9342a1394de966c31703e0410ee3988378] omap3: SDP: Introducing 'board-sdp-flash.c' for flash init
Bisecting: 6 revisions left to test after this (roughly 3 steps)
[1df58db8a25ec7656005f1dd161a9ede044551b7] omap_hsmmc: Allow for power saving without going off
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[e0eb2424469ec2333885672d3db8bd07d322455d] omap_hsmmc: Allow for a shared VccQ
Bisecting: 0 revisions left to test after this (roughly 1 step)
[6da20c89af64b75302399369a90b9d50c1a87665] omap_hsmmc...

Read more...

papukaija (papukaija)
tags: added: maverick
Revision history for this message
Lee Jones (lag) wrote :

Good work gentlemen.

Mathieu, are you going to get this pushed today, or would you like me to?

Also, can yo provide me with a test kernel, so I can test it for myself?

Revision history for this message
Robert Nelson (robertcnelson) wrote :

Sure,

Mainline 2.6.35 + Mathieu's config, - revert of 6da20c89

http://rcn-ee.homeip.net:81/testing/lp-591941/2.6.35-u2.uImage

Regards,

Revision history for this message
Lee Jones (lag) wrote :

Works for me.

Changed in linux (Ubuntu Maverick):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 2.6.35-17.23

---------------
linux (2.6.35-17.23) maverick; urgency=low

  [ Jeremy Kerr ]

  * [Config] build-in uinput module
    - LP: #584812

  [ Leann Ogasawara ]

  * Revert "[Config] [FTBS] ia64: Temporarily disable CONFIG_CEPH_FS"
  * Revert "[Config] [FTBS] ia64: Temporarily disable gpiolib"
  * Revert "[Config] [FTBS] sparc: Temporarily disable
    CONFIG_MTD_NAND_DENALI"
  * Revert "[Config] [FTBS] sparc: Temporarily disable
    CONFIG_MFD_JANZ_CMODIO"
  * Revert "[Config] [FTBS] sparc: Temporarily disable
    CONFIG_INFINIBAND_QIB"
  * [Config] Enable INTEL_IPS
    - LP: #601057
  * Remove ia64 support
  * [Config] Update portsconfigs after removing ia64 support
  * Remove sparc support
  * [Config] Update portsconfigs after removing sparc support

  [ Linus Torvalds ]

  * (pre-stable) mm: fix page table unmap for stack guard page properly

  [ Mathieu J. Poirier ]

  * SAUCE: (no-up) ARM: Resetting power_mode to its original value.
    - LP: #591941

  [ Upstream Kernel Changes ]

  * timer: add on-stack deferrable timer interfaces
    - LP: #601057
  * x86 platform driver: intelligent power sharing driver
    - LP: #601057
  * IPS driver: add GPU busy and turbo checking
    - LP: #601057
  * X86: intel_ips, check for kzalloc properly
    - LP: #601057
  * ips driver: make it less chatty
    - LP: #601057
 -- Leann Ogasawara <email address hidden> Tue, 17 Aug 2010 09:38:08 -0700

Changed in linux (Ubuntu Maverick):
status: Fix Committed → Fix Released
Revision history for this message
Aapo Rantalainen (aapo-rantalainen) wrote :
Download full text (3.2 KiB)

I know this is two years old ticket.

I encountered this same 'sd card initialization error with low quality cards' on omap-linux.
Upstream kernels from 2.6.35 to 3.6 has "host->power_mode = MMC_POWER_OFF;" (drivers/mmc/host/omap_hsmmc.c). And with it initializing fails.

Kernels of Ubuntu from 2.6.35-17.23 to (at least) 3.2.0-23.36 has it changed (/reverted) to the "host->power_mode = -1;".

I made this same change and got my SD card initialized and working. That was good, but I also got kernel-warning during boot:

+------------[ cut here ]------------
+WARNING: at drivers/regulator/core.c:1371 _regulator_disable+0x44/0x11c()
+unbalanced disables for VMMC2_IO_18
+Modules linked in:
+[<c005448c>] (unwind_backtrace+0x0/0x120) from [<c008109c>]
(warn_slowpath_common+0x54/0x6c)
+[<c008109c>] (warn_slowpath_common+0x54/0x6c) from [<c008114c>]
(warn_slowpath_fmt+0x34/0x44)
+[<c008114c>] (warn_slowpath_fmt+0x34/0x44) from [<c02b1c2c>]
(_regulator_disable+0x44/0x11c)
+[<c02b1c2c>] (_regulator_disable+0x44/0x11c) from [<c02b1d38>]
(regulator_disable+0x34/0x70)
+[<c02b1d38>] (regulator_disable+0x34/0x70) from [<c03523b4>]
(omap_hsmmc_23_set_power+0xac/0x100)
+[<c03523b4>] (omap_hsmmc_23_set_power+0xac/0x100) from [<c0350c0c>]
(omap_hsmmc_set_ios+0x78/0x2dc)
+[<c0350c0c>] (omap_hsmmc_set_ios+0x78/0x2dc) from [<c03449c4>]
(mmc_power_off+0x9c/0xac)
+[<c03449c4>] (mmc_power_off+0x9c/0xac) from [<c0347048>]
(mmc_start_host+0x14/0x24)
+[<c0347048>] (mmc_start_host+0x14/0x24) from [<c0347bb4>]
(mmc_add_host+0x6c/0x80)
+[<c0347bb4>] (mmc_add_host+0x6c/0x80) from [<c0028020>]
(omap_hsmmc_probe+0x4e8/0x6fc)
+[<c0028020>] (omap_hsmmc_probe+0x4e8/0x6fc) from [<c02e0024>]
(platform_drv_probe+0x1c/0x20)
+[<c02e0024>] (platform_drv_probe+0x1c/0x20) from [<c02debe0>]
(really_probe+0xd8/0x1d4)
+[<c02debe0>] (really_probe+0xd8/0x1d4) from [<c02dee78>]
(driver_probe_device+0x88/0xac)
+[<c02dee78>] (driver_probe_device+0x88/0xac) from [<c02def04>]
(__driver_attach+0x68/0x8c)
+[<c02def04>] (__driver_attach+0x68/0x8c) from [<c02ddc18>]
(bus_for_each_dev+0x54/0x8c)
+[<c02ddc18>] (bus_for_each_dev+0x54/0x8c) from [<c02de4f4>]
(bus_add_driver+0x180/0x30c)
+[<c02de4f4>] (bus_add_driver+0x180/0x30c) from [<c02df140>]
(driver_register+0xb0/0x140)
+[<c02df140>] (driver_register+0xb0/0x140) from [<c02e0470>]
(platform_driver_probe+0x20/0xb4)
+[<c02e0470>] (platform_driver_probe+0x20/0xb4) from [<c00493d8>]
(do_one_initcall+0x3c/0x100)
+[<c00493d8>] (do_one_initcall+0x3c/0x100) from [<c00089b0>]
(kernel_init+0x98/0x144)
+[<c00089b0>] (kernel_init+0x98/0x144) from [<c004fa98>]
(kernel_thread_exit+0x0/0x8)
+---[ end trace 6b67a64bbd611c58 ]---

On code side this seems suspicious :
  unsigned char power_mode;
  host->power_mode = -1;
(negative value to the unsigned char)

----
I got it working when I used vanilla code ("host->power_mode = MMC_POWER_OFF;") and added several printks to the __init omap_hsmmc_probe. So, for me that Ubuntu patch only works because it causes kernel warning, which gives some more time for card-driver to get initialized. I think adding debug messages are hacky as causing intentional kernel-warning, but what is the root cause of this?

Is there pe...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.