Panda board shuts down during boot

Bug #708883 reported by Paul Larson on 2011-01-27
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux-linaro (Ubuntu)
John Rigby
linux-linaro-omap (Ubuntu)
John Rigby

Bug Description

Booting the daily netbook image/hwpack from 20110127 on panda, I get part way through the boot and the board just shuts down. It seems to be detecting USB devices at the time, and there is no panic or stacktrace just before the failure, it just turns off. The only way to get it running again is to pull the power plug out and reconnect.

Will attach serial log.

Paul Larson (pwlars) wrote :
Michael Hudson-Doyle (mwhudson) wrote :

With --verbose in bootargs, this is the tail end of the output:

Changed in linux-linaro (Ubuntu):
status: New → Confirmed
Alexander Sack (asac) wrote :

does this happen randomly or on every boot?

John Rigby (jcrigby) wrote :

Some wild guesses: 1. Since this happens while enumerating USB devices it could be a power supply issue. I have seen boards work fine until the voltage dips as a result of USB devices getting turned on. 2. This reminds me of how the mx51 kernel was behaving when u-boot was enabling the watchdog timer but the kernel was not disabling or updating it.

On Mon, 31 Jan 2011 23:49:36 -0000, Alexander Sack <email address hidden> wrote:
> does this happen randomly or on every boot?

Every boot that I've tried (maybe 4 or 5?).

The board worked fine with some random set up rsalveti provided me with,
but that didn't seem to have USB support so maybe it didn't probe for
USB devices at startup.

The fact that the reset button doesn't work -- you have to yank the
power out -- suggests that maybe it is power related. But just a guess.


Michael Hudson-Doyle (mwhudson) wrote :

On Mon, 31 Jan 2011 23:55:19 -0000, John Rigby <email address hidden> wrote:
> Some wild guesses: 1. Since this happens while enumerating USB devices
> it could be a power supply issue. I have seen boards work fine until
> the voltage dips as a result of USB devices getting turned on. 2. This
> reminds me of how the mx51 kernel was behaving when u-boot was enabling
> the watchdog timer but the kernel was not disabling or updating it.

I don't notice if it's related, but I've just noticed that if I press
reset during boot (but before the hang), the board still powers down a
few seconds later (before the next attempt at booting gets going).

Resetting on line 19 of was OK,
whereas resetting on line 40 was not.


John Rigby (jcrigby) wrote :

Looks like this is known upstream:
However it is unclear after reading the thread if this is a hardware problem or x-loader/u-boot problem.

Paul Larson (pwlars) wrote :

Confirmed I still see this problem with 20110201-1 image, BUT, I do not see this problem when booting in the validation lab with old uboot/xloader + new everything else. So my bet is on this being a problem with xloader or uboot.

Ken Werner (kwerner) wrote :

I've tried various hwpacks and (headless) images and encountered exact the same issue while the Ubuntu 10.10 image works flawlessly on this board. In order to exclude faulty external components I used different power supplies and several SD cards - also no difference. My next step was to use a SD-card image that works works on a PandaBoard of a colleague (20110127 hwpack with 20110126 headless) to exclude differences during the creation of the image (version of l-m-c and such). On my board it shows the same symptoms as the image I created myself.

> I do not see this problem when booting in the validation lab with old uboot/xloader + new everything else
This is interesting. I took the Ubuntu 10.10 image which runs fine on my board and replaced only the kernel with git:// with the result that the board shuts down. Therefore I thought it's related to the Linaro kernel. Could you elaborate which uboot/xloader you used?

Ricardo Salveti (rsalveti) wrote :

This kernel works fine for me when installing it at the Ubuntu Natty Alpha-2 release. Was able to boot and use the board without any hang.

Seems I'm using a newer u-boot, so it would be good to test to make sure it's not related with it.

Texas Instruments X-Loader 1.4.4ss (Jan 26 2011 - 10:12:48)
Reading boot sector
Loading u-boot.bin from mmc

U-Boot 2010.12 (Jan 27 2011 - 17:59:04)

CPU : OMAP4430
Board: OMAP4 Panda
I2C: ready
Using default environment

In: serial
Out: serial
Err: serial
Hit any key to stop autoboot: 0
reading boot.scr

391 bytes read
Running bootscript from mmc0 ...
## Executing script at 82000000
reading uImage

3636192 bytes read
reading uInitrd

8919927 bytes read
## Booting kernel from Legacy Image at 80000000 ...
   Image Name: Linux
   Image Type: ARM Linux Kernel Image (uncompressed)
   Data Size: 3636128 Bytes = 3.5 MiB
   Load Address: 80008000
   Entry Point: 80008000
   Verifying Checksum ... OK
## Loading init Ramdisk from Legacy Image at 81600000 ...
   Image Name: initramfs
   Image Type: ARM Linux RAMDisk Image (uncompressed)
   Data Size: 8919863 Bytes = 8.5 MiB
   Load Address: 00000000
   Entry Point: 00000000
   Verifying Checksum ... OK
   Loading Kernel Image ... OK

Ricardo Salveti (rsalveti) wrote :
warmcat (andy-warmcat) wrote :

I was trying to power my Panda board from USB OTG connector, it dies early in boot apparently in xloader or u-boot when connected to a laptop.

I did three tests

1) Short the USB OTG 5V pin to the DC Jack 5V pin in case something in the OTG 5V path makes trouble. It still dies in early boot.

2) Remove the D+ and D- wires from the USB cable, still plugged into the OTG socket. It still dies in early boot.

3) Completely remove the connection from the OTG socket, and bring the same laptop USB power to the DC power jack. Boots fine.

So it seems that you just need external power coming to the OTG power pin to make the early boot problem.

I also tested combinations of DC Jack power with OTG power using different sequencing, DC Jack power couldn't allow it to recover once it was in the process of dying, and you could not start the board again until both power sources were removed.

Paul Larson (pwlars) wrote :

That is not consistent with my observations. I have been able to reproduce this with *nothing* plugged into the board other than the external 5V4A power - not even usb for serial. The board shuts down and the LEDs go off shortly after plugging in the power.

warmcat (andy-warmcat) wrote :

Well, there's no reason to expect it should be consistent since we didn't compare which version of Xloader and U-Boot I am running and you already have reason to believe the behaviour depends on that. However what I reported is repeatable on this board with

Texas Instruments X-Loader 1.41 (Oct 6 2010 - 17:27:48)
U-Boot 2010.09-rc1 (Sep 23 2010 - 11:24:55)

If that looks like you "old", working setup maybe you can try that with the USB OTG power to see if that provokes the messed-up boot.

warmcat (andy-warmcat) wrote :

AFAICT the latest X-loader amongst the various trees lives here

I built the current HEAD of it and got

Texas Instruments X-Loader 1.4.4ss (Feb 7 2011 - 21:09:54)

This boots fine under DC Jack power (derived from my laptop USB port) and continues to turn the status LEDs off after a few seconds if powered from the Panda OTG jack.

warmcat (andy-warmcat) wrote :

I examined the pandaboard schematics and found that the SYS_BOOT[7..0] pins are set to 10000101 by default. These control the actions of the ROM on the OMAP4. b5..b0 select the boot ordering; 000101 means try USB boot first and MMC1 second.

In the case that no external power is connected to the USB OTG, the ROM skips initializing the USB unit and boot proceeds normally via MMC1.

In the case that there is external power coming to the USB OTG connector, the ROM spends a few seconds trying to bring that up and if it isn't satisfied then proceeds to the second boot option via MMC1. It starts the MMC1 card clock and pulls in X-loader and it starts.

However because of some side effect of the ROM having followed a boot path that started the OTG unit, X-loader is unable to operate normally at a couple of points during its initialization.

I have just sent some patches to linaro-dev that fix this for me so I am able to boot into Linux OK just from my laptop plugged into the OTG socket on Panda.

Ricardo Salveti (rsalveti) wrote :

What I still don't understand is why Paul Larson was able to reproduce this issue without ever touching the USB OTG and using just the external power supply?

The x-loader that Paul is using is already based on this upstream one, and it's the same one we're using at the Ubuntu images.

Paul, are you able to reproduce this issue even after updating both x-loader and u-boot to the latest version available at the archive? Also, can you reproduce the bug with Ubuntu's kernel for OMAP 4?

Aneesh V (aneesh) wrote :

Did your x-loader changes fix the above issue conclusively.

I had seen similar issue(the one that you have explained in your patch not this bug) on Blaze and fixed it with this patch(On this tree this would also apply to Panda).;a=commit;h=7ecbec096c300c7e71b663eae51945522329bfbd

However, in this case, it would crash in x-loader and it wouldn't go beyond that.
But from the comments above it looks like the crash is happening when kernel is booting?
Please let me know if my understanding is wrong.

BTW, I tried booting on my 8-layer Panda with 20110207 hw pack and 20110206 headless images.
It worked just fine for me.

warmcat (andy-warmcat) wrote :

@Aneesh... my problem was with USB OTG power of the board, it is fixed with the patches but you're right the bug in Launchpad is more about dying later. Like yourself my board is not dying later either. My patch also does prcm_init() earlier but after that there is another problem with using shadow updates of the clock register when EMIF IDLE is disabled; it only updates when the EMIFs are idle (DDR is not busy then). It sat for a long time (minutes at least) in the loop waiting to confirm the shadow update had gone through.

@Ricardo... I think there are two issues, mine was crashing early in boot if powered by USB OTG and that's solved by my patches. "Panda board shuts down in boot" described my issue too. I can't reproduce the other issue.

Torez Smith (lnxtorez) wrote :

interesting data point....running with Linaro image linaro-natty-headless-tar-20110202-0 and hwpack_linaro-panda_20110202-0_armel_supported
but switched kernel, initrd, boot loader and xloader out with that from ubuntu natty and I no longer see the board halting problem.

While still using the files from natty, if I switch back in the Linaro kernel, I see the board halting problem once more.

My board usually halts shortly after reaching the shell prompt and it's fairly consistent and easy to reproduce.

I've seen similar things to Torez too, including the problem with today's image/hwpack. So while the OTG-powered stuff was interesting, I'm not sure that it had any relation to the problem this bug report was originally about...

Paul Larson (pwlars) on 2011-02-09
Changed in linux-linaro (Ubuntu):
importance: Undecided → Critical
Aneesh V (aneesh) wrote :

We have been finding it difficult to reproduce this problem. Can somebody give the following details of a configuration that reproduces this problem:

Board revision: ES1.0/ES2.0/ES2.1/ES2.2
Board type: 6 layer vs 8 layer
(The 750-xxx-yyyy number on the back of the board will give us the above two details)
file system image:
power supply: USB vs power adapter

Ken Werner (kwerner) wrote :

Thanks for looking into this. I can reproduce this bug on my A1 ES2.1 8-layer 750-2152-010(D) board. H ere are the details:
  $ for gpio in 171 101 182; do read gpio${gpio} < /sys/class/gpio/gpio${gpio}/value; done && echo "Revision: ${gpio171} ${gpio101} ${gpio182}"
  Revision: 0 1 1

  $ sudo ./devmem2 0x4A002204
  /dev/mem opened.
  Memory mapped at address 0x40155000.
  Value at address 0x4A002204 (0x40155204): 0x3B95C02F

I'm using the following PSU:

I've tried several linaro-n snapshots on this board - all showed the same behavior. Also the alpha-2 doesn't work for me. These are the files used for the image creation:
 * linaro-natty-headless-tar-20110203-1.tar.gz
 * hwpack_linaro-panda_20110203-0_armel_supported.tar.gz

I'm using l-m-c from lp:linaro-image-tools rev 287 with the qemu-kvm-extras-static from Natty (0.13.0+noroms-0ubuntu11_i386.deb) on a i686 Maverick installation.

Please let me know if you need further details.

Paul Larson (pwlars) wrote :

Still dies at the same place for me with today's image/hwpack.
PandaBoard Rev A1
hwpack: hwpack_linaro-panda_20110211-0_armel_supported.tar.gz
rootfs: linaro-natty-efl-tar-20110211-3.tar.gz
external power

I can also still reproduce this regardless of whether I'm booting with my usb hub, monitor, network, etc plugged in, or even if I am booting with nothing but the power plugged in

Aneesh V (aneesh) wrote :

@Ken, Paul
I tried 4 different boards having the serial number 750-2152-010(D) and one with 750-2151-002 (A). No luck in reproducing!
 * linaro-natty-headless-tar-20110203-1.tar.gz
 * hwpack_linaro-panda_20110203-0_armel_supported.tar.gz

My power supply is not the Digi-Key one you are using, but a GlobTek one with the same rating (5V, 4A)

Ken Werner (kwerner) wrote :

The latest (20110214) Linaro headless snapshot plus corresponding hwpack still powers down my pandaboard shortly after it gets to the bash boot. As Paul said this seems to be independent of the attached devices (HDMI, USB, Ethernet). To check if it's related to something that is started by upstart I added "init=/bin/bash" to the bootargs but the board still dies (shortly after the regulator_init_complete msgs). Then I copied over the uImage + uInitrd from the Ubuntu Natty alpha2 and the system works fine. Unfortunately the Ubuntu kernel (2.6.35-1101-omap4) differs a lot from current Linaro kernels.

Ken Werner (kwerner) wrote :

To check if it's x-loader/u-boot related I replaced the MLO and u-boot.bin of the 20110214 Linaro snapshot with the files from the working Ubuntu Natty alpha2 installation but that didn't change anything - the board still goes off.

Paul Larson (pwlars) wrote :

Interesting data point today, I booted the stripped down image that Ken sent out. The first two times I was able to get all the way to a shell prompt and even type a few commands, however the leds on the board flickered out and a few seconds later I could not communicate with the board on the serial console (nothing on HDMI due to another bug).

The third time I booted, the lights on the board still went out, but the board never locked up either. So it seems that it appears to lose power, but isn't actually losing power - possibly just locking up and going dark.

warmcat (andy-warmcat) wrote :

Most of this confusion is coming because the LED trigger heartbeat is being built as a module (why?) and not inserted.

If you type

 modprobe ledtrig-heartbeat

at the prompt you will get the familiar flashing LED letting you know the board is ON. So I guess the confusion about ON or OFF will be removed anyway if this is added to an initscript.

Aneesh V (aneesh) wrote :

Can anybody ship one of those faulty boards(preferably with power supply and an MMC card with the non-working image) to TI Dallas. I shall tell you who it should be addressed to later today.

warmcat (andy-warmcat) wrote :

I added the modprobe ledtrig-heartbeat to /etc/rc.local for now, although I'm sure there's a better place.

I cycled the power 10 times. 9 times it was normal, but one time there was a 30s delay before boot completed. If it delayed 30s with LEDs off, many people will think the board is OFF.

Normal -->

[ 2.144744] udev[65]: starting version 165
Begin: Loading e[ 2.153198] usb 1-1: new high speed USB device using ehci-omap and address 2
ssential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
[ 2.357147] hub 1-1:1.0: USB hub found
[ 2.361633] hub 1-1:1.0: 5 ports detected
Begin: Running /scripts/local-premount ... done.
[ 2.412078] EXT3-fs: barriers not enabled
[ 2.419006] kjournald starting. Commit interval 5 seconds
[ 2.425384] EXT3-fs (mmcblk0p2): mounted filesystem with ordered data mode
Begin: Running /scripts/local-bottom ... done.
Begin: Running /scripts/init-bottom ... done.
[ 2.668670] usb 1-1.1: new high speed USB device using ehci-omap and address 3

Slow -->

Begin: Loading e[ 2.145019] usb 1-1: new high speed USB device using ehci-omap and address 2
ssential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
[ 2.341339] hub 1-1:1.0: USB hub found
[ 2.346069] hub 1-1:1.0: 5 ports detected
[ 2.824768] usb 1-1.1: new high speed USB device using ehci-omap and address 3

Begin: Running /scripts/local-premount ... done.
[ 32.391906] EXT3-fs: barriers not enabled

You can notice that the difference is usb node 1-1.1 enumeration happening before local-premount or not (before local-premount makes the delay).

usb node 1-1.1 is the SMSC USB Ethernet PHY.

Is there a race between the USB Ethernet PHY enumeration and the rootfs being ready to use network layer? Eg, no /sys mounted at that time in the bad case or somesuch?

Paul Larson (pwlars) wrote :

@aneesh give me an address over email later today, and I'll get it shipped to you.

Ken Werner (kwerner) wrote :

Thanks for the LED heartbeat hint. I've added ledtrig-heartbeat to /etc/modules and I see that the Status1-LED begins to flash during boot. When the system dies (shortly after it gets to the bash prompt on my board) the LED stays off and the CPU cools down. The board works fine as soon as I replace the linaro kernel (uImage + uInitrd) with the files from the Ubuntu Natty alpha2.

Kurt Taylor (krtaylor) wrote :

I am seeing the same as Ken and others with my Panda specifics:

PandaBoard Rev A1
Assy: 750-2152-010(D)

I am also using a USB/serial for console. I have tried with/without kb/mouse, ethernet, usb hub, HDMI and usb serial, all with no success. I am not using LEDs to determine if the board is hung, and have waited much longer than 30 seconds.

Since Ubuntu netbook 10.10 worked for me, I doubt the power supply is a problem, but for completeness, I am using the Digikey V-Infinity switch mode power supply brick 5V 4A, model 3A-211DN05.

I have not yet tried swapping the Linaro and Natty alpha2 parts yet, but will try now and post results.

Kurt Taylor (krtaylor) wrote :

Swapping natty alpha 2 uImage and uInitrd worked with Linaro linaro-natty-alip-tar-20110215-0.tar.gz and hwpack_linaro-panda_20110215-0_armel_supported.tar.gz.

The output of :
 $ for gpio in 171 101 182; do read gpio${gpio} < /sys/class/gpio/gpio${gpio}/value; done && echo "Revision: ${gpio171} ${gpio101} ${gpio182}"

was also:
  Revision: 0 1 1

Kurt Taylor (krtaylor) wrote :

For what it's worth, just as Ken reported, the PWRON_RESET button works just as expected with the hybrid linaro/natty image described above, but does not do anything when pressed after booting Ken's test image (image that reproduces the symptoms of this defect).

warmcat (andy-warmcat) wrote :

After a lot of help from kenws testing, I have a small patch which should stop the Panda going OFF. I can explain why it goes off without the patch, but since TI won't let me have a twl6030 datasheet I can't get right to the end yet.

Basically the Panda PMIC is seeing an overcurrent on one of its regulators, and reporting it via b6 of its interrupt sources VXXX_SHORT. Like other PMICs, if something bad like that happens the PMIC gives the CPU a short time to acknowledge the report of the bad news, and if the CPU doesn't come and clear the interrupt within that time, the PMIC will force the device OFF for safety.

The mainline twl6030 driver takes the approach that it will mask off all interrupt sources except the ones it is interested in, just MMC card detect and USB OTG ID status at the moment. Therefore if the VXXX_SHORT condition occurs with the driver as it is, the interrupt is not cleared and the PMIC panics forcing the Panda OFF.

The patch enables all interrupt sources instead, so the condition is acknowledged.

The variation in boards showing the problem or not is explained by the process spread of the overcurrent comparator in the PMIC; I was able to reduce the threshold on mine by lowering Vin and reducing the PMIC temperature so I could control whether VXXX_SHORT was generated or not since the actual current was marginal vs the comparator limit. But since the comparator limit would normally be quite a little bit above the max legal current, it suggests that on one regulator the Panda really is pulling more than it should.

So the root cause is genuinely excessive current on one of the regulators, but I don't know which one because I don't have a datasheet (although I would look at the 2.1V one first). The final fix will be to identify the regulator and reduce the current taken. But in the meanwhile this patch should remove the symptom.

Marcin Juszkiewicz (hrw) wrote :

Kurt: GPIO check on <A1 boards is not granted to be proper. I have EA1 board which has GPIO setup as A1 but has ES2.0 cpu.

Ken Werner (kwerner) wrote :

Your patch fixes the issue on my board. I've applied your patch to the linaro natty kernel and the system stays alive.


Torez Smith (lnxtorez) wrote :

interesting observation...
If I start with linaro-natty-headless image, build kernel mentioned in #39 with patch applied from #37, the kernel boots fine and the board stays up for hours.

If however I start with linaro-n-developer image, build kernel mentioned in #39 with patch applied from #37, I see the bug once more. That is, the board boots to shell prompt and within a few seconds the board halts.

tags: added: patch
Marcin Juszkiewicz (hrw) wrote :

If you have issue - please try to boot Ubuntu natty alpha3 image on panda. Worked for me. Will do some more testing during week.

warmcat (andy-warmcat) wrote :

Since Sebasien Jan took my patch into his tree, and the Ubuntu Panda kernel is coming via his tree, it should mean that kernel has the patch in it. You can confirm it with

$ dmesg | grep "6030: Interrupt status"

if it shows anything it's using the patch.

Ken Werner (kwerner) wrote :

Is there any chance to put the workaround/patch into the Linaro omap4 kernel as well?

Ricardo Salveti (rsalveti) wrote :

On Mon, Mar 7, 2011 at 11:27 AM, warmcat <email address hidden> wrote:
> Since Sebasien Jan took my patch into his tree, and the Ubuntu Panda
> kernel is coming via his tree, it should mean that kernel has the patch
> in it.  You can confirm it with
> $ dmesg | grep "6030: Interrupt status"
> if it shows anything it's using the patch.

At Ubuntu we're still using the old 35 one, so I don't expect this
patch to be applied. Our 38 tree is currently available only at a PPA,
but should soon be the default.

John Rigby (jcrigby) on 2011-03-10
Changed in linux-linaro (Ubuntu):
assignee: nobody → John Rigby (jcrigby)
status: Confirmed → In Progress
John Rigby (jcrigby) on 2011-03-11
Changed in linux-linaro-omap (Ubuntu):
status: New → In Progress
Changed in linux-linaro (Ubuntu):
status: In Progress → Invalid
Changed in linux-linaro-omap (Ubuntu):
assignee: nobody → John Rigby (jcrigby)
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-linaro-omap - 2.6.38-1001.2

linux-linaro-omap (2.6.38-1001.2) natty; urgency=low

  [ John Rigby ]

  * Rebase to new upstreams:
    Linux v2.6.38-rc6 -- same
    linaro-linux-2.6.38-upstream-1Mar2011 -- new
    Ubuntu-2.6.38-5.32 -- same
    - LP: #724377
  * Enable CONFIG_THUMB2_KERNEL for OMAP[34]
  * Bump ABI
  * Rebase to new upstreams:
    Linux v2.6.38-rc7
    ubuntu-natty master-next as of 4Mar2011
  * Re-enable display on OMAP4
    - LP: #728603
    - LP: #720055
  * Rebase to new upstreams:
    Linux v2.6.38-rc8
      rebased to 2.6.38-rc8
  * Remove generated file kernel-versions and sort
    - LP: #718677
  * Rebase to new upstreams:
    Linux v2.6.38 final
    - LP: #708883
    - LP: #723159
    ubuntu-natty Ubuntu-2.6.38-7.35
  * Enable CONFIG_IP_PNP and CONFIG_ROOT_NFS for all flavours
    - LP: #736429
  * mach-ux500: fix build error
    workaround a problem in linux-linaro-2.6.38
  * OMAP4:Fix -EINVAL for vana, vcxio, vdac
    from omap-linux mailing list pending ack
  * turn off ROOT_NFS for mx51
    it makes the kernel too large to boot with current hwpack settings
 -- John Rigby <email address hidden> Fri, 18 Mar 2011 07:36:33 -0600

Changed in linux-linaro-omap (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers