snapdragon uc18 image fails to boot (current stable)

Bug #1846397 reported by Kyle Nitzsche on 2019-10-02
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
snapd
Critical
Ondrej Kubik

Bug Description

The current stable uc18 snapdragon arm64 doesn't boot.

http://us.cdimage.ubuntu.com/ubuntu-core/18/stable/current/

ubuntu-core-18-arm64+snapdragon.img.xz 2019-08-06 07:56

An earlier image boots fine: http://us.cdimage.ubuntu.com/ubuntu-core/18/stable/20190213/ubuntu-core-18-arm64+snapdragon.img.xz

Other than the usual quick flash of a blue LED, there is no apparent boot activity. HDMI shows none of the usual boot sequence. I have no UART so cannot see boot console.

Paul Larson (pwlars) wrote :

This image is used successfully many times a day in our lab, but those have a uart attached for capturing debug information and for control. I tried this without a uart connected and confirmed that it does not boot though. I also noticed that if you boot with the uart connected, then disconnect it, it remains booted and usable.

Changed in snapd:
status: New → Confirmed
Łukasz Zemczak (sil2100) wrote :

This seems like one of those bugs that one would never actually explicitly testing, because it doesn't make sense to treat as a separate test-case (i.e. testing with and without serial connected). Sadly there's not much one could have done automated-testing wise - we just need to use this as a lesson and dedicate one device with serial disconnected to make sure this test case is handled. From what Chris mentioned, this is now done, so we should be covered for the future!

For now I have asked Ondrej if he could take a look, since this might be something that got introduced with his latest dragonboard gadget update.

Ondrej: could you take a look? I'd like to know if this issue can be fixed easily - otherwise, per Chris Wayne's proposition, we'd probably want to revert the dragonboard images to the previous stable version. Since an older working image is better than a newer one that doesn't boot (for some cases).

Changed in snapd:
assignee: nobody → Ondrej Kubik (ondrak)
importance: Undecided → Critical
Łukasz Zemczak (sil2100) wrote :

Ok, for now we have reverted the snapdragon images to the old images. But we'd really need this fixed.

Ondrej Kubik (ondrak) wrote :

After some debugging this seems to be u-boot caused change. By wiring uart directly without 96boards mezzanine I was able to test boot sequence with disconnected RX and TX.
And I can now see u-boot stopping at:
DRAM: 986 MiB
MMC: sdhci@07824000: 0, sdhci@07864000: 1
Loading Environment from FAT... OK
In: serial@78b0000
Out: serial@78b0000
Err: serial@78b0000
## Error: Can't overwrite "serial#"
## Error inserting "serial#" variable, errno=1
Net: Net Initialization Skipped
No ethernet found.
Hit any key to stop autoboot: 0
dragonboard410c =>

I will see if this is bug in v2019.04 or we need extra config flag...... stay tuned

Ondrej Kubik (ondrak) wrote :

OK correction and update
This was false lead, close though....
Seems like u-boot "thinks" someone pressed key to abort boot when there is nothing connected on uart.
Now Testing some patches to fix this

Ondrej Kubik (ondrak) wrote :

I have narrow it down to this commit:
https://gitlab.denx.de/u-boot/u-boot/commit/b460b889e28379014a7f951c08d93a151116b1ad
Questions now is, do we revert it completely, or debug further which part of initialisation is broken there

Paul Larson (pwlars) wrote :

As I understand it, the 20191008.2 core18 beta image should have this change included. I tried booting it on a system where serial is disconnected, and things are improved at least - It started booting the kernel after a few seconds. But then it died with "No init found".
After the resizing on first boot, I saw quite a few mount errors. I can get a picture of the screen if that helps.

I rebooted it without changing anything though, and it worked on the second boot.

Paul Larson (pwlars) wrote :

Here's a snapshot of the screen when I got the errors

Paul Larson (pwlars) wrote :

Also forgot to mention, the first two times I tested it, it failed for me (one on each of those images), then when I went back to reproduce it by writing a fresh image each time, it failed to reproduce this problem for the next 5 attempts, then I got it to happen again. So it seems to be somewhat random.

that's interesting as resizing and generally init run should be way beyond
bootloader.
I wonder if we can reproduce same with previous gadget snap revision

On Tue, Oct 8, 2019 at 5:31 PM Paul Larson <email address hidden>
wrote:

> I also tried https://people.canonical.com/~okubik/ubuntu-core-
> dragonboard-20191008-00.img.xz with the same result
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1846397
>
> Title:
> snapdragon uc18 image fails to boot (current stable)
>
> Status in snapd:
> Confirmed
>
> Bug description:
> The current stable uc18 snapdragon arm64 doesn't boot.
>
> http://us.cdimage.ubuntu.com/ubuntu-core/18/stable/current/
>
> ubuntu-core-18-arm64+snapdragon.img.xz 2019-08-06 07:56
>
> An earlier image boots fine: http://us.cdimage.ubuntu.com/ubuntu-
> core/18/stable/20190213/ubuntu-core-18-arm64+snapdragon.img.xz
>
> Other than the usual quick flash of a blue LED, there is no apparent
> boot activity. HDMI shows none of the usual boot sequence. I have no
> UART so cannot see boot console.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/snapd/+bug/1846397/+subscriptions
>

Ondrej Kubik (ondrak) wrote :

From screenshot this is indeed way further in the boot chain. We should
validate this is not happening with previous gadget. As error happens
inside initrd. Only relation to u-boot would be messed up hw init. I can
revert u-boot version to one we used till now
Also is this happening with and without uart connected?

On Tue, 8 Oct 2019 at 19:08, Ondrej Kubik <email address hidden>
wrote:

> that's interesting as resizing and generally init run should be way beyond
> bootloader.
> I wonder if we can reproduce same with previous gadget snap revision
>
> On Tue, Oct 8, 2019 at 5:31 PM Paul Larson <email address hidden>
> wrote:
>
>> I also tried https://people.canonical.com/~okubik/ubuntu-core-
>> dragonboard-20191008-00.img.xz with the same result
>>
>> --
>> You received this bug notification because you are a bug assignee.
>> https://bugs.launchpad.net/bugs/1846397
>>
>> Title:
>> snapdragon uc18 image fails to boot (current stable)
>>
>> Status in snapd:
>> Confirmed
>>
>> Bug description:
>> The current stable uc18 snapdragon arm64 doesn't boot.
>>
>> http://us.cdimage.ubuntu.com/ubuntu-core/18/stable/current/
>>
>> ubuntu-core-18-arm64+snapdragon.img.xz 2019-08-06 07:56
>>
>> An earlier image boots fine: http://us.cdimage.ubuntu.com/ubuntu-
>> core/18/stable/20190213/ubuntu-core-18-arm64+snapdragon.img.xz
>>
>> Other than the usual quick flash of a blue LED, there is no apparent
>> boot activity. HDMI shows none of the usual boot sequence. I have no
>> UART so cannot see boot console.
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/snapd/+bug/1846397/+subscriptions
>>
>

Paul Larson (pwlars) wrote :

I can try some more in the morning, but here are some more observations from the rest of my testing today:
1. with both the current beta image, and the one from your people.c.c, I had previously been rewriting the sd card each time to test. I was hitting the errors maybe 10-20% of the time or so. After a successful boot, I then tried rebooting the same sd several more times without rewriting it, and after about 6 more successful boots, I was able to reproduce the error again. It would be interesting to see if someone else with a dragonboard can reproduce this behavior. The SD card I'm using has never given me problems before, but given the nature of them, I don't think I could rule out the possibility of a media problem.

2. I also tried booting the current/stable image that is on cdimage now. This is the image from before this problem was detected. I've rebooted it at least 12 times so far, and have not yet been able to reproduce the problems. I'll still try this some more though, given how randomly I've been able to reproduce this so far.

To be fair, things are definitely *better* than they were before, but there may or may not be a second issue we are seeing.

Ondrej Kubik (ondrak) wrote :

So image from cd image is not giving us good reference as it's running
different kernel, and this problem is happening in early boot. Let me build
some test images to compare with

On Wed, 9 Oct 2019, 04:55 Paul Larson, <email address hidden> wrote:

> I can try some more in the morning, but here are some more observations
> from the rest of my testing today:
> 1. with both the current beta image, and the one from your people.c.c, I
> had previously been rewriting the sd card each time to test. I was hitting
> the errors maybe 10-20% of the time or so. After a successful boot, I then
> tried rebooting the same sd several more times without rewriting it, and
> after about 6 more successful boots, I was able to reproduce the error
> again. It would be interesting to see if someone else with a dragonboard
> can reproduce this behavior. The SD card I'm using has never given me
> problems before, but given the nature of them, I don't think I could rule
> out the possibility of a media problem.
>
> 2. I also tried booting the current/stable image that is on cdimage now.
> This is the image from before this problem was detected. I've rebooted
> it at least 12 times so far, and have not yet been able to reproduce the
> problems. I'll still try this some more though, given how randomly I've
> been able to reproduce this so far.
>
> To be fair, things are definitely *better* than they were before, but
> there may or may not be a second issue we are seeing.
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1846397
>
> Title:
> snapdragon uc18 image fails to boot (current stable)
>
> Status in snapd:
> Confirmed
>
> Bug description:
> The current stable uc18 snapdragon arm64 doesn't boot.
>
> http://us.cdimage.ubuntu.com/ubuntu-core/18/stable/current/
>
> ubuntu-core-18-arm64+snapdragon.img.xz 2019-08-06 07:56
>
> An earlier image boots fine: http://us.cdimage.ubuntu.com/ubuntu-
> core/18/stable/20190213/ubuntu-core-18-arm64+snapdragon.img.xz
>
> Other than the usual quick flash of a blue LED, there is no apparent
> boot activity. HDMI shows none of the usual boot sequence. I have no
> UART so cannot see boot console.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/snapd/+bug/1846397/+subscriptions
>

Ondrej Kubik (ondrak) wrote :

I got same error when build latest UC18 image with gadget from stable
So this is not related to latest change
It's probably more related to the fact we do not always test clean boot, or
do we?
My test image is dragonboard_48.snap and dragonboard-kernel_114.snap

On Wed, Oct 9, 2019 at 8:15 AM Ondrej Kubik <email address hidden>
wrote:

> So image from cd image is not giving us good reference as it's running
> different kernel, and this problem is happening in early boot. Let me build
> some test images to compare with
>
> On Wed, 9 Oct 2019, 04:55 Paul Larson, <email address hidden> wrote:
>
>> I can try some more in the morning, but here are some more observations
>> from the rest of my testing today:
>> 1. with both the current beta image, and the one from your people.c.c, I
>> had previously been rewriting the sd card each time to test. I was hitting
>> the errors maybe 10-20% of the time or so. After a successful boot, I then
>> tried rebooting the same sd several more times without rewriting it, and
>> after about 6 more successful boots, I was able to reproduce the error
>> again. It would be interesting to see if someone else with a dragonboard
>> can reproduce this behavior. The SD card I'm using has never given me
>> problems before, but given the nature of them, I don't think I could rule
>> out the possibility of a media problem.
>>
>> 2. I also tried booting the current/stable image that is on cdimage now.
>> This is the image from before this problem was detected. I've rebooted
>> it at least 12 times so far, and have not yet been able to reproduce the
>> problems. I'll still try this some more though, given how randomly I've
>> been able to reproduce this so far.
>>
>> To be fair, things are definitely *better* than they were before, but
>> there may or may not be a second issue we are seeing.
>>
>> --
>> You received this bug notification because you are a bug assignee.
>> https://bugs.launchpad.net/bugs/1846397
>>
>> Title:
>> snapdragon uc18 image fails to boot (current stable)
>>
>> Status in snapd:
>> Confirmed
>>
>> Bug description:
>> The current stable uc18 snapdragon arm64 doesn't boot.
>>
>> http://us.cdimage.ubuntu.com/ubuntu-core/18/stable/current/
>>
>> ubuntu-core-18-arm64+snapdragon.img.xz 2019-08-06 07:56
>>
>> An earlier image boots fine: http://us.cdimage.ubuntu.com/ubuntu-
>> core/18/stable/20190213/ubuntu-core-18-arm64+snapdragon.img.xz
>>
>> Other than the usual quick flash of a blue LED, there is no apparent
>> boot activity. HDMI shows none of the usual boot sequence. I have no
>> UART so cannot see boot console.
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/snapd/+bug/1846397/+subscriptions
>>
>

Paul Larson (pwlars) wrote :

I just grepped through all the serial output since 9/22 that we have in the lab, and I don't see any occurrences of "No init found" showing up in the logs.

We always provision the system with a fresh image every time on dragonboard. However in my testing at home, I was able to reproduce this even by booting the same image over and over enough times, so I don't think it has to only show up on the first boot.

I'll try this new image also, and also try to see if I can get it to happen with serial unless you already did that.

Ondrej Kubik (ondrak) wrote :

I was able to reproduce it with serial connected.
To confirm, area also able to reproduce it with stable channel image at
home?

On Wed, 9 Oct 2019, 19:20 Paul Larson, <email address hidden> wrote:

> I just grepped through all the serial output since 9/22 that we have in
> the lab, and I don't see any occurrences of "No init found" showing up
> in the logs.
>
> We always provision the system with a fresh image every time on
> dragonboard. However in my testing at home, I was able to reproduce this
> even by booting the same image over and over enough times, so I don't
> think it has to only show up on the first boot.
>
> I'll try this new image also, and also try to see if I can get it to
> happen with serial unless you already did that.
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1846397
>
> Title:
> snapdragon uc18 image fails to boot (current stable)
>
> Status in snapd:
> Confirmed
>
> Bug description:
> The current stable uc18 snapdragon arm64 doesn't boot.
>
> http://us.cdimage.ubuntu.com/ubuntu-core/18/stable/current/
>
> ubuntu-core-18-arm64+snapdragon.img.xz 2019-08-06 07:56
>
> An earlier image boots fine: http://us.cdimage.ubuntu.com/ubuntu-
> core/18/stable/20190213/ubuntu-core-18-arm64+snapdragon.img.xz
>
> Other than the usual quick flash of a blue LED, there is no apparent
> boot activity. HDMI shows none of the usual boot sequence. I have no
> UART so cannot see boot console.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/snapd/+bug/1846397/+subscriptions
>

Paul Larson (pwlars) wrote :

No, exactly the opposite. I have *not* been able to reproduce the "No init found" error with the current stable image so far. I was about to try the new image from today that you have on people.c.c, but I can switch back to trying the current stable image instead. I've only run through it about 12 times so far, so it's still possible - just haven't seen it so far, and I usually do by that many times on the other images

Ondrej Kubik (ondrak) wrote :

hmm strange.
Are you testing image you build yourself or image from cdimage?
Let's see if you can reproduce it with image I built

On Wed, Oct 9, 2019 at 7:50 PM Paul Larson <email address hidden>
wrote:

> No, exactly the opposite. I have *not* been able to reproduce the "No
> init found" error with the current stable image so far. I was about to
> try the new image from today that you have on people.c.c, but I can
> switch back to trying the current stable image instead. I've only run
> through it about 12 times so far, so it's still possible - just haven't
> seen it so far, and I usually do by that many times on the other images
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1846397
>
> Title:
> snapdragon uc18 image fails to boot (current stable)
>
> Status in snapd:
> Confirmed
>
> Bug description:
> The current stable uc18 snapdragon arm64 doesn't boot.
>
> http://us.cdimage.ubuntu.com/ubuntu-core/18/stable/current/
>
> ubuntu-core-18-arm64+snapdragon.img.xz 2019-08-06 07:56
>
> An earlier image boots fine: http://us.cdimage.ubuntu.com/ubuntu-
> core/18/stable/20190213/ubuntu-core-18-arm64+snapdragon.img.xz
>
> Other than the usual quick flash of a blue LED, there is no apparent
> boot activity. HDMI shows none of the usual boot sequence. I have no
> UART so cannot see boot console.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/snapd/+bug/1846397/+subscriptions
>

Paul Larson (pwlars) wrote :

None of this is with an image I've built myself.
Using your image from 20191009, I was able to reproduce the "No init found" error after 6 attempts.
Using the current/stable image, I made 12 attempts yesterday, and 15 today, and I still have not managed to reproduce the "No init found" error

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments