arm64: Synchronous Exception at 0x00000000BBC129BC

Bug #1675522 reported by dann frazier
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
edk2 (Ubuntu)
Confirmed
Undecided
Unassigned
grub2 (Ubuntu)
Invalid
Undecided
Unassigned
linux (Ubuntu)
Invalid
Undecided
Unassigned
qemu (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

I installed a KVM VM with the zesty beta server ISO. The install went well but, upon reboot, GRUB entered a Synchronous Exception loop after selecting the Ubuntu boot option.

After killing qemu-system-aarch64 and restarting the VM, the system booted to a prompt w/o issue.

Revision history for this message
dann frazier (dannf) wrote :

This can also be reproduced with a reboot after booting from the installed system.

Revision history for this message
dann frazier (dannf) wrote :

Behavior is the same after reverting to yakkety grub. This issue follows the guest kernel that we are rebooting *from*:

kernel running | kernel we | Reboot
  at reboot | reboot into | Success?
--------------------------------------------
     4.10 | 4.10 | Failure
     4.8 | 4.8 | Success
     4.10 | 4.8 | Failure
     4.8 | 4.10 | Success

This suggests a bug in QEMU/qemu-efi. I suppose it could also be a bug in the kernel if it were a long-standing issue with boottime init that existed in 4.8 but was never tickled until now.

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1675522

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
dann frazier (dannf) wrote :

Upgrading the host from xenial's QEMU to zesty's QEMU seems to fix it. I can do several reboots of zesty w/o a hang.

Changed in edk2 (Ubuntu):
status: New → Invalid
Changed in linux (Ubuntu):
status: Incomplete → Invalid
Changed in grub2 (Ubuntu):
status: New → Invalid
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Dann, thanks for your checks already!
Are you intending to break this down if e.g. Yakkety qemu is fixed as well as zesty seems?
And then once we know if Y is good maybe even down to a bisect of qemu in that regard?

Given your checks already the test seems to be as "easy" as "boot zesty + reboot = crash".

Do you happen to know:
- if it makes a difference what exact arm HW (more diverse than x86 at least) the host has?
- if it could be reproduced on qemu on x86 doing arm emulation?

Revision history for this message
dann frazier (dannf) wrote : Re: [Bug 1675522] Re: arm64: Synchronous Exception at 0x00000000BBC129BC

On Fri, Mar 24, 2017 at 2:01 AM, ChristianEhrhardt
<email address hidden> wrote:
> Hi Dann, thanks for your checks already!
> Are you intending to break this down if e.g. Yakkety qemu is fixed as well as zesty seems?
> And then once we know if Y is good maybe even down to a bisect of qemu in that regard?

hey Christian! Yep, that's my plan.

> Given your checks already the test seems to be as "easy" as "boot zesty
> + reboot = crash".
>
> Do you happen to know:
> - if it makes a difference what exact arm HW (more diverse than x86 at least) the host has?
> - if it could be reproduced on qemu on x86 doing arm emulation?

I don't have these data points yet, but I can collect them if the
bisection method doesn't turn up a quick answer.

  -dann

Revision history for this message
Ubuntu QA Website (ubuntuqa) wrote :

This bug has been reported on the Ubuntu ISO testing tracker.

A list of all reports related to this bug can be found here:
http://iso.qa.ubuntu.com/qatracker/reports/bugs/1675522

tags: added: iso-testing
tags: added: zesty
Revision history for this message
dann frazier (dannf) wrote :

Now that I've found time to get back to this one, I can no longer reproduce it. The machine I originally observed this on has been redeployed, so I no longer have the artifacts necessary for comparison.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Then for now I'll mark it incomplete for now, thanks for your work on it Dann!

Changed in qemu (Ubuntu):
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for qemu (Ubuntu) because there has been no activity for 60 days.]

Changed in qemu (Ubuntu):
status: Incomplete → Expired
dann frazier (dannf)
Changed in edk2 (Ubuntu):
status: Invalid → Confirmed
Changed in qemu (Ubuntu):
status: Expired → Invalid
Revision history for this message
dann frazier (dannf) wrote :

I've found that this is fairly reproducible when I run a hirsute guest on a bionic host, revert qemu-efi to xenial's version, and put the guest in a reboot loop. The problem therefore seems to follow edk2, so I'll see if a bisect will identify a fix.

Revision history for this message
dann frazier (dannf) wrote :

Because the issue is difficult to hit - and therefore bisect - I ended up using 10 VMs in a reboot loop. Turns out even with the latest edk2 source, I could hit a boot-time crash. So I dug a little deeper and found that the crash I'm seeing w/ latest edk2 is actually in shim code. We didn't yet build shim for arm64 in zesty, so it's likely not the same bug I reported here[*], so reported it in new bug 1928010.

[*] Although it is possible that the bug is in some common code that impacts both shim and GRUB (e.g. gnu-efi)

dann frazier (dannf)
Changed in edk2 (Ubuntu):
status: Confirmed → In Progress
assignee: nobody → dann frazier (dannf)
status: In Progress → Confirmed
assignee: dann frazier (dannf) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.