Can not boot Cavium ThunderX

Bug #1923230 reported by Diego Mascialino
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Hi I am tryring to deploy a Hirsute server (Ubuntu 5.11.0-13.14-generic 5.11.7) in

arm64
    description: Computer
    product: Cavium ThunderX CN88XX board
    width: 64 bits

Im getting a kernel panic occurs during the boot:

[ 8.417589] Code: a90153f3 aa0003f3 2a0103f4 37000074 (79437000)
[ 8.425298] ---[ end trace 2f743d3081d8a73b ]---
[ 8.431538] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 8.440888] SMP: stopping secondary CPUs
[ 8.446480] Kernel Offset: disabled
[ 8.451581] CPU features: 0x00040002,69101108
[ 8.457552] Memory Limit: none
[ 8.462197] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

Revision history for this message
Diego Mascialino (dmascialino) wrote :
Revision history for this message
Diego Mascialino (dmascialino) wrote :

I exectued `lspci -vnvn` using a focal server in the same machine .

Let me know if there is any else I may help with

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1923230

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Diego Mascialino (dmascialino) wrote :

Im not able to run: apport-collect

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
dann frazier (dannf) wrote :

Strange, we have several of these systems, but I can only reproduce it on only one of them.

On the systems that work, I see:

[ 8.226607] efifb: probing for efifb
[ 8.230224] efifb: No BGRT, not showing boot graphics
[ 8.235277] efifb: framebuffer at 0x881010000000, using 1200k, total 1200k
[ 8.242148] efifb: mode is 640x480x32, linelength=2560, pages=1
[ 8.248063] efifb: scrolling: redraw
[ 8.251631] efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0
[ 8.257458] Console: switching to colour frame buffer device 80x30
[ 8.340917] fb0: EFI VGA frame buffer device
[ 8.353028] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled

On the one that doesn't, I see:
[ 8.357963] efifb: probing for efifb
[ 8.361579] efifb: cannot reserve video memory at 0x10000000
[ 8.367282] efifb: No BGRT, not showing boot graphics
[ 8.372338] efifb: framebuffer at 0x10000000, using 1876k, total 1875k
[ 8.378864] efifb: mode is 800x600x32, linelength=3200, pages=1
[ 8.384783] efifb: scrolling: redraw
[ 8.388356] efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0
[ 8.394237] Console: switching to colour frame buffer device 100x37
[ 8.404419] fb0: EFI VGA frame buffer device
[ 8.408744] Unable to handle kernel read from unreadable memory at virtual address 0000000000000280

Notice that in the good case it is reserving memory at 0x881010000000, the bad case at 0x10000000. This reminds me of LP: #1755304. Can you try updating your firmware?

Your console.log shows you are running:
DMI: GIGABYTE R120-T33/MT30-GS1, BIOS T45 06/14/2017

A working one on my end shows:
DMI: GIGABYTE R120-T30-V6-A/MT30-GS3-V6, BIOS F02 08/06/2019

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Paolo Pisati (p-pisati) wrote :

I couldn't find a CN8888XX board, so i tried to deploy Hirsute on a Thunder X2 (CN99XX?), and it works fine there:

ubuntu@apollo:~$ uname -a
Linux apollo 5.11.0-14-generic #15-Ubuntu SMP Thu Apr 8 21:35:05 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux

FWIW:
...
[ 37.570426] efifb: probing for efifb
[ 37.574037] efifb: No BGRT, not showing boot graphics
[ 37.579078] efifb: framebuffer at 0x40000000, using 1876k, total 1875k
[ 37.585595] efifb: mode is 800x600x32, linelength=3200, pages=1
[ 37.591505] efifb: scrolling: redraw
[ 37.595069] efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0
[ 37.600699] fbcon: Deferring console take-over
[ 37.605136] fb0: EFI VGA frame buffer device
...

Revision history for this message
dann frazier (dannf) wrote :

I found a system running old firmware and I was able to reproduce this. I updated the firmware, and the issue went away. Of course, it seems like a regression that the kernel now *crashes* with the bad firmware (see bug 1755304 for my reasoning for it being bad), when it used to just emit errors.

Revision history for this message
Diego Mascialino (dmascialino) wrote :

Thanks dannf,

We updated the BIOS to F02 and its working well now.

Changed in linux (Ubuntu):
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.