qemu-system-riscv64 sbi_trap_error powering down VM riscv64

Bug #1905067 reported by Sean Feole
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
qemu (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Incomplete
Undecided
Unassigned
Groovy
Won't Fix
Undecided
Unassigned
Hirsute
Fix Released
Undecided
Unassigned

Bug Description

Host OS: Focal, 20.04(5.4.0-52-generic)
QEMU: 1:4.2-3ubuntu6.8
OpenSBI: 0.8.1

Affected Series: Focal/Groovy Both 5.4/5.8 kernels

Upon powering off a Groovy VM ( GNU/Linux 5.8.0-7-generic riscv64 ) an sbi_trap_error will occur halting the VM. I have not tried an older version of OSBI.

root@riscv64-groovy:~# poweroff -f
Powering off.
[ 134.931728] reboot: Power down
sbi_trap_error: hart0: trap handler failed (error -2)
sbi_trap_error: hart0: mcause=0x0000000000000007 mtval=0x0000000000100000
sbi_trap_error: hart0: mepc=0x000000008000d4b0 mstatus=0x0000000000001822
sbi_trap_error: hart0: ra=0x00000000800098de sp=0x0000000080023c78
sbi_trap_error: hart0: gp=0xffffffe001722418 tp=0xffffffe1ed138b80
sbi_trap_error: hart0: s0=0x0000000080023c88 s1=0x0000000000000040
sbi_trap_error: hart0: a0=0x0000000000000000 a1=0x0000000080003f66
sbi_trap_error: hart0: a2=0x0000000080003f66 a3=0x0000000080003f66
sbi_trap_error: hart0: a4=0x0000000000100000 a5=0x0000000000005555
sbi_trap_error: hart0: a6=0x0000000000003f66 a7=0x00000000000110e8
sbi_trap_error: hart0: s2=0x0000000000000000 s3=0x0000000080024000
sbi_trap_error: hart0: s4=0x0000000000000000 s5=0x0000000000000000
sbi_trap_error: hart0: s6=0x0000000000000001 s7=0x0000000000000000
sbi_trap_error: hart0: s8=0x0000000000000000 s9=0x0000000000000000
sbi_trap_error: hart0: s10=0x0000000000000000 s11=0x0000000000000008
sbi_trap_error: hart0: t0=0x0000000000000000 t1=0x0000000000000000
sbi_trap_error: hart0: t2=0x0000000000000000 t3=0x0000000000000000
sbi_trap_error: hart0: t4=0x0000000000000000 t5=0x0000000000000000
sbi_trap_error: hart0: t6=0x0000000000000000

root@riscv64-groovy:~# sudo poweroff
         Stopping Session 1 of user root.
[ OK ] Removed slice system-modprobe.slice.
[ OK ] Stopped target Graphical Interface.
[ OK ] Stopped target Multi-User System.
[ OK ] Stopped target Login Prompts.
[ OK ] Stopped target Host and Network Name Lookups.
[ OK ] Stopped target Timers.
[ OK ] Stopped Daily apt upgrade and clean activities.
[ OK ] Stopped Daily apt download activities.
[ OK ] Stopped Periodic ext4 Onli…ata Check for All Filesystems.
[ OK ] Stopped Discard unused blocks once a week.
[ OK ] Stopped Daily rotation of log files.
[ OK ] Stopped Message of the Day.
[ OK ] Stopped Daily Cleanup of Temporary Directories.
[ OK ] Stopped target System Time Synchronized.
[ OK ] Stopped target System Time Set.
[ OK ] Closed Load/Save RF Kill Switch Status /dev/rfkill Watch.
         Stopping Regular background program processing daemon...
         Stopping Getty on tty1...
         Stopping Dispatcher daemon for systemd-networkd...
         Stopping System Logging Service...
         Stopping Serial Getty on ttyS0...
         Stopping Load/Save Random Seed...
[ OK ] Stopped Regular background program processing daemon.
[ OK ] Stopped Dispatcher daemon for systemd-networkd.
[ OK ] Stopped System Logging Service.
[ OK ] Stopped Serial Getty on ttyS0.
[ OK ] Stopped Getty on tty1.
[ OK ] Stopped Load/Save Random Seed.
[ OK ] Stopped Session 1 of user root.
[ OK ] Removed slice system-getty.slice.
[ OK ] Removed slice system-serial\x2dgetty.slice.
         Stopping User Login Management...
         Stopping User Manager for UID 0...
[ OK ] Stopped User Login Management.
[ OK ] Stopped User Manager for UID 0.
         Stopping User Runtime Directory /run/user/0...
[ OK ] Unmounted /run/user/0.
[ OK ] Stopped User Runtime Directory /run/user/0.
[ OK ] Removed slice User Slice of UID 0.
[ OK ] Reached target Unmount All Filesystems.
         Stopping D-Bus System Message Bus...
         Stopping Permit User Sessions...
[ OK ] Stopped D-Bus System Message Bus.
[ OK ] Stopped Permit User Sessions.
[ OK ] Stopped target Basic System.
[ OK ] Stopped target Network.
[ OK ] Stopped target Paths.
[ OK ] Stopped target Remote File Systems.
[ OK ] Stopped target Slices.
[ OK ] Removed slice User and Session Slice.
[ OK ] Stopped target Sockets.
[ OK ] Closed D-Bus System Message Bus Socket.
[ OK ] Stopped target System Initialization.
[ OK ] Stopped target Local Encrypted Volumes.
[ OK ] Stopped Dispatch Password …ts to Console Directory Watch.
[ OK ] Stopped Forward Password R…uests to Wall Directory Watch.
[ OK ] Stopped target Swap.
[ OK ] Closed Syslog Socket.
         Stopping Network Name Resolution...
         Stopping Network Time Synchronization...
         Stopping Update UTMP about System Boot/Shutdown...
[ OK ] Stopped Network Time Synchronization.
[ OK ] Stopped Network Name Resolution.
         Stopping Network Service...
[ OK ] Stopped Network Service.
[ OK ] Stopped Update UTMP about System Boot/Shutdown.
[ OK ] Stopped Apply Kernel Variables.
[ OK ] Stopped Load Kernel Modules.
[ OK ] Stopped Create Volatile Files and Directories.
[ OK ] Stopped target Local File Systems.
[ OK ] Stopped target Local File Systems (Pre).
[ OK ] Stopped Create Static Device Nodes in /dev.
[ OK ] Stopped Create System Users.
[ OK ] Stopped Remount Root and Kernel File Systems.
[ OK ] Reached target Shutdown.
[ OK ] Reached target Final Step.
[ OK ] Finished Power-Off.
[ OK ] Reached target Power-Off.
[ 77.560831] reboot: Power down
sbi_trap_error: hart0: trap handler failed (error -2)
sbi_trap_error: hart0: mcause=0x0000000000000007 mtval=0x0000000000100000
sbi_trap_error: hart0: mepc=0x000000008000d4b0 mstatus=0x0000000000001822
sbi_trap_error: hart0: ra=0x00000000800098de sp=0x0000000080023c78
sbi_trap_error: hart0: gp=0xffffffe001722418 tp=0xffffffe1f5bf5080
sbi_trap_error: hart0: s0=0x0000000080023c88 s1=0x0000000000000040
sbi_trap_error: hart0: a0=0x0000000000000000 a1=0x0000000080003f66
sbi_trap_error: hart0: a2=0x0000000080003f66 a3=0x0000000080003f66
sbi_trap_error: hart0: a4=0x0000000000100000 a5=0x0000000000005555
sbi_trap_error: hart0: a6=0x0000000000003f66 a7=0x00000000000110e8
sbi_trap_error: hart0: s2=0x0000000000000000 s3=0x0000000080024000
sbi_trap_error: hart0: s4=0x0000000000000000 s5=0x0000000000000000
sbi_trap_error: hart0: s6=0x0000000000000001 s7=0x0000000000000000
sbi_trap_error: hart0: s8=0x0000000000000000 s9=0x0000000000000000
sbi_trap_error: hart0: s10=0x0000000000000000 s11=0x0000000000000008
sbi_trap_error: hart0: t0=0x0000000000000000 t1=0x0000000000000000
sbi_trap_error: hart0: t2=0x0000000000000000 t3=0x0000000000000000
sbi_trap_error: hart0: t4=0x0000000000000000 t5=0x0000000000000000
sbi_trap_error: hart0: t6=0x0000000000000000

CVE References

Revision history for this message
lotuspsychje (lotuspsychje) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. Please execute the following command only once, as it will automatically gather debugging information, in a terminal:
apport-collect 1905067

When reporting bugs in the future please use apport by using 'ubuntu-bug' and the name of the package affected. You can learn more about this functionality at https://wiki.ubuntu.com/ReportingBugs.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Sean,
last time I used our riscv in VMs it still had to use a lot of hand-collected bits in [1]. Could you outline exactly what bits you used and from where?

Furthermore I've just made qemu 5.1 [2] available in 21.04 a few days ago. For the sake of trying if a fix already might exist in this recent version could you give this a try?

[1]: https://people.ubuntu.com/~wgrant/riscv64/
[2]: https://launchpad.net/ubuntu/+source/qemu/1:5.1+dfsg-4ubuntu1

Changed in qemu (Ubuntu):
status: New → Incomplete
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

This looks a lot like
https://mail.gnu.org/archive/html/qemu-devel/2020-09/msg00212.html

You'd think the offending commit mentioned there is actually in 5.1 and not earlier.
But it is backported in Groovy as part of
  Bug-Debian: https://bugs.debian.org/964793
  Bug-Debian: https://bugs.debian.org/964247
  https://bugs.launchpad.net/qemu/+bug/1886318
It already had one follow on fix in
  d/p/riscv-allow-64-bit-access-to-SiFive-CLINT.patch

Focal has that as well via CVE fixes:
  d/p/ubuntu/hw-riscv-Allow-64-bit-access-to-SiFive-CLINT.patch
  debian/patches/ubuntu/CVE-2020-13754-1.patch

Chances are we need this later follow on fix as well.

I wanted to check for Focal for stable patches of 4.2 (<email address hidden>) anyway (but there is not 4.2.2 yet). This would be one of them, but one step at a time.

I guess we need to backport https://git.qemu.org/?p=qemu.git;a=commit;h=ab3d207fe89bc0c63739db19e177af49179aa457

@Sean - if I'd build you qemu with that fix could you test it? If so what would you need qemu for F&G ?

Revision history for this message
Sean Feole (sfeole) wrote :

@Christian thanks for helping out, I could test both F&G versions of QEMU for you, but the one I need on our production system is Focal. Let me know where to grab it from once you have it built

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The patch in in 5.2 so it would be needed in >=Focal - I marked bug tasks accordingly.
But for now let us test it in one place (=Focal being the LTS and the furthest back) and if confirmed to work then prep the dev-fix and SRUs.

Could you try the build 4.2-3ubuntu6.10~ppa1 at [1] if it resolved the issue for you in Focal?

Also we'll really need these clear "where to get artifacts and how exactly to invoke" steps for the SRU process, so please add these as well.

[1]: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4351/+packages

Changed in qemu (Ubuntu Hirsute):
status: Incomplete → Triaged
Changed in qemu (Ubuntu Groovy):
status: New → Triaged
Changed in qemu (Ubuntu Focal):
status: New → Triaged
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

@Sean - I guess you were on a thanksgiving break, once you are back and had a chance to test - please let me know.

Revision history for this message
Sean Feole (sfeole) wrote :

I have some cycles to test this, will report back hopefully by EOD today.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hmm, activity here seems to have died.
The PPA is cleaned by the end-of-cycle cleanup and Hirsute released with 5.2.

Maybe by now all kernels have got 5d971f9e67 ("memory: Revert "memory: accept mismatching sizes in memory_region_access_valid"") and so it triggers no more?

Thereby I'm setting 5.2 to Fixed (as it has the change) and F/G to incomplete as it might be no more important enough to SRU this.

Please speak up if you think this still is an issue, then I'll provide a new PPA for you to have a look.

Changed in qemu (Ubuntu Hirsute):
status: Triaged → Fix Released
Changed in qemu (Ubuntu Groovy):
status: Triaged → Incomplete
Changed in qemu (Ubuntu Focal):
status: Triaged → Incomplete
Changed in qemu (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote :

The Groovy Gorilla has reached end of life, so this bug will not be fixed for that release

Changed in qemu (Ubuntu Groovy):
status: Incomplete → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.