ARM cpu emulation regression on QEMU 4.2.0

Bug #1882123 reported by Hajin Jang
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
QEMU
Expired
Undecided
Unassigned

Bug Description

[*] Summary

Latest QEMU has an ARM CPU emulation regression.
Regression is reproducible by building any C# project with .NET Core SDK 3.1.300 on Debian 10 armhf guest OS.

Releases affected: QEMU 4.2.0, 5.0.0
Releases not affected: QEMU 4.1.0, QEMU 4.1.1

[*] Detail

.NET Core SDK 3.1 fails to run on Debian 10 emulated by qemu-system-arm.

I occasionally test my C# projects on the virtual armhf/arm64 system emulated by QEMU. MSBuild, a build engine of the .NET Core SDK, crashes on QEMU 4.2.0 or later. The crash only happens when MSBuild tries to do any JIT compiling (dotnet build / dotnet test).

I attached the MSBuild crash logs. MSBuild always crashes with SEHException, which means it tried to call C binary from .NET binary.

I think the ARM CPU emulation regression happened between QEMU 4.1.1 ~ 4.2.0. The issue affects QEMU 4.2.0 and 5.0.0. QEMU 4.1.0, 4.1.1, and real Raspberry Pi 2 are not affected by this issue, and .NET Core SDK works completely fine.

[*] Environment

[Host OS]
Distribution: Linux Mint 19.3 amd64
CPU: AMD Ryzen 5 3600
Kernel: Ubuntu 5.3.0-51-generic

[QEMU Guest OS]
Distribution: Debian 10 Buster armhf
Kernel: Debian 4.19.0-9-armmp-lpae
.NET Core SDK: 3.1.300

[Raspberry Pi 2]
Distribution: Raspberry Pi OS Buster armhf
Kernel: 4.19.118-v7+

[Tested C# Projects]
This is a list of C# projects I have tested on QEMU and RPI2.
- https://github.com/ied206/Joveler.DynLoader
- https://github.com/ied206/Joveler.Compression
- https://github.com/ied206/ManagedWimLib

[QEMU Launch Arguments]
qemu-system-arm \
    -smp 3 -M virt -m 4096 \
    -kernel vmlinuz-4.19.0-9-armmp-lpae \
    -initrd initrd.img-4.19.0-9-armmp-lpae \
    -append "root=/dev/vda2" \
    -drive if=none,file=debian_arm.qcow2,format=qcow2,id=hd \
    -device virtio-blk-device,drive=hd \
    -netdev user,id=mynet,hostfwd=tcp::<PORT>-:22 \
    -device virtio-net-device,netdev=mynet \
    -device virtio-rng-device

[QEMU Configure Arguments]
./configure --enable-spice --enable-gtk --enable-vnc-jpeg --enable-vnc-png --enable-avx2 --enable-libusb --enable-opengl --enable-virglrenderer --enable-kvm --enable-system --enable-modules --audio-drv-list=pa

Revision history for this message
Hajin Jang (joveler) wrote :
description: updated
Hajin Jang (joveler)
description: updated
description: updated
Revision history for this message
Hajin Jang (joveler) wrote :

I have tested 4.2.0 release candidate versions to pinpoint which commit caused the regression.

- 4.2.0-rc2: Same with 4.2.0, dotnet command crashes with SEHException.
- 4.2.0-rc0, 4.2.0-rc1: Launching dotnet command with any argument crashes with illegal hardware instruction message.

$ dotnet build
[1] 658 illegal hardware instruction dotnet build
$ dotnet --version
[1] 689 illegal hardware instruction dotnet --version

So the issue is affected by some commits pushed between 4.1.0 ~ 4.2.0-rc0 and 4.2.0-rc1 ~ 4.2.0-rc2 period.

Revision history for this message
Peter Maydell (pmaydell) wrote :

Could you try current head of git as well please, just in case it's a bug we've already fixed since 5.0? Thanks!

Revision history for this message
Hajin Jang (joveler) wrote :

I pinpointed the exact commits which affected the regression.

[QEMU 4.2.0-rc0 : illegal hardware instruction]
- Introduced in commit af28822
https://github.com/qemu/qemu/commit/af2882289951e58363d714afd16f80050685fa29
The commit affected LDREX/STREX translation, and broke dotnet command from .NET Core SDK.

[QEMU 4.2.0-rc2 : .NET SEHException]
- Introduced in commit 655b026
https://github.com/qemu/qemu/commit/655b02646dc175dc10666459b0a1e4346fc8d46a
The commit fixes STREX a bit. As a result, dotnet command is now executable except JIT compiling.

I also tested lastest HEAD from the master, and it still has the SEHException regression.
(Tested commit is 66234fee9c2d37bfbc523aa8d0ae5300a14cc10e)

Revision history for this message
Hajin Jang (joveler) wrote :

Dear Peter Maydell (@pmaydell), is there any update on this bug?

Revision history for this message
Peter Maydell (pmaydell) wrote :

No, sorry. There would be a lot of effort required to track down exactly what goes wrong with the MS JIT engine...

Peter Maydell (pmaydell)
Changed in qemu:
status: New → Confirmed
Revision history for this message
Thomas Huth (th-huth) wrote : Moved bug report

This is an automated cleanup. This bug report has been moved to QEMU's
new bug tracker on gitlab.com and thus gets marked as 'expired' now.
Please continue with the discussion here:

 https://gitlab.com/qemu-project/qemu/-/issues/271

Changed in qemu:
status: Confirmed → Expired
Revision history for this message
Qiang Liu (cyruscyliu) wrote : Re: [PATCH] hw/usb/core: fix inconsistent ep and pid (UBS_TOKEN_SETUP)

Hi all,

I'm sure this patch will prevent the assertion failure due to the
inconsistent ep and pid (UBS_TOKEN_SETUP) (
https://lists.gnu.org/archive/html/qemu-devel/2021-06/msg07179.html).

For UHCI (https://gitlab.com/qemu-project/qemu/-/issues/119) and OHCI (
https://gitlab.com/qemu-project/qemu/-/issues/303), this patch may be
right.

For EHCI, I found another way to trigger this assertion even with my patch
because ehci_get_pid() returns 0 if qtd->token.QTD_TOKEN_PID is not
valid[0]. In this case, the patch cannot capture it because pid is zero[2].
This case is specific to EHCI as far as I know. It seems we want to drop
the operation if ehci_get_pid() returns 0.

```static int ehci_get_pid(EHCIqtd *qtd)
{
    switch (get_field(qtd->token, QTD_TOKEN_PID)) {
    case 0:
        return USB_TOKEN_OUT;
    case 1:
        return USB_TOKEN_IN;
    case 2:
        return USB_TOKEN_SETUP;
    default:
        fprintf(stderr, "bad token\n"); //
---------------------------------------------> [0]
        return 0;
    }
}

static int ehci_execute(EHCIPacket *p, const char *action)
{
    p->pid = ehci_get_pid(&p->qtd); //
--------------------------------------------> [1]
    p->queue->last_pid = p->pid;
    endp = get_field(p->queue->qh.epchar, QH_EPCHAR_EP);
    ep = usb_ep_get(p->queue->dev, p->pid/*=0*/, endp); //
-----------------------> [2]
```

A qtest sequence is like
```
writel 0x1011b000 0x10124000
writel 0x10124004 0x358cbd80
writel 0x10124018 0x9e4bba36
writel 0x10124014 0x10139000
writel 0xfebd5020 0x1c4a5135
writel 0x10139008 0x3d5c4b84
clock_step 0xb17b0
writel 0xfebd5064 0x5f919911
clock_step 0xa9229
writel 0xfebd5064 0x5431e207
writel 0xfebd5038 0x1b2034b5
writel 0x1b2034a0 0x10100000
writel 0x10100000 0x10109000
writel 0x10109000 0x1011b000
clock_step 0xa9229
```

Best,
Qiang

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.