AARCH64 to ARMv7 mistranslation in TCG
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
QEMU |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
The following guest code:
implements, in hand-optimized aarch64 assembly, the CopyMem() edk2 (EFI
Development Kit II) library function. (CopyMem() basically has memmove()
semantics, to provide a standard C analog here.) The relevant functions
are InternalMemCopy
When TCG translates this aarch64 code to x86_64, everything works fine.
When TCG translates this aarch64 code to ARMv7, the destination area of
the translated CopyMem() function becomes corrupted -- it differs from
the intended source contents. Namely, in every 4096 byte block, the
8-byte word at offset 4032 (0xFC0) is zeroed out in the destination,
instead of receiving the intended source value.
I'm attaching two hexdumps of the same destination area:
- "good.txt" is a hexdump of the destination area when CopyMem() was
translated to x86_64,
- "bad.txt" is a hexdump of the destination area when CopyMem() was
translated to ARMv7.
In order to assist with the analysis of this issue, I disassembled the
aarch64 binary with "objdump". Please find the listing in
"DxeCore.objdump", attached. The InternalMemCopy
hex offset 2b2ec. The __memcpy() function starts at hex offset 2b180.
And, I ran the guest on the ARMv7 host with "-d
in_asm,
"tcg.in_
The TBs that correspond to (parts of) the InternalMemCopy
__memcpy() functions are scattered over the TCG log file, but the offset
between the "nice" disassembly from "DxeCore.objdump", and the in-RAM
TBs in the TCG log, can be determined from the fact that there is a
single prfm instruction in the entire binary. The instruction's offset
is 0x2b180 in "DxeCore.objdump" -- at the beginning of the __memcpy()
function --, and its RAM address is 0x472d2180 in the TCG log. Thus the
difference (= the load address of DxeCore.efi) is 0x472a7000.
QEMU was built at commit a4f667b67149 ("Merge remote-tracking branch
'remotes/
The reproducer command line is (on an ARMv7 host):
qemu-
-display none \
-machine virt,accel=tcg \
-nodefaults \
-nographic \
-drive if=pflash,
-drive if=pflash,
-cpu cortex-a57 \
-chardev stdio,signal=
-mon chardev=
-serial chardev:char0
The apparent symptom is an assertion failure *in the guest*, such as
> ASSERT [DxeCore]
> /home/lacos/
> Length < _gPcd_FixedAtBu
but that is only a (distant) consequence of the CopyMem()
mistranslation, and resultant destination area corruption.
Originally reported in the following two mailing list messages:
- http://<email address hidden>
- http://<email address hidden>
tags: | added: arm |
tags: | added: testcase |
Changed in qemu: | |
status: | New → Fix Committed |
Possibly related: /lists. gnu.org/ archive/ html/qemu- devel/2019- 05/msg07362. html
[Qemu-devel] "accel/tcg: demacro cputlb" break qemu-system-x86_64
https:/
(qemu-system-x86_64 fails to boot 64-bit kernel under TCG accel when QEMU is built for i686)
Note to self: try to reprodouce the present issue with QEMU built at eed5664238ea^ -- this LP has originally been filed about the tree at a4f667b67149, and that commit contains eed5664238ea. So checking at eed5664238ea^ might reveal a difference.