fc1120a7f5f2d4b601003205c598077d3eb11ad2 causes a kernel panic in vfp_init on a clang built kernel

Bug #1844597 reported by Nathan Chancellor
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Fix Released
Undecided
Unassigned

Bug Description

Commit 4cdabee7d6d2 ("ARM: configs: aspeed_g5: Enable AST2600") [1] in the Linux kernel enabled CONFIG_VFP. When building this config with Clang, the resulting kernel does not boot after commit fc1120a7f5 ("target/arm: Implement NSACR gating of floating point") [2] (present since the 4.1.0 release).

The QEMU command:

qemu-system-arm -m 512m \
                -machine romulus-bmc \
                -no-reboot \
                -dtb out/arch/arm/boot/dts/aspeed-bmc-opp-romulus.dtb \
                -initrd rootfs.cpio \
                -display none \
                -serial mon:stdio \
                -kernel ${KBF}/arch/arm/boot/zImage

If it is needed, the rootfs we are using is provided at a link below [3].

Debugging with QEMU reveals that the kernel panics in vfp_init, specifically at the line:

vfpsid = fmrx(FPSID);

in arch/arm/vfp/vfpmodule.c because of an illegal instruction:

[ 0.058685] VFP support v0.3:
[ 0.059159] Internal error: Oops - undefined instruction: 0 [#1] SMP ARM
[ 0.059525] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.3.0-next-20190918-dirty #1
[ 0.059547] Hardware name: Generic DT based system
[ 0.059702] PC is at vfp_init+0x50/0x1f4
[ 0.059721] LR is at vfp_init+0x4c/0x1f4
[ 0.059738] pc : [<80b0383c>] lr : [<80b03838>] psr: 60000153
[ 0.059756] sp : 9e497ec0 ip : 00000020 fp : 9e497ed8
[ 0.059773] r10: 00000000 r9 : ffffe000 r8 : 80c06048
[ 0.059792] r7 : 00000000 r6 : 80c0caac r5 : 80c6c418 r4 : 80b037ec
[ 0.059811] r3 : 00000000 r2 : 339aa372 r1 : 00000000 r0 : 00000012
[ 0.059859] Flags: nZCv IRQs on FIQs off Mode SVC_32 ISA ARM Segment none
[ 0.059883] Control: 00c5387d Table: 80004008 DAC: 00000051
[ 0.059997] Process swapper/0 (pid: 1, stack limit = 0x(ptrval))
[ 0.060048] Stack: (0x9e497ec0 to 0x9e498000)
[ 0.060205] 7ec0: 80b037ec 80b6bf0c 80b037ec ffffffff 00000000 00000000 9e497f48 80b01100
[ 0.060310] 7ee0: 00000000 9eeff9e0 80a85734 809eb9be 00000000 8014b7f4 9eeff9e0 80a85734
[ 0.060408] 7f00: 9e497f48 8014b7f4 000000a4 00000001 00000001 00000000 80b0133c 9e497f38
[ 0.060509] 7f20: 00000000 9eeff9d5 339aa372 80b6be80 80b6bf0c 00000000 00000000 00000000
[ 0.060606] 7f40: 00000000 00000000 9e497f70 80b01864 00000001 00000001 00000000 80b0133c
[ 0.060703] 7f60: 00000001 8085d268 00000000 00000000 9e497f80 80b01758 00000000 00000000
[ 0.060800] 7f80: 9e497f90 80b015e4 00000000 8085d268 9e497fa8 8085d27c 00000000 8085d268
[ 0.060897] 7fa0: 00000000 00000000 00000000 801010e8 00000000 00000000 00000000 00000000
[ 0.060993] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 0.061090] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[ 0.061625] [<80b0383c>] (vfp_init) from [<80b01100>] (do_one_initcall+0xa8/0x1e0)
[ 0.061722] [<80b01100>] (do_one_initcall) from [<80b01864>] (do_initcall_level+0xfc/0x12c)
[ 0.061742] [<80b01864>] (do_initcall_level) from [<80b01758>] (do_basic_setup+0x2c/0x3c)
[ 0.061759] [<80b01758>] (do_basic_setup) from [<80b015e4>] (kernel_init_freeable+0x68/0x104)
[ 0.061777] [<80b015e4>] (kernel_init_freeable) from [<8085d27c>] (kernel_init+0x14/0x26c)
[ 0.061798] [<8085d27c>] (kernel_init) from [<801010e8>] (ret_from_fork+0x14/0x2c)
[ 0.061835] Exception stack(0x9e497fb0 to 0x9e497ff8)
[ 0.061896] 7fa0: 00000000 00000000 00000000 00000000
[ 0.061998] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 0.062080] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 0.062263] Code: e5860000 e59f0174 ebd9d8fc e59f5170 (eef04a10)
[ 0.062679] ---[ end trace 2d338c91e4e74562 ]---

Before fc1120a7f5:

[ 0.069418] VFP support v0.3: implementor 41 architecture 1 part 20 variant b rev 5

Should you need to reproduce this locally:

* clang 9.0.0 or later is needed to build this config. If you do not have easy access to such a build, we have a clang build script available [4] that can help with this:

% ./build-llvm.py --branch llvmorg-9.0.0-rc6 \
                  --build-stage1-only \
                  --projects clang \
                  --targets ARM

* Because of an unrelated build issue, linux-next needs to be used (or the singular patch that resolves it needs to be cherry-picked on top of 4cdabee7d6d2 [5]). The kernel make command used:

% make -j$(nproc) -s \
       ARCH=arm \
       CC=clang \
       CROSS_COMPILE=arm-linux-gnueabi- \
       O=out \
       distclean aspeed_g5_defconfig all

[1]: https://git.kernel.org/linus/4cdabee7d6d2e439fea726a101e448c4ca6837f4
[2]: https://git.qemu.org/?p=qemu.git;a=commit;h=fc1120a7f5f2d4b601003205c598077d3eb11ad2
[3]: https://github.com/ClangBuiltLinux/continuous-integration/blob/800d84bf8c55ee04c50ed4c78144a96d889a91c5/images/arm/rootfs.cpio
[4]: https://github.com/ClangBuiltLinux/tc-build
[5]: http://git.armlinux.org.uk/cgit/linux-arm.git/commit/?id=7b3948597372e5a6b314208ac320362c204b7f0f

Revision history for this message
Peter Maydell (pmaydell) wrote :

Hi; could you attach all the binaries needed to repro this (zImage, dtb, etc) to the bug please?

Revision history for this message
Nathan Chancellor (nathanchance) wrote :

Ugh, sorry, I forget that I can actually upload files to these platforms :(

Done, let me know if you need anything else!

Revision history for this message
Nathan Chancellor (nathanchance) wrote :
Revision history for this message
Peter Maydell (pmaydell) wrote :

Thanks. I've diagnosed the problem -- when we boot a kernel directly into non-secure state on an AArch32 CPU which implements EL3, we need to set the NSACR.{CP11,CP10} bits so that Non-Secure is allowed to use the FPU, but we weren't doing that. The omission didn't matter until commit fc1120a7f5 because before that point we were ignoring the NSACR trap bits entirely... Patch coming up shortly.

Revision history for this message
Peter Maydell (pmaydell) wrote :

Should be fixed by:
https://<email address hidden>/
(which allows me to boot the kernel you attached at least as far as "didn't find a root filesystem").

Changed in qemu:
status: New → In Progress
Revision history for this message
Nathan Chancellor (nathanchance) wrote :

I pulled down the fix, built locally, and can confirm that this resolves the issue. Thank you for the quick patch!

Revision history for this message
Peter Maydell (pmaydell) wrote :

Fixed in master in commit ece628fcf69cbbd, which will be in the 4.2 release (also tagged for stable so will end up on the 4.1.x branch at some point).

Changed in qemu:
status: In Progress → Fix Committed
Thomas Huth (th-huth)
Changed in qemu:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.