Constant Folding on 64-bit Subtraction causes SIGILL on linux-user glxgears ppc64le to x86_64 by way of generating bad shift instruction with c=-1

Bug #1877794 reported by Catherine A. Frederick
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Invalid
Undecided
Unassigned

Bug Description

Hello, I've been recently working on my own little branch of QEMU implementing the drm IOCTLs, when I discovered that glxgears seems to crash in GLXSwapBuffers(); with a SIGILL. I investigated this for about 2 weeks, manually trying to trace the call stack, only to find that we seemingly crash in a bad shift instruction. Originally intended to be an shr_i64 generated to an RLDICL, we end up with an all ones(-1) c value, which gets thrown to the macro for generating the MB, and replaces the instruction with mostly ones. This new instruction, FFFFFFE0 is invalid on ppc64le, and crashes in a host SIGILL in codegen_buffer. I tried to see if the output of translate.c had this bad instruction, but all I got were two (shr eax, cl) instructions, and upon creating a test program with shr (eax, cl) in it, nothing happened. Then figuring that there was nothing actually wrong with the instruction in the first place, I turned my eye to the optimizer, and completely disabled constant folding for arithmetic instructions. This seemed to actually resolve the issue, and then I slowly enabled constant folding again on various instructions only to find that enabling not on the shifts, but on subtraction seemed to cause the bug to reappear. I am bewildered and frankly at this point I'm not sure I have a chance in hell of figuring out what causes it, so I'm throwing it here.

tags: added: linux-user ppc64le
tags: added: tcg
tags: added: ppc64
removed: ppc64le
Revision history for this message
Laurent Vivier (laurent-vivier) wrote :

Which version of qemu and glxgears do you use?

Tested with qemu-4.2 and qemu-5.0 and debian/sid mesa-utils-8.4.0-1+b1 and it works fine.

Revision history for this message
Catherine A. Frederick (mptcultist) wrote :

I'm on 5.0-rc4 add a few patches implementing a subset of drm/amdgpu support, with mesa-utils 8.4.0-1. Important note I guess: I can't get the crash to trigger under llvmpipe/softrast, I can only get it on RadeonSI. I valgrind'd qemu only to not find anything on the host-side. I'm 99% sure this isn't a memory corruption bug from my patches, anyway, and it's not replicable on upstream qemu because DRM simply isn't functional. (Oh dear, I think I might be on my own until I can get this stuff upstreamed.)

Revision history for this message
Laurent Vivier (laurent-vivier) wrote :

QEMU has been selected to have a GSoC intern to work on new ioctl():
https://wiki.qemu.org/Google_Summer_of_Code_2020#Extend_linux-user_syscalls_and_ioctls

Perhaps he can help you by working on DRM ioctl()?
Perhaps you can send an RFC series to the QEMU mailing list?

Revision history for this message
Catherine A. Frederick (mptcultist) wrote :

I'm marking this invalid and moving on because it isn't replicable on upstream due to the lack of DRM support and because I'll probably just figure it out myself. (if anyone has somewhere better than tcg/README.md to learn TCG implementation details, I would appreciate it.

Changed in qemu:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.