x86: singlestepping through SYSCALL instruction causes exception in kernelspace

Bug #1645355 reported by Rudolf Marek
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
QEMU
Fix Released
Undecided
Unassigned

Bug Description

Hi,

The bug was originally reported [1] and [2] here. There is a problem inside QEMU with singlestepping from userspace until SYSCALL instruction is reached. The OS has in FMASK TF bit set, therefore there should be no singlestepping exception when transitioning to kernelmode. But, inside QEMU there is (TF is clear seems FMASK is applied). See below for further details.

The reproducer is available at [2].

Here is the original text with some minor clarifications:

It seems that there is something wrong with QEMU with respect to handle the singlestepping and AMD64 syscall instruction.

The AMD "syscall" instruction will clear defined flag in the FMASK MSR. Normally the TF flag is set there, so the first instruction when kernel is entered after syscall won't cause single step exception in the kernel.

The observed scenario is a unhandled singlestep fault in the kernel or host reboot or QEMU crash.

The possible way how to reproduce it is to single step through any function (in userspace) which does "syscall" instruction. After syscall is entered QEMU will trigger singlestepping exception in the kernel despite that the TF is set in FMASK MSR. Real HW behaves correctly and does not trigger this exception.

What is interesting is that I was not able to trigger it if I just enabled TF and did the syscall instruction, perhaps for this bug is somewhat important to have TF set for previous few instruction.

I have stumbled to this problem while working with our custom OS. However after some googling I found out that the NetBSD guys (CCed) are having very similar problem and I asked them to prepare a ISO image where the problem ends with QEMU SIGSEGV or host reboot.

Thanks
Rudolf

[1] https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg02289.html
[2] http://gnats.netbsd.org/49603

Revision history for this message
Andreas Gustafsson (gson) wrote :

I believe this is fixed in the qemu git mainline (but not yet in any release)
by the following commits:

 commit c52ab08aee6f7d4717fc6b517174043126bd302f
 Author: Doug Evans <email address hidden>
 Date: Tue Dec 6 23:06:30 2016 +0000

     target-i386: Fix eflags.TF/#DB handling of syscall/sysret insns

     The syscall and sysret instructions behave a bit differently:
     TF is checked after the instruction completes.
     This allows the o/s to disable #DB at a syscall by adding TF to FMASK.
     And then when the sysret is executed the #DB is taken "as if" the
     syscall insn just completed.

 commit 410e98146ffde201ab4c778823ac8beaa74c4c3f
 Author: Doug Evans <email address hidden>
 Date: Sat Dec 24 20:29:33 2016 +0000

     target/i386: Fix bad patch application to translate.c

     In commit c52ab08aee6f7d4717fc6b517174043126bd302f,
     the patch snippet for the "syscall" insn got applied to "iret".

Revision history for this message
Rudolf Marek (r-marek) wrote :

Hi Andreas, man thanks for your reply. I can confirm that applying the two patches fixes my problem in the older version of the QEMU.

Changed in qemu:
status: New → Fix Committed
Revision history for this message
Thomas Huth (th-huth) wrote :

The two patches mentioned in comment #1 have been released with v2.9.0, so setting status to "Fix released" now.

Changed in qemu:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.