When trying to run wine (latest devel from git) regression tests for ntdll in a statically linked qemu-i386 (commit 392b9a74b9b621c52d05e37bc6f41f1bbab5c6f8) on arm32 (raspberry pi 4) in a debian buster chroot, the exception tests fail at the first test with an infinite exception loop.
WINEDEBUG=+seh wine wine/dlls/ntdll/tests/ntdll_test.exe exception
Working x86_64 system running 32-bit code
0024:warn:seh:dispatch_exception EXCEPTION_ACCESS_VIOLATION exception (code=c0000005) raised
0024:trace:seh:dispatch_exception eax=00000000 ebx=7ffc2000 ecx=004e0ef4 edx=003c0004 esi=003c0000 edi=00000000
0024:trace:seh:dispatch_exception ebp=0085fa08 esp=0085f9ac cs=0023 ds=002b es=002b fs=0063 gs=006b flags=00010246
0024:trace:seh:call_vectored_handlers calling handler at 7B00B460 code=c0000005 flags=0
0024:trace:seh:call_vectored_handlers handler at 7B00B460 returned 0
0024:trace:seh:call_stack_handlers calling handler at 004178B0 code=c0000005 flags=0
0024:trace:seh:call_stack_handlers handler at 004178B0 returned 0
0024:trace:seh:dispatch_exception call_stack_handlers continuing
0024:trace:seh:NtGetContextThread 0xfffffffe: dr0=42424240 dr1=00000000 dr2=126bb070 dr3=0badbad0 dr6=00000000 dr7=ffff0115
Non-working qemu
0024:warn:seh:dispatch_exception EXCEPTION_ACCESS_VIOLATION exception (code=c0000005) raised
0024:trace:seh:dispatch_exception eax=00000000 ebx=3ffe2000 ecx=004e0ef4 edx=003c0004 esi=003c0000 edi=00000000
0024:trace:seh:dispatch_exception ebp=0085fa08 esp=0085f9ac cs=0023 ds=002b es=002b fs=003b gs=0033 flags=00000246
0024:trace:seh:call_vectored_handlers calling handler at 7B00B460 code=c0000005 flags=0
0024:trace:seh:call_vectored_handlers handler at 7B00B460 returned 0
0024:trace:seh:call_stack_handlers calling handler at 004178B0 code=c0000005 flags=0
0024:trace:seh:call_stack_handlers handler at 004178B0 returned 0
0024:trace:seh:dispatch_exception call_stack_handlers continuing
0024:trace:seh:dispatch_exception call_stack_handlers ret status = 0
0024:trace:seh:dispatch_exception code=0 flags=1 addr=7BC2389C ip=7bc2389c tid=0024
The non-working verion is never managing to set the CPU context using NtContinue/SetContextThread back to the correct running thread stack and IP. It executes as if the context restore just returns to the function that called NtContinue() (dispatch_exception(), not the function that raised the exception or one of its parent exception handlers).
It looks like NtSetContextThread(), specifically the asm function set_full_cpu_context() is being handled incorrectly.
wine code below. note interesting use of iret with no previous interrupt call. The exception handler is called with a jmp.
/***********************************************************************
* set_full_cpu_context
*
* Set the new CPU context.
*/
extern void set_full_cpu_context( const CONTEXT *context );
__ASM_GLOBAL_FUNC( set_full_cpu_context,
"movl $0,%fs:0x1f8\n\t" /* x86_thread_data()->syscall_frame = NULL */
"movl 4(%esp),%ecx\n\t"
"movw 0x8c(%ecx),%gs\n\t" /* SegGs */
"movw 0x90(%ecx),%fs\n\t" /* SegFs */
"movw 0x94(%ecx),%es\n\t" /* SegEs */
"movl 0x9c(%ecx),%edi\n\t" /* Edi */
"movl 0xa0(%ecx),%esi\n\t" /* Esi */
"movl 0xa4(%ecx),%ebx\n\t" /* Ebx */
"movl 0xb4(%ecx),%ebp\n\t" /* Ebp */
"movw %ss,%ax\n\t"
"cmpw 0xc8(%ecx),%ax\n\t" /* SegSs */
"jne 1f\n\t"
/* As soon as we have switched stacks the context structure could
* be invalid (when signal handlers are executed for example). Copy
* values on the target stack before changing ESP. */
"movl 0xc4(%ecx),%eax\n\t" /* Esp */
"leal -4*4(%eax),%eax\n\t"
"movl 0xc0(%ecx),%edx\n\t" /* EFlags */
".byte 0x36\n\t"
"movl %edx,3*4(%eax)\n\t"
"movl 0xbc(%ecx),%edx\n\t" /* SegCs */
".byte 0x36\n\t"
"movl %edx,2*4(%eax)\n\t"
"movl 0xb8(%ecx),%edx\n\t" /* Eip */
".byte 0x36\n\t"
"movl %edx,1*4(%eax)\n\t"
"movl 0xb0(%ecx),%edx\n\t" /* Eax */
".byte 0x36\n\t"
"movl %edx,0*4(%eax)\n\t"
"pushl 0x98(%ecx)\n\t" /* SegDs */
"movl 0xa8(%ecx),%edx\n\t" /* Edx */
"movl 0xac(%ecx),%ecx\n\t" /* Ecx */
"popl %ds\n\t"
"movl %eax,%esp\n\t"
"popl %eax\n\t"
"iret\n"
/* Restore the context when the stack segment changes. We can't use
* the same code as above because we do not know if the stack segment
* is 16 or 32 bit, and 'movl' will throw an exception when we try to
* access memory above the limit. */
"1:\n\t"
"movl 0xa8(%ecx),%edx\n\t" /* Edx */
"movl 0xb0(%ecx),%eax\n\t" /* Eax */
"movw 0xc8(%ecx),%ss\n\t" /* SegSs */
"movl 0xc4(%ecx),%esp\n\t" /* Esp */
"pushl 0xc0(%ecx)\n\t" /* EFlags */
"pushl 0xbc(%ecx)\n\t" /* SegCs */
"pushl 0xb8(%ecx)\n\t" /* Eip */
"pushl 0x98(%ecx)\n\t" /* SegDs */
"movl 0xac(%ecx),%ecx\n\t" /* Ecx */
"popl %ds\n\t"
"iret" )
Be aware that most of the regression test failures are caused by lack of ptrace() support.
The wine traces above show one of these cases. I will provide traces of an actual relevant failure.
However the failure to restart the correct thread is in some way related to the use of iret in set_full_ cpu_context( ).