First breakpoint at AVX instruction with memory operand causes SIGSEGV when tring to continue execution

Bug #1850258 reported by Pauli
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
gdb (Ubuntu)
New
Undecided
Unassigned

Bug Description

I noticed random looking SIGSEGV to application when trying to continue execution after first breakpoint. I now seem to have narrowed the issue to SIMD instruction with memory operand as first breakpoint location. I haven't managed to figure out why the SIGSEGV is delivered to the debugger application.

It is important have first breakpoint exactly at a problematic instructions. If I first break on a different instruction then later breakpoints won't reproduce that crash

I haven't tested if this is a hardware specific issue.

I managed to write a simple test case which reproduces the crash if breakpoint is set. I attached the test.cc which includes compilation and testing instructions. test.cc is supposed to generate a simple main function like:

Dump of assembler code for function main():
=> 0x0000555555554520 <+0>: vmovdqa 0x1af8(%rip),%xmm0 # 0x555555556020 <foo>
   0x0000555555554528 <+8>: vmovd %xmm0,%eax
   0x000055555555452c <+12>: retq

I set breakpoint with:
b main

Then either continue or stepping causes SIGSEGV to the debugged application.

This was happening already with disco. I only now figured out enough details to make a simple test case which is worth a bug report.

ProblemType: Bug
DistroRelease: Ubuntu 19.10
Package: gdb 8.3-0ubuntu1
ProcVersionSignature: Ubuntu 5.3.0-19.20-generic 5.3.1
Uname: Linux 5.3.0-19-generic x86_64
NonfreeKernelModules: nvidia_modeset nvidia
ApportVersion: 2.20.11-0ubuntu8
Architecture: amd64
CurrentDesktop: GNOME
Date: Tue Oct 29 09:44:52 2019
InstallationDate: Installed on 2037-12-25 (-6632 days ago)
InstallationMedia: Lubuntu 16.04 LTS "Xenial Xerus" - Release amd64 (20160420.1)
SourcePackage: gdb
UpgradeStatus: Upgraded to eoan on 2019-10-27 (1 days ago)

Revision history for this message
Pauli (paniemin) wrote :
description: updated
Revision history for this message
Pauli (paniemin) wrote :

Actually it seems like it is more restricted towards a few instructions with memory operand. But if breakpoint is set to them at any point of execution they crash.

That means also breakpoint before and single stepping to problematic breakpoint will crash application. But single stepping over problematic instruction without breakpoint doesn't crash. That was adding to my earlier confusion why application crashes looked so random.

Revision history for this message
Pauli (paniemin) wrote :

A bit more debugging I see that signal comes from kernel (si_code=0x80) but it claims null pointer reference. I don't understand how it could be a null pointer .... I would need to have better understanding what happens in gdb and kernel to trigger the SIGSEGV.

I found out also that workaround is to have breakpoint but disable it after stopping on the problematic instruction. If breakpoint isn't active then execution continues without issues. Only issues is that I don't know full set of instructions which actually trigger this issue. I have had issues with some other memory referencing VEX coded instructions. But I also have examples of memory reference instructions which don't trigger the bug.

I attached updated test2.cc which has now signal handling dumping siginfo.

Reading symbols from ./test2...
(gdb) b main
Breakpoint 1 at 0x650: file test2.cc, line 41.
(gdb) r
Starting program: /home/coren/project/test2

Breakpoint 1, main () at test2.cc:41
41 asm("\tvmovdqa %1, %0\n" : "=x" (bar) : "xm" (foo));
(gdb) disassemble
Dump of assembler code for function main():
=> 0x0000555555554650 <+0>: vmovdqa 0x19d8(%rip),%xmm0 # 0x555555556030 <foo>
   0x0000555555554658 <+8>: vmovd %xmm0,%eax
   0x000055555555465c <+12>: retq
End of assembler dump.
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
main () at test2.cc:41
41 asm("\tvmovdqa %1, %0\n" : "=x" (bar) : "xm" (foo));
(gdb)
Continuing.
sig: 11, ctx: 0x7fffffffd740
si_signo: 11, si_erron: 0, si_code: 128
si_addr: (nil), si_addr_lsb: 0, si_pid: 0, si_uid: 0

Breakpoint 1, main () at test2.cc:41
41 asm("\tvmovdqa %1, %0\n" : "=x" (bar) : "xm" (foo));
(gdb)
Continuing.

Program received signal SIGSEGV, Segmentation fault.
main () at test2.cc:41
41 asm("\tvmovdqa %1, %0\n" : "=x" (bar) : "xm" (foo));
(gdb)
Continuing.
sig: 11, ctx: 0x7fffffffd740
si_signo: 11, si_erron: 0, si_code: 128
si_addr: (nil), si_addr_lsb: 0, si_pid: 0, si_uid: 0

Breakpoint 1, main () at test2.cc:41
41 asm("\tvmovdqa %1, %0\n" : "=x" (bar) : "xm" (foo));
(gdb) dis 1
(gdb) c
Continuing.
[Inferior 1 (process 9091) exited with code 01]

Revision history for this message
Pauli (paniemin) wrote :

Another instruction which appears to have issues with memory operands and breakpoint on the instruction is vpmuludq. This time it didn't crash but I was looking at completely incorrect multiplication results compared to incoming values to the instruction. To me this indicates like vpmuludq read the memory operand from wrong address which happened to be a mapped address.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.