Unable to debug any kernel on i386 qemu machine

Bug #1846557 reported by Vladislav K. Valtchev
24
This bug affects 10 people
Affects Status Importance Assigned to Milestone
gdb (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Hi,
On my x86_64 machine [running Ubuntu 18.04.3 LTS] with gdb version 'Ubuntu 8.1-0ubuntu3' I could happily debug any kernel running on a i386 qemu VM (qemu-system-i386) by just doing the following:

> target remote localhost:1234
> b term.c:694

and then, when the breakpoint was hit I used to observe output like:

> Breakpoint 1, term_action_use_alt_buffer (t=0xc017514c <first_instance>, use_alt_buffer=true)
> at /home/vlad/dev/tilck/kernel/char/tty/term.c:694

And then I was able to do `s`, `si` or `c`, exactly like with regular user applications.

With the newest update of gdb, version 'Ubuntu 8.1-0ubuntu3.1', instead, something is broken.
By doing the same things I observe:

> (gdb) b term.c:693
> warning: Breakpoint address adjusted from 0xc01158fe to 0xffffffffc01158fe.

Which seems (and actually is) a bad sign, for what comes later. [why do you need to change the address? why do you want to extend it to 64-bit for a 32-bit machine?? mmm..]

GDB detects the breakpoint, but in a weird way:

Program received signal SIGTRAP, Trace/breakpoint trap.
term_action_use_alt_buffer (t=0xc017514c <first_instance>, use_alt_buffer=true)

At this point, I'm able to read the memory and the variables BUT, I cannot continue the execution, NOR doing any kind of step. The commands apparently don't get delivered to the remote side (QEMU), or they get delivered in a wrong way somehow. Example output:

(gdb) b 709
warning: Breakpoint address adjusted from 0xc0115a45 to 0xffffffffc0115a45.
Breakpoint 2 at 0xc0115a45: file /home/vlad/dev/tilck/kernel/char/tty/term.c, line 709.
(gdb) c
Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.
term_action_use_alt_buffer (t=0xc017514c <first_instance>, use_alt_buffer=true)
    at /home/vlad/dev/tilck/kernel/char/tty/term.c:693
693 t->alt_buf = kmalloc(sizeof(u16) * t->rows * t->cols);
(gdb) c
Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.
term_action_use_alt_buffer (t=0xc017514c <first_instance>, use_alt_buffer=true)
    at /home/vlad/dev/tilck/kernel/char/tty/term.c:693
693 t->alt_buf = kmalloc(sizeof(u16) * t->rows * t->cols);
(gdb) c
Continuing.

As you see, the whole QEMU VM is stuck until I quit GDB.

Note: I downgraded exclusively GDB back to version 'Ubuntu 8.1-0ubuntu3' in order to check if the problem would be fixed and it is. I'm sure the problem has been introduced in this specific version 'Ubuntu 8.1-0ubuntu3.1' and it's *not* related with QEMU *nor* with the kernel that is being debugged. It's totally independent from that.

Final remark: note that I'm running gdb on x86_64 machine, while I'm debugging a kernel running on a i386 (virtual) machine. I believe that the cross-arch scenario almost certainly has something to do with the bug, as it happened in the past on both sides (qemu and gdb).

Thanks a lot,
Vlad

description: updated
Revision history for this message
Robie Basak (racb) wrote :

@Manoj,

Please could you take a look?

Revision history for this message
Vladislav K. Valtchev (vvaltchev) wrote :

Hi guys,
any update on this?

Just to be sure, I tried to the Linux kernel 4.19.16 in the same scenario and I got the same result. I built the kernel with buildroot and I launched QEMU with:

qemu-system-i386 -kernel bzImage -S -s -append 'nokaslr'

I know it needs an initrd and a hdd img in order to boot a full system, but for me it was enough
to break on start_kernel and then trying to do `stepi`. Exactly like with the other project, with the gdb version `Ubuntu 8.1-0ubuntu3` it worked perfectly, while with gdb `Ubuntu 8.1-0ubuntu3.1` I got the same problem:

(gdb) b start_kernel
warning: Breakpoint address adjusted from 0xc17257cd to 0xffffffffc17257cd.
Breakpoint 1 at 0xc17257cd
(gdb) c
Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.
0xc17257cd in start_kernel ()
(gdb) si
0xc17257cd in start_kernel ()
(gdb) si
0xc17257cd in start_kernel ()
(gdb) si
0xc17257cd in start_kernel ()
(gdb) si

Therefore, as expected, the bug affects _definitively_ any kind of 32-bit code when remote debugging is used and the client is 64-bit. I also checked if the latest non-Ubuntu gdb is affected by this issue and it's not.

In conclusion, I believe that the following patch introduced the regression:

http://launchpadlibrarian.net/431301516/gdb_8.1-0ubuntu3_8.1-0ubuntu3.1.diff.gz

And that the bug needs to get some attention. After all, people _cannot_ debug a 32-bit linux kernel running on a VM anymore, if they're using Ubuntu.

@Manoj could you please comment?

Thanks

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in gdb (Ubuntu):
status: New → Confirmed
Revision history for this message
Vladislav K. Valtchev (vvaltchev) wrote :

A fix was released after bug #1848200, reporting the same problem, was opened.

Changed in gdb (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
dann frazier (dannf) wrote :

The fix for bug #1848200 was ARM specific. This is about x86, so it's a different issue.

Revision history for this message
dann frazier (dannf) wrote :

Ignore my last comment - I took another look and it appears to be generic.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.