Bug #615995 “gdb.base/watch-vfork.exp failures” : Bugs : Linaro GDB

Revision history for this message

Yao Qi (yao-codesourcery) wrote on 2010-08-11:

#1

[CodeSourcery#8250]

Ulrich Weigand (uweigand) on 2010-08-12

Changed in gdb-linaro:
status:	New → Confirmed

Revision history for this message

Yao Qi (yao-codesourcery) wrote on 2010-08-19:

#2

Download full text (4.2 KiB)

Debuggee hangs in vfork() libc, disassemble vfork shows instruction on (+56) is a breakpoint instruction.

(gdb) disassemble vfork
Dump of assembler code for function 0x400fa2d0 <+0>: 0x400fa2d4 <+4>: 0x400fa2d8 <+8>: 0x400fa2dc <+12>: sub 0x400fa2e0 <+16>: pop 0x400fa2e4 <+20>: mov 0x400fa2e8 <+24>: ldr 0x400fa2ec <+28>: 0x400fa2f0 <+32>: 0x400fa2f4 <+36>: str 0x400fa2f8 <+40>: mov 0x400fa2fc <+44>: mov 0x400fa300 <+48>: svc 0x400fa304 <+52>: mov => 0x400fa308 <+56>: 0x400fa30c <+60>: 0x400fa310 <+64>: cmn 0x400fa314 <+68>: 0x400fa318 <+72>: b End of assembler dump. vfork:
push {lr} ; (str lr, [sp, #-4]!)
mvn r0, #61440 ; 0xf000
mov lr, pc
pc, r0, #31
{lr} ; (ldr lr, [sp], #4)
r2, r0
r3, [r2, #-1108] ; 0x454
rsbs r0, r3, #0
moveq r0, #-2147483648 ; 0x80000000
r0, [r2, #-1108] ; 0x454
r12, r7
r7, #190 ; 0xbe
0x00000000
r7, r12
; <UNDEFINED> instruction: 0xe7f001f0
strne r3, [r2, #-1108] ; 0x454
r0, #4096 ; 0x1000
bxcc lr
0x400a05d0 <__syscall_error>

set debug infrun 1, we can see single step breakpoint is inserted twice at the same address,
infrun: TARGET_WAITKIND_FORKED
remove_single_step_breakpoint: 0
infrun: keep_going: ptid = process 12702, trap_expected = 0 -> 0
infrun: resume (step=1, signal=0), trap_expected=0, process 12702
insert_single_step_breakpoint: next_pc = 0x400fa308 // <--- [1]
insert_single_step_breakpoint: 0
infrun: prepare_to_wait
infrun: target_wait (-1, status) =
infrun: 12702 [process 12702],
infrun: status->kind = unknown???
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_VFORK_DONE
infrun: keep_going: ptid = process 12702, trap_expected = 0 -> 0
infrun: resume (step=1, signal=0), trap_expected=0, process 12702
insert_single_step_breakpoint: next_pc = 0x400fa308 // <--- [2]
insert_single_step_breakpoint: 1
infrun: prepare_to_wait
infrun: target_wait (-1, status) =
infrun: 12702 [process 12702],
infrun: status->kind = stopped, signal = SIGTRAP
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x400fa308
infrun: software single step trap for process 12702
remove_single_step_breakpoint: 0
remove_single_step_breakpoint: remove 1
infrun: stepi/nexti

We can find that single step breakpoint is inserted on [1] and [2]. Create a patch to fix this,
----------------------------------------------------------------------------------------
diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
index 6d59583..743247c 100644
--- a/gdb/breakpoint.c
+++ b/gdb/breakpoint.c
@@ -10829,7 +10829,19 @@ insert_single_step_breakpoint (struct gdbarch *gdbarch,
     }
   else
     {
+ struct bp_target_info *bp_tgt;
       gdb_assert (single_step_breakpoints[1] == NULL);
+
+ /* Avoid insert single step breakpoint twice in the same address. When
+ GDB single step over vfork, for example, we don't need to insert single
+ step breakpoint again in NEXT_PC, because we've inserted one singl...

Debuggee hangs in vfork() libc, disassemble vfork shows instruction on (+56) is a breakpoint instruction.

(gdb) disassemble vfork
Dump of assembler code for function vfork:
   0x400fa2d0 <+0>:     push    {lr}            ; (str lr, [sp, #-4]!)
   0x400fa2d4 <+4>:     mvn     r0, #61440      ; 0xf000
   0x400fa2d8 <+8>:     mov     lr, pc
   0x400fa2dc <+12>:    sub     pc, r0, #31 
   0x400fa2e0 <+16>:    pop     {lr}            ; (ldr lr, [sp], #4) 
   0x400fa2e4 <+20>:    mov     r2, r0
   0x400fa2e8 <+24>:    ldr     r3, [r2, #-1108]        ; 0x454
   0x400fa2ec <+28>:    rsbs    r0, r3, #0
   0x400fa2f0 <+32>:    moveq   r0, #-2147483648        ; 0x80000000
   0x400fa2f4 <+36>:    str     r0, [r2, #-1108]        ; 0x454
   0x400fa2f8 <+40>:    mov     r12, r7
   0x400fa2fc <+44>:    mov     r7, #190        ; 0xbe
   0x400fa300 <+48>:    svc     0x00000000
   0x400fa304 <+52>:    mov     r7, r12 
=> 0x400fa308 <+56>:                    ; <UNDEFINED> instruction: 0xe7f001f0
   0x400fa30c <+60>:    strne   r3, [r2, #-1108]        ; 0x454
   0x400fa310 <+64>:    cmn     r0, #4096       ; 0x1000
   0x400fa314 <+68>:    bxcc    lr  
   0x400fa318 <+72>:    b       0x400a05d0 <__syscall_error>
End of assembler dump.

set debug infrun 1, we can see single step breakpoint is inserted twice at the same address,
infrun: TARGET_WAITKIND_FORKED
remove_single_step_breakpoint: 0
infrun: keep_going: ptid = process 12702, trap_expected = 0 -> 0
infrun: resume (step=1, signal=0), trap_expected=0, process 12702
insert_single_step_breakpoint: next_pc = 0x400fa308  // <--- [1]
insert_single_step_breakpoint: 0             
infrun: prepare_to_wait
infrun: target_wait (-1, status) =
infrun:   12702 [process 12702],
infrun:   status->kind = unknown???
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_VFORK_DONE
infrun: keep_going: ptid = process 12702, trap_expected = 0 -> 0
infrun: resume (step=1, signal=0), trap_expected=0, process 12702
insert_single_step_breakpoint: next_pc = 0x400fa308 // <--- [2]
insert_single_step_breakpoint: 1              
infrun: prepare_to_wait
infrun: target_wait (-1, status) =
infrun:   12702 [process 12702],
infrun:   status->kind = stopped, signal = SIGTRAP
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x400fa308
infrun: software single step trap for process 12702
remove_single_step_breakpoint: 0
remove_single_step_breakpoint: remove 1
infrun: stepi/nexti

We can find that single step breakpoint is inserted on [1] and [2].  Create a patch to fix this,
----------------------------------------------------------------------------------------
diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
index 6d59583..743247c 100644
--- a/gdb/breakpoint.c
+++ b/gdb/breakpoint.c
@@ -10829,7 +10829,19 @@ insert_single_step_breakpoint (struct gdbarch *gdbarch,
     }   
   else
     {   
+      struct bp_target_info *bp_tgt;
       gdb_assert (single_step_breakpoints[1] == NULL);
+
+      /* Avoid insert single step breakpoint twice in the same address.  When
+        GDB single step over vfork, for example, we don't need to insert single
+        step breakpoint again in NEXT_PC, because we've inserted one single step
+        breakpoint before step over.  */
+      bp_tgt = (struct bp_target_info *)single_step_breakpoints[0];
+      if (next_pc == bp_tgt->placed_address)
+       {
+         return;
+       }
+
       bpt_p = &single_step_breakpoints[1];
       single_step_gdbarch[1] = gdbarch;
     }   
----------------------------------------------------------------------------------------
After applied this patch to gdb cvs trunk, failures go away,
Test Run By yao on Thu Sep  9 10:46:37 2010
Native configuration is arm-unknown-linux-gnueabi

=== gdb tests ===

Schedule of variations:
    unix

Running target unix
Running /home/yao/maverick/home/yao/cvs/src/gdb/testsuite/gdb.base/watch-vfork.exp ...
PASS: gdb.base/watch-vfork.exp: Watchpoint on global variable (hw)
PASS: gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (hw)
PASS: gdb.base/watch-vfork.exp: Watchpoint on global variable (sw)
PASS: gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw)

=== gdb Summary ===

# of expected passes            4

Changed in gdb-linaro:
status:	Confirmed → In Progress
assignee:	nobody → Yao Qi (yao-codesourcery)

Revision history for this message

Yao Qi (yao-codesourcery) wrote on 2010-08-19:

#3

With this patch applied, test case passes on pavo1,
$ uname -a
Linux pavo1 2.6.32 #1 PREEMPT Mon May 3 22:40:09 CEST 2010 armv7l GNU/Linux

However, fails on Loic's beagle board, even failure is different,
$ uname -a
Linux beagle 2.6.35-6-omap #11 Tue Jul 6 21:08:57 UTC 2010 armv7l GNU/Linux

(gdb)
infrun: clear_proceed_status_thread (process 6891)
infrun: proceed (addr=0xffffffff, signal=144, step=1)
infrun: resume (step=1, signal=0), trap_expected=0
insert_single_step_breakpoint: next_pc = 0x40103304
insert_single_step_breakpoint: [0]
LLR: Preparing to resume process 6891, 0, inferior_ptid process 6891
RC: Not resuming sibling process 6891 (not stopped)
LLR: PTRACE_CONT process 6891, 0 (resume event thread)
infrun: wait_for_inferior (treat_exec_as_sigtrap=0)
linux_nat_wait: [process -1]
LLW: waitpid 6891 received Trace/breakpoint trap (stopped)
LLW: Handling extended status 0x02057f
LLW: Candidate event Trace/breakpoint trap (stopped) in process 6891.
LLW: trap ptid is process 6891.
infrun: target_wait (-1, status) =
infrun: 6891 [process 6891],
infrun: status->kind = vforked
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_FORKED
remove_single_step_breakpoint: 0
Detaching after fork from child process 6894.
LCFF: waiting for VFORK_DONE on 6891
infrun: resume (step=1, signal=0), trap_expected=0
insert_single_step_breakpoint: next_pc = 0x40103308
insert_single_step_breakpoint: [0]
LLR: Preparing to resume process 6891, 0, inferior_ptid process 6891
RC: Not resuming sibling process 6891 (not stopped)
LLR: PTRACE_CONT process 6891, 0 (resume event thread)
infrun: prepare_to_wait
linux_nat_wait: [process -1]
LLW: waitpid 6891 received Unknown signal 0 (terminated)
LLW: Candidate event Unknown signal 0 (terminated) in process 6891.
infrun: target_wait (-1, status) =
infrun: 6891 [process 6891],
infrun: status->kind = signalled, signal = SIGTRAP
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_SIGNALLED
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
The program no longer exists.
infrun: stop_stepping

With this patch applied, test case passes on pavo1,
$ uname -a
Linux pavo1 2.6.32 #1 PREEMPT Mon May 3 22:40:09 CEST 2010 armv7l GNU/Linux

However, fails on Loic's beagle board, even failure is different,
$ uname -a
Linux beagle 2.6.35-6-omap #11 Tue Jul 6 21:08:57 UTC 2010 armv7l GNU/Linux

(gdb) 
infrun: clear_proceed_status_thread (process 6891)
infrun: proceed (addr=0xffffffff, signal=144, step=1)
infrun: resume (step=1, signal=0), trap_expected=0
insert_single_step_breakpoint: next_pc = 0x40103304
insert_single_step_breakpoint: [0]
LLR: Preparing to resume process 6891, 0, inferior_ptid process 6891
RC: Not resuming sibling process 6891 (not stopped)
LLR: PTRACE_CONT process 6891, 0 (resume event thread)
infrun: wait_for_inferior (treat_exec_as_sigtrap=0)
linux_nat_wait: [process -1]
LLW: waitpid 6891 received Trace/breakpoint trap (stopped)
LLW: Handling extended status 0x02057f
LLW: Candidate event Trace/breakpoint trap (stopped) in process 6891.
LLW: trap ptid is process 6891.
infrun: target_wait (-1, status) =
infrun:   6891 [process 6891],
infrun:   status->kind = vforked
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_FORKED
remove_single_step_breakpoint: 0
Detaching after fork from child process 6894.
LCFF: waiting for VFORK_DONE on 6891
infrun: resume (step=1, signal=0), trap_expected=0
insert_single_step_breakpoint: next_pc = 0x40103308
insert_single_step_breakpoint: [0]
LLR: Preparing to resume process 6891, 0, inferior_ptid process 6891
RC: Not resuming sibling process 6891 (not stopped)
LLR: PTRACE_CONT process 6891, 0 (resume event thread)
infrun: prepare_to_wait
linux_nat_wait: [process -1]
LLW: waitpid 6891 received Unknown signal 0 (terminated)
LLW: Candidate event Unknown signal 0 (terminated) in process 6891.
infrun: target_wait (-1, status) =
infrun:   6891 [process 6891],
infrun:   status->kind = signalled, signal = SIGTRAP 
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_SIGNALLED
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
The program no longer exists.
infrun: stop_stepping

Revision history for this message

Ulrich Weigand (uweigand) wrote on 2010-08-19:

#4

I think the patch you posted doesn't really address the core problem. (For example, it might not help if there are already two single-step breakpoint outstanding.)

The underlying issue seems to rather be that handle_inferior_event does not always clean up single-step breakpoints as it should. I've noticed and fixed a couple of such problems a while ago here:
http://sourceware.org/ml/gdb-patches/2010-06/msg00481.html

but this patch still did not handle the TARGET_WAITKIND_VFORK_DONE case. It seems that this case also should remove the single-step breakpoints.

Revision history for this message

Yao Qi (yao-codesourcery) wrote on 2010-08-20:

#5

Ulrich,
Thanks for your comments. Here is a new patch as you suggested, and tested on pavo1.
diff --git a/gdb/infrun.c b/gdb/infrun.c
index dd89e78..5e1f78b 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -3304,6 +3304,13 @@ handle_inferior_event (struct execution_control_state *ecs)
if (debug_infrun)
fprintf_unfiltered (gdb_stdlog, "infrun: TARGET_WAITKIND_VFORK_DONE\n");

+ if (singlestep_breakpoints_inserted_p)
+ {
+ /* Pull the single step breakpoints out of the target. */
+ remove_single_step_breakpoints ();
+ singlestep_breakpoints_inserted_p = 0;
+ }
+
if (!ptid_equal (ecs->ptid, inferior_ptid))
context_switch (ecs->ptid);

However, the failure on Loic's beagle board is different. Please pay attention to the last several lines of log I posted in comment #3.

LLW: waitpid 6891 received Unknown signal 0 (terminated)
LLW: Candidate event Unknown signal 0 (terminated) in process 6891.
infrun: target_wait (-1, status) =
infrun: 6891 [process 6891],
infrun: status->kind = signalled, signal = SIGTRAP
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_SIGNALLED // <---- [1]
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
The program no longer exists.
infrun: stop_stepping

On [1], it is expected to get TARGET_WAITKIND_VFORK_DONE, however, we got TARGET_WAITKIND_SIGNALLED. Looks like the status is not correct.

Revision history for this message

Ulrich Weigand (uweigand) wrote on 2010-08-21:

#6

LLW: waitpid 6891 received Unknown signal 0 (terminated)

This is odd, but seems to be caused by a bug in the debug message printing code only (in linux-nat.c:status_to_str):

  else if (WIFSIGNALED (status))
    snprintf (buf, sizeof (buf), "%s (terminated)",
              strsignal (WSTOPSIG (status)));

To extract the signal number from a WIFSIGNALED status, you need to use WTERMSIG, not WSTOPSIG. However, since the code that actually processes the status does this correctly, it seems the process was killed on a SIGTRAP:

infrun: status->kind = signalled, signal = SIGTRAP

Now, the question is why this should happen. One way this can happen is if GDB has not attached (either not yet, or not anymore) with ptrace to a process while it is running into a breakpoint instruction, or some other event that causes a SIGTRAP to be generated ...

Revision history for this message

Yao Qi (yao-codesourcery) wrote on 2010-08-27:

#7

single_step_vfork.patch Edit (2.8 KiB, text/plain)

Ulrich, you are right. Single step breakpoint is inserted in resume, and child process will hit it however, we've detached child process.

I fix this by "removing single step breakpoint in resume() when gdb is stopped by vforked." Patch attached here can fix failures in this bug, and tested on GDB CVS on armv7l-unknown-linux-gnueabi, no regression.

Revision history for this message

Yao Qi (yao-codesourcery) wrote on 2010-09-01:

#8

After talked with Pedro, I create a smaller patch against this problem,

diff -u -p -r1.446 infrun.c
--- infrun.c 19 Jul 2010 07:55:43 -0000 1.446
+++ infrun.c 1 Sep 2010 02:11:22 -0000
@@ -1602,7 +1602,8 @@ a command like `return' or `jump' to con
       step = gdbarch_displaced_step_hw_singlestep (gdbarch,
                                                   displaced->step_closure);
     }
-
+ else if (current_inferior()->waiting_for_vfork_done)
+ step = 0;
   /* Do we need to do it the hard way, w/temp breakpoints? */
   else if (step)
     step = maybe_software_singlestep (gdbarch, pc);

------------------------------------------------------------------------------
This patch can fix failures on ARM, but can't fix failures on x86. Since there have been some failures on x86, I am not sure this patch can be accepted by upstreams. Shall we send it to gdb-patches to have a try or have a look at failures on x86?

Revision history for this message

Michael Hope (michaelh1) wrote on 2010-09-01: Re: [Bug 615995] Re: gdb.base/watch-vfork.exp failures

#9

Spend up to half a day looking at it on x86, otherwise let's see what
upstream thinks.

Revision history for this message

Yao Qi (yao-codesourcery) wrote on 2010-09-01:

#10

Send patch to gdb-patches http://www.cygwin.com/ml/gdb-patches/2010-09/msg00022.html

Ulrich Weigand (uweigand) on 2010-09-02

Changed in gdb-linaro:
importance:	Undecided → Medium

Revision history for this message

Yao Qi (yao-codesourcery) wrote on 2010-09-06:

#11

Patch is committed to GDB mainline. http://www.cygwin.com/ml/gdb-cvs/2010-09/msg00042.html

Will merge it to GDB 7.2 branch, after Pedro's revision to comments.

Changed in gdb-linaro:
status:	In Progress → Fix Committed

Revision history for this message

Yao Qi (yao-codesourcery) wrote on 2010-09-08:

#12

Committed to GDB 7.2 branch.
http://www.cygwin.com/ml/gdb-cvs/2010-09/msg00054.html

Ulrich Weigand (uweigand) on 2010-10-08

Changed in gdb-linaro:
milestone:	none → 7.2-2010.10-0

Michael Hope (michaelh1) on 2010-10-12

Changed in gdb-linaro:
status:	Fix Committed → Fix Released

Linaro GDB

gdb.base/watch-vfork.exp failures

Bug Description

Related branches

Other bug subscribers

Related blueprints

Patches

Remote bug watches