gdb.base/watch-vfork.exp failures

Bug #615995 reported by Ulrich Weigand
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro GDB
Fix Released
Medium
Yao Qi

Bug Description

FAIL: gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (hw) (timeout)
FAIL: gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout)

Further analysis needed.

Related branches

Revision history for this message
Yao Qi (yao-codesourcery) wrote :

[CodeSourcery#8250]

Changed in gdb-linaro:
status: New → Confirmed
Revision history for this message
Yao Qi (yao-codesourcery) wrote :
Download full text (4.2 KiB)

Debuggee hangs in vfork() libc, disassemble vfork shows instruction on (+56) is a breakpoint instruction.

(gdb) disassemble vfork
Dump of assembler code for function vfork:
   0x400fa2d0 <+0>: push {lr} ; (str lr, [sp, #-4]!)
   0x400fa2d4 <+4>: mvn r0, #61440 ; 0xf000
   0x400fa2d8 <+8>: mov lr, pc
   0x400fa2dc <+12>: sub pc, r0, #31
   0x400fa2e0 <+16>: pop {lr} ; (ldr lr, [sp], #4)
   0x400fa2e4 <+20>: mov r2, r0
   0x400fa2e8 <+24>: ldr r3, [r2, #-1108] ; 0x454
   0x400fa2ec <+28>: rsbs r0, r3, #0
   0x400fa2f0 <+32>: moveq r0, #-2147483648 ; 0x80000000
   0x400fa2f4 <+36>: str r0, [r2, #-1108] ; 0x454
   0x400fa2f8 <+40>: mov r12, r7
   0x400fa2fc <+44>: mov r7, #190 ; 0xbe
   0x400fa300 <+48>: svc 0x00000000
   0x400fa304 <+52>: mov r7, r12
=> 0x400fa308 <+56>: ; <UNDEFINED> instruction: 0xe7f001f0
   0x400fa30c <+60>: strne r3, [r2, #-1108] ; 0x454
   0x400fa310 <+64>: cmn r0, #4096 ; 0x1000
   0x400fa314 <+68>: bxcc lr
   0x400fa318 <+72>: b 0x400a05d0 <__syscall_error>
End of assembler dump.

set debug infrun 1, we can see single step breakpoint is inserted twice at the same address,
infrun: TARGET_WAITKIND_FORKED
remove_single_step_breakpoint: 0
infrun: keep_going: ptid = process 12702, trap_expected = 0 -> 0
infrun: resume (step=1, signal=0), trap_expected=0, process 12702
insert_single_step_breakpoint: next_pc = 0x400fa308 // <--- [1]
insert_single_step_breakpoint: 0
infrun: prepare_to_wait
infrun: target_wait (-1, status) =
infrun: 12702 [process 12702],
infrun: status->kind = unknown???
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_VFORK_DONE
infrun: keep_going: ptid = process 12702, trap_expected = 0 -> 0
infrun: resume (step=1, signal=0), trap_expected=0, process 12702
insert_single_step_breakpoint: next_pc = 0x400fa308 // <--- [2]
insert_single_step_breakpoint: 1
infrun: prepare_to_wait
infrun: target_wait (-1, status) =
infrun: 12702 [process 12702],
infrun: status->kind = stopped, signal = SIGTRAP
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_STOPPED
infrun: stop_pc = 0x400fa308
infrun: software single step trap for process 12702
remove_single_step_breakpoint: 0
remove_single_step_breakpoint: remove 1
infrun: stepi/nexti

We can find that single step breakpoint is inserted on [1] and [2]. Create a patch to fix this,
----------------------------------------------------------------------------------------
diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
index 6d59583..743247c 100644
--- a/gdb/breakpoint.c
+++ b/gdb/breakpoint.c
@@ -10829,7 +10829,19 @@ insert_single_step_breakpoint (struct gdbarch *gdbarch,
     }
   else
     {
+ struct bp_target_info *bp_tgt;
       gdb_assert (single_step_breakpoints[1] == NULL);
+
+ /* Avoid insert single step breakpoint twice in the same address. When
+ GDB single step over vfork, for example, we don't need to insert single
+ step breakpoint again in NEXT_PC, because we've inserted one singl...

Read more...

Changed in gdb-linaro:
status: Confirmed → In Progress
assignee: nobody → Yao Qi (yao-codesourcery)
Revision history for this message
Yao Qi (yao-codesourcery) wrote :

With this patch applied, test case passes on pavo1,
$ uname -a
Linux pavo1 2.6.32 #1 PREEMPT Mon May 3 22:40:09 CEST 2010 armv7l GNU/Linux

However, fails on Loic's beagle board, even failure is different,
$ uname -a
Linux beagle 2.6.35-6-omap #11 Tue Jul 6 21:08:57 UTC 2010 armv7l GNU/Linux

(gdb)
infrun: clear_proceed_status_thread (process 6891)
infrun: proceed (addr=0xffffffff, signal=144, step=1)
infrun: resume (step=1, signal=0), trap_expected=0
insert_single_step_breakpoint: next_pc = 0x40103304
insert_single_step_breakpoint: [0]
LLR: Preparing to resume process 6891, 0, inferior_ptid process 6891
RC: Not resuming sibling process 6891 (not stopped)
LLR: PTRACE_CONT process 6891, 0 (resume event thread)
infrun: wait_for_inferior (treat_exec_as_sigtrap=0)
linux_nat_wait: [process -1]
LLW: waitpid 6891 received Trace/breakpoint trap (stopped)
LLW: Handling extended status 0x02057f
LLW: Candidate event Trace/breakpoint trap (stopped) in process 6891.
LLW: trap ptid is process 6891.
infrun: target_wait (-1, status) =
infrun: 6891 [process 6891],
infrun: status->kind = vforked
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_FORKED
remove_single_step_breakpoint: 0
Detaching after fork from child process 6894.
LCFF: waiting for VFORK_DONE on 6891
infrun: resume (step=1, signal=0), trap_expected=0
insert_single_step_breakpoint: next_pc = 0x40103308
insert_single_step_breakpoint: [0]
LLR: Preparing to resume process 6891, 0, inferior_ptid process 6891
RC: Not resuming sibling process 6891 (not stopped)
LLR: PTRACE_CONT process 6891, 0 (resume event thread)
infrun: prepare_to_wait
linux_nat_wait: [process -1]
LLW: waitpid 6891 received Unknown signal 0 (terminated)
LLW: Candidate event Unknown signal 0 (terminated) in process 6891.
infrun: target_wait (-1, status) =
infrun: 6891 [process 6891],
infrun: status->kind = signalled, signal = SIGTRAP
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_SIGNALLED
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
The program no longer exists.
infrun: stop_stepping

Revision history for this message
Ulrich Weigand (uweigand) wrote :

I think the patch you posted doesn't really address the core problem. (For example, it might not help if there are already two single-step breakpoint outstanding.)

The underlying issue seems to rather be that handle_inferior_event does not always clean up single-step breakpoints as it should. I've noticed and fixed a couple of such problems a while ago here:
http://sourceware.org/ml/gdb-patches/2010-06/msg00481.html

but this patch still did not handle the TARGET_WAITKIND_VFORK_DONE case. It seems that this case also should remove the single-step breakpoints.

Revision history for this message
Yao Qi (yao-codesourcery) wrote :

Ulrich,
Thanks for your comments. Here is a new patch as you suggested, and tested on pavo1.
diff --git a/gdb/infrun.c b/gdb/infrun.c
index dd89e78..5e1f78b 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -3304,6 +3304,13 @@ handle_inferior_event (struct execution_control_state *ecs)
       if (debug_infrun)
        fprintf_unfiltered (gdb_stdlog, "infrun: TARGET_WAITKIND_VFORK_DONE\n");

+ if (singlestep_breakpoints_inserted_p)
+ {
+ /* Pull the single step breakpoints out of the target. */
+ remove_single_step_breakpoints ();
+ singlestep_breakpoints_inserted_p = 0;
+ }
+
       if (!ptid_equal (ecs->ptid, inferior_ptid))
        context_switch (ecs->ptid);

However, the failure on Loic's beagle board is different. Please pay attention to the last several lines of log I posted in comment #3.

LLW: waitpid 6891 received Unknown signal 0 (terminated)
LLW: Candidate event Unknown signal 0 (terminated) in process 6891.
infrun: target_wait (-1, status) =
infrun: 6891 [process 6891],
infrun: status->kind = signalled, signal = SIGTRAP
infrun: infwait_normal_state
infrun: TARGET_WAITKIND_SIGNALLED // <---- [1]
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
The program no longer exists.
infrun: stop_stepping

On [1], it is expected to get TARGET_WAITKIND_VFORK_DONE, however, we got TARGET_WAITKIND_SIGNALLED. Looks like the status is not correct.

Revision history for this message
Ulrich Weigand (uweigand) wrote :

LLW: waitpid 6891 received Unknown signal 0 (terminated)

This is odd, but seems to be caused by a bug in the debug message printing code only (in linux-nat.c:status_to_str):

  else if (WIFSIGNALED (status))
    snprintf (buf, sizeof (buf), "%s (terminated)",
              strsignal (WSTOPSIG (status)));

To extract the signal number from a WIFSIGNALED status, you need to use WTERMSIG, not WSTOPSIG. However, since the code that actually processes the status does this correctly, it seems the process was killed on a SIGTRAP:

infrun: status->kind = signalled, signal = SIGTRAP

Now, the question is why this should happen. One way this can happen is if GDB has not attached (either not yet, or not anymore) with ptrace to a process while it is running into a breakpoint instruction, or some other event that causes a SIGTRAP to be generated ...

Revision history for this message
Yao Qi (yao-codesourcery) wrote :

Ulrich, you are right. Single step breakpoint is inserted in resume, and child process will hit it however, we've detached child process.

I fix this by "removing single step breakpoint in resume() when gdb is stopped by vforked." Patch attached here can fix failures in this bug, and tested on GDB CVS on armv7l-unknown-linux-gnueabi, no regression.

Revision history for this message
Yao Qi (yao-codesourcery) wrote :

After talked with Pedro, I create a smaller patch against this problem,

diff -u -p -r1.446 infrun.c
--- infrun.c 19 Jul 2010 07:55:43 -0000 1.446
+++ infrun.c 1 Sep 2010 02:11:22 -0000
@@ -1602,7 +1602,8 @@ a command like `return' or `jump' to con
       step = gdbarch_displaced_step_hw_singlestep (gdbarch,
                                                   displaced->step_closure);
     }
-
+ else if (current_inferior()->waiting_for_vfork_done)
+ step = 0;
   /* Do we need to do it the hard way, w/temp breakpoints? */
   else if (step)
     step = maybe_software_singlestep (gdbarch, pc);

------------------------------------------------------------------------------
This patch can fix failures on ARM, but can't fix failures on x86. Since there have been some failures on x86, I am not sure this patch can be accepted by upstreams. Shall we send it to gdb-patches to have a try or have a look at failures on x86?

Revision history for this message
Michael Hope (michaelh1) wrote : Re: [Bug 615995] Re: gdb.base/watch-vfork.exp failures

Spend up to half a day looking at it on x86, otherwise let's see what
upstream thinks.

Revision history for this message
Yao Qi (yao-codesourcery) wrote :
Changed in gdb-linaro:
importance: Undecided → Medium
Revision history for this message
Yao Qi (yao-codesourcery) wrote :

Patch is committed to GDB mainline. http://www.cygwin.com/ml/gdb-cvs/2010-09/msg00042.html

Will merge it to GDB 7.2 branch, after Pedro's revision to comments.

Changed in gdb-linaro:
status: In Progress → Fix Committed
Revision history for this message
Yao Qi (yao-codesourcery) wrote :
Changed in gdb-linaro:
milestone: none → 7.2-2010.10-0
Michael Hope (michaelh1)
Changed in gdb-linaro:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related blueprints

Remote bug watches

Bug watches keep track of this bug in other bug trackers.