Long term timer thread failure under Linux kernel 2.4.21-20.ELsmp #1 SMP
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
EPICS Base |
Fix Released
|
High
|
mrk |
Bug Description
If I start catime with a channel that will not be found and leave it running for a few days I see a failure in one of the timer threads related to use of a bad event semaphore identifier. Attempts to debug the situation have not been succesful as the debugger appears to also be confused. Is the entire Linux process scrambled at this point?
Also, why is the thread id -1218549152 below? Is that normal? I also wonder why the debugger segmentation faults when it terminates?
Additional information:
(gdb) run fishy 1
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/hill/
[Thread debugging using libthread_db enabled]
[New Thread -1218549152 (LWP 23850)]
[New Thread -1218552912 (LWP 23851)]
[New Thread -1229042768 (LWP 23852)]
Testing with 1 channels named fishy
channel connect test
[New Thread -1239536720 (LWP 23853)]
Detaching after fork from child process 23854.
Program received signal SIGINT, Interrupt.
[Switching to Thread -1218549152 (LWP 23850)]
0x00e5440b in pthread_
from /lib/tls/
(gdb) catch throw
Catchpoint 1 (throw)
(gdb) cont
Continuing.
[New Thread -1250026576 (LWP 23855)]
CA client library is unable to contact CA repeater after 50 tries.
Silence this message by starting a CA repeater daemon
or by calling ca_pend_event() and or ca_poll() more often.
pthread_
[Switching to Thread -1229042768 (LWP 23852)]
Catchpoint 1 (exception thrown)
0x004d11e6 in __cxa_throw () from /usr/lib/
(gdb) info threads
Cannot fetch general-purpose registers for thread -1229042768: generic error
(gdp) bt
#0 0x00138cdf in raise () from /lib/tls/libc.so.6
(gdb) quit
The program is running. Exit anyway? (y or n) y
Segmentation fault
~/epicsR3.
Linux santana 2.4.21-20.ELsmp #1 SMP Wed Aug 18 20:46:40 EDT 2004 i686 i686 i386 GNU/Linux
Original Mantis Bug: mantis-139
http://
~/epicsR3. 14/epics/ base$ gdb catime 1.20040607. 17rh) linux-gnu" ...Using host libthread_db library "/lib/tls/ libthread_ db.so.1" .
GNU gdb Red Hat Linux (6.1post-
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-
(gdb) run fishy 1 epicsR3. 14/epics/ base/bin/ linux-x86/ catime fishy 1
Starting program: /home/hill/
[Thread debugging using libthread_db enabled]
[New Thread -1218524576 (LWP 3995)]
[New Thread -1218528336 (LWP 3998)]
[New Thread -1229018192 (LWP 3999)]
Testing with 1 channels named fishy
channel connect test
[New Thread -1239512144 (LWP 4000)]
Detaching after fork from child process 4001.
Program received signal SIGINT, Interrupt. cond_timedwait@ @GLIBC_ 2.3.2 () libpthread. so.0 cond_timedwait failed: error Invalid argument
[Switching to Thread -1218524576 (LWP 3995)]
0x0011840b in pthread_
from /lib/tls/
(gdb) catch throw
Catchpoint 1 (throw)
(gdb) const
Undefined command: "const". Try "help".
(gdb) cont
Continuing.
[New Thread -1250002000 (LWP 4002)]
CA client library is unable to contact CA repeater after 50 tries.
Silence this message by starting a CA repeater daemon
or by calling ca_pend_event() and or ca_poll() more often.
pthread_
[Switching to Thread -1229018192 (LWP 3999)]
Catchpoint 1 (exception thrown) libstdc+ +.so.5 libstdc+ +.so.5 0.0269910000000 00001) at ../../. ./src/libCom/ osi/epicsEvent. cpp:78 e::run (this=0x8953488) ./src/libCom/ timer/timerQueu eActive. cpp:69 EntryPoint (pPvt=0x89534e4) ./src/libCom/ osi/epicsThread .cpp:41 ./src/libCom/ osi/os/ posix/osdThread .c:294 libpthread. so.0
0x007521e6 in __cxa_throw () from /usr/lib/
(gdb) bt
#0 0x007521e6 in __cxa_throw () from /usr/lib/
#1 0x00d7e54a in epicsEvent::wait (this=0x89534dc,
timeOut=
#2 0x00d8acf0 in timerQueueActiv
at ../../.
#3 0x00d7ca12 in epicsThreadCall
at ../../.
#4 0x00d83174 in start_routine (arg=0x8953788)
at ../../.
#5 0x00115dec in start_thread () from /lib/tls/
#6 0x0028719a in clone () from /lib/tls/libc.so.6
(gdb) info threads cond_wait@ @GLIBC_ 2.3.2 libpthread. so.0 libstdc+ +.so.5 cond_wait@ @GLIBC_ 2.3.2 libpthread. so.0 cond_timedwait@ @GLIBC_ 2.3.2 () from /lib/tls/ libpthread. so.0
5 Thread -1250002000 (LWP 4002) 0x0011821d in pthread_
() from /lib/tls/
4 Thread -1239512144 (LWP 4000) 0x00287dce in recvfrom ()
from /lib/tls/libc.so.6
* 3 Thread -1229018192 (LWP 3999) 0x007521e6 in __cxa_throw ()
from /usr/lib/
2 Thread -1218528336 (LWP 3998) 0x0011821d in pthread_
() from /lib/tls/
1 Thread -1218524576 (LWP 3995) 0x0011840b in pthread_
(gdb) thread 1 cond_timedwait@ @GLIBC_ 2.3.2 () from /lib/tls/ libpthread. so.0 cond_timedwait@ @GLIBC_ 2.3.2 () libpthread. so.0 cond_timedwait@ @GLIBC_ 2.3.2 () ithTimeout (pev...
[Switching to thread 1 (Thread -1218524576 (LWP 3995))]#0 0x0011840b in pthread_
(gdb) bt
#0 0x0011840b in pthread_
from /lib/tls/
#1 0x0029425d in pthread_
from /lib/tls/libc.so.6
#2 0x00d849b6 in epicsEventWaitW