crash while exiting ca client
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
EPICS Base |
Fix Released
|
High
|
Jeff Hill |
Bug Description
From Bob Soliday:
I have R3-14-2_branch from 2009-02-05 on linux-x86_64 built with HOST_OPT=NO and I compiled the included caputTest.c also with HOST_OPT=NO. I ran this inside gdb multiple times and got multiple problems. All the variations of gdb output are included in the CAissues file. Basically if I run it enough times I can get it to crash while exiting.
Additional information:
No problems with this but it looks like the last thread is not exited.
Starting program: /home/helios1/
[Thread debugging using libthread_db enabled]
[New Thread 140204410697472 (LWP 8040)]
[New Thread 1084721488 (LWP 8050)]
[New Thread 1093708112 (LWP 8051)]
[New Thread 1081780560 (LWP 8052)]
[New Thread 1079662928 (LWP 8053)]
[New Thread 1078569296 (LWP 8054)]
[New Thread 1105955152 (LWP 8055)]
[New Thread 1088198992 (LWP 8056)]
[Thread 1081780560 (LWP 8052) exited]
[Thread 1079662928 (LWP 8053) exited]
[Thread 1105955152 (LWP 8055) exited]
[Thread 1078569296 (LWP 8054) exited]
[Thread 1093708112 (LWP 8051) exited]
[Thread 1084721488 (LWP 8050) exited]
[Thread 1088198992 (LWP 8056) exited]
[New Thread 1093708112 (LWP 8057)]
_______
No problems with this
Starting program: /home/helios1/
[Thread debugging using libthread_db enabled]
[New Thread 139738863658752 (LWP 8058)]
[New Thread 1098991952 (LWP 8068)]
[New Thread 1089534288 (LWP 8069)]
[New Thread 1106905424 (LWP 8070)]
[New Thread 1081514320 (LWP 8071)]
[New Thread 1077770576 (LWP 8072)]
[New Thread 1082042704 (LWP 8073)]
[New Thread 1091991888 (LWP 8074)]
[Thread 1106905424 (LWP 8070) exited]
[Thread 1081514320 (LWP 8071) exited]
[Thread 1082042704 (LWP 8073) exited]
[Thread 1077770576 (LWP 8072) exited]
[Thread 1091991888 (LWP 8074) exited]
[Thread 1089534288 (LWP 8069) exited]
[Thread 1098991952 (LWP 8068) exited]
[New Thread 1089534288 (LWP 8075)]
[Thread 1089534288 (LWP 8075) exited]
_______
Segmentation fault with this one
Starting program: /home/helios1/
[Thread debugging using libthread_db enabled]
[New Thread 139657383642880 (LWP 7791)]
[New Thread 1103264080 (LWP 7801)]
[New Thread 1100908880 (LWP 7802)]
[New Thread 1087953232 (LWP 7803)]
[New Thread 1100618064 (LWP 7804)]
[New Thread 1096583504 (LWP 7805)]
[New Thread 1077164368 (LWP 7806)]
[New Thread 1093613904 (LWP 7807)]
[Thread 1087953232 (LWP 7803) exited]
[Thread 1103264080 (LWP 7801) exited]
[Thread 1100908880 (LWP 7802) exited]
[Thread 1096583504 (LWP 7805) exited]
[Thread 1077164368 (LWP 7806) exited]
[Thread 1100618064 (LWP 7804) exited]
[New Thread 1091684688 (LWP 7808)]
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1093613904 (LWP 7807)]
0x0000000000436741 in tcpRecvWatchdog
229 this->timer.cancel ();
Current language: auto; currently c++
(gdb) bt
#0 0x0000000000436741 in tcpRecvWatchdog
#1 0x0000000000430d46 in ~tcpiiu (this=0x2222788) at ../tcpiiu.cpp:1015
#2 0x00000000004171f4 in cac::destroyIIU (this=0x21fdc50, iiu=@0x2222788) at ../cac.cpp:1152
#3 0x0000000000433756 in tcpSendThread::run (this=0x22228c8) at ../tcpiiu.cpp:228
#4 0x00000000004463c5 in epicsThreadCall
#5 0x000000000044cdb2 in start_routine (arg=0x22382e0) at ../../.
#6 0x00000031b1806407 in start_thread () from /lib64/
#7 0x00000031b0cd4b0d in clone () from /lib64/libc.so.6
_______
I don't know what happened here
Starting program: /home/helios1/
[Thread debugging using libthread_db enabled]
[New Thread 140484916070144 (LWP 7985)]
[New Thread 1097197904 (LWP 7995)]
[New Thread 1076332880 (LWP 7996)]
[New Thread 1092413776 (LWP 7997)]
[New Thread 1080375632 (LWP 7998)]
[New Thread 1078511952 (LWP 7999)]
[New Thread 1093278032 (LWP 8000)]
[New Thread 1081186640 (LWP 8001)]
[Thread 1092413776 (LWP 7997) exited]
[Thread 1078511952 (LWP 7999) exited]
[Thread 1081186640 (LWP 8001) exited]
[Thread 1093278032 (LWP 8000) exited]
[Thread 1080375632 (LWP 7998) exited]
[Thread 1076332880 (LWP 7996) exited]
[Thread 1097197904 (LWP 7995) exited]
[New Thread 1092413776 (LWP 8002)]
[Switching to Thread 1093278032 (LWP 8000)]
Cannot remove breakpoints because program is no longer writable.
It might be running in another process.
Further execution is probably impossible.
0x00000031b18065ac in start_thread () from /lib64/
ptrace: No such process.
_______
Not sure about this one either
Starting program: /home/helios1/
[Thread debugging using libthread_db enabled]
[New Thread 140095465068288 (LWP 8233)]
[New Thread 1095080272 (LWP 8243)]
[New Thread 1081846096 (LWP 8244)]
[New Thread 1090562384 (LWP 8245)]
[New Thread 1103030608 (LWP 8246)]
[New Thread 1080772944 (LWP 8247)]
[New Thread 1091090768 (LWP 8248)]
[New Thread 1085905232 (LWP 8249)]
[Thread 1090562384 (LWP 8245) exited]
[Thread 1080772944 (LWP 8247) exited]
[Thread 1103030608 (LWP 8246) exited]
[Thread 1095080272 (LWP 8243) exited]
[Thread 1081846096 (LWP 8244) exited]
[Thread 1085905232 (LWP 8249) exited]
[Thread 1091090768 (LWP 8248) exited]
[New Thread 1080772944 (LWP 8250)]
Couldn't get registers: No such process.
_______
It completely hung up with this and gdb had to be killed
Starting program: /home/helios1/
[Thread debugging using libthread_db enabled]
[New Thread 139927175259904 (LWP 8539)]
[New Thread 1086777680 (LWP 8551)]
[New Thread 1081743696 (LWP 8552)]
[New Thread 1097648464 (LWP 8553)]
[New Thread 1089816912 (LWP 8554)]
[New Thread 1106667856 (LWP 8555)]
[New Thread 1089272144 (LWP 8556)]
[New Thread 1103780176 (LWP 8557)]
[Thread 1097648464 (LWP 8553) exited]
[Thread 1106667856 (LWP 8555) exited]
[Thread 1089816912 (LWP 8554) exited]
[Thread 1103780176 (LWP 8557) exited]
[Thread 1089272144 (LWP 8556) exited]
[Thread 1086777680 (LWP 8551) exited]
[Thread 1081743696 (LWP 8552) exited]
[New Thread 1106667856 (LWP 8558)]
_______
This one I added an abort() to osdMutex.c right after the
"epicsMutex pthread_mutex_lock failed: error Invalid argument"
error message so I could catch the backtrace.
Starting program: /home/helios1/
[Thread debugging using libthread_db enabled]
[New Thread 140616365545216 (LWP 9163)]
[New Thread 1088264528 (LWP 9173)]
[New Thread 1104468304 (LWP 9174)]
[New Thread 1088530768 (LWP 9175)]
[New Thread 1096276304 (LWP 9176)]
[New Thread 1086306640 (LWP 9177)]
[New Thread 1089059152 (LWP 9178)]
[New Thread 1092725072 (LWP 9179)]
[Thread 1088530768 (LWP 9175) exited]
[Thread 1089059152 (LWP 9178) exited]
[Thread 1096276304 (LWP 9176) exited]
[Thread 1092725072 (LWP 9179) exited]
[Thread 1088264528 (LWP 9173) exited]
[Thread 1104468304 (LWP 9174) exited]
[New Thread 1092725072 (LWP 9180)]
epicsMutex pthread_mutex_lock1 failed DEBUG: error Invalid argument
Program received signal SIGABRT, Aborted.
[Switching to Thread 1086306640 (LWP 9177)]
0x00000031b0c30ec5 in raise () from /lib64/libc.so.6
(gdb) bt full
#0 0x00000031b0c30ec5 in raise () from /lib64/libc.so.6
No symbol table info available.
#1 0x00000031b0c32970 in abort () from /lib64/libc.so.6
No symbol table info available.
#2 0x000000000044e128 in epicsMutexOsdLock (pmutex=0x22b78a0) at ../../.
status = 22
#3 0x0000000000446793 in epicsMutexLock (pmutexNode=
status = epicsMutexLockOK
#4 0x0000000000442095 in freeListFree (pvt=0x22b7870, pmem=0x22e3710) at ../../.
status = epicsMutexLockOK
pfl = (FREELISTPVT *) 0x22b7870
ppnext = (void **) 0x40bfac70
#5 0x00000000004345e2 in cac::releaseSma
No locals.
#6 0x0000000000430da5 in ~tcpiiu (this=0x22da3f0) at ../tcpiiu.cpp:1024
No locals.
#7 0x00000000004171f4 in cac::destroyIIU (this=0x22b5c50, iiu=@0x22da3f0) at ../cac.cpp:1152
mgr = {<notifyGuard> = {notify = @0x22b5860}, cbGuard = {_pTargetMutex = 0x22b5930}}
guard = {_pTargetMutex = 0x22b5928}
addr = {ia = {sin_family = 2, sin_port = 51219, sin_addr = {s_addr = 2466395812}, sin_zero = {30 '\036', 195 'Ã', 66 'B', 0 '\0', 0 '\0',
0 '\0', 0 '\0', 0 '\0'}}, sa = {sa_family = 2, sa_data = "\023Ȥ
#8 0x0000000000433756 in tcpSendThread::run (this=0x22da530) at ../tcpiiu.cpp:228
guard = {_pTargetMutex = 0x22b5928}
#9 0x00000000004463c5 in epicsThreadCall
pThread = (epicsThread *) 0x22da538
waitRelease = true
#10 0x000000000044cdb2 in start_routine (arg=0x22d7e60) at ../../.
pthreadInfo = (epicsThreadOSD *) 0x22d7e60
status = 0
oldtype = 0
blockAllSig = {__val = {18446744067267
#11 0x00000031b1806407 in start_thread () from /lib64/
No symbol table info available.
#12 0x00000031b0cd4b0d in clone () from /lib64/libc.so.6
No symbol table info available.
Original Mantis Bug: mantis-334
http://
summary: |
- crash whil exiting ca client + crash while exiting ca client |
Here is the source code for the test:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <cadef.h>
#include <epicsVersion.h>
int main(int argc, char **argv)
{
char name1[40], name2[40];
chid chid1, chid2;
long result;
sprintf(name1, "S30B:P1: msAve:AveEnbBO" ); msAve:AveEnbBO" );
sprintf(name2, "S31B:P1:
result = ca_context_ create( ca_disable_ preemptive_ callback) ; result) );
if (result != ECA_NORMAL) {
fprintf(stderr, "CA error %s occurred while trying to start channel access.\n", ca_message(
return(1);
}
result = ca_search(name1, &chid1);
if (result != ECA_NORMAL) {
fprintf(stderr, "error: problem doing search for %s\n", name1);
return(1);
}
result = ca_search(name2, &chid2);
if (result != ECA_NORMAL) {
fprintf(stderr, "error: problem doing search for %s\n", name2);
return(1);
}
result = ca_pend_io(10);
if (result != ECA_NORMAL) {
fprintf(stderr, "warning: problem doing search for PVs\n");
}
result = ca_state(chid1); stderr, "error: problem doing put for %s\n", name1);
if (result != cs_conn) {
result = ca_put(DBR_STRING, chid1, "0");
if (result != ECA_NORMAL) {
fprintf(
return(1);
}
}
result = ca_state(chid2); stderr, "error: problem doing put for %s\n", name2);
if (result != cs_conn) {
result = ca_put(DBR_STRING, chid2, "0");
if (result != ECA_NORMAL) {
fprintf(
return(1);
}
}
result = ca_pend_io(10);
if (result != ECA_NORMAL) {
fprintf(stderr, "error: problem doing put for PVs\n");
}
ca_context_ destroy( );
return(0);
}