caget segmentation fault

Bug #729331 reported by Matthieu Bec
This bug report is a duplicate of:  Bug #717252: local caput causes ioc crash on win32. Edit Remove
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
EPICS Base
Triaged
Undecided
Jeff Hill

Bug Description

base= R3.14.12
host arch= linux-x86_64

trying to caget an existing channel over a slow network yields a Segmentation fault:

% caget lbb:heartbeat
Channel connect timed out: 'lbb:heartbeat' not found.
Segmentation fault (core dumped)

from the same terminal session, I can access the channel increasing the wait time:

% caget -w5 lbb:heartbeat
lbb:heartbeat 58

Despite the message saying 'channel not found' when I get the core dump, channel that really do not exist handle well:

% caget lbb:doesnotexist
Channel connect timed out: 'lbb:doesnotexist' not found.

recompiled linux-x86_64-debug that also core dump using the same example:

% gdb -e ./bin/linux-x86_64-debug/caget -c core.6231
GNU gdb (GDB) Fedora (7.2-41.fc14)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
[New Thread 6235]
[New Thread 6236]
[New Thread 6233]
[New Thread 6231]
[New Thread 6232]
Missing separate debuginfo for /export/home/mdcb/work/sbfsvn01/epics-base/base-3.14.12/lib/linux-x86_64-debug/libca.so.3.14
Try: yum --disablerepo='*' --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/60/35db7a40deaae380732e9150a092ceec849487
Missing separate debuginfo for /export/home/mdcb/work/sbfsvn01/epics-base/base-3.14.12/lib/linux-x86_64-debug/libCom.so.3.14
Try: yum --disablerepo='*' --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/86/208419a7cd12459c67e6a9ee14d0697fc407e3
Missing separate debuginfo for
Try: yum --disablerepo='*' --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/cf/62b562b5e2a7fce9fc0fc1e72d9549366ac429
Reading symbols from /export/home/mdcb/work/sbfsvn01/epics-base/base-3.14.12/lib/linux-x86_64-debug/libca.so.3.14...done.
Loaded symbols for /export/home/mdcb/work/sbfsvn01/epics-base/base-3.14.12/lib/linux-x86_64-debug/libca.so.3.14
Reading symbols from /export/home/mdcb/work/sbfsvn01/epics-base/base-3.14.12/lib/linux-x86_64-debug/libCom.so.3.14...done.
Loaded symbols for /export/home/mdcb/work/sbfsvn01/epics-base/base-3.14.12/lib/linux-x86_64-debug/libCom.so.3.14
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/librt.so.1
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /usr/lib64/libstdc++.so.6...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libstdc++.so.6
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libgcc_s.so.1
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnss_files.so.2
Reading symbols from /lib64/libnss_mdns4_minimal.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnss_mdns4_minimal.so.2
Reading symbols from /lib64/libnss_dns.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnss_dns.so.2
Reading symbols from /lib64/libresolv.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libresolv.so.2
Core was generated by `./bin/linux-x86_64-debug/caget lbb:heartbeat'.
Program terminated with signal 11, Segmentation fault.
#0 0x00007f612b874538 in tsDLList<nciu>::remove (this=0x7f6124000b38, item=...) at ../../../include/tsDLList.h:230
230 nextNode.pPrev = theNode.pPrev;
Missing separate debuginfos, use: debuginfo-install glibc-2.13-1.x86_64 libgcc-4.5.1-4.fc14.x86_64 libstdc++-4.5.1-4.fc14.x86_64 nss-mdns-0.10-8.fc12.x86_64
(gdb) t apply all bt

Thread 5 (Thread 0x7f612b597700 (LWP 6232)):
#0 0x000000387760b3b4 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f612b5fdebc in condWait (condId=0x24f99a8, mutexId=0x24f9980) at ../../../src/libCom/osi/os/posix/osdEvent.c:75
#2 0x00007f612b5fe214 in epicsEventWait (pevent=0x24f9980) at ../../../src/libCom/osi/os/posix/osdEvent.c:137
#3 0x00007f612b5f6dfa in epicsEvent::wait (this=0x24f9838) at ../../../src/libCom/osi/epicsEvent.cpp:63
#4 0x00007f612b5f3707 in ipAddrToAsciiEnginePrivate::run (this=0x24f93f0) at ../../../src/libCom/misc/ipAddrToAsciiAsynchronous.cpp:305
#5 0x00007f612b5f542c in epicsThreadCallEntryPoint (pPvt=0x24f9848) at ../../../src/libCom/osi/epicsThread.cpp:85
#6 0x00007f612b5fc85e in start_routine (arg=0x24f9bb0) at ../../../src/libCom/osi/os/posix/osdThread.c:282
#7 0x0000003877606ccb in start_thread () from /lib64/libpthread.so.0
#8 0x0000003876ae0c2d in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x7f612b599740 (LWP 6231)):
#0 0x000000387760b3b4 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f612b5fdebc in condWait (condId=0x24f92e8, mutexId=0x24f92c0) at ../../../src/libCom/osi/os/posix/osdEvent.c:75
#2 0x00007f612b5fe214 in epicsEventWait (pevent=0x24f92c0) at ../../../src/libCom/osi/os/posix/osdEvent.c:137
#3 0x00007f612b5f6dfa in epicsEvent::wait (this=0x24f8fd8) at ../../../src/libCom/osi/epicsEvent.cpp:63
#4 0x00007f612b860871 in cac::~cac (this=0x24f8de0, __in_chrg=<value optimized out>) at ../cac.cpp:312
#5 0x00007f612b860d0e in cac::~cac (this=0x24f8de0, __in_chrg=<value optimized out>) at ../cac.cpp:343
#6 0x00007f612b889889 in epics_auto_ptr<cacContext, (epics_auto_ptr_type)0>::destroyTarget() ()
   from /export/home/mdcb/work/sbfsvn01/epics-base/base-3.14.12/lib/linux-x86_64-debug/libca.so.3.14
#7 0x00007f612b888ddc in epics_auto_ptr<cacContext, (epics_auto_ptr_type)0>::reset(cacContext*) ()
   from /export/home/mdcb/work/sbfsvn01/epics-base/base-3.14.12/lib/linux-x86_64-debug/libca.so.3.14
#8 0x00007f612b8867de in ca_client_context::~ca_client_context (this=0x24f89f0, __in_chrg=<value optimized out>) at ../ca_client_context.cpp:188
#9 0x00007f612b886a60 in ca_client_context::~ca_client_context (this=0x24f89f0, __in_chrg=<value optimized out>) at ../ca_client_context.cpp:193
#10 0x00007f612b86b8bd in ca_context_destroy () at ../access.cpp:252
#11 0x0000000000403737 in ?? ()
#12 0x00007fff34f759f8 in ?? ()
#13 0x0000000200000100 in ?? ()
#14 0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7f612b516700 (LWP 6233)):
#0 0x000000387760b71e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f612b5fde78 in condTimedwait (condId=0x24fa1c8, mutexId=0x24fa1a0, time=0x7f612b515ce0) at ../../../src/libCom/osi/os/posix/osdEvent.c:65
#2 0x00007f612b5fe34a in epicsEventWaitWithTimeout (pevent=0x24fa1a0, timeout=1.7976931348623157e+308) at ../../../src/libCom/osi/os/posix/osdEvent.c:156
#3 0x00007f612b5f6e6c in epicsEvent::wait (this=0x24f9f70, timeOut=1.7976931348623157e+308) at ../../../src/libCom/osi/epicsEvent.cpp:72
#4 0x00007f612b605749 in timerQueueActive::run (this=0x24f9eb0) at ../../../src/libCom/timer/timerQueueActive.cpp:93
#5 0x00007f612b5f542c in epicsThreadCallEntryPoint (pPvt=0x24f9f80) at ../../../src/libCom/osi/epicsThread.cpp:85
#6 0x00007f612b5fc85e in start_routine (arg=0x24fa3d0) at ../../../src/libCom/osi/os/posix/osdThread.c:282
#7 0x0000003877606ccb in start_thread () from /lib64/libpthread.so.0
#8 0x0000003876ae0c2d in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7f612b1ca700 (LWP 6236)):
#0 0x000000387760b71e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f612b5fde78 in condTimedwait (condId=0x7f6124009e58, mutexId=0x7f6124009e30, time=0x7f612b1c9ba0) at ../../../src/libCom/osi/os/posix/osdEvent.c:65
#2 0x00007f612b5fe34a in epicsEventWaitWithTimeout (pevent=0x7f6124009e30, timeout=30) at ../../../src/libCom/osi/os/posix/osdEvent.c:156
#3 0x00007f612b5f6e6c in epicsEvent::wait (this=0x7f61240009d0, timeOut=30) at ../../../src/libCom/osi/epicsEvent.cpp:72
#4 0x00007f612b5f57cf in epicsThread::exitWait (this=0x7f61240009b0, delay=30) at ../../../src/libCom/osi/epicsThread.cpp:154
#5 0x00007f612b87c4aa in tcpRecvThread::exitWait (this=0x7f61240009a8, delay=30) at ../tcpiiu.cpp:402
#6 0x00007f612b87bc67 in tcpSendThread::run (this=0x7f6124000a00) at ../tcpiiu.cpp:205
#7 0x00007f612b5f542c in epicsThreadCallEntryPoint (pPvt=0x7f6124000a08) at ../../../src/libCom/osi/epicsThread.cpp:85
#8 0x00007f612b5fc85e in start_routine (arg=0x7f612400a250) at ../../../src/libCom/osi/os/posix/osdThread.c:282
#9 0x0000003877606ccb in start_thread () from /lib64/libpthread.so.0
#10 0x0000003876ae0c2d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7f612b24b700 (LWP 6235)):
#0 0x00007f612b874538 in tsDLList<nciu>::remove (this=0x7f6124000b38, item=...) at ../../../include/tsDLList.h:230
#1 0x00007f612b8815f2 in tcpiiu::connectNotify (this=0x7f61240008c0, guard=..., chan=...) at ../tcpiiu.cpp:1944
#2 0x00007f612b863764 in cac::createChannelRespAction (this=0x24f8de0, mgr=..., iiu=..., hdr=...) at ../cac.cpp:1126
#3 0x00007f612b863b80 in cac::executeResponse (this=0x24f8de0, mgr=..., iiu=..., currentTime=..., hdr=..., pMshBody=0x7f612400a4b0 "") at ../cac.cpp:1193
#4 0x00007f612b87f006 in tcpiiu::processIncoming (this=0x7f61240008c0, currentTime=..., mgr=...) at ../tcpiiu.cpp:1252
#5 0x00007f612b87c961 in tcpRecvThread::run (this=0x7f61240009a8) at ../tcpiiu.cpp:519
#6 0x00007f612b5f542c in epicsThreadCallEntryPoint (pPvt=0x7f61240009b0) at ../../../src/libCom/osi/epicsThread.cpp:85
#7 0x00007f612b5fc85e in start_routine (arg=0x7f6124009ea0) at ../../../src/libCom/osi/os/posix/osdThread.c:282
---Type <return> to continue, or q <return> to quit---
#8 0x0000003877606ccb in start_thread () from /lib64/libpthread.so.0
#9 0x0000003876ae0c2d in clone () from /lib64/libc.so.6
(gdb)

Revision history for this message
Jeff Hill (johill-lanl) wrote :

This appears to be similar to the 2nd issue in bug 717252.

This crash appears that it might be fixed by prior revision 12173 (occurring on 2011-01-15) to ca/tcpiiu.cpp. Unfortunately, I am having trouble finding a bug report to link to in launchpad.

Changed in epics-base:
status: New → Triaged
assignee: nobody → Jeff Hill (johill-lanl)
Revision history for this message
Jeff Hill (johill-lanl) wrote :

revision 12173

C:\hill\epicsInBazaar\R3.14\trunk\src\ca>bzr diff -c12173 tcpiiu.cpp
=== modified file 'src/ca/tcpiiu.cpp'
--- src/ca/tcpiiu.cpp 2010-09-20 21:21:50 +0000
+++ src/ca/tcpiiu.cpp 2011-01-15 00:53:33 +0000
@@ -1866,10 +1866,14 @@
     guard.assertIdenticalMutex ( this->mutex );

     while ( nciu * pChan = this->createReqPend.get () ) {
+ pChan->channelNode::listMember =
+ channelNode::cs_none;
         pChan->serviceShutdownNotify ( cbGuard, guard );
     }

     while ( nciu * pChan = this->createRespPend.get () ) {
+ pChan->channelNode::listMember =
+ channelNode::cs_none;
         // we dont yet know the server's id so we cant
         // send a channel delete request and will instead
         // trust that the server can do the proper cleanup
@@ -1878,12 +1882,16 @@
     }

     while ( nciu * pChan = this->v42ConnCallbackPend.get () ) {
+ pChan->channelNode::listMember =
+ channelNode::cs_none;
         this->clearChannelRequest ( guard,
             pChan->getSID(guard), pChan->getCID(guard) );
         pChan->serviceShutdownNotify ( cbGuard, guard );
     }

     while ( nciu * pChan = this->subscripReqPend.get () ) {
+ pChan->channelNode::listMember =
+ channelNode::cs_none;
         pChan->disconnectAllIO ( cbGuard, guard );
         this->clearChannelRequest ( guard,
             pChan->getSID(guard), pChan->getCID(guard) );
@@ -1891,6 +1899,8 @@
     }

     while ( nciu * pChan = this->connectedList.get () ) {
+ pChan->channelNode::listMember =
+ channelNode::cs_none;
         pChan->disconnectAllIO ( cbGuard, guard );
         this->clearChannelRequest ( guard,
             pChan->getSID(guard), pChan->getCID(guard) );
@@ -1898,6 +1908,8 @@
     }

     while ( nciu * pChan = this->unrespCircuit.get () ) {
+ pChan->channelNode::listMember =
+ channelNode::cs_none;
         pChan->disconnectAllIO ( cbGuard, guard );
         // if we know that the circuit is unresponsive
         // then we dont send a channel delete request and
@@ -1907,6 +1919,8 @@
     }

      while ( nciu * pChan = this->subscripUpdateReqPend.get () ) {
+ pChan->channelNode::listMember =
+ channelNode::cs_none;
         pChan->disconnectAllIO ( cbGuard, guard );
         this->clearChannelRequest ( guard,
             pChan->getSID(guard), pChan->getCID(guard) );

Revision history for this message
Matthieu Bec (mbec) wrote :

segfault is gone after applying the patch, Thank you very much.

The use case that would previously core dump now gives extra output. Not really a concern, but copied below for info:

./bin/linux-x86_64/caget lbb:heartbeat
Channel connect timed out: 'lbb:heartbeat' not found.
CA Client Library: Ignored duplicate create channel response from CA server?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.