CA client lib failure when IOC 'quit' (exit?) and immediate recreation of context
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
EPICS Base |
Fix Released
|
Wishlist
|
Jeff Hill |
Bug Description
From: Victor F E Pucknell
Sent: 16 January 2008 15:01
To: Duggan, AJ (Andrew)
Cc: Owens, PH (Peter); Letts, SC (Simon)
Subject: EPICS problem
Here is a simple test using the EPICS caget
Normally it is in a loop using caget every 2 seconds.
If you "quit" the EPICS server then I get a serious failure and the test program is lost somewhere in the ca library
Here is the output from the test program
#############
ca_pend_io returned with rc=0x1: value=0x0 ca_get returned with rc=0x1 ca_pend_io returned with rc=0x1: value=0x0 ca_get returned with rc=0x1 ca_pend_io returned with rc=0x1: value=0x0 ca_get returned with rc=0x1 CA.Client.
Warning: "Virtual circuit disconnect"
Context: "nndhcp069.
Source File: ../cac.cpp line 1126
Current Time: Wed Jan 16 2008 14:46:01.574167550 .......
CA.Client.
Warning: "Virtual circuit disconnect"
Context: "op=0, channel=
Source File: ../getCopy.cpp line 86
Current Time: Wed Jan 16 2008 14:46:01.574311751 .......
ca_pend_io failed User specified timeout on IO operation expired ca_get failed Virtual circuit disconnect EPICS server failure - resetting
A call to "assert (_pTargetMutex == & mutexToVerify)" failed in ../../.
EPICS Release EPICS R3.14.9-3.14.9 $R3-14-9$ $2007/02/05 16:31:45$.
Current time Wed Jan 16 2008 14:46:08.582331945.
Please E-mail this message to Jeff Hill <email address hidden> or to <email address hidden> Calling epicsThreadSusp
#######
However if kill the EPICS server using Control+C then the application does seem to recover and resume once I restart the EPICS server.
##################
ca_pend_io returned with rc=0x1: value=0x0
ca_get returned with rc=0x1
ca_pend_io returned with rc=0x1: value=0x0
ca_get returned with rc=0x1
ca_pend_io returned with rc=0x1: value=0x0
ca_get returned with rc=0x1
ca_pend_io returned with rc=0x1: value=0x0
ca_get returned with rc=0x1
CA.Client.
Warning: "Virtual circuit disconnect"
Context: "nndhcp069.
Source File: ../cac.cpp line 1126
Current Time: Wed Jan 16 2008 14:50:41.049109570
.......
CA.Client.
Warning: "Virtual circuit disconnect"
Context: "op=0, channel=
ctx="nndhcp069.
Source File: ../getCopy.cpp line 86
Current Time: Wed Jan 16 2008 14:50:41.049255811
.......
ca_pend_io failed User specified timeout on IO operation expired
ca_get failed Virtual circuit disconnect
EPICS server failure - resetting
Attempting start....
calling ca_context_create
calling ca_create_channel
calling ca_pend_io
ca_pend_io failed User specified timeout on IO operation expired
Attempting start....
calling ca_context_create
calling ca_create_channel
calling ca_pend_io
ca_get returned with rc=0x1
ca_pend_io returned with rc=0x1: value=0x0
ca_get returned with rc=0x1
ca_pend_io returned with rc=0x1: value=0x0
ca_get returned with rc=0x1
ca_pend_io returned with rc=0x1: value=0x0
ca_get returned with rc=0x1
ca_pend_io returned with rc=0x1: value=0x0
ca_get returned with rc=0x1
ca_pend_io returned with rc=0x1: value=0x0
ca_get returned with rc=0x1
ca_pend_io returned with rc=0x1: value=0x0
#######
This was not an exhaustive trial. I tried each only a couple of times.
However the asset failure and the suspend is typical of what we see.
Vic
Additional information:
#include <sys/types.h>
#include <stdlib.h>
#include <stdio.h>
#include <stddef.h>
#include <string.h>
#include "cadef.h"
#define EPICS_TIMEOUT 5.0
int main(int argc, char *argv[])
{
char PVN[17] = "MEIS-B-
int rc;
chid CID;
double timeout = EPICS_TIMEOUT;
unsigned short value;
START:
printf(
sleep(5);
printf("calling ca_context_
rc = ca_context_
if (rc != ECA_NORMAL) {
goto START;
}
printf("calling ca_create_
rc = ca_create_
if (rc != ECA_NORMAL) {
(void) ca_context_
goto START;
}
printf("calling ca_pend_io\n");
rc = ca_pend_
if (rc != ECA_NORMAL) {
goto START;
}
for(;;) {
rc = ca_get(DBR_SHORT, CID, &value);
if (rc != ECA_NORMAL) {
// if (CA_EXTRACT_
(void) ca_context_
// }
goto START;
}
printf("ca_get returned with rc=0x%x\n", rc);
rc = ca_pend_
if (rc != ECA_NORMAL) {
} else {
}
sleep(2);
}
exit(0);
}
Original Mantis Bug: mantis-306
http://
I snagged a stack trace this morning
#0 0x001183ad in pthread_ cond_wait@ @GLIBC_ 2.3.2 () from /lib/tls/ libpthread. so.0 cond_wait@ @GLIBC_ 2.3.2 () from /lib/tls/libc.so.6 ./src/libCom/ osi/os/ posix/osdEvent. c:77 endSelf () at ../../. ./src/libCom/ osi/os/ posix/osdThread .c:486 ../include/ epicsGuard. h", line=84, pExp=0x48b4a4 "_pTargetMutex == & mutexToVerify", pAuthorName= 0x653e90 "Calling epicsThreadSusp endSelf( )n") at ../../. ./src/libCom/ osi/os/ default/ osdAssert. c:71 AddressUnknown (this=0x99a6198, newiiu=@0xfffffffc, guard=@0x1) at ../../. ./include/ epicsGuard. h:84 utdownNotify (this=0x99a6198, callbackControl Guard=@ 0xbfffadc0, mutualExclusion Guard=@ 0xbfffadb0) at ../nciu.cpp:577 norTimer: :shutdown (this=0x99a48d8, cbGuard= @0xbfffadc0, guard=@0xbfffadb0) at ../disconnectGo vernorTimer. cpp:61 @0xbfffadc0, guard=@0xbfffadb0) at ../udpiiu.cpp:286 ./include/ epicsMemory. h:55
#1 0x0036a006 in pthread_
#2 0x006471a2 in epicsEventWait (pevent=0x99850a0) at ../../.
#3 0x00644f30 in epicsThreadSusp
#4 0x00643967 in epicsAssert (pFile=0x48b2f0 "../../
#5 0x0046fb94 in nciu::setServer
#6 0x00470828 in nciu::serviceSh
#7 0x0046ea3a in disconnectGover
#8 0x004732c1 in udpiiu::shutdown (this=0x9994470, cbGuard=
#9 0x00460a6a in ~cac (this=0x998a790) at ../cac.cpp:243
#10 0x0047ffc6 in ~ca_client_context (this=0x998af90) at ../../.
#11 0x004666c1 in ca_context_destroy () at ../access.cpp:251
#12 0x08048812 in main (argc=1, argv=0xbfffaf04) at ../test.c:41
(gdb)
edited on: 2008-09-25 09:43