assert fail when running CA regression tests against in memory DB channel

Bug #541210 reported by Jeff Hill
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
EPICS Base
Fix Released
Wishlist
Jeff Hill

Bug Description

I see the following assert fail in a callback thread when the CA regression tests run against a local database channel on a windows SMP system. The test that is running is called multiple_sg_requests (the source code is included below). I ran the test through purify and nothing unusual was detected (possibly ruling out corruption or use of memory that has been released back to pool). I have also included source code for the application's main function calling acctst that may be helpful with reproducing the problem.

A call to "assert (ppn->state==putNotifyRestartCallbackRequested || ppn->state==
putNotifyUserCallbackRequested)" failed in ..\\dbNotify.c line 176.
EPICS Release EPICS R3.14.6 $$Name: $$ $$Date: 2004/05/28 19:27:47 $$.
Current time Tue Oct 19 2004 10:22:55.766769874.
Please E-mail this message to the author or to <email address hidden>
Calling epicsThreadSuspendSelf()

-----------------------------------------------

/* exMain.cpp */
/* Author: Marty Kraimer Date: 17MAR2000 */

#include <stddef.h>
#include <stdlib.h>
#include <stddef.h>
#include <string.h>
#include <stdio.h>

#include "epicsThread.h"
#include "iocsh.h"
#include "caDiagnostics.h"

int main(int argc,char *argv[])
{
    if(argc>=2) {
        iocsh(argv[1]);
        epicsThreadSleep(.2);
    }
    acctst ( "hillHost:xxxExample", 10, 1,
   1, ca_enable_preemptive_callback );

    iocsh(NULL);
    return(0);
}

-------------------------------------------------------------

void multiple_sg_requests ( chid chix, CA_SYNC_GID gid )
{
    int status;
    unsigned i;
    static dbr_float_t fvalput = 3.3F;
    static dbr_float_t fvalget;

    for ( i=0; i < 1000; i++ ) {
        if ( ca_write_access (chix) ){
            status = ca_sg_array_put ( gid, DBR_FLOAT, 1,
                        chix, &fvalput);
            SEVCHK ( status, NULL );
        }

        if ( ca_read_access (chix) ) {
            status = ca_sg_array_get ( gid, DBR_FLOAT, 1,
                    chix, &fvalget);
            SEVCHK ( status, NULL );
        }
    }
}

------------------------------------------------------------

Callback thread (is suspended by assert):
 NTDLL.DLL!NtSuspendThread() + 0xb
  Com.dll!epicsThreadSuspendSelf() Line 580 + 0x18 C
  Com.dll!epicsAssert(const char * pFile=0x00268f8c, const unsigned int line=0x000000b0, const char * pExp=0x00268f30, const char * pAuthorName=0x002df2ac) Line 72 C
> dbIoc.dll!notifyCallback(callbackPvt * pcallback=0x0146f468) Line 176 + 0x31 C
  dbIoc.dll!callbackTask(int * ppriority=0x0026b458) Line 125 + 0xb C
  Com.dll!epicsWin32ThreadEntry(void * lpParameter=0x00a8d568) Line 423 + 0xf C
  msvcr71d.dll!_threadstartex(void * ptd=0x00aac9d0) Line 241 + 0xd C
  KERNEL32.DLL!TlsSetValue() + 0xf0

-----------------------------------------------------------

Main thread is running acctst (the ca regression test):
 NTDLL.DLL!NtWaitForSingleObject() + 0xb
  KERNEL32.DLL!WaitForSingleObject() + 0xf
> Com.dll!epicsEvent::wait(double timeOut=1.0000000000000000) Line 72 + 0x14 C++
  dbIoc.dll!dbPutNotifyBlocker::initiatePutNotify(epicsGuard<epicsMutex> & guard={...}, cacWriteNotify & notify={...}, dbAddr & addr={...}, unsigned int type=0x00000002, unsigned long count=0x00000001, const void * pValue=0x00413a50) Line 148 + 0x1d C++
  dbIoc.dll!dbContext::initiatePutNotify(epicsGuard<epicsMutex> & guard={...}, dbChannelIO & chan={...}, dbAddr & addr={...}, unsigned int type=0x00000002, unsigned long count=0x00000001, const void * pValue=0x00413a50, cacWriteNotify & notifyIn={...}, unsigned int * pId=0x01475370) Line 267 C++
  dbIoc.dll!dbChannelIO::write(epicsGuard<epicsMutex> & guard={...}, unsigned int type=0x00000002, unsigned long count=0x00000001, const void * pValue=0x00413a50, cacWriteNotify & notify={...}, unsigned int * pId=0x01475370) Line 122 C++
  ca.dll!oldChannelNotify::write(epicsGuard<epicsMutex> & guard={...}, unsigned int type=0x00000002, unsigned long count=0x00000001, const void * pValue=0x00413a50, cacWriteNotify & notify={...}, unsigned int * pId=0x01475370) Line 624 + 0x2d C++
  ca.dll!syncGroupWriteNotify::begin(epicsGuard<epicsMutex> & guard={...}, unsigned int type=0x00000002, unsigned long count=0x00000001, const void * pValueIn=0x00413a50) Line 45 C++
  ca.dll!CASG::put(epicsGuard<epicsMutex> & guard={...}, oldChannelNotify * pChan=0x0140f010, unsigned int type=0x00000002, unsigned long count=0x00000001, const void * pValue=0x00413a50) Line 220 C++
  ca.dll!ca_sg_array_put(const unsigned int gid=0x00000001, long type=0x00000002, unsigned long count=0x00000001, oldChannelNotify * pChan=0x0140f010, const void * pValue=0x00413a50) Line 266 C++
  ex.exe!multiple_sg_requests(oldChannelNotify * chix=0x0140f010, unsigned int gid=0x00000001) Line 1257 + 0x19 C
  ex.exe!test_sync_groups(oldChannelNotify * chan=0x0140f010, unsigned int interestLevel=0x0000000a) Line 1287 + 0xd C
  ex.exe!acctst(const char * pName=0x0040f920, unsigned int interestLevel=0x0000000a, unsigned int channelCount=0x00000001, unsigned int repetitionCount=0x00000001, ca_preemptive_callback_select select=ca_enable_preemptive_callback) Line 2882 + 0xd C
  ex.exe!main(int argc=0x00000002, char * * argv=0x009ca8f0) Line 21 + 0x12 C++
  ex.exe!mainCRTStartup() Line 398 + 0x11 C
  KERNEL32.DLL!OpenEventA() + 0x63d

Original Mantis Bug: mantis-151
    http://www.aps.anl.gov/epics/mantis/view_bug_page.php?f_id=151

Tags: db 3.14
Revision history for this message
Jeff Hill (johill-lanl) wrote :

Here is the Makefile for the application

TOP=../..

include $(TOP)/configure/CONFIG
#----------------------------------------
# ADD MACRO DEFINITIONS AFTER THIS LINE
#=============================

#==================================================
# build a support library

LIBRARY_IOC += xxxSupport

# xxxRecord.h will be created from xxxRecord.dbd
DBDINC += xxxRecord
# install devXxxSoft.dbd into <top>/dbd
DBD += xxxSupport.dbd

# The following are compiled and added to the Support library
xxxSupport_SRCS += xxxRecord.c
xxxSupport_SRCS += devXxxSoft.c

xxxSupport_LIBS += $(EPICS_BASE_IOC_LIBS)

#=============================
# build an ioc application

PROD_IOC = ex
# <name>.dbd will be created from <name>Include.dbd
DBD += ex.dbd

# <name>_registerRecordDeviceDriver.cpp will be created from <name>.dbd
ex_SRCS += ex_registerRecordDeviceDriver.cpp
ex_SRCS_DEFAULT += exMain.cpp
ex_SRCS_vxWorks += -nil-

# Add locally compiled object code
ex_SRCS += dbSubExample.c devAiXxx.c

# The following adds support from base/src/vxWorks
ex_OBJS_vxWorks += $(EPICS_BASE_BIN)/vxComLibrary

ex_OBJS += $(EPICS_BASE_BIN)/acctst

ex_LIBS += xxxSupport

# NOTES:
# 1)It is not possible to build sncExample both as a component of ex
# and standalone. You must choose only one.
# 2)To build sncExample SNCSEQ must be defined in <top>/configure/RELEASE

# The following builds sncExample as a component of ex
# Also in exInclude.dbd uncomment #registrar(sncExampleRegistrar)
#ex_SRCS += sncExample.stt
#ex_LIBS += seq pv

ex_LIBS += $(EPICS_BASE_IOC_LIBS)

# The following builds sncExample as a standalone application
#PROD_HOST += sncExample
#sncExample_SNCFLAGS += +m
#sncExample_SRCS += sncExample.stt
#sncExample_LIBS += seq pv
#sncExample_LIBS += $(EPICS_BASE_HOST_LIBS)

#===========================

include $(TOP)/configure/RULES
#----------------------------------------
# ADD RULES AFTER THIS LINE

Revision history for this message
Jeff Hill (johill-lanl) wrote :

From Marty Kraimer:

I dont have any way to test on a windows SMP system.

Can you add the statement shown in column 1 below and wait for the assert?

STATIC void notifyCallback(CALLBACK *pcallback)
{
    putNotify *ppn=NULL;
    dbCommon *precord;

    callbackGetUser(ppn,pcallback);
    precord = ppn->paddr->precord;
    dbScanLock(precord);
    epicsMutexMustLock(notifyLock);
    assert(precord->ppnr);
if(ppn->state!=putNotifyRestartCallbackRequested
&& ppn->state!=putNotifyUserCallbackRequested) {
printf("notifyCallback bad state %d\\n",ppn->state);
}
    assert(ppn->state==putNotifyRestartCallbackRequested
          || ppn->state==putNotifyUserCallbackRequested);

Revision history for this message
Ralph Lange (ralph-lange) wrote :

There was a problem running the softcallback tests from mrkSoftTest on HP-UX, which really looks a lot like being another symptom of this problem.

Marty was able to fix the problem on HP-UX - he said:
---------------------------------------------------------------------------
  By the way what I am doing is the following:

  If dbPutNotify finds that state==putNotifyUserCallbackActive then it

  ppn->userCallbackWait = 1;
  epicsEventWait(ppn->userCallbackEvent);

  And notifyCallback does the following AFTER the userCallback has completed.

    if(ppn->userCallbackWait) {
        ppn->userCallbackWait = 0;
        epicsEventSignal(ppn->userCallbackEvent);
    }
---------------------------------------------------------------------------

So - Jeff, can you try to run your regression tests against a local database channel on a windows SMP system again? If this is also working now, we might declare this bug fixed.

edited on: 2004-12-01 12:56

Revision history for this message
Jeff Hill (johill-lanl) wrote :

No change at this time.

D:\\users\\hill\\R3.14.pcas_fix\\epics\\appl\\iocBoot\\iocex>..\\..\\bin\\WIN32-x86\\ex st.
cmd
#!../../bin/WIN32-x86/ex
## You may have to change ex to something else
## everywhere it appears in this file
< envPaths
epicsEnvSet(ARCH,"WIN32-x86")
epicsEnvSet(IOC,"iocex")
epicsEnvSet(TOP,"D:/users/hill/R3.14.pcas_fix/epics/appl")
epicsEnvSet(EPICS_BASE,"D:/users/hill/R3.14.pcas_fix/epics/base")
cd D:/users/hill/R3.14.pcas_fix/epics/appl
## Register all support components
dbLoadDatabase("dbd/ex.dbd")
ex_registerRecordDeviceDriver(pdbbase)
## Load record instances
dbLoadRecords("db/dbExample1.db","user=hillHost")
dbLoadRecords("db/dbExample2.db","user=hillHost,no=1,scan=1 second")
dbLoadRecords("db/dbExample2.db","user=hillHost,no=2,scan=2 second")
dbLoadRecords("db/dbExample2.db","user=hillHost,no=3,scan=5 second")
dbLoadRecords("db/dbSubExample.db","user=hillHost")
## Set this to see messages from mySub
#var mySubDebug 1
cd D:/users/hill/R3.14.pcas_fix/epics/appl/iocBoot/iocex
iocInit()
Starting iocInit
############################################################################
### EPICS IOC CORE built on Oct 29 2004
### EPICS R3.14.6 $$Name: $$ $$Date: 2004/05/28 19:27:47 $$
############################################################################
iocInit: All initialization complete
## Start any sequence programs
#seq sncExample,"user=hillHost"
CA Client V4.11, channel name "hillHost:xxxExample", timeout 1e+020
Preemptive call back is enabled.
Waiting for test channel to connect..confirmed.
verifyImmediateTearDown {..........} 0.226868 sec
verifyTearDownWhenChannelConnected {} 0.002090 sec
unequalServerBufferSizeTest {...} 0.000316 sec
connecting to test channel {} 0.000087 sec
native type was DBF_DOUBLE, native count was 1
testing with a local channel
Canonical name for channel was "hillHost:xxxExample.VAL"
clearChannelInGetCallbackTest {} 0.000107 sec
monitorAddConnectionCallbackTest {} 0.102760 sec
verifyConnectWithDisconnectedChannels {..........} 11.050653 sec
grEnumTest {} 0.000185 sec
test_sync_groups {

A call to "assert (ppn->state==putNotifyRestartCallbackRequested || ppn->state==
putNotifyUserCallbackRequested)" failed in ..\\dbNotify.c line 176.
EPICS Release EPICS R3.14.6 $$Name: $$ $$Date: 2004/05/28 19:27:47 $$.
Current time Fri Dec 03 2004 09:30:55.705209189.
Please E-mail this message to the author or to <email address hidden>
Calling epicsThreadSuspendSelf()
filename="..\\..\\..\\src\\libCom\\taskwd\\taskwd.c" line number=170
task 00A70DF8 suspended

Revision history for this message
Jeff Hill (johill-lanl) wrote :

Ooops, my path was wrong and I was using the wrong DLLs. I fixed that and now my regression tests pass w/o incident both against a remote and an in-process PV.

Sorry about the confusion, I have alot going on today, and I am also preparing for the Japan meeting.

Revision history for this message
Jeff Hill (johill-lanl) wrote :

Marty fixed this, but he requested that I close this out.

Revision history for this message
Andrew Johnson (anj) wrote :

R3.14.7 Released

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.