RTEMS IOC hangs if multiple CA clients start together while IOC is initializing

Bug #541195 reported by Jeff Hill
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
EPICS Base
Fix Released
Critical
Jeff Hill

Bug Description

From Till Straumann:

The caRepeater thread calls 'exit()' which is not really
what you want to do from the ca server

Additional information:
*** base-3.14.6/src/ca/repeater.cpp.orig Mon Feb 2 22:32:00 2004
--- base-3.14.6/src/ca/repeater.cpp Wed Sep 15 18:31:58 2004
***************
*** 484,490 ****
  /*
   * ca_repeater ()
   */
! void ca_repeater ()
  {
      tsFreeList < repeaterClient, 0x20 > freeList;
      int size;
--- 484,490 ----
  /*
   * ca_repeater ()
   */
! static int ca_repeater_func ()
  {
      tsFreeList < repeaterClient, 0x20 > freeList;
      int size;
***************
*** 506,512 ****
          if ( SOCKERRNO == SOCK_EADDRINUSE ) {
              osiSockRelease ();
              debugPrintf ( ( "CA Repeater: exiting because a repeater is already running\\n" ) );
! exit (0);
          }
          char sockErrBuf[64];
          epicsSocketConvertErrnoToString (
--- 506,512 ----
          if ( SOCKERRNO == SOCK_EADDRINUSE ) {
              osiSockRelease ();
              debugPrintf ( ( "CA Repeater: exiting because a repeater is already running\\n" ) );
! return (0);
          }
          char sockErrBuf[64];
          epicsSocketConvertErrnoToString (
***************
*** 515,521 ****
              __FILE__, sockErrBuf );
          osiSockRelease ();
          delete [] pBuf;
! exit(0);
      }

      debugPrintf ( ( "CA Repeater: Attached and initialized\\n" ) );
--- 515,521 ----
              __FILE__, sockErrBuf );
          osiSockRelease ();
          delete [] pBuf;
! return(0);
      }

      debugPrintf ( ( "CA Repeater: Attached and initialized\\n" ) );
***************
*** 574,579 ****
--- 574,586 ----

          fanOut ( from, pMsg, size, freeList );
      }
+ return -1;
+ }
+
+ void ca_repeater(void)
+ {
+ if ( 0 == ca_repeater_func() )
+ exit(0);
  }

  /*
***************
*** 582,588 ****
  extern "C" void caRepeaterThread ( void * /* pDummy */ )
  {
      taskwdInsert ( epicsThreadGetIdSelf(), NULL, NULL );
! ca_repeater ();
  }

--- 589,596 ----
  extern "C" void caRepeaterThread ( void * /* pDummy */ )
  {
      taskwdInsert ( epicsThreadGetIdSelf(), NULL, NULL );
! if ( 0 == ca_repeater_func () )
! taskwdRemove( epicsThreadGetIdSelf() );
  }

Original Mantis Bug: mantis-131
    http://www.aps.anl.gov/epics/mantis/view_bug_page.php?f_id=131

Tags: ca 3.14
Revision history for this message
Jeff Hill (johill-lanl) wrote :

This code was written a long time ago. With R3.13 it executes as a thread only on vxWorks, and on vxWorks exit() causes only the thread to exit.

Of course, now with R3.14 the code runs also on other OS where exit() may have a different behavior. In particular, on Linux, Solaris, HPUX, Windows, etc exit() will cause the process to exit, but that behavior should not be observed because osiSpawnDetachedProcess() is implemented on these systems, and therefore the CA repeater always runs in an independent detached process. It is not running as a thread in the IOC as is the case with vxWorks.

That leaves RTEMS. The osiSpawnDetachedProcess() function is not (probably cant be) implemented on RTEMS and so you will end up with the CA repeater running as an RTEMS thread. However, perhaps on RTEMS exit() causes the RTEMS OS to shut down? That might be a more appropriate implementation of exit() compared to vxWorks.

What symptoms did you observe that led to finding this fix? I just read that Kate observed a hang (which I assume means that RTEMS shut down).

Revision history for this message
Jeff Hill (johill-lanl) wrote :

Fixed in R3.14.7

Revision history for this message
Andrew Johnson (anj) wrote :

R3.14.7 Released

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.