gddScalar::new() operator is not fully thread safe

Bug #541372 reported by Jeff Hill
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
EPICS Base
Fix Released
High
Jeff Hill

Bug Description

Dirk Wrote:

 From what I can see in the macro definition of gdd_NEWDEL_NEW(gdd) in gddNewDel.h, the gddScalar::new() operator is not fully thread safe.
Initialization of the freelist is not protected. Thus, calling gdd*::new() for the first time in two different threads may crash. Maybe using gdd*::new() the first time while still in single threaded context may cure the problem.

Dirk

Bruno Coudoin wrote:
> Hi,
>
> Tonight I found something odd. I am perhaps doing something wrong but
> after several test the results were consistant.
>
> I have a multithreaded application, at startup each thread creates
> several gdd. It ends up frequently in the following SIGSEGV:
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x98a35b90 (LWP 1020)]
> 0x00f2b2b8 in gdd::operator new (size=44) at ../gdd.cc:26
> 26 gdd_NEWDEL_NEW(gdd)
>
> My code to create the gdd is as simple as:
> gdd *pDD;
> pDD = new gddScalar ( gddAppType_value, aitEnumInt32 );
>
> It seems like my program crashes at startup but if it passed the first
> gdd creation of each threads, it becomes stable after that. More
> threads I have, more chance I have to see the crash.
>
> My configuration:
> Epics 3.14.10
> CentOS 5.3
> Multi core processor.
>
> Has anybody ever seen this issue? I'll try to dig further tomorrow, if
> someone has ideas on workarounds or tests to do to refine the issue,
> your welcome.
>
> Bruno.
>
>
>

Original Mantis Bug: mantis-343
    http://www.aps.anl.gov/epics/mantis/view_bug_page.php?f_id=343

Tags: cas 3.14
Revision history for this message
bcoudoin (bruno-coudoin-eads) wrote :

I found the issue, in gddNewDel.h:gdd_NEWDEL_NEW() there is a race condition.

In this method, a first test is made if there are gdd left in the pool. If not it creates 20 more gdds. Then it goes on and uses one gdd of the pool for the new object being created.

What happens is that 2 threads can run while there is a single gdd in the pool. Then none will create gdds at the first stage. At the second stage, the first one will succeed but the second one will crash because there is no gdd left in the pool.

The solution is to move the guard at the start of the method instead of being into the two inner cases.

Revision history for this message
bcoudoin (bruno-coudoin-eads) wrote :

I don't see a way to attach a file. In case here is the patch.

--- epics/base-3.14.10/src/gdd/gddNewDel.h
+++ epics/base-3.14.10/src/gdd/gddNewDel.h
@@ -93,18 +93,17 @@
     int tot; \
     clas *nn,*dn; \
     epicsThreadOnce ( &once, clas##_gddNewDelInit, 0 ); \
+ epicsGuard < epicsMutex > guard ( *clas::pNewdel_lock ); \
     if(!clas::newdel_freelist) { \
         tot=gdd_CHUNK_NUM; \
         nn=(clas*)malloc(gdd_CHUNK(clas)); \
         gddGlobalCleanupAdd (nn); \
         for(dn=nn;--tot;dn++) dn->newdel_setNext((char*)(dn+1)); \
- epicsGuard < epicsMutex > guard ( *clas::pNewdel_lock ); \
         (dn)->newdel_setNext(clas::newdel_freelist); \
         clas::newdel_freelist=(char*)nn; \
     } \
     if(size==sizeof(clas)) { \
         { \
- epicsGuard < epicsMutex > guard ( *clas::pNewdel_lock ); \
             dn=(clas*)clas::newdel_freelist; \
             clas::newdel_freelist=((clas*)clas::newdel_freelist)->newdel_next(); \
         } \

Revision history for this message
Jeff Hill (johill-lanl) wrote :

I applied this patch to the R3.14 branch

diff -u -b -w -b -r1.6.2.2 gddNewDel.h
--- gddNewDel.h 10 Dec 2008 21:51:19 -0000 1.6.2.2
+++ gddNewDel.h 1 Jul 2009 22:16:49 -0000
@@ -95,18 +95,17 @@
     int tot; \
     clas *nn,*dn; \
     epicsThreadOnce ( &once, clas##_gddNewDelInit, 0 ); \
+ epicsGuard < epicsMutex > guard ( *clas::pNewdel_lock ); \
     if(!clas::newdel_freelist) { \
         tot=gdd_CHUNK_NUM; \
         nn=(clas*)malloc(gdd_CHUNK(clas)); \
         gddGlobalCleanupAdd (nn); \
         for(dn=nn;--tot;dn++) dn->newdel_setNext((char*)(dn+1)); \
- epicsGuard < epicsMutex > guard ( *clas::pNewdel_lock ); \
         (dn)->newdel_setNext(clas::newdel_freelist); \
         clas::newdel_freelist=(char*)nn; \
     } \
     if(size==sizeof(clas)) { \
         { \
- epicsGuard < epicsMutex > guard ( *clas::pNewdel_lock ); \
             dn=(clas*)clas::newdel_freelist; \
             clas::newdel_freelist=((clas*)clas::newdel_freelist)->newdel_next(); \
         } \

Revision history for this message
Jeff Hill (johill-lanl) wrote :

fixed in R3.14.11

Revision history for this message
Andrew Johnson (anj) wrote :

R3.14.11 released.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.