epicsThread::exitWait() from epicsAtThreadExit() handler corrupts stack

Bug #1558206 reported by mdavidsaver
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
EPICS Base
Fix Released
Low
mdavidsaver
3.14
Fix Released
Low
mdavidsaver
3.15
Fix Released
Low
mdavidsaver
3.16
Fix Released
Low
mdavidsaver

Bug Description

The epicsThreadCallEntryPoint() function stores a pointer to a local variable in epicsThread::pWaitReleaseFlag. Calling epicsAtThreadExit::exitWait() from that thread's epicsAtThreadExit() handler writes to this pointer after epicsThreadCallEntryPoint() has returned. Thus corrupting the stack.

An obvious fix is for epicsThreadCallEntryPoint() to set pWaitReleaseFlag=NULL before returning. That said, I don't quite see the point of the waitRelease flag. Maybe it should be removed all together.

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

I've committed the obvious fix to the 3.14 branch.

Revision history for this message
Andrew Johnson (anj) wrote :

Looking at the comments immediately above your addition, I'm not sure whether this is quite the right fix:

    if ( ! waitRelease ) {
        epicsGuard < epicsMutex > guard ( pThread->mutex );
        pThread->terminated = true;
        pThread->exitEvent.signal ();
        // once the terminated flag is set and we release the lock
        // then the "this" pointer must not be touched again
    }
    pThread->pWaitReleaseFlag = NULL;
}

If pThread is the "this" pointer mentioned in that comment (and I'm pretty sure it is), the implication of those comments is that in some circumstances the pThread object may have already been destroyed before the guard's destructor returns. Should the pointer be cleared just before releasing the mutex instead?

The libCom/test/epicsThreadTest.cpp file is not currently a comprehensive test of all supported features of the epicsThread class.

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

I guess the idea is that an epicsThread may delete itself. So, how about just removing waitRelease altogether? From what I can tell it's function is to suppress calls to printLastChanceExceptionMessage() after a thread calls exitWait() on itself.

If this isn't acceptable we'll have to move the flag to a new struct w/ ref. counting to ensure it can outlive epicsThread and the call to epicsThreadCallEntryPoint().

> The libCom/test/epicsThreadTest.cpp file is not currently a comprehensive test of all supported features of the epicsThread class.

Ha, clearly.

Revision history for this message
Andrew Johnson (anj) wrote :

The self-deletion sounds right, I think Jeff uses that inside CA client code.

What's wrong with moving the pointer-clear line up inside the guard block? It appears that whole purpose of the waitRelease flag is to tell epicsThreadCallEntryPoint() whether the epicsThread object still exists after the pthread->runnable.run() routine has returned (or thrown). Given that, I don't see why applying this patch on top of your commit shouldn't be correct:

=== modified file 'src/libCom/osi/epicsThread.cpp'
--- src/libCom/osi/epicsThread.cpp 2016-03-16 18:40:37 +0000
+++ src/libCom/osi/epicsThread.cpp 2016-03-16 21:20:43 +0000
@@ -104,12 +104,12 @@
     }
     if ( ! waitRelease ) {
         epicsGuard < epicsMutex > guard ( pThread->mutex );
+ pThread->pWaitReleaseFlag = NULL;
         pThread->terminated = true;
         pThread->exitEvent.signal ();
         // once the terminated flag is set and we release the lock
         // then the "this" pointer must not be touched again
     }
- pThread->pWaitReleaseFlag = NULL;
 }

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

You're right. Fixed.

Revision history for this message
Andrew Johnson (anj) wrote :

I renamed the waitRelease flag to threadDestroyed and rewrote the comments in the epicsThread.cpp file to better explain what's going on. Also added Release Notes linking to this bug.

When committing bug fixes to Bazaar, please use the command-line flag --fixes=lp:<number> which will link that branch to the bug automatically. Just putting 'fixes lp:<number>' into the commit log is not sufficient for Launchpad to notice (but include that too as it's easier to search commit logs).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.