Comment 5 for bug 697509

Revision history for this message
Alex Chen (alex-chen) wrote : Re: [Bug 697509] Re: Thread synchronization issue in libCom/osi/os/WIN32/osdTime.cpp

Hey Jeff,

Yes, they are all local variables, but the reason to move up the
'EnterCriticalSection' is not to protect race conditions, obviously there
is no race conditions for local variables. The reason is to preserve the
right order to update the 'curPerfCounter' and 'lastPerfCounter', we must
guarantee that the 'lastPerfCounter' is always updated at the end of the
last iteration, and 'curPerfCounter' is updated at the beginning of a new
iteration. But I found this issue:

1. Thread A executes currentTime::expire() from line 425 to line 439, it
has already filled the local variable 'curPerfCounter' (say 10001) and
wait to enter the critical section which is currently obtained by Thread
B.
2. Thread B executes currentTime::getCurrentTime(), it updates the
'lastPerfCounter' (say 10009) at line 402 within the critical section,
then thread B exits the critical section.
3. Thread A gets the chance to enter the critial section just released by
thread B, but it finds the 'lastPerfCounter' is bigger than the
'curPerfCounter', so it will handle it as a roll over case, but it's
actually not a real roll over case.

I found this issue on NI's PharLap platform(a sort of realtime Windows),
but I think it could show up on other platforms as well during concurrency
situations.

BTW, the big headache for me is actually not this bug, but Bug 697516
which throws a std::logic_error exception and causes system to crash.

Thanks,

Alex

From:
Jeff Hill <email address hidden>
To:
<email address hidden>
Date:
01/06/2011 09:45 AM
Subject:
[Bug 697509] Re: Thread synchronization issue in
libCom/osi/os/WIN32/osdTime.cpp
Sent by:
<email address hidden>

if we move the enter critical section up in the function it doesn't help
because that part of the code only accesses the auto (function local)
variable.

Nevertheless, I do see a bug. This is hopefully a proper fix.

>> bzr diff
=== modified file 'src/libCom/osi/os/WIN32/osdTime.cpp'
--- src/libCom/osi/os/WIN32/osdTime.cpp 2010-10-05 19:27:37 +0000
+++ src/libCom/osi/os/WIN32/osdTime.cpp 2011-01-06 01:27:14 +0000
@@ -409,8 +409,8 @@

     EnterCriticalSection ( & this->mutex );

- LONGLONG perfCounterDiff = curPerfCounter.QuadPart -
this->lastPerfCounterPLL;
- if ( curPerfCounter.QuadPart >= this->lastPerfCounter ) {
+ LONGLONG perfCounterDiff;
+ if ( curPerfCounter.QuadPart >= this->lastPerfCounterPLL ) {
         perfCounterDiff = curPerfCounter.QuadPart -
this->lastPerfCounterPLL;
     }
     else {

--
You received this bug notification because you are a direct subscriber
of the bug.
https://bugs.launchpad.net/bugs/697509

Title:
  Thread synchronization issue in libCom/osi/os/WIN32/osdTime.cpp

Status in EPICS Base:
  New

Bug description:
  EPICS Base version 3.14.9

File: /src/libCom/osi/os/WIN32/osdTime.cpp

Function: epicsTimerNotify::expireStatus currentTime::expire ( const
epicsTime & )

Line 444: if ( curPerfCounter.QuadPart >= this->lastPerfCounter )

Problem: There is a thread synchronization issue which causes the
'curPerfCounter.QuadPart' to be less than the 'this->lastPerfCounter', and
we will handle it as a roll over case. But actually this is not a roll
over case, the 'this->lastPerfCounter' is changed by another thread which
executes the 'void currentTime::getCurrentTime ( epicsTimeStamp & dest ) '

How to fix:
We can move line 441 ' EnterCriticalSection ( & this->mutex );' to the
beginning of this function.

To unsubscribe from this bug, go to:
https://bugs.launchpad.net/epics-base/+bug/697509/+subscribe