epicsMessageQueue lost messages
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
EPICS Base | Status tracked in 7.0 | |||||
3.15 |
Fix Released
|
Undecided
|
Andrew Johnson | |||
7.0 |
Fix Released
|
High
|
Andrew Johnson |
Bug Description
https:/
Mark Rivers observed epicsMessageQueue losing messages.
https:/
> I think I see the logic error in how the eventSent flag is handled,
> specifically related to the fact that epicsEvent is a semaphore as
> opposed to a condition variable.
>
> This allows a "race" to occur if the first/only waiting receiver
> times out, and epicsEventWaitW
> is in epicsMessageQue
>
> This results in a situation where the sender has set the eventSent,
> and indeed copied a message to the buffer of, a thread which has
> decided to abort.
>
> After timing out, the receiver sees the timeout and returns -1
> even through eventSent has been set. This can be trapped with:
>
> > b osdMessageQueue
>
> So here is your lost message.
>
> Now when epicsMessageQue
> is waiting in the queue, so epicsEventWaitW
> Since the semaphore is already set, this returns immediately with status==0,
> but this is a spurious wakeup and the eventSent flag is not set.
>
> And here is the second "timeout".
Line numbers circa 7.0.3.1
This issue has been present in all versions of epicsMessageQueue.
I see two fixes which could combine to address this issue in myReceive().
Similar changes will likely be required in mySend() as it also supports timeout.
1. Return success in threadNode. eventSent regardless of timeout status.
2. Handle the case where getEventNode() returns an epicsEvent which
has already been triggered. Maybe getEventNode() calls epicsEventTryWait() ?