CA monitoring of waveforms is unreliable because values are not buffered
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
EPICS Base |
Won't Fix
|
Wishlist
|
Ralph Lange |
Bug Description
When monitoring a waveform of more than one element, the IOC will not queue values in the event queue. Instead when an event is generated (db_post_
This limitation is mentioned in the code, see http://
In our case, we are using a waveform record with asynchronous processing to send commands to a PLC and receive the responses. A client issues a CA-Put-Notify to the waveform with the request. The device support sends the request to the PLC and sets PACT, and when a response is later received, it completes asynchronous processing, writing the response data to that same record. The client which has sent a request monitors this record and recognizes the response to its own command (and ignores responses to commands issued by other clients).
Because we are using Put-Notify, we expected that this would work reliably even when multiple clients are sending commands at around the same time, since the requests would be serialized (as implemented in dbNotify.c). This does indeed happen, but then we are seeing problems where we issue two requests at the same time, and when monitoring this PV, in place where we should see the data for the first response, we instead see the data for the second request (followed by the second response). Apparently what has happened is that the queued up second request managed to start (and be written into the record) before the monitor event for the first response was picked up by the event thread.
Changed in epics-base: | |
status: | New → Confirmed |
assignee: | nobody → Ralph Lange (ralph-lange) |
importance: | Undecided → Wishlist |
Changed in epics-base: | |
status: | Confirmed → Won't Fix |
The behavior you describe is a result of the fact that, in the process database, array data is stored in the record struct directly, which entails making copies for buffering. The dbfl_type_rec references are an attempt to avoid making some copies. While arguably not documented clearly enough (what is) this is a widely known, and much lamented, behavior.
For Base <=3.14 series that's the end of the story.
With >=3.15 the server side filters feature offers a way to force the extra copy being avoided, and get buffering as with scalar values. I'll see about putting together an example.
Base series 3.15 also adds the put-process-get feature, which I think may be a better fit, but I don't know as much about.