Periodic scan thread delays are not accurate

Bug #597054 reported by Andrew Johnson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
EPICS Base
Fix Released
Low
Andrew Johnson

Bug Description

Steve Hunt reported:

I notice that under Linux my (for instance) 1Hz scans are not at 1Hz, or at least that is what the timestamps say. It seems to gain 1ms or so each time it processes!!

Of course under VxWorks all is fine.

Tags: db
Revision history for this message
Andrew Johnson (anj) wrote :

This is probably something to do with the way that we implement the periodic scan threads using timers (by "gain" I assume the scan periods are slightly too long, not too short). We do subtract the time it takes to process all the records from the timer delay, but the result still has to be quantized and I'm pretty sure that the pthread_cond_timedwait() that this eventually calls on Linux only has to guarantee to delay for that minimum time interval and is likely to wait for longer than requested. It looks like we do *not* adjust for the overhead associated with calculating the delay though, so it should be possible to be more accurate by enhancing the periodicTask() routine in dbScan.c by measuring that overhead as well.

Revision history for this message
Ralph Lange (ralph-lange) wrote :

Since the call to pthread_cond_timedwait() always happens at the same point in the cycle - wouldn't it be more adequate to actually measure the time since the last round and run a simple control loop on it? So that it would compensate for additional overhead and adjust to things without having to measure the overhead?

Revision history for this message
Andrew Johnson (anj) wrote :

Right, but something I can't quite express bothers me about doing that. I'd want to make sure that this wouldn't affect the vxWorks and RTEMS timings, and that if a CPU is too busy to service all the periodic threads that this doesn't cause some other nasty effect.

I wonder how hard it would be to write a db/test/scanTest.c unit test program? I think I'd want something like that in place before making this kind of change — we already have a db/test/callbackTest.c program.

Andrew Johnson (anj)
Changed in epics-base:
assignee: nobody → Andrew Johnson (anj)
Revision history for this message
Andrew Johnson (anj) wrote :

The tech-talk discussion at http://www.aps.anl.gov/epics/tech-talk/2013/msg00008.php includes a proposed patch from Eric Norum which resolves this issue. After some further modifications to his code and the addition of messages warning about repeated over-runs I commited the fix to the 3.14 branch.

Changed in epics-base:
status: New → Fix Committed
Andrew Johnson (anj)
Changed in epics-base:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.