Possible bug in ca_clear_channel
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
EPICS Base |
Invalid
|
High
|
Unassigned |
Bug Description
From Benjamin Franksen:
I recently added a few more tests to the sequencer, mostly concerned with the
pvAssign function. This led to mysterious hang-ups: one of the new tests
sometimes hangs right in the middle of a call to pvAssign. Further
investigation revealed that the bug is not in the sequencer, at least not in
any apparent way. When pvAssign is called with a connected channel, it first
disconnects it by calling ca_clear_channel. What I see when instrumenting the
relevant parts of the code with printf statements is that ca_clear_channel
gets called but never returns. This does not happen every time, of course,
it's highly timing dependent. I can reproduce it only about every dozen times
I run the test. It only happens if the channel is on another IOC (but in my
tests the other IOC runs on the same machine, in the background). When it
happens it is always the first call to ca_clear_channel in the program that
hangs. I looked at the code in src/ca/access.cpp and there has been a change
between 3.14.12.3 and 3.14.12.4 in ca_clear_channel and it looks as if this is
a regression because I cannot reproduce the problem with 3.14.12.3, but I can
with 3.14.12.4 (and 3.15, BTW).
The latest version 2.2 snapshot on the sequencer home page (seq-2-2-
snapshot-
reproduce, it is easiest to start the IOC with the database in the background
like this:
ben@sarun[1]: .../seq/branch-2-2 > cd test/validate/
ben@sarun[1]: .../validate/
[1] 29640
ben@sarun[1]: .../validate/
#######
## EPICS R3.14.12.3 $Date: Mon 2012-12-17 14:11:47 -0600$
## EPICS Base built Dec 28 2014
#######
iocRun: All initialization complete
Then start the real test program as often as it takes. This is how far it gets
when it hangs:
ben@sarun[1]: .../validate/
Sequencer release 2.2.0.3, compiled Sun Dec 28 22:40:58 2014
Spawning sequencer program "reassignTest", thread 0x1a77490: "reassignTest"
reassignTest[0]: all channels connected & received 1st monitor
1..30
# start
ok 1 - seq_pvChannelCo
ok 2 - seq_pvAssignCou
ok 3 - seq_pvConnectCo
Changed in epics-base: | |
milestone: | 3.14.branch → none |
Only one commit could have caused this change, to the 3.14 branch:
------- ------- ------- ------- ------- ------- ------- ------- ---- /bugs.launchpad .net/epics- base/+bug/ 1179642 ------- ------- ------- ------- ------- ------- ------- ----
revno: 12415
committer: Jeff Hill <email address hidden> <email address hidden>
branch nick: trunk
timestamp: Thu 2013-05-16 12:33:31 -0600
message:
merged in fix for https:/
also merged in removal of c++ support for old HPUX compiler
-------
http:// bazaar. launchpad. net/~epics- core/epics- base/3. 14/revision/ 12415
Unfortunately this was a rather large patch which introduced a new guard to prevent a race condition, although many of the changes in it were unrelated to delete dead code.