disconnect behavior could be more robust in response to congestion

Bug #541129 reported by Jeff Hill
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
EPICS Base
Fix Released
Wishlist
Jeff Hill

Bug Description

When the CA client library times out an unresponsive virtual circuit the TCP socket is closed and the channels are placed in an unknown server address state.

To avoid making the situation worse when the network or the IOC are in a heavily loaded state, it would be best not to close the socket and remove the channels from the circuit when a circuit times out. Instead the channels should disconnect when the circuit is unresponsive and reconnect when the next message arrives from the server.

Additional information:
If this change is made then confusion may occur in the following situation.

o an IOC with PV xyz is abruptly turned off
o PV xyz is moved to an IOC with a different address

In these restricted circumstances the client will wait the full duration of the TCP keepalive interval before the channel with the new address will reconnect. To avoid this problem, a graceful IOC shutdown could be implemented.

We are concerned about this type of confusion, but it is felt that the increased robustness probably justifies the change.

Original Mantis Bug: mantis-49
    http://www.aps.anl.gov/epics/mantis/view_bug_page.php?f_id=49

Tags: ca 3.14
Revision history for this message
Jeff Hill (johill-lanl) wrote :

fixed in R3.14.5

Revision history for this message
Andrew Johnson (anj) wrote :

R3.14.5 released.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.