CAS becomes unresponsive when beacons are sent to port where search requests are expected

Bug #541332 reported by Dirk Zimoch
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
EPICS Base
Fix Released
High
Jeff Hill

Bug Description

This problem happens if an IOC is started when EPICS_CA_ADDR_LIST includes a port number like "gateway:5066" and EPICS_CAS_BEACON_ADDR_LIST is not set. The IOC then sends beacons to port 5066, where the server (e.g. gateway) expects only search requests. CAS starts printing error messages:
CAS: CAS Request: ? on pc3853.psi.ch:38210: cmd=13 cid=0 typ=11 cnt=5064 psz=0 avail=81818260
bad request code=13 in DG
filename="../../../../src/cas/generic/st/casDGIntfOS.cc" line number=498
protocol from client was invalid unexpected problem with UDP input from "pc3853.psi.ch:38210"

The CAS does not terminate, but once confused like this it does not respond to any search requests any more.

Additional information:
The IOC should probably add 1 to the port number in EPICS_CA_ADDR_LIST before using it as the default value for EPICS_CAS_BEACON_ADDR_LIST.

Original Mantis Bug: mantis-302
    http://www.aps.anl.gov/epics/mantis/view_bug_page.php?f_id=302

Tags: cas 3.14
Revision history for this message
Jeff Hill (johill-lanl) wrote :

I agree that it shouldnt do that.

I see the Mantis entry. At some point I will need to reproduce this and see what can be done, but at the moment I need to stay focused on other tasks (that are funded).

edited on: 2007-10-22 17:52

Revision history for this message
Jeff Hill (johill-lanl) wrote :

More details from Dirk,

I see strange crashes of the CA gateway which probably originate in the CAS code.

If I use an explicit port in EPICS_CA_ADDR_LIST like this

EPICS_CA_ADDR_LIST="gateway:5066"

and then start a softioc running a record with an INP link to a record behind the gateway (running on that port), the gateway starts printing errors and refuses to handle any more requests. It has to be restarted.

The gateway prints:

CAS: CAS Request: ? on pc3853.psi.ch:38210: cmd=13 cid=0 typ=11 cnt=5064 psz=0 avail=81818260

Oct 12 14:01:36 !!! Errlog message received (message is above) bad request code=13 in DG

Oct 12 14:01:36 !!! Errlog message received (message is above) filename="../../../../src/cas/generic/st/casDGIntfOS.cc" line number=498 protocol from client was invalid unexpected problem with UDP input from "pc3853.psi.ch:38210"

Oct 12 14:01:36 !!! Errlog message received (message is above)
CAS: CAS Request: ? on pc3853.psi.ch:38210: cmd=13 cid=0 typ=11 cnt=5064 psz=0 avail=81818260

Oct 12 14:01:36 !!! Errlog message received (message is above) bad request code=13 in DG

Oct 12 14:01:36 !!! Errlog message received (message is above) filename="../../../../src/cas/generic/st/casDGIntfOS.cc" line number=498 protocol from client was invalid unexpected problem with UDP input from "pc3853.psi.ch:38210"

Oct 12 14:01:36 !!! Errlog message received (message is above) ....

However, a simple caget works without problems, at least before the softioc crashes the gateway.

I tried 3.14.8 and 3.14.9 on the ioc side. The gateway runs with some pre-3.13.9
CVS snapshot. But I also tried it with the 3.14.9 libraries.

edited on: 2007-10-22 17:52

Revision history for this message
Jeff Hill (johill-lanl) wrote :

Fixed in R3.14.10

This patch fixes the issue.

@@ -831,6 +831,7 @@
             }
       status = ( this->*pHandler ) ();
       if ( status ) {
+ this->in.removeMsg ( this->in.bytesPresent() );
        break;
       }

Revision history for this message
Andrew Johnson (anj) wrote :

R3.14.10 released.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.