ACS Status response can include incorrect timeout in Multiplex mode

Bug #1927542 reported by Galen Charlton
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SIPServer
New
Medium
Unassigned

Bug Description

If the Multiplex Net::Server personality is in use, the ACS Status response message (98) can return 000 for both the timeout and the number of retries allowed after the SIP terminal has successfully authenticated to the ILS.

Per the SIP2 specification, reporting a zero timeout in the ACS Status response indicates that the ACS is offline. Some SIP2 clients instead interpret such a response as indicating that there's effectively no time for the ACS to respond before the SIP client should give up on the request. Either way, SIP2 clients that take the ACS Status timeout seriously are left out in the cold by this bug.

Revision history for this message
Galen Charlton (gmc) wrote :

A patch is available in the working repository:

user/gmcharlt/lp1927542_fix_acs_status_in_multiplex / https://git.evergreen-ils.org/?p=working/SIPServer.git;a=shortlog;h=refs/heads/user/gmcharlt/lp1927542_fix_acs_status_in_multiplex

Changed in sipserver:
importance: Undecided → Medium
tags: added: multiplex pullrequest
Revision history for this message
Rogan Hamby (rogan-hamby) wrote :

Works for me. I never sent in my keys for the sip server repo so ...

I have tested this code and consent to signing off on it with my name, Rogan Hamby and my email address, <email address hidden>.

tags: added: signedoff
Revision history for this message
Jason Stephenson (jstephenson) wrote :

Rogan, you should be able to push to any of the working repositories with the key your submitted for Evergreen.

Revision history for this message
Jason Stephenson (jstephenson) wrote :

Since I have a program that can test this, I'll take this one and check it tonight, too.

I'll go ahead and add Rogan's signoff.

Changed in sipserver:
assignee: nobody → Jason Stephenson (jstephenson)
Revision history for this message
Jason Stephenson (jstephenson) wrote :

It looks to me like SIPServer returns a 000 timeout and 000 retries allowed no mater what, whether the terminal has authenticated or not, and I'm using the prefork server before trying multiplex.

Revision history for this message
Jason Stephenson (jstephenson) wrote :

Ok. Spoke too soon. My timeouts for the institutions were all set to 0. I'm going to update that and keep testing. Retries is working for prefork. I'll switch to multiplex and test Galen's patch properly.

Revision history for this message
Jason Stephenson (jstephenson) wrote :

Is this issue caused by a misconfiguration of the oils_sip.xml file? I ask because I have not been able to reproduce this bug on a server running a copy of our production.

I modified the file to use the Multiplex peronality, added the localhost as the cache server, and then updated our timeouts to be something other than 0.

I then hit this with a couple programs that I wrote recently while working on getting our load balancer to monitor our SIPServers. One of these programs simply sends the SC Status message while the other logs in and then sends the SC Status message.

With the timeout set in the policy block of the institution in use, I always got the values that I expected for timeout and retries with the Multiplex personality, with or without Galen's patch.

Am I missing something?

Changed in sipserver:
assignee: Jason Stephenson (jstephenson) → nobody
Revision history for this message
Jason Boyer (jboyer) wrote :

Jason, I don't have a lot of background on this issue but I wonder if the mention of hash key randomness in Galen's patch might mean it's difficult to trigger on demand and if your config happens to perturb the keys enough to cause it that this patch will fix it. I also haven't dug around in the code enough to see how likely that is but it sounds like a good guess. :)

Revision history for this message
Jason Stephenson (jstephenson) wrote :

Jason, thanks for the reply.

I'm going to have another look today, removing the timeouts in my own config to see what happens.

Changed in sipserver:
assignee: nobody → Jason Stephenson (jstephenson)
Revision history for this message
Jason Stephenson (jstephenson) wrote :

My experience still does not line up with the bug description, nor what is in the commit message.

However, this commit does not change what I see, whether I'm using the prefork or multiplex personalities.

Does this bug not apply to Evergreen 3.5?

Changed in sipserver:
assignee: Jason Stephenson (jstephenson) → nobody
Revision history for this message
Galen Charlton (gmc) wrote :

I do not believe that the bug depends on the Evergreen version.

Here is an expect(1) script I've been using for testing:

---
#!/usr/bin/expect

spawn telnet localhost 6001
sleep .1;
send "9900402.00\r";
expect "98";
puts "\n";
sleep .1;
send "9300CNadmin|COPASSWORD|CPgapines\r";
expect "941";
puts "\n";
sleep 6;
send "9900402.00\r";
expect "98";
puts "\ndone\n";
---

And the salient diff against the stock oils_sip.xml from Evergreen for the configuration I'm using to test with:

---

+ <server-params
+ min_servers='3'
+ min_spare_servers='1'
+ max_servers='50'
+ personality='Multiplex'
+ />
+
+
        <listeners>
                <service
                        port="0:8080/tcp"
@@ -23,11 +31,17 @@

                <service
                        port="127.0.0.1:6001/tcp"
+ allow_sc_status_then_login="enabled"
                        transport="RAW"
                        protocol="SIP/2.00"
+ worker-keepalive="65"
                        timeout="60" />
        </listeners>

+ <cache>
+ <server>127.0.0.1:11211</server>
+ </cache>
+
        <accounts>
                <!--
                Fine Item Detail returned by the Patron Information Request is
@@ -50,7 +64,7 @@
                'true' or 'false' as a login attribute, the login attribute will
                override whatever is set in the implementation_config.
                -->
- <login id="scclient" password="clientpwd" institution="gapines"/>
+ <login id="admin" password="demo123" institution="gapines"/>
        </accounts>
---

Perl version I'm testing with is 5.28.1.

Revision history for this message
Galen Charlton (gmc) wrote :

And here's a configuration that doesn't depend on Evergreen at all, just the test ILS driver, where I can reliably reproduce the bug:

oils_sip.xml / SIPConfig.xml
---
<acsconfig xmlns="http://openncip.org/acs-config/1.0/">

  <error-detect enabled="true" />

  <!-- Set Net::Server runtime parameters. "personality" may -->
  <!-- be either PreFork or Multiplex. -->
  <server-params
           personality='Multiplex'
           min_servers='3'
           min_spare_servers='1'
           max_servers='50' />

  <listeners>
    <service
      port="127.0.0.1:6001/tcp"
      transport="RAW"
      protocol="SIP/2.00"
      allow_sc_status_then_login="enabled"
      timeout="60" />
  </listeners>

  <!-- One or more Memecache servers are required for Multiplex mode. -->
  <!-- Cache server(s) are ignored in non-Multiplex mode -->
  <cache>
    <server>127.0.0.1:11211</server>
  </cache>

  <accounts>
      <login id="scclient" password="clientpwd" institution="UWOLS"></login>
  </accounts>

<!-- Institution tags will hold stuff used to interface to -->
<!-- the rest of the ILS: authentication parameters, etc. I -->
<!-- don't know what yet, so it'll just be blank. But there -->
<!-- needs to be one institution stanza for each institution -->
<!-- named in the accounts above. -->
<institutions>
    <institution id="UWOLS" implementation="ILS" parms="">
          <policy checkin="true" renewal="false"
                  status_update="false" offline="false"
                  timeout="600" client_location_code="true"
                  retries="3" />
          <relais_extensions_to_msg24 enabled="false" />
          <encoding>ascii</encoding>
    </institution>
</institutions>
</acsconfig>
---

The expect script:
---
#!/usr/bin/expect

spawn telnet 127.0.0.1 6001
sleep .1;
send "9900402.00\r";
expect "98";
puts "\n";
sleep .1;
send "9300CNscclient|COclientpwd|CPUWOLS\r";
expect "941";
puts "\n";
sleep 6;
send "9900402.00\r";
expect "98";
puts "\ndone\n";
---

Results without the patch:
---
spawn telnet 127.0.0.1 6001
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
9900402.00
98YNNYNN60000320210512 1449022.00AOUWOLS|BXYYYYYYYYYNYNYYYY|

9300CNscclient|COclientpwd|CPUWOLS
941

9900402.00
98YNNNNN00000020210512 1449082.00AOUWOLS|BXYYYYYYYYYNYNYYYY|
done
---

Results with the patch:
---
spawn telnet 127.0.0.1 6001
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
9900402.00
98YNNYNN60000320210512 1451122.00AOUWOLS|BXYYYYYYYYYNYNYYYY|

9300CNscclient|COclientpwd|CPUWOLS
941

9900402.00
98YNNYNN60000320210512 1451182.00AOUWOLS|BXYYYYYYYYYNYNYYYY|
done
---

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.