Gateway timeout on retrieving many shelving locations

Bug #1754164 reported by Jeff Davis
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Evergreen
New
Medium
Unassigned
2.12
Won't Fix
Undecided
Unassigned
3.0
Won't Fix
Undecided
Unassigned
3.1
Won't Fix
Undecided
Unassigned

Bug Description

EG 2.12.1 and 3.0.3
OpenSRF 2.5 and 3.0

When requesting shelving locations (open-ils.circ.copy_location.retrieve.all) via the HTTP gateway, the request can time out prematurely if you have a large number of shelving locations.

To reproduce, add a large number of shelving locations to asset.copy_location (over 5,000 in my test environment). Then, request all shelving locations via the gateway:

https://example.com/osrf-gateway-v1?service=open-ils.circ&method=open-ils.circ.copy_location.retrieve.all

On my test server, this returns a null response:

{"payload":[],"status":200}

The open-ils.circ.copy_location.retrieve.all API call itself is not failing: OpenSRF logs show no errors, and the request succeeds via srfsh. However, the gateway logs show a timeout error:

Returning NULL from app_request_recv after timeout: open-ils.circ.copy_location.retrieve.all [null]

The default timeout value is 60s, but we get our null response in only a few seconds. The gateway request succeeds if we supply a larger timeout value as a URL param, e.g.:

https://example.com/osrf-gateway-v1?service=open-ils.circ&method=open-ils.circ.copy_location.retrieve.all&timeout=300

So far I haven't been able to replicate the issue with other API calls that return large chunked responses, but I don't know why there would be anything special about open-ils.circ.copy_location.retrieve.all.

Revision history for this message
Jeff Davis (jdavis-sitka) wrote :

See attachment for osrfsys and gateway logs for a failed request.

description: updated
Revision history for this message
Jeff Davis (jdavis-sitka) wrote :
tags: added: pullrequest
Dan Wells (dbw2)
Changed in evergreen:
milestone: none → 3.1-rc
no longer affects: evergreen/3.1
Changed in evergreen:
importance: Undecided → Medium
Revision history for this message
Bill Erickson (berick) wrote :

Beware, changing an API from non-streaming to streaming changes how the client interacts with the call. Any clients (e.g. Open-ILS/web/js/dojo/openils/CopyLocation.js) that call the API will need to be taught to expect a stream instead of a single array of things.

Also, while we're addressing the issue of fetching too-large data blobs, I suggest we add {substream => 1} to the underlying cstore editor call so it also fetches the big list of stuff via streaming call from cstore.

Removing pullrequest until at least the first issue is resolved.

tags: removed: pullrequest
Changed in evergreen:
milestone: 3.1-rc → none
Revision history for this message
Jeff Davis (jdavis-sitka) wrote :

Thanks for reviewing, Bill. I should have known it wouldn't be so easy. :)

I don't know what is required to handle a streaming response from this API. Any assistance would be appreciated.

Elaine Hardy (ehardy)
tags: added: copylocations performance
Andrea Neiman (aneiman)
tags: added: itemlocations
removed: copylocations
tags: added: cat-locations
removed: itemlocations
Revision history for this message
Galen Charlton (gmc) wrote :

Noting in discussion with Jeff that it was a non-Evergreen client that had been running into this issue.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.