z39.50 searches include spurious keyword term

Bug #1662667 reported by Jason Stephenson on 2017-02-07
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Evergreen
Undecided
Unassigned

Bug Description

Evergreen Versions: 2.10.7 & Master
OpenSRF Versions: 2.4.1 & Master
Pg Versions: 9.2 & 9.5

When doing a z39.50 search against Evergreen, I've seen that a spurious keyword term is apparently included. Here is the log for a search for Mozart against concerto from a freshly compiled master:

[2017-02-07 20:28:24] open-ils.search [INFO:21405:Biblio.pm:964:14864991682148510] compiled search is {"limit":10,"skip_check":0,"searches":{"keyword":{"term":"eg."},"author":{"term":"mozart"}},"estimation_strategy":"inclusion","depth":0,"check_limit":"1000","org_unit":1,"offset":0,"core_limit":10000}

The {"keyword":{"term":"eg."}, is the spurious search term.

The above was produced by doing a search with yaz-client:
find @attr 1=1003 mozart

I've similar examples with production data on 2.10.7 both on a test vm and a live, production server. Every Z39.50 search apparently gets that extra search term added.

It does not appear to affect search results as the following SRU search claims to return 11 results, though in my test, I only counted 10 <recordData> tags.

http://localhost/opac/extras/sru/CONS/holdings?version=1.1&operation=searchRetrieve&query=eg.author%3Dmozart&maximumRecords=0

The above is what I considered the equivalent SRU search.

The above tests should be reproducible by anyone using the concerto dataset.

description: updated
Jason Stephenson (jstephenson) wrote :

Here are log entries from the Z39.50 search. (Don't mind the timestamp differences. My VM apparently can't decide if it is in EST or UTC.)

Apache other_vhosts_access_log:

localhost:80 ::1 - - [07/Feb/2017:16:33:40 -0500] "GET /opac/extras/sru/CONS/holdings?version=1.2&operation=searchRetrieve&query=eg.author%20%3D%20mozart&startRecord=1&maximumRecords=0 HTTP/1.1" 200 129771 "-" "YAZ/4.2.30"

SRU conversion from osrfsys.log:

[2017-02-07 16:33:40] /usr/sbin/apache2 [INFO:1863:SuperCat.pm:1996:148650319118639] SRU search string [eg.author = mozart] converted to [eg.author:mozart site:CONS]

How Evergreen interpreted the search:

[2017-02-07 21:33:42] open-ils.search [INFO:1790:Biblio.pm:964:1486503191186310] compiled search is {"org_unit":1,"offset":0,"skip_check":0,"limit":10,"searches":{"keyword":{"term":"eg."},"author":{"term":"mozart"}},"estimation_strategy":"inclusion","check_limit":"1000","depth":0,"core_limit":10000}

Here are the same for the equivalent SRU search, i.e. the one in the main description:

Apache log:

localhost:80 192.168.122.1 - - [07/Feb/2017:16:51:21 -0500] "GET /opac/extras/sru/CONS/holdings?version=1.1&operation=searchRetrieve&query=eg.author%3Dmozart&maximumRecords=0 HTTP/1.1" 200 129797 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/55.0.2883.87 Chrome/55.0.2883.87 Safari/537.36"

SRU conversion:

[2017-02-07 16:51:21] /usr/sbin/apache2 [INFO:1889:SuperCat.pm:1996:148650319118899] SRU search string [eg.author=mozart] converted to [eg.author:mozart site:CONS]

The actual search:

[2017-02-07 21:51:22] open-ils.search [INFO:1790:Biblio.pm:964:1486503191188910] compiled search is {"check_limit":"1000","core_limit":10000,"depth":0,"searches":{"keyword":{"term":"eg."},"author":{"term":"mozart"}},"estimation_strategy":"inclusion","org_unit":1,"offset":0,"skip_check":0,"limit":10}

I want to add one more entry. Changing the version in the URL to 1.2 produces the following search log entry:

[2017-02-07 21:56:59] open-ils.search [INFO:1790:Biblio.pm:964:1486503191185910] compiled search is {"org_unit":1,"offset":0,"limit":10,"skip_check":0,"check_limit":"1000","core_limit":10000,"depth":0,"searches":{"keyword":{"term":"eg."},"author":{"term":"mozart"}},"estimation_strategy":"inclusion"}

I believe the original problem description is not quite accurate. The issue seems to be with specifying version 1.2 to SRU. I may update the bug title/initial description later.

Jason Stephenson (jstephenson) wrote :

And, I should read better. It seems to be a problem with SRU searching in general.

Mike Rylander (mrylander) wrote :

Jason,

This looks to be caused by legacy code in the open-ils.search application. Specifically, the loop starting at http://git.evergreen-ils.org/?p=working/Evergreen.git;a=blob;f=Open-ILS/src/perlmods/lib/OpenILS/Application/Search/Biblio.pm;h=5c077903f4abbee700ab245f31d7434a9809adb8#l857 that attempts to "fix" search class types. All that logic is removed in http://git.evergreen-ils.org/?p=working/Evergreen.git;a=commitdiff;h=d2bb2e67df67f1d0de6ea8decd8e621500d95b31 which is part of the last branch attached to https://bugs.launchpad.net/evergreen/+bug/1005040 that Kathy plans to commit some time this week. Could you test that branch, or retest after that's committed?

Thanks!

Jason Stephenson (jstephenson) wrote :

Mike,

I'll give that branch a spin.

Jason Stephenson (jstephenson) wrote :

Looks like it fixes it. The log entries are different now, but here's the Mozart search done again:

[2017-02-09 01:39:29] open-ils.storage [INFO:8733:Application.pm:159:14866042828
83011] CALL: open-ils.storage open-ils.storage.biblio.multiclass.staged.search_f
ts.atomic estimation_strategy, inclusion, check_limit, 1000, core_limit, 10000,
limit, 1000, return_query, 1, query, eg.author:mozart site:CONS, offset, 0, skip
_check, 0

Here's the apache line:

[2017-02-08 20:39:29] /usr/sbin/apache2 [INFO:8830:SuperCat.pm:1997:148660428288
3010] SRU search string [eg.author = mozart] converted to [eg.author:mozart site
:CONS]

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers