Activity metric data isn't always retrieved as expected for searches

Bug #1781480 reported by Kathy Lussier on 2018-07-12
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Evergreen
Medium
Unassigned
3.0
Undecided
Unassigned
3.1
Undecided
Unassigned

Bug Description

Evergreen version: 2.12+
OS: Ubuntu 16.04 and Debian Jessie

We have found problems where catalog searches frequently ignore the activity metric data associated with records in the search. When this problem occurs, we see the following behavior:

- None of the records in the search display the badge data, even though there are records in the result set with entries in rating.badge. The badge does not display in either the search results page or the record page.
- If the search should be sorting by popularity or poprel, the results are retrieved in bib relevance order.
- If I perform the exact same search while limiting to records with a particular badge that I know should be part of the result set, I will retrieve 0 results.

Basically, the search behaves as if there are no badges that are part of the result set.

There have been times when I've done a search showing the above behavior when a person performing the exact same search on a different computer at the same time and is able to see the badge data. Other times, if I wait 5-10 minutes, a search unsuccessful in retrieving badge data may suddenly become successful or vice versa.

CW MARS has experience this problem for several months on their system running Ubuntu 16.04, but I was unable to replicate it on any other catalog using activity metric sorting, and I thought it was a configuration issue. However, NOBLE recently implemented this feature and is also seeing this behavior on their system running on Debian Jessie.

A couple of other things we've noticed.
- We have never seen this behavior on one of our single-server training/development systems. We've only seen it occur on our multi-brick production systems. I'm not saying it doesn't happen in a single-server setting, but we just haven't witnessed it.

- We also have verified that it doesn't just happen on one specific brick. For example, we've seen this behavior occur in search results retrieved on every brick in the CW MARS setup, and we've also seen successful search on every brick.

This feature is such a great addition to Evergreen, and we would love to see it working more consistently! I'm not sure where else to look to find the source of the problem.

Martha Driscoll (mjdriscoll) wrote :

Here is a search as seen in the logs. The system searched has one badge defined with a scope of 1 (Consortia). The search is showing badge_orgs(18) which does not exist and therefore does not sort the result set as expected. A successful search would show badge_orgs(1).

018-07-19 10:52:51 db-primary postgres[108955]: [30-1] 2018-07-19 10:52:51.497 EDT [172.29.120.92(43570)] [108955]: [28-1] user=evergreen,db=evergreen,e=00000 LOG: duration: 506.683 ms statement: -- bib search: #CD_documentLength #CD_meanHarmonic #CD_uniqueWords core_limit(10000) badge_orgs(18) estimation_strategy(inclusion) skip_check(0) check_limit(1000) sort(poprel) winnie the pooh depth(0)
2018-07-19 10:52:51 db-primary postgres[108955]: [30-6] #011 (to_tsquery('synonym_noble', COALESCE(NULLIF( '(' || btrim(regexp_replace(translate_isbn1013(split_date_range(naco_normalize($_80054$winnie$_80054$))),E'(?:\\s+|:)','&','g'),'&|') || ')', '()'), '')) || to_tsquery('simple', COALESCE(NULLIF( '(' || btrim(regexp_replace(translate_isbn1013(split_date_range(naco_normalize($_80054$winnie$_80054$))),E'(?:\\s+|:)','&','g'),'&|') || ')', '()'), '')))&&
2018-07-19 10:52:51 db-primary postgres[108955]: [30-9] #011 (to_tsquery('synonym_noble', COALESCE(NULLIF( '(' || btrim(regexp_replace(translate_isbn1013(split_date_range(naco_normalize($_80054$winnie$_80054$))),E'(?:\\s+|:)','&','g'),'&|') || ')', '()'), '')) || to_tsquery('simple', COALESCE(NULLIF( '(' || btrim(regexp_replace(translate_isbn1013(split_date_range(naco_normalize($_80054$winnie$_80054$))),E'(?:\\s+|:)','&','g'),'&|') || ')', '()'), ''))) ||

Mike Rylander (mrylander) wrote :

First, Martha's right on that the badge_orgs() filter (supplied by a wrapper function that looks at shelving location groups and the site() filter) is the proximate cause here. The code in question is around line 3381 of Open-ILS/src/perlmods/lib/OpenILS/Application/Storage/Publisher/metabib.pm.

It looks like a location group is being passed to the method, either embedded in the query string or as a hash parameter, but the query generated by the method does not reflect that. Still looking...

Michele Morgan (mmorgan) wrote :

I've been able to reproduce this issue pretty reliably on a single server test system with our production data.

This server has:

- Location groups owned by an org unit other than the consortium.
- No other search activity being done during the test.

Here are the steps that reproduced the problem with log entries:

1. Perform a search at the consortium level - check logs to confirm that the search contains "badge_orgs(1)"

[2018-07-27 19:42:25] open-ils.search [INFO:46526:Biblio.pm:1304:15327185504673420] Completed canonicalized search is: core_limit(100000) badge_orgs(1) estimation_strategy(inclusion) skip_check(0) check_limit(1000) location_groups(3) sort(poprel) depth(0) #CD_documentLength #CD_meanHarmonic #CD_uniqueWords (keyword: harry potter)

2. Perform a different search scoped to a location group that is not owned by the consortium. In this case the owner of the location group is org unit 74 - check logs to confirm that the search contains "badge_orgs(74)"

[2018-07-27 19:43:08] open-ils.search [INFO:46813:Biblio.pm:1304:15327185504673531] Completed canonicalized search is: core_limit(100000) badge_orgs(74) estimation_strategy(inclusion) skip_check(0) check_limit(1000) location_groups(22) sort(poprel) site(WAK) depth(2) #CD_documentLength #CD_meanHarmonic #CD_uniqueWords (keyword: harry potter) (keyword: -"jijjjjjjj")

3. Perform a different search at the consortium level - check logs and see that the search contains "badge_orgs(74)". Even a search done in a different browser will show the same results.

[2018-07-27 19:43:42] open-ils.search [INFO:46526:Biblio.pm:1304:15327185504672414] Completed canonicalized search is: core_limit(100000) badge_orgs(74) estimation_strategy(inclusion) skip_check(0) check_limit(1000) sort(poprel) depth(0) #CD_documentLength #CD_meanHarmonic #CD_uniqueWords (keyword: harry potter) (keyword: -"jijjkjjjjj")

Subsequent unrelated searches at the consortium level also showed "badge_orgs(74)" in the logs. I also found that executing a search in a location group owned at the consortium level reset the badge_orgs to badge_orgs(1).

Changed in evergreen:
status: New → Confirmed
Kathy Lussier (klussier) wrote :

I'm confirming that I'm able to replicate the problem by following Michele's steps, but I also wanted to an additional data point. When performing any search that is scoped to a copy location group owned by a branch or system, no activity metric data is retrieved. The activity metric works fine when scoping to a copy location group owned by the consortium.

It isn't just a problem where the badge_orgs filter is being mis-applied to searches that are not being scoped to the copy location group. It's also a problem where copy location group searching isn't working correctly with the activity metric data unless the copy location group is owned by the org unit at the top of the org tree.

Mike Rylander (mrylander) wrote :

All,

I have a branch that fixes the "leaking" location groups. It looks like it may be caused by a subtle change in how Perl instantiates variables in newer version when a certain idiomatic expression is used inside of what eventually becomes a closure.

Specifically, you can't say:

  my @foo = @{$bar{x}} if (ref $bar{x});
   ...
  if ($some_condition) { @foo = qw/some list/; }

and expect @foo to be unset if the next call to an OpenSRF method does not pass something in $bar{x}.

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/miker/lp-1781480-locations-are-sneaking-in

Separately, the display behavior of popularity data is different for location groups and direct org unit context searches. For org unit searches, we display the badge data for the ancestors of the context org unit (and its own, of course). However, because location groups may have different location owners, and more than one can be supplied, we use only the badges owned by the owners of the location groups requested.

Changing that is simple, but the impact will be potentially showing popularity data from many different org units unrelated to the user-chosen location group. If the community consensus is that we should show ancestor badges for all location group owners, across all results, I'm happy to make that change.

tags: added: pullrequest
Kathy Lussier (klussier) wrote :

Hi Mike,

I'm trying to understand the downside to including the badge data in copy location groups. I'm confused as to why badge data for another org unit will show up in copy location group searches.

As an example, we have a copy location group that allows patrons to search just our academic libraries that contains Copy locations owned by our all of our academic libraries. Let's say we have a consortium with a structure like this:

- CONS
 -- Evergreen University
   -- EU Main Campus Library
   -- EU Satellite Campus
 -- Spruce College
   -- Spruce College Library
 -- Hemlock Public Library
   -- Hemlock Main Branch
   -- Hemlock West
   -- Hemlock East
 -- Douglas Public Library
   -- Douglas Library

We create an academic-themed copy location group that includes copy locations owned by EU Main Campus Library, EU Satellite Campus, and Spruce College Library.

When searching this copy location group, I would be okay with badge data appearing for any badges owned by the consortium, by any of the academic library systems, or by any of the three academic library branches.

Are you saying that badge data might display for badges owned by one of the public systems? One circumstances would lead to that happening?

In our case, all of the badges are owned by the consortium, so I think we would be okay with seeing the badge data for ancestor org units, but I want to understand the situations where other org unit data would be leaking in before I give a +1 to it.

Mike Rylander (mrylander) wrote :

Hi Kathy,

I'll use your (great) example consortium and theoretical location group.

The decision to make the behavior different came down to the fact that location groups are collections of copy locations that are not necessarily related in a hierarchical way, and are not limited to a single "context" value the way that the site() filter is. In your example, the contents of the group spans three branches and two systems. It's unclear in the example is who owns the academic-themed group, but I'll posit that the top org unit owns the group for our purposes here.

Looking at things from the layer closest to the data, there are basically four choices when it comes to asking who's badges to show when we have a location_groups() filter but no site() filter:

 1) The badges belonging to the owners of the location groups requested via the location_groups() filter. This is what we do today.
 2) My suggested change, which was to include badges belonging to the ancestors of the group owners as well, though in this example that wouldn't make a difference because the top org unit owns the group.
 3) The badges belonging to the owners of the copy locations /within/ the groups requested.
 4) The badges belonging to the owners of the copy locations, as in (3), along with their ancestors. This would pull in badges from all the academic libraries, the systems, and the top of the tree.

Adjusting the example a little to consider a "Children's materials" group in a large public consortium, (4) could very well pull (basically) duplicated badges from /all/ libraries in a consortium. Maybe that's fine.

I'm fine with changing the location-group-search badge scoping to any of 2-4, or leaving it at 1. The nice thing about 1 is that one does have the ability, through configuration of badges at specific org units matching group owners, to control what is displayed more directly, but that may not be useful or necessary in practice.

Thanks, Kathy!

Kathy Lussier (klussier) wrote :

OK, I was confused earlier thinking that it would consider the owners of the copy locations within the group rather than the owner of the group.

If we went with your original suggestion (#2), then, and Hemlock Main Branch in our example owned a copy location group that pulled together all children's materials, it would display any badges owned by the branch or by the Hemlock Public Library System or by the consortium. Is that correct? Is there any case where it would show the badges owned by sibling branches? If not, I think this approach should work, but I may be missing something.

Mike Rylander (mrylander) wrote :

Yes, that's exactly right. Badges owned by the group owner and any of its ancestors would be displayed, but badges from sibling branches, and sibling systems and their branches, would not be displayed.

I've pushed a commit to the branch which makes that change.

Elizabeth Thomsen (et-8) wrote :

This is great -- this is exactly how we want things to work! Thanks, Mike!

Changed in evergreen:
milestone: none → 3.2-rc
assignee: nobody → Jason Stephenson (jstephenson)
Jeanette Lundgren (jlundgren) wrote :

Jason applied Mike's branch and I've done some testing and I have not been able to get the badges to drop off of search results (pre-patch behavior).

I tried searching at multiple scope levels: consortium, library, branch, children's and shelving locations owned by individual libraries. The badge detail remained in every case and while switching scopes.

Thank you Mike and NOBLE for your work on this!

I'm adding a signoff based on my testing. I consent to signoff of this with my name jlundgren and my email <email address hidden>.

tags: added: signedoff
Changed in evergreen:
assignee: Jason Stephenson (jstephenson) → nobody
Jason Stephenson (jstephenson) wrote :

Added Jeanette's sign off and pushed to master, rel_3_1, and rel_3_0. Thanks, everyone!

Changed in evergreen:
status: Confirmed → Fix Committed
Changed in evergreen:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers