Duplicate initial search results where copy circ lib/call number owning lib are different

Bug #1315552 reported by Brent Mills on 2014-05-02
76
This bug affects 14 people
Affects Status Importance Assigned to Milestone
Evergreen
Medium
Unassigned

Bug Description

-Evergreen 2.5.1
-OpenSRF 2.2.x

When items within a record have a circulating library different than their call number's owning library, the catalog search results for that record show duplicate rows in the initial search returns.

The item number counts are correct in the search result and record level displays. When you click on the record with the duplicate results, the copies are listed correctly, without duplicates.

Making the call number/circ libraries the same corrects the duplicate initial search issue.

A few libraries are trying to implement floating collections and would like the owning lib to remain constant for items within a system. e.g. HR-HRCL, is the owning branch of the HRDIST system with the circ lib being one of the other branches or HR-HRCL itself.

Attaching a few screen captures of the before (copies with different circ lib/same owning lib) and after (copies have same circ/owning lib).

Behavior is the same for the staff client and OPAC

Brent Mills (brent-5) wrote :
Brent Mills (brent-5) wrote :
Kathy Lussier (klussier) wrote :

Confirmed on a 2.5 system. I wasn't initially able to replicate the problem when I tried it on a record where the system only owned one copy of the title. The problem started to appear on records where there were multiple copies, at least one of which had an owning library that was different than the circulation library.

Changed in evergreen:
status: New → Confirmed
importance: Undecided → Medium
Mary Llewellyn (mllewell) wrote :

Confirmed on a 2.8 system, when items share the same call number and owning library, but have different circulating library values.

Blake GH (bmagic) wrote :

Here is a query that will identify the bibs in your database:

select bmp.record,acpm.target_copy,acn.record from asset.copy ac, asset.call_number acn, asset.copy_part_map acpm, biblio.record_entry bre, biblio.monograph_part bmp
where
bre.id!=bmp.record and
ac.call_number=acn.id and
bmp.id=acpm.part and
bre.id=acn.record and
acpm.target_copy=ac.id and
not bre.deleted and
not acn.deleted and
not ac.deleted and
acn.record>0
limit 100

These are copies with parts where one of the parts point to a bib that is not equal to the call_number.record for the copy. In these cases, they are repeated in the OPAC.

Running some experiments:

1. Pick a bib X
2. Pick a bib Y
3. Create a part for bib X (vol1)
4. Create a copy on bib X with part assigned vol1 and call number 1234
5. Repeat 3-4 with bib Y
6. Transfer volume from bib X to bib Y
7. The result will not move the part. Copy from bib X still points to the part for bib X
8. Refresh bib Y, right click the newly moved item and "Replace Barcode"
9. Select vol1 from the part dropdown and click "Re-Barcode / Update Items"
10. You will now have two rows in asset.copy_part_map one pointing to the old bib and one pointing to the current bib

This will result in duplicate rows in the OPAC

The issue with moving items is:

var robj = network.simple_request(
            'FM_ACP_FLESHED_BATCH_UPDATE',
            [ ses(), copies, true ],
            null,
            {
                'title' : $("catStrings").getString('staff.cat.util.transfer_copies.override_transfer_failure'),
                'overridable_events' : [
                    1208 /* TITLE_LAST_COPY */,
                    1227 /* COPY_DELETE_WARNING */,
                ]
            }
        );

There needs to be logic to handle the part.

And here is the code for the volume transfer:

var robj = obj.network.simple_request(
                                        'FM_ACN_TRANSFER',
                                        [ ses(), { 'docid' : obj.data.marked_library.docid, 'lib' : obj.data.marked_library.lib, 'volumes' : list } ],
                                        null,
                                        {
                                            'title' : document.getElementById('catStrings').getString('staff.cat.copy_browser.transfer.override.failure'),
                                            'overridable_events' : [
                                                1208 /* TITLE_LAST_COPY */,
                                                1219 /* COPY_REMOTE_CIRC_LIB */,
                                            ],
                                        }
                                    );

and I think that the perl code could handle it instead of the JS:

Cat.pm
batch_volume_transfer

OR*

we just don't care that the part_map remains in the database and we handle the display better in the OPAC.

Blake GH (bmagic) wrote :

Watch out for holds on these parts!

select * from action.hold_request where target in(
select acpm.part from asset.copy ac, asset.call_number acn, asset.copy_part_map acpm, biblio.record_entry bre, biblio.monograph_part bmp
where
bre.id!=bmp.record and
ac.call_number=acn.id and
bmp.id=acpm.part and
bre.id=acn.record and
acpm.target_copy=ac.id and
not bre.deleted and
not acn.deleted and
not ac.deleted and
acn.record>0
) and hold_type='P'
and capture_time is null and cancel_time is null

In case you are considering deleting those stale part maps

Michele Morgan (mmorgan) wrote :

Blake's comment addresses a different issue than what this bug describes. The duplication described her results from an asset.copy.circ_lib being different than asset.call_number.owning_lib for a given copy. This can be normal for copies.

The duplication in the catalog display that Blake describes results from multiple rows for the copy in asset.copy_part_map, which is a bug in itself.

I'd say that Blake's multiple rows in asset.copy_part_map issue should be reported in a new bug.

Blake GH (bmagic) wrote :

I reported a bug here:

https://bugs.launchpad.net/evergreen/+bug/1411422

And then it was marked a duplicate of this bug. I think that there are some similarities and maybe some of the underlying code is related. I believe bug 1411422 is related to parts.

Michele Morgan (mmorgan) wrote :

I see I'm the one that made it a duplicate!

But I misunderstood from the description, that it was indeed a separate issue, I'm removing the duplicate designation on bug 1411422.

Shae (shae-esilibrary) wrote :

Confirmed on a 2.9 system. It seems there are more conditions to the duplicates than originally described and it's not specific to floating collections. Anytime you have a mismatch between Owning and Circulating Library, there's potential for this display issue to show up. Here is what I did to recreate it on a 2.9. system:

1. Title is Dune Road by Jane Green. I'm searching at the System level (Lexington System). It has 3 copies and all 3 copies had an Owning and Circulating Library of Lexington Main. All is well with the "show more details" OPAC view.

2. I went to Holdings Maintenance and updated one of the 3 copies to have a Circulating Library of Oxford, keeping the Owning Library of Lexington Main. Oxford is also part of the Lexington system.

3. When I go back to my search, again at the System level, I see duplicates. It's now showing 6 copies in the detailed view although the copy summary is still correct showing "3 out of 3 copies are available in the Lexington System."

4. If I scope down to the Lexington Main branch, it looks good. I only see the 2 copies with a Circulating Library of Lexington Main.

5. If I scope down to the Oxford branch, it looks good. I only see the 1 copy with the Circulating Library of Oxford.

So, scoping makes a difference here. Also verifying that this is only with the "show more details" display. The copy summaries are fine.

I also found that if I have a bunch of copies, the issue only shows up if I have a mismatch with one of the copies that shows on the "show more details" screen. For instance, if there are 30 copies and the first 10 show up in the detailed view, I actually have to make one of those 10 copies a mismatch before I'll see the duplicates. If I happen to update a copy that wasn't showing in the initial list, I won't see the duplicates. This is why it appears larger library systems with more copies might not bump into this as much as smaller systems where all of their copies show up on that initial search results screen.

I would say the easiest work around until the bug can be resolved is to turn off the "show more details" display in the OPAC. Not an ideal solution but the only thing I have to offer until someone can resolve the display issue. Hope this helps.

Shae (shae-esilibrary) wrote :

And I'm attaching some screen prints to go with my original comment to illustrate how I made the duplicates appear.

Mike Rylander (mrylander) wrote :

The problem stems from the db stored proc called evergreen.ranked_volumes() and the fact that it no longer (and maybe never, but it's been wholly rewritten at least once) does what it says on the tin. It mixes volume-level information (id) with copy-level information (circ lib name), and is basically useless (now) for the internal uses to which it is currently being put. This is because it does not actually spit out ranked volumes at all, but volume+copy.circ_lib, and volume id is duplicated in the case of differing circ and owning libs.

There are a few ways to address this:

 1) We can notice the duplicates in the output of unapi.holdings_xml() while in the TPAC code.
 2) We can ignore duplicate volume IDs in unapi.holdings_xml() while preserving the true "volume" rank order displayed today by taking the "best" ranking for each volume. Then create a function that ranks copies of a volume, and use that in unapi.acn() to include copies.
 3) We can "fix" ranked_volumes to return more information, such as adding copy id and relevant library ids, so that direct callers can ... **mumble mumble** ... more complicated query.
 4) We can break the unapi structure and invert the volume->copy structure so that copies have volumes dangling off them. We'd still need (3) or something like it for this to work properly.
 5) Something else?

(4) is both the most disruptive and most "correct" solution, IMO. It could be made a bit less painful by leaving the "holdings_xml" include option as-is and adding a new include option that differs in shape. Perhaps "title_copies", since it would be nominally attached to the "title" (bre).

Thoughts?

Download full text (3.6 KiB)

I'm all for doing something correctly so if #4 is the most correct option,
that would be my vote. If there's an interim solution that's less
disruptive, I'm sure customers would also appreciate this until the more
complete solution can be put in place. My two cents.

--
Shae Tetterton
Director of Sales
Equinox Software, Inc. / The Open Source Experts
email: <email address hidden>
direct: +1 770-709-5573

On Tue, Jan 5, 2016 at 11:53 AM, Mike Rylander <email address hidden> wrote:

> The problem stems from the db stored proc called
> evergreen.ranked_volumes() and the fact that it no longer (and maybe
> never, but it's been wholly rewritten at least once) does what it says
> on the tin. It mixes volume-level information (id) with copy-level
> information (circ lib name), and is basically useless (now) for the
> internal uses to which it is currently being put. This is because it
> does not actually spit out ranked volumes at all, but
> volume+copy.circ_lib, and volume id is duplicated in the case of
> differing circ and owning libs.
>
> There are a few ways to address this:
>
> 1) We can notice the duplicates in the output of unapi.holdings_xml()
> while in the TPAC code.
> 2) We can ignore duplicate volume IDs in unapi.holdings_xml() while
> preserving the true "volume" rank order displayed today by taking the
> "best" ranking for each volume. Then create a function that ranks copies
> of a volume, and use that in unapi.acn() to include copies.
> 3) We can "fix" ranked_volumes to return more information, such as adding
> copy id and relevant library ids, so that direct callers can ... **mumble
> mumble** ... more complicated query.
> 4) We can break the unapi structure and invert the volume->copy structure
> so that copies have volumes dangling off them. We'd still need (3) or
> something like it for this to work properly.
> 5) Something else?
>
>
> (4) is both the most disruptive and most "correct" solution, IMO. It
> could be made a bit less painful by leaving the "holdings_xml" include
> option as-is and adding a new include option that differs in shape.
> Perhaps "title_copies", since it would be nominally attached to the "title"
> (bre).
>
> Thoughts?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1315552
>
> Title:
> Duplicate initial search results where copy circ lib/call number
> owning lib are different
>
> Status in Evergreen:
> Confirmed
>
> Bug description:
> -Evergreen 2.5.1
> -OpenSRF 2.2.x
>
> When items within a record have a circulating library different than
> their call number's owning library, the catalog search results for
> that record show duplicate rows in the initial search returns.
>
> The item number counts are correct in the search result and record
> level displays. When you click on the record with the duplicate
> results, the copies are listed correctly, without duplicates.
>
> Making the call number/circ libraries the same corrects the duplicate
> initial search issue.
>
> A few libraries are trying to implement floating collections and would
> like the owning lib to remain constant for items wi...

Read more...

Mike Rylander (mrylander) wrote :

Any of the options I listed would be a non-trivial effort, but some would be more work than others.

Also, while (1) would be the least developer-time effort, it would also fail to maintain the overall ordering intended by the backend function. That's not to say that the ordering is being honored now (it's not guaranteed to do so today, per the code), but ordering would be made either systemically incorrect, or randomly less correct, all depending on the details of when copies were created and other variables. But duplicates could certainly be removed that way. IOW, trading one bug for another (or bigger) one.

(4) is the only real way I see to accomplish the intent of the original. We can make it not break existing templates by providing backward compatibility, though.

Dan Wells (dbw2) wrote :

I came here via the serials bug recently linked to this. Mike's right on with his diagnosis, but I think we are much closer to already having solution #2 than he realized. It appears copies are already re-sorted on a per-volume level in unapi.acn(), so ranked_volumes need only return volume information ranked according to the "best" copy on each volume, which I believe it already does.

With all that in place, the only thing left to do is actually return the owning_lib.name instead of the circ_lib.name, which is what everything above should expect anyway. Here is an experimental, lightly-tested branch:

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/dbwells/lp1315552_fix_duplicates_in_ranked_volumes

working/user/dbwells/lp1315552_fix_duplicates_in_ranked_volumes

Thoughts?

Brent Mills (brent-5) wrote :

Dan,

I applied your patch on a test machine running our production data that has many of these duplicate results. From the initial searches on frequent offender titles (Harry Potter) the duplicates are indeed removed. Attaching example result screenshots.

Brent Mills (brent-5) wrote :
Mike Rylander (mrylander) wrote :

Dan,

Your change absolutely does remove duplicates. I'm concerned, though, because the intent of the "more details" copy table is to show the "best" copies, as opposed to call numbers. The change as proposed (I think) will allow checked-out and otherwise "less good" copies to appear above better ones if they are on different call numbers. (More correctly, less-good copies could be included below more-good ones from the same call number and library, but above succeeding copies that are better but are on other call numbers.)

I'll note that what I describe is probably a more subtle issue than the duplicates, and may generally go unnoticed, but it's the kind of thing that, when it's noticed, generates questions that tend to be even harder to troubleshoot.

If we end up committing this, should we consider it a stop-gap until an inverted-relationship (or some other) solution is developed?

Thanks, Dan, for looking at this!

Dan Wells (dbw2) wrote :

I believe that "less-good copies could be included below more-good ones from the same call number and library, but above succeeding copies that are better but are on other call numbers" is currently true in releases (and perhaps has always been true?). In effect (for clarity), the ranking system will guarantee you see the "best" copy first, and depending on overall availability and copy/call-number ratio, you're pretty likely to see other "good" copies as well, but no guarantees.

I'm not discounting the more subtle goal of true best-copy order, but I agree that maybe we can consider it outside of the scope of this bug? Unless the proposed branch adds new problems, my vote would be that we solve the more immediate need ("stop-gap") and consider true best-copy ranking as an enhancement on a separate bug.

Thanks, Mike!

Dan Wells (dbw2) wrote :

P.S. Also, thanks, Brent, for testing and providing screenshots!

tags: added: needstest pullrequest
Kathy Lussier (klussier) wrote :

I haven't looked at the code, but I want to jump in on the best-copy order. I'm not sure the true best-copy ranking should be considered a separate enhancement bug since, from what I understand, there is a current ranking that will change as a result of this code.

Looking at http://git.evergreen-ils.org/?p=Evergreen.git;a=commit;h=2cce485ffc953176760b4972ffc7f2944661f4b6

The current sort behavior is intentional. I believe it has been adapted since that time, but there was a reason copy availability was part of that best copy order. It may appear to be subtle, but when you are working with public libraries that buy multiple copies of a bestseller, ignoring copy availability on the search results screen leads to a situation where all copies listed in the search results screen may show as unavailable, even though there is another copy available in the library. Those patrons may then skip over that title believing that it is not immediately available. It may not be a situation seen in academic libraries very often, but it is something we come across frequently in a consortium with many public libraries.

I understand that this bug needs to be fixed, but if you are breaking another library's feature to address this issue, then you are are adding a new problem IMO. I worry that treating the fix for the new problem as an enhancement on a separate bug means that it will not be resolved any time soon. My preference is that we not create the new problem to begin with and see if there is a way to fix this bug without changing the current sort order.

Mike Rylander (mrylander) wrote :

I think you're right that it's possible today, and it's also entirely possible that I'm focusing on the a minor part of problem. I have been known to get stuck on a point when a name doesn't match its implied functionality... ;)

I'm happy with the code (it doesn't look like it will create any problem for existing callers in our code) and it is doubtful that 3rd party code is depending on the current functionality, so if others can confirm that they like the functionality and don't see issues at the user level, I'm happy to commit.

Just to clarify and add some background, the point of the function in question was, originally, to return call numbers that have copies that are in-scope while surfacing the in-scope org name and the call number info together, presumably for efficient display. This change hides the org on the copy that is used for the scope test, but that is OK for current callers as (I believe) they all currently pull in the copies under the call number for other purposes and can deduce the org(s) in question from there.

Thanks, Dan!

Dan Wells (dbw2) wrote :

Kathy, to address your concern, I just want to emphasize that the fix doesn't change the current sort order at all, so it doesn't break any of that intended behavior. It continues to display first the most available copy as it always has since that feature was implemented. From a usage perspective, it only removes the duplicates mentioned in the bug.

To illustrate, if we use letters to represent call numbers and numbers for copy rank, we currently get the following display in cases with mismatched owning/circ libs:

Q1
Q1
Q4
Q4
F2
F2
F5
F5
H3
H3

The duplicates listed above are not real copies, just display problems. The new codes removes the dupes, keeping the same order:

Q1
Q4
F2
F5
H3

An idealized future would remove call number grouping to give us:

Q1
F2
H3
Q4
F5

The current code, both before and after the fix, orders by copy "rank" (the complex set of rules from the branch linked) but groups by call number. This means the best copy (the "1") is still always first, and in this case, the Qs come before the Fs come before the Hs. The more nuanced issue is how to handle Q4 and F5 (to shove them further down), but it is not (AFAICT) something the original code did.

Michele Morgan (mmorgan) wrote :

I haven't actually tested this yet, but in looking at the proposed changes to the function, I agree that this fix appears only to remove the duplicate entries in the display, not change the sort in any way. I plan to give it a test shortly.

Mike Rylander (mrylander) wrote :

Thanks, Dan. It's very helpful to have that explained plainly. And Kathy, thanks for weighing in on the intent of the functionality point.

If the order labeled "idealized future" was the original intent, and Dan Scott may be able to illuminate this according to git, then a naive fix is to provide a copy-oriented view of the holdings. Proceeding under that assumption, I'm honestly still not sure if the age of the actual behavior turns this ticket into a bug or a feature request.

I'd like to hear more opinions.

Thanks again, both!

Kathy Lussier (klussier) wrote :

Thank you Dan for the follow-up explanation! I clearly need to look more closely at the feature before weighing in. I'll try to do so later this week. Thanks again!

Michele Morgan (mmorgan) wrote :

I tested the proposed changes and captured the before and after results for the same record. I also captured the Holdings Maintenance view showing where the circ lib and owning lib are different:

https://docs.google.com/a/noblenet.org/document/d/1c2o1Kjyt7CTuBbclB29udMBsytQr9C9IypHTeu72j5Y/edit?usp=sharing

In looking at the proposed changes, I couldn't see where the order of the copies would change, but looking at the screen captures, it looks to me to be sorted differently, not just deduped.

The copies listed first are equally available, though, so I'm not sure if this is really an issue.

I think it's important to fix the duplicate copy display issue. I would favor this as a bug fix with the "idealized future" being a separate Launchpad entry.

Dan Wells (dbw2) wrote :

I did overstate when I said the new branch didn't change the sort order "at all". What I meant to say is that it doesn't change the ranked order at all. In cases of equal ranking, the old code attempted to sort call numbers by circ_lib, and of course call numbers don't have circ_lib's, and aren't guaranteed to have a single circ_lib within the attached copies, so this is at best confusing and at worst nonsensical.

The order of equal rank displayed copies *within* a given call number is currently random/arbitrary in both versions of the code. I believe it would be simple (and perhaps a good idea overall) to sort by circ_lib within each rank within each call number, if only to make the result list predictable in appearance.

Kathy Lussier (klussier) wrote :

I haven't had a chance to look closely at the code, but I trust Michele's judgement in this arena and am willing to go with her assessment that it's important to fix the duplicate copy issue and submitting a separate LP entry on the "idealized future."

Mike Rylander (mrylander) wrote :

I've picked Dan's commit into master. Thanks, Dan!

Changed in evergreen:
status: Confirmed → Fix Committed
Dan Wells (dbw2) on 2016-08-24
Changed in evergreen:
milestone: none → 2.11-beta
Changed in evergreen:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers