Use all subfield values to link authority records to bibs

Bug #1245944 reported by Mike Rylander on 2013-10-29
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Evergreen
Critical
Unassigned
2.3
Undecided
Unassigned
2.4
Undecided
Unassigned

Bug Description

Given an Evergreen instance with two authority records loaded, one being a more specific than the other via a repeated subdivision subfield, we must make sure that we use all the bib-supplied subfield values when attempting to auto-link to the correct authority. Otherwise, the "shorter" authority record may be selected as appropriate, and data in the bib record would be lost.

I consider this pretty serious, as bib data is changed in a way that makes reverting particularly difficult, and the problem can go unnoticed until an authority ingest (forcing authority propagation) mangles a ton of data.

Additionally, if a previously linked "short" authority record has not yet asserted itself, and a re-run of the linking script would not find the previously linked record (it won't in the case described above) then the linking script does not remove the old $0.

Here's a branch that (1) considers all subfield values when linking and (2) adds a --refresh flag to the authority linking script to strip target bib records of all $0 subfields before searching for a best match.

Top 2 commits of: http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/miker/link-using-all-subfield-values

Galen Charlton (gmc) wrote :

My commentary on the proposed branch upon eyeballing it: using all of the subfields in the bib field to look up a matching authority record is an improvement over the status quo.

It's still not perfect, though: the order of subfields in the bib and authority headings are ignored, as is checking that the subject thesauri of the bib and authority heading match. Those concerns are long-standing, however, and reasonably the topic of a separate bug.

Galen Charlton (gmc) wrote :

Mentioning a comment Mike made in a separate discussion -- a better approach in the long run is base authority lookup on authority.simple_heading() or the like. Of course, that's tantamount to rewriting a good chunk of the linking script, so ... Captain! NEED MORE TUITS!

Mike Rylander (mrylander) wrote :

Head's up! I force-pushed a tiny bug fix for a think/type-o spotted by Dan Wells during testing. Pull again if you're planning to test, please and thank you.

Dan Wells (dbw2) on 2013-11-01
Changed in evergreen:
assignee: nobody → Dan Wells (dbw2)
Dan Wells (dbw2) wrote :

Tested with some help from Remington S., looks good. Pushed from master through rel_2_3. Thanks, Mike!

Changed in evergreen:
status: New → Fix Committed
assignee: Dan Wells (dbw2) → nobody
Dan Wells (dbw2) on 2013-11-11
Changed in evergreen:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers