presence of relator terms or codes in $e or $4 can prevent authority headings linking

Bug #1465830 reported by Galen Charlton on 2015-06-16
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Evergreen
High
Unassigned
2.7
High
Unassigned

Bug Description

Suppose an authority record with the following heading is present in the database:

100 $a Doe, Jane $d 1945-

and suppose a set of bib records that contain the following headings are loaded:

100 $a Doe, Jane $d 1945- $4 edt

and

700 $a Doe, Jane $d 1945- $e editor

If the authority_control_fields.pl script is run on those bibs, the Jane Doe headings are *not* linked to the authority.

Expected behavior: relator terms and codes (e.g., subfields $4 and $e in name headings) should be ignored when doing automatic linking of bib headings to authority headings.

Evergreen master

Galen Charlton (gmc) on 2015-06-16
tags: added: authority cataloging
Yamil (ysuarez) wrote :

I started putting together a working branch with new values for authority_control_fields.pl.in and the authority.control_set_authority_field table, but I am not done.

Here is what I have so far...

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/ysuarez/berklee_proposed_authority_config_changes

Galen Charlton (gmc) wrote :

Yamil: indeed, the part of your patch that takes out references to $4 and $e matches what I had in mind as a fix for this bug as described.

Galen Charlton (gmc) wrote :

I consulted the folks on the Evergreen catalogers list and received confirmation that $4 and $e in name headings should be ignored when linking records:

http://list.evergreen-ils.org/pipermail/evergreen-catalogers/2015-June/000540.html

Yamil (ysuarez) wrote :

So my original plans was to address the $4 and $e, but also a bunch of other subfields that should be deleted or added to authority_control_fields.pl.in and the authority.control_set_authority_field DB table. Should I split my work and submit the $4 and $e, or should I wrap up and push the changes to the seed values of authority.control_set_authority_field DB table.

Note 1: I the seed values of authority.control_set_authority_field DB table is and $e too. I will soon writing out new seed values. In addition there are some new subfields that need to be added to authority.control_set_authority_field DB table.

Note 2: Last night I made some more edits to my working branch, I had to update some additional values inauthority_control_fields.pl.in.

Galen Charlton (gmc) wrote :

I think it would be most expedient if you split the patch and opened a new bug for the rest of it; I can test and review the $e/$4 stuff right quick.

Yamil (ysuarez) wrote :

@Galen: I will create a new branch with that fix only, will try to do it today.

I will remove subfield $4 and $e for the follwoing BIB TAG defintions in authority_control_fields.pl.in:

100, 110, 600, 610, 700, 710

Yamil (ysuarez) on 2015-06-19
tags: added: pullrequest
Galen Charlton (gmc) on 2015-06-19
Changed in evergreen:
status: New → Confirmed
Galen Charlton (gmc) on 2015-06-19
Changed in evergreen:
importance: Undecided → Medium
milestone: none → 2.8.3
Galen Charlton (gmc) wrote :

Pushed to master, rel_2_8, and rel_2_7. Thanks, Yamil!

Changed in evergreen:
status: Confirmed → Fix Committed
Dan Wells (dbw2) wrote :

I think this change might be incomplete in a dangerous way. If we no longer look at $e for linking, but we leave $e in authority.control_set_authority_field rows for tag 100 (etc.), then the $e will be removed during future authority propagation. I think this is part of what Yamil was saying in note #1 of comment #4.

I could be wrong, but this looked true in some quick testing. Also, if you test, you will hit bug #712490 as well. I am reopening this bug mostly for expediency, but if someone thinks it deserves a new bug, feel free.

Changed in evergreen:
status: Fix Committed → Confirmed
importance: Medium → High
Galen Charlton (gmc) wrote :

I think this warrants a new bug. I'll close this one after you open it.

Yamil (ysuarez) wrote :

I can create the new bug to get the ball rolling. How about to something like...

"...

title: authority data may be deleted during propagation with current values of authority.control_set_authority_field

Since subfield $e in bib tags 100/110/600/610/700/710 is no longer being used for auth linking through authority_control_fields.pl.in, but we currently left it $e in authority.control_set_authority_field rows for tag 100 (etc.), then the $e will be removed in the bib side during future authority propagation. That is not a desired outcome.

New values need to be set for Open-ILS/src/sql/Pg/950.data.seed-values.sql, as well as an upgrade script. Also a pgTAP test.

..."

Feel free to take my draft bug and edit it yourself or suggest edits to me.

BTW, I also started looking at the changes that have to be made to Open-ILS/src/sql/Pg/950.data.seed-values.sql, but I will be submitting some questions to the devs about the changes.

Thanks, Yamil

Yamil (ysuarez) wrote :

I just created a new bug for the behavior that Dan noticed in comment #9

LP 1465830: authority data may be deleted during propagation with current values of authority.control_set_authority_field
https://bugs.launchpad.net/bugs/1465830

Yamil (ysuarez) wrote :

My bad, I linked the wrong bug and gave the wrong bug number above in comment #12

LP 1484281: authority data may be deleted during propagation with current values of authority.control_set_authority_field

https://bugs.launchpad.net/evergreen/+bug/1484281

Jason Stephenson (jstephenson) wrote :

Setting to fix released per Galen's comment in #10.

I am presently looking at the fix for the bug Yamil opened.

Changed in evergreen:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers