We should expand the effect of the fix for 1308090 and remove more ISBD punctuation

Bug #1864507 reported by Mike Rylander
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Evergreen
Fix Released
Wishlist
Unassigned

Bug Description

In bug 1308090 Dan Pearl created a function to strip some dangling ISBD punctuation from author-ish fields. We need to improve and expand that to protect some trailing punctuation and to cover some more ISBD punctuation such as dangling "/" and ":" on title-ish fields.

Branch coming soon...

Revision history for this message
Mike Rylander (mrylander) wrote :
Changed in evergreen:
assignee: Mike Rylander (mrylander) → nobody
tags: added: pullrequest
Revision history for this message
Jeff Davis (jdavis-sitka) wrote :

I can confirm that the updated trim_trailing_punctuation function works as advertised. I wonder if we should add ";" to the list of trailing characters to be trimmed?

Does anyone have a good test case for replicating the original problem? It seems like we might need a MARC-based config.metabib_field browse field for that; the standard ones are based on the MODS representation of the record, which appears not to contain trailing ISBD punctuation. For example, MARC 245$a = "The fellowship of the ring /" translates into MODS32 titleBrowse = "The fellowship of the ring" and the latter is the basis for the title browse entry.

Revision history for this message
Mike Rylander (mrylander) wrote :

I need better gmail filters -- I missed you're update, Jeff, sorry!

If there's consensus, I have no problem adding ";" to the list of trailing punctuation and "=" looks like it might likewise show up based on the LoC docs for MARC 245.

Changed in evergreen:
milestone: 3.5-beta → 3.5.0
Changed in evergreen:
milestone: 3.5.0 → 3.5.1
Revision history for this message
Jane Sandberg (sandbergja) wrote :

+1 to removing ";" and "="

Changed in evergreen:
milestone: 3.5.1 → 3.5.2
Revision history for this message
Mike Rylander (mrylander) wrote :

Hi folks,

I've force-pushed an update to the above branch that adds equal and semicolon to the list of trailing punctuation we remove now.

Revision history for this message
Jennifer Weston (jweston) wrote :

Tested on https://tiffany-master.gapines.org/eg/staff

Loaded the following records as examples of each trailing punctuation to be preserved:
TCN 256
245 10$aOlivia counts =

TCN 257
245 10$aGuidebook to Zen and the art of motorcycle maintenance /

TCN 258
245 00$aProceedings at symposium;

TCN 259
245 00$aAction painting :

Punctuation is successfully preserved upon import.

Changed in evergreen:
milestone: 3.5.2 → 3.6.1
Changed in evergreen:
milestone: 3.6.1 → 3.6.2
Changed in evergreen:
assignee: nobody → Jennifer Weston (jweston)
Revision history for this message
Jennifer Weston (jweston) wrote :

Tested during Feedback Fest on https://tiffany-master.gapines.org/eg/staff

Noting this bug adds the following punctuation to the list to be preserved in title-ish fields / : ; =

Loaded the following records as examples of each trailing punctuation to be preserved:

TCN 254
245 10 ‡aGoodnight moon 123 :

TCN 255
245 10 ‡aGoodnight moon = ‡bPw zoo hli /

TCN 256
245 00 ‡aFiddler on the roof; ‡b South Pacific; The sound of music

Punctuation is successfully preserved upon import.

I have tested this code and consent to signing off on it with my name, Jennifer Weston, and my email address, <email address hidden>

tags: added: signedoff
Changed in evergreen:
assignee: Jennifer Weston (jweston) → nobody
Changed in evergreen:
milestone: 3.6.2 → 3.6.3
Changed in evergreen:
milestone: 3.6.3 → 3.6.4
Changed in evergreen:
milestone: 3.6.4 → 3.7.2
status: New → Confirmed
tags: added: opac-browse
removed: browse
no longer affects: evergreen/3.5
Changed in evergreen:
milestone: 3.7.2 → 3.7.3
no longer affects: evergreen/3.6
Changed in evergreen:
milestone: 3.7.3 → none
Changed in evergreen:
milestone: none → 3.10-beta
Changed in evergreen:
importance: Undecided → Wishlist
Revision history for this message
Jane Sandberg (sandbergja) wrote :

Merged for inclusion in 3.10. Thanks, Mike and Jennifer!

I added some release notes based on my understanding of the effects of this patch -- would you have a moment to check those, Mike?

Changed in evergreen:
status: Confirmed → Fix Released
status: Fix Released → Fix Committed
Revision history for this message
Galen Charlton (gmc) wrote :

Pushed fix to seed data and schema update scripts; syntax error would have broken the schema update and left some indexes without the new normalizer in a new database.

Changed in evergreen:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.