biblio fingerprint should distinguish between elements contributing to the fingerprint
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Evergreen |
Fix Released
|
Medium
|
Unassigned |
Bug Description
Consider the movie "Blue steel" and the book "Blue" by Danielle Steel. With typical MARC cataloging of these titles and the default seed data for config.
This has the effect of putting both titles on the same metarecord, which would lead to confusing results when doing metarecord searches or using advanced hold options in the public catalog.
Some ways that this problem could be addressed:
- change the stock author component of the fingerprint so that it includes all words, not just the first word. This will reduce the chance of mismatches, but can also result in cases where minor differences in how an author's given names are catalogued
- adjust the fingerprint so that a special separator character is used to distinguish between fields contributing to the fingerprint. That special character could be as simple as a space, e.g. "blue steel" would mean title=blue, author=steel, as opposed to "bluesteel" (title normalizes to "bluesteel", no individual contributor is cataloged).
- solve the general problem of assigning work identifiers
Option 3 is... ambitious. I personally have a slight preference for option 2, but option 1 should be considered as well.
description: | updated |
Changed in evergreen: | |
assignee: | nobody → Kathy Lussier (klussier) |
tags: | added: fingerprint metarecords |
Changed in evergreen: | |
assignee: | Galen Charlton (gmc) → nobody |
milestone: | 2.next → 2.12-beta |
Changed in evergreen: | |
assignee: | nobody → Mike Rylander (mrylander) |
Changed in evergreen: | |
status: | Fix Committed → Fix Released |
I like option 2. It would address the underlying issue the majority (if
not all) of the times I have seen it.
On Wed, Dec 23, 2015 at 3:54 PM, Galen Charlton <email address hidden> wrote:
> ** Description changed: biblio_ fingerprint, the same bib finger print will be biblio_ fingerprint, the same bib fingerprint will be /bugs.launchpad .net/bugs/ 1528901 biblio_ fingerprint, the same bib fingerprint will be
>
> Consider the movie "Blue steel" and the book "Blue" by Danielle Steel.
> With typical MARC cataloging of these titles and the default seed data
> - for config.
> + for config.
> generated: "bluesteel".
>
> This has the effect of putting both titles on the same metarecord, which
> would lead to confusing results when doing metarecord searches or using
> advanced hold options in the public catalog.
>
> Some ways that this problem could be addressed:
>
> - - change the stock author component of the fingerprint so that it
> includes all words, not just the first word. This will reduce the chance
> of mismatches, but can also result in cases where minor differences in how
> an author's given names are catalogued
> + - change the stock author component of the fingerprint so that it
> includes all words, not just the first word. This will reduce the chance
> of mismatches, but can also result in cases where minor differences in how
> an author's given names are catalogued
> - adjust the fingerprint so that a special separator character is used
> to distinguish between fields contributing to the fingerprint. That
> special character could be as simple as a space, e.g. "blue steel" would
> mean title=blue, author=steel, as opposed to "bluesteel" (title normalizes
> to "bluesteel", no individual contributor is cataloged).
> - solve the general problem of assigning work identifiers
>
> Option 3 is... ambitious. I personally have a slight preference for
> option 2, but option 1 should be considered as well.
>
> --
> You received this bug notification because you are subscribed to
> Evergreen.
> Matching subscriptions: evergreenbugs
> https:/
>
> Title:
> biblio fingerprint should distinguish between elements contributing to
> the fingerprint
>
> Status in Evergreen:
> New
>
> Bug description:
> Consider the movie "Blue steel" and the book "Blue" by Danielle Steel.
> With typical MARC cataloging of these titles and the default seed data
> for config.
> generated: "bluesteel".
>
> This has the effect of putting both titles on the same metarecord,
> which would lead to confusing results when doing metarecord searches
> or using advanced hold options in the public catalog.
>
> Some ways that this problem could be addressed:
>
> - change the stock author component of the fingerprint so that it
> includes all words, not just the first word. This will reduce the chance
> of mismatches, but can also result in cases where minor differences in how
> an author's given names are catalogued
> - adjust the fingerprint so that a special separator character is used
> to distinguish between fields contributing to the fingerprint. That
> special character could be as simple as a space, e.g. "blue steel" would
> mean...