Added Content: Allow fake ISBNs to be sent to providers

Bug #1559281 reported by Josh Stompro
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Evergreen
Fix Released
Medium
Unassigned
2.8
Won't Fix
Medium
Unassigned
2.9
Fix Released
Medium
Unassigned

Bug Description

This is related to bug 1549393.

Currently the added content module uses Business::ISBN to validate ISBN's before they are sent to the content providers. It filters out the unofficial ISBN's that Baker & Taylor use for their DVDs. Those ISBN's are rejected because they have an unknown group/country code since they were not actually issued by Bowker.

But Content Cafe will happily return results for those ISBN's since Baker & Taylor makes sure they are registered for the titles they sell. I don't know if other providers have those ISBN's registered as well.

So it would be nice if the added content module allowed ISBN's with invalid country/group codes through. So lookups would work for items that only have an unofficial ISBN listed.

Thanks
Josh

Revision history for this message
Jeff Godin (jgodin) wrote :

Josh-

There might be other good reasons for relaxing our ISBN validation, such as not needing to worry about keeping our local (or packaged) copy of RangeMessage.xml up-to-date.

Can you provide any additional background or sample data?

Does B&T document their practice anywhere, either in Content Cafe documentation or elsewhere?

Can you provide some sample "fake ISBN" values and their expected pairings with titles?

Thanks,

-jeff

Revision history for this message
Josh Stompro (u-launchpad-stompro-org) wrote :

Hello Jeff, Thanks for your feedback.

I haven't seen any documentation from Baker and Taylor on this, I'm just going off of deduction.

1. ISBN's are not issued for entertainment DVDs. https://www.myidentifiers.com/help/isbn

2. Baker & Talor supplies an ISBN for every entertainment DVD they sell, for ease of integration with local acquisitions systems that may only support ISBN identifiers. See one of their marketing brochures, which has a bunch of the fake ISBN's for DVD and Blue Ray titles. - http://www.btol.com/home_whatshot_details.cfm?sideMenu=Featured%20CDs%20and%20DVDs&home=home_whatshot_details.cfm

3. The ISBN's that B&T supplies have a Group/Country code that isn't registered. 631 is the code that I've noticed. It looks like 621 is the latest one officially registered to the Philippines - https://en.wikipedia.org/wiki/List_of_ISBN_identifier_groups

4. Content Cafe (a Baker & Taylor Service) will return cover art for these ISBN's. I don't know if other added content providers also included these ISBN's. That would be good to know.

Examples:
DVD "Black Mass" - 9786316294241 6316294247 - https://www.worldcat.org/title/black-mass/oclc/933728185
DVD "Bridge of Spies" - 9786316271976 6316271972 - https://www.worldcat.org/title/bridge-of-spies/oclc/933729520
DVD "Alvin and the Chipmunks. The road chip" - 9786316364036 - http://www.btol.com/home_whatshot_details.cfm?sideMenu=Featured%20CDs%20and%20DVDs&home=home_whatshot_details.cfm
DVD "Spectre" - 9786316334886
DVD "Southerner" - 9786316321183
DVD "Spotlight" - 9786316319401
DVD "Steve Jobs" - 9786316291431

Revision history for this message
Thomas Berezansky (tsbere) wrote :

I don't know if I like this solution, but it likely works fine for most B&T ISBNs:

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/tsbere/lp1559281_bt_isbns

Revision history for this message
Galen Charlton (gmc) wrote :

On the assumption that B&T's added content service is likely the only one that will recognize B&T's fake ISBNs, a variant approach would be to move identifier validation to a routine that could optionally be overridden by a specific handler, then incorporate Thomas' approach (+ use of local) into just the B&T handler.

Revision history for this message
Galen Charlton (gmc) wrote :

Yet another approach: fix Business::ISBN so that it can actually be used to just do checksum validation of ISBNs (and omit country/publisher code validation), then add an option to the added content configuration to specify the degree of validation required.

Revision history for this message
Thomas Berezansky (tsbere) wrote :

I suppose a good question before we go any further is "Does anything other than Content Cafe recognize the B&T ISBNs?" but I am unsure about how to best go about testing that right now.

Revision history for this message
Rogan Hamby (rogan-hamby) wrote : Re: [Bug 1559281] Re: Added Content: Allow fake ISBNs to be sent to providers

Addressing Thomas' question, I tried several of the sample ones that Josh
supplied and Syndetics does not recognize them.

On Mon, Mar 21, 2016 at 1:22 PM, Thomas Berezansky <
<email address hidden>> wrote:

> I suppose a good question before we go any further is "Does anything
> other than Content Cafe recognize the B&T ISBNs?" but I am unsure about
> how to best go about testing that right now.
>
> --
> You received this bug notification because you are subscribed to
> Evergreen.
> Matching subscriptions: evergreenbugs
> https://bugs.launchpad.net/bugs/1559281
>
> Title:
> Added Content: Allow fake ISBNs to be sent to providers
>
> Status in Evergreen:
> New
>
> Bug description:
> This is related to bug 1549393.
>
> Currently the added content module uses Business::ISBN to validate
> ISBN's before they are sent to the content providers. It filters out
> the unofficial ISBN's that Baker & Taylor use for their DVDs. Those
> ISBN's are rejected because they have an unknown group/country code
> since they were not actually issued by Bowker.
>
> But Content Cafe will happily return results for those ISBN's since
> Baker & Taylor makes sure they are registered for the titles they
> sell. I don't know if other providers have those ISBN's registered as
> well.
>
> So it would be nice if the added content module allowed ISBN's with
> invalid country/group codes through. So lookups would work for items
> that only have an unofficial ISBN listed.
>
> Thanks
> Josh
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/evergreen/+bug/1559281/+subscriptions
>

Revision history for this message
Josh Stompro (u-launchpad-stompro-org) wrote :

It looks to me like Business:ISBN never loads the current checksum digit at initialization of the object if any of the checks that come before the checksum check fail. So because the group code check fails, the current checksum digit is never loaded. So is_valid_checksum will never return true, since it tries to compare the stored check digit with the computed check digit. I open a ticket with the mod author to see if this is the intended behavior, and to see if the docs can clearly state that.

It looks like fix_checkdigit might work instead, it grabs the current last character and computes the check digits, returns false if the check digit is correct, and true if it needed to fix the check_digit which implies that it was wrong. I believe that is_valid_checksum would work after running fix_checkdigit, but it would always be true.

It feels wrong to just add the B&T group code as a valid group code, but it is the simplest solution. I'm assuming that is the only group code that they are using, but I don't know that for sure. I'll try to contact B&T and find out if they use any other codes. I like the idea of only doing that for Content Cafe, and not subjecting all providers to that work around. +1 to that.

Josh

Revision history for this message
brian d foy (brian-d-foy) wrote :

I'm the maintainer of the Business::ISBN module. I think there are some easy ways to approach this problem (much of which repeats Josh):

1. Insert the unassigned-yet-used group codes into the group ranges data structure. You then verify the ISBNs as normal and still catch bad group codes. You can do this by making a local version of Business::ISBN::Data, or a new namespace that replaces the global variable that Business::ISBN::Data inserts. This way you still catch bad group codes.

2. The private _checksum method returns the good checksum even for bad ISBNs. I think that is stable enough to be safe for everyday use. It's what fix_checksum uses.

3. The fix_checksum method will give you the checksum it thinks it should be, even if the ISBN is bad. You know what the current checksum is because you know what the ISBN string is. If the ISBN fails because it's a bad group code, and the group code is one that you want to accept, you have the checksums and can decide which one to use.

I don't think it would be in the best interests of most users to nerf the module to allow bad isbns. If there's something you need in the module to make this easier without relaxing the safeguards, I'll look at pull requests.

I don't monitor this place, so if you want my attention say something on the GitHub issue for this: https://github.com/briandfoy/business-isbn/issues/6

Revision history for this message
brian d foy (brian-d-foy) wrote :

I played around with this for a bit. It's pretty easy:

$Business::ISBN::country_data{ '631' } = [
 'Baker and Taylor',
 [ '0' => '9' ],
 ];

I don't know what the publisher codes should be.

I've made a complete example script using the test cases from this thread: https://github.com/briandfoy/business-isbn/blob/master/examples/private_group_code.pl

Revision history for this message
Galen Charlton (gmc) wrote :

> I played around with this for a bit. It's pretty easy:
>
> $Business::ISBN::country_data{ '631' } = [
> 'Baker and Taylor',
> [ '0' => '9' ],
> ];

Do you expect the data structure to remain stable for the foreseeable future?

Revision history for this message
Rogan Hamby (rogan-hamby) wrote :

I like that solution a lot better than opening the gate to bad ISBNs.

On Mon, Mar 28, 2016 at 12:02 AM, brian d foy <email address hidden> wrote:

> I played around with this for a bit. It's pretty easy:
>
> $Business::ISBN::country_data{ '631' } = [
> 'Baker and Taylor',
> [ '0' => '9' ],
> ];
>
> I don't know what the publisher codes should be.
>
> I've made a complete example script using the test cases from this
> thread: https://github.com/briandfoy/business-
> isbn/blob/master/examples/private_group_code.pl
>
> --
> You received this bug notification because you are subscribed to
> Evergreen.
> Matching subscriptions: evergreenbugs
> https://bugs.launchpad.net/bugs/1559281
>
> Title:
> Added Content: Allow fake ISBNs to be sent to providers
>
> Status in Evergreen:
> New
>
> Bug description:
> This is related to bug 1549393.
>
> Currently the added content module uses Business::ISBN to validate
> ISBN's before they are sent to the content providers. It filters out
> the unofficial ISBN's that Baker & Taylor use for their DVDs. Those
> ISBN's are rejected because they have an unknown group/country code
> since they were not actually issued by Bowker.
>
> But Content Cafe will happily return results for those ISBN's since
> Baker & Taylor makes sure they are registered for the titles they
> sell. I don't know if other providers have those ISBN's registered as
> well.
>
> So it would be nice if the added content module allowed ISBN's with
> invalid country/group codes through. So lookups would work for items
> that only have an unofficial ISBN listed.
>
> Thanks
> Josh
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/evergreen/+bug/1559281/+subscriptions
>

Revision history for this message
Thomas Berezansky (tsbere) wrote :

Here is my previous solution but thrown into the Content Cafe module instead of the main module:

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/tsbere/lp1559281_bt_isbns_cc

Revision history for this message
Josh Stompro (u-launchpad-stompro-org) wrote :

I checked in with Baker&Taylor support and they reported that they currently use group codes 630 and 631 for DVD, Blu-Ray and CD products that do not have supplier generated ISBNs. So 630 should also be added to the group code exemption.

I can add that and test this out with our live system and provide a sign off branch.

Josh

Changed in evergreen:
assignee: nobody → Josh Stompro (u-launchpad-stompro-org)
Revision history for this message
Josh Stompro (u-launchpad-stompro-org) wrote :

Thanks everyone, this worked without any problems when I tried it out.

Signoff branch at user/stompro/lp1559281_bt_allow_fake_isbn

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/stompro/lp1559281_bt_allow_fake_isbn

Josh

Changed in evergreen:
assignee: Josh Stompro (u-launchpad-stompro-org) → nobody
tags: added: pullrequest
Galen Charlton (gmc)
Changed in evergreen:
milestone: none → 2.10.2
no longer affects: evergreen/2.10
Changed in evergreen:
importance: Undecided → Medium
Revision history for this message
Jason Stephenson (jstephenson) wrote :

Can someone explain why this is considered a bug and not a new feature?

I've thought this was a new feature from the get go.

Revision history for this message
Galen Charlton (gmc) wrote :

I grant that it is borderline, but here's my reasoning: a cover images provider that supplies MARC records with the identifier required for an ILS to retrieve the images... has Evergreen refusing to hand it over, when an implementation that did not attempt to impose an ISBN validation requirement would have just worked. The patch itself is self-contained and present little backport risk.

Changed in evergreen:
milestone: 2.10.2 → 2.10.3
Changed in evergreen:
milestone: 2.10.3 → 2.10.4
Changed in evergreen:
milestone: 2.10.4 → 2.10.5
Changed in evergreen:
milestone: 2.10.5 → 2.10.6
Revision history for this message
Mike Rylander (mrylander) wrote :

Picked into 2.9-master. Thanks, Thomas and Josh!

Changed in evergreen:
status: New → Fix Committed
Changed in evergreen:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.