Bogus data can lead to null values, breaking indexing

Bug #1303940 reported by Mike Rylander
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Evergreen
Fix Released
High
Unassigned

Bug Description

In at least the case of browse indexing, some MARC contents can cause us problems when normalization (such as stripping all punctuation) can leave us with a NULL value for index-intended data. The branch (coming RSN) eventually pointed at by this bug avoids the specific case of browse data being null and causing and ingest failure.

Revision history for this message
Mike Rylander (mrylander) wrote :

Here's the branch. Jason Stephenson is working on a pgtap test case (thanks!) to avoid regression here.

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/miker/lp1303940-browse-hates-null

Revision history for this message
Dan Wells (dbw2) wrote :

I had encountered this in a slightly different context. Here is a relevant commit (from a branch which was abandoned for other reasons, so ignore the seed data stuffs):

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=commitdiff;h=97a3ab44152c2fbe6704f3f5fdfd167621dd3d5f

My fix is very similar to Mike's (I had not accounted for the sort_value emptiness). This make me wonder, though, is there a reason we don't abort earlier in the case of a NULL ind_data.value? I don't think we need to go ahead with the facet/search inserts when 'value' is NULL, even if they might allow it.

In short, I think we should CONTINUE on a NULL 'value' right at the top, then CONTINUE on a NULL 'sort_value' where Mike put his code. Thoughts?

Revision history for this message
Jason Stephenson (jstephenson) wrote :

I added a pg tap test at http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/collab/dyrcona/lp1303940-browse-hates-null

Unfortunately, I didn't sign off on Mike's commit before pushing to the above.

Galen Charlton (gmc)
Changed in evergreen:
assignee: nobody → Galen Charlton (gmc)
status: New → Confirmed
Revision history for this message
Galen Charlton (gmc) wrote :

I've pushed to master Mike's patch (and took the liberty of adding Jason's signoff), Jason's test case, and a further patch implementing Dan's suggestion.

Changed in evergreen:
status: Confirmed → Fix Committed
assignee: Galen Charlton (gmc) → nobody
Changed in evergreen:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.