Duplicate bib records added

Bug #596099 reported by Steven Chan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Evergreen
Won't Fix
Medium
Unassigned

Bug Description

System details:

Evergreen version 1.6.0
opensrf version 1.2.2
postgresql version 8.3
Linux Ubuntu Server 8.04

Problem description:

We are getting indications of users making duplicate entries in the biblio.record_entry table. Using the following SQL query in a psql session,

evergreen=# select
count(*)
, creator
, create_date
, trim(both from fingerprint) as trimmed_fingerprint
from biblio.record_entry
where
deleted = false
and creator > 1
and extract(year from create_date) = 2010
group by fingerprint, create_date, creator
having count(*) > 1
order by create_date desc
;
count | creator | create_date | trimmed_fingerprint
-------+---------+------------------------+------------------------------
2 | 338817 | 2010-06-14 16:19:25-07 | totemnational
2 | 338817 | 2010-06-09 15:27:40-07 | onehundredbulletsazzarello
2 | 338817 | 2010-06-07 16:05:23-07 | everydayeasydk
2 | 22794 | 2010-05-06 11:46:03-07 | ihatetorideinwindstormdineen
2 | 44094 | 2010-04-28 12:06:05-07 | problemchildbuckley
13 | 372584 | 2010-04-27 19:05:35-07 | vogueadvance
13 | 372584 | 2010-04-27 19:05:34-07 | vogueadvance
18 | 372584 | 2010-04-27 19:05:33-07 | vogueadvance
13 | 372584 | 2010-04-27 19:05:32-07 | vogueadvance
14 | 372584 | 2010-04-27 19:05:31-07 | vogueadvance
15 | 372584 | 2010-04-27 19:05:30-07 | vogueadvance
15 | 372584 | 2010-04-27 19:05:29-07 | vogueadvance
2 | 196151 | 2010-03-16 14:49:22-07 | creampuffmurderfluke
2 | 196151 | 2010-02-16 15:26:53-08 | fridaynightstrollope
2 | 35898 | 2010-01-22 08:57:14-08 | genesisslaughter
(15 rows)

In the results, we find multiple entries when grouped by the same creator id, create date, and fingerprint. (To reduce data, we focus on entries for this year, undeleted entries, and those created by creator id's greater than 1.)

We can describe two cases in detail:

1. 'vogueadvance'

Here, we have 101 duplicates generated within a 7-second range. The case has been verified by talking to the staff person responsible for creating the entries. She reported using a staff client during a catalogue training session. She does not recall doing anything differently when creating that record and has created other bibs without incident.

2. 'ihatetorideinwindstormdineen'

Here, we have an example of the majority case, a duplicate. The creator is a member of the Sitka project team and was witnessed by other team members in a face-to-face work session. She reports,

'-- AUTOGENERATED-24696, and AUTOGENERATED-24697. I used Sitka book template, created a record called I hate to ride in a windstorm, with idea of testing indexing. found both records on a title search, so indexing worked, but what is up with two records?'

James Fournie (jfournie)
Changed in evergreen:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Mike Rylander (mrylander) wrote :

James, have you seen this in any later versions? (I guess, specifically, on 2.0, since you're on that now.)

Revision history for this message
Steven Chan (schan2) wrote :

I reran the SQL starting from beginning of 2011 and can see only one duplicate bib since we upgraded to ver.2.0 on May 18, 2011. Here are the first two entries in the result list:

 count | creator | create_date | trimmed_fingerprint
-------+---------+------------------------+----------------------------------------
     2 | 48834 | 2011-05-28 10:16:18-07 | littleenginethatcoulduniversal
     2 | 68303 | 2011-04-14 14:25:22-07 | capricauniversal

It doesn't seem duplicate bibs is a problem for Evergreen 2.0 software.

Revision history for this message
Jason Stephenson (jstephenson) wrote :

Submitter indicates it is not a problem in later versions of the software. The version for which the bug was reported is no longer supported. I am therefore setting to Won't Fix.

Changed in evergreen:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.