Duplicate bib records added
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Evergreen |
Won't Fix
|
Medium
|
Unassigned |
Bug Description
System details:
Evergreen version 1.6.0
opensrf version 1.2.2
postgresql version 8.3
Linux Ubuntu Server 8.04
Problem description:
We are getting indications of users making duplicate entries in the biblio.record_entry table. Using the following SQL query in a psql session,
evergreen=# select
count(*)
, creator
, create_date
, trim(both from fingerprint) as trimmed_fingerprint
from biblio.record_entry
where
deleted = false
and creator > 1
and extract(year from create_date) = 2010
group by fingerprint, create_date, creator
having count(*) > 1
order by create_date desc
;
count | creator | create_date | trimmed_fingerprint
-------
2 | 338817 | 2010-06-14 16:19:25-07 | totemnational
2 | 338817 | 2010-06-09 15:27:40-07 | onehundredbulle
2 | 338817 | 2010-06-07 16:05:23-07 | everydayeasydk
2 | 22794 | 2010-05-06 11:46:03-07 | ihatetorideinwi
2 | 44094 | 2010-04-28 12:06:05-07 | problemchildbuckley
13 | 372584 | 2010-04-27 19:05:35-07 | vogueadvance
13 | 372584 | 2010-04-27 19:05:34-07 | vogueadvance
18 | 372584 | 2010-04-27 19:05:33-07 | vogueadvance
13 | 372584 | 2010-04-27 19:05:32-07 | vogueadvance
14 | 372584 | 2010-04-27 19:05:31-07 | vogueadvance
15 | 372584 | 2010-04-27 19:05:30-07 | vogueadvance
15 | 372584 | 2010-04-27 19:05:29-07 | vogueadvance
2 | 196151 | 2010-03-16 14:49:22-07 | creampuffmurder
2 | 196151 | 2010-02-16 15:26:53-08 | fridaynightstro
2 | 35898 | 2010-01-22 08:57:14-08 | genesisslaughter
(15 rows)
In the results, we find multiple entries when grouped by the same creator id, create date, and fingerprint. (To reduce data, we focus on entries for this year, undeleted entries, and those created by creator id's greater than 1.)
We can describe two cases in detail:
1. 'vogueadvance'
Here, we have 101 duplicates generated within a 7-second range. The case has been verified by talking to the staff person responsible for creating the entries. She reported using a staff client during a catalogue training session. She does not recall doing anything differently when creating that record and has created other bibs without incident.
2. 'ihatetorideinw
Here, we have an example of the majority case, a duplicate. The creator is a member of the Sitka project team and was witnessed by other team members in a face-to-face work session. She reports,
'-- AUTOGENERATED-
Changed in evergreen: | |
status: | New → Triaged |
importance: | Undecided → Medium |
James, have you seen this in any later versions? (I guess, specifically, on 2.0, since you're on that now.)