Authority merge time out when too many records,
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Evergreen |
Triaged
|
Medium
|
Unassigned |
Bug Description
Tested in 2.3
The server times out When doing an authority merge when there are a lot of bib records involved.
In the authority.
One thing that could be changed to help relieve the amount of data, is it looks like it un-neccesarrily re-ingests the target bib records instead of just doing the source ones.
In the function, right before the,
-- 3. Temporarily set reingest on same to TRUE
The original target records should be pulled out before the re-ingest so we just do source records. They then can be added back in after the re-ingest.
I'm not sure of the best way to do this though. I don't think moving them to a temp table is a good idea. Maybe a local variable? Maybe there is a better way to accomplish this by re-writing the code here.
In my testing, I was trying to merge a record that had 4 entries into one that had 400, and received the timeout and failed on the merge.
Steve
tags: | added: authority |
Changed in evergreen: | |
status: | New → Triaged |
importance: | Undecided → Medium |
tags: | added: performance |
tags: |
added: cat-authority removed: authority cataloging |
Another option, which would help in the situation you encountered, would be to split the force-on-same-flag into two, one for each of bib and authority. Then use just the authority force-on-same flag inside the merge function, and because only the bibs that were pointing to the new master authority record were changed, only those would end up being updated via the cascade of trigger-induced reingest.
However, this only helps the "lightly used merged into heavily used" case. A heavily used authority being merge into another authority will suffer the same timeout possibility. To address the general case, I think we may need to consider a reingest queue. This would be useful more generally for upgrade-time and configuration- caused reingests, as well.
There are many roads we could take to such a thing, among them:
* An A/T reactor
* A cron-fired script that reads a queue table and generates a script to cause the reingest
* A LISTEN/NOTIFY client for Postgres that reacts by asynchronously firing a reingest as needed, based on a queue table
* Other, fancier things...
All of these things will require, I believe, one common ability; they need to be able to inform the main reingest trigger that, instead of doing its work, it should instead simply insert the record id into a queue and move on. That can take the form of a global flag (ingest. reingest. asynchronous, maybe) that the code would use in situations where queuing for async reingest is known to be the preferred method, and even by admins to cause all reingest to be performed async.
Thoughts?