Replace ARRAY_ACCUM with ARRAY_AGG (and STRING_AGG)

Bug #874296 reported by Dan Scott
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Evergreen
Fix Released
Medium
Unassigned

Bug Description

* Evergreen master
* PostgreSQL 9.0.5
* Fedora 15

PostgreSQL 8.4 introduced ARRAY_AGG(), a native function that replaces our custom ARRAY_ACCUM() (for the most part a drop-in replacement, with a slight variation in that ARRAY_AGG() returns NULL while ARRAY_ACCUM returns '{}' instead).

PostgreSQL 9.0 introduced STRING_AGG(), a native function that replaces ARRAY_TO_STRING(ARRAY_AGG()) with performance that is 1-2 orders of magnitude better (at least in some basic benchmarks).

In general, native functions should be preferred to custom functions. I've pushed a branch - user/dbs/array_accum_2_array_agg @ working - that drops in ARRAY_AGG() and STRING_AGG() where appropriate. While exercising the affected functions, I found and fixed one function that had never worked. I can't claim to have tested all variations of all affected functions, but I have read the code to look for places where NULL vs. '{}' may cause issues and exercised a large percentage of the functionality with no apparent problems.

If there is interest in merging this, I can add upgrade scripts to the branch. I haven't added them yet as I don't want to have to continually update them whilst the schema evolves - merge doesn't happen to new files! Just give me a heads up. Also, this will be a good chance to try out the supersedes functionality in the upgrade scripts, as we just issued an updated asset.merge_record_assets() function and that can be superseded.

Tags: pullrequest
Revision history for this message
Dan Scott (denials) wrote :

Updated the branch to replace the ARRAY_ACCUM() usage inside the new functions that were created as part of the authority NFI work with ARRAY_AGG().

Revision history for this message
Dan Scott (denials) wrote :

Force-pushed an updated merge of the fixes to user/dbs/array_accum_2_array_agg @ working

This would be really nice to have for 2.2.

Revision history for this message
Dan Scott (denials) wrote :

Refreshed the branch, dealt with the conflicts, pushed to user/dbs/array_accum_2_array_agg @ working

Revision history for this message
Ben Shum (bshum) wrote :

Rebased to master, fixed one conflict, and then there were four commits that were related to XXXX.schema.asset_merge_record_assets.sql which look like they were already merged to master as 0639.schema.asset_merge_record_assets.sql. I skipped applying those in the new branch.

Using ack-grep, I found a few places where array_accum were still used:

Open-ILS/src/sql/Pg/002.functions.config.sql
Open-ILS/src/sql/Pg/002.functions.aggregate.sql
Open-ILS/src/sql/Pg/300.schema.staged_search.sql
Open-ILS/src/perlmods/lib/OpenILS/Application/Booking.pm
Open-ILS/examples/build-eg-replication.sh
Open-ILS/examples/fm_IDL.xml

The rest were in existing upgrade and version-upgrade files, which I assume we'll skip.

Work in progress pushed to working/collab/bshum/array_accum_2_array_agg

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/collab/bshum/array_accum_2_array_agg

Changed in evergreen:
importance: Undecided → Wishlist
Changed in evergreen:
status: New → Incomplete
Changed in evergreen:
status: Incomplete → Triaged
Revision history for this message
Ben Shum (bshum) wrote :

Since it's been awhile, I decided to rebase my branch to latest master and force pushed it to the collab branch for reconsideration.

There are still two files where I'm unsure how to proceed right now:

Open-ILS/src/perlmods/lib/OpenILS/Application/Booking.pm has references to array_accum still
Open-ILS/examples/build-eg-replication.sh has array_to_string(array_agg()) to change into string_agg()?

And after all is said and done, we still have to create the upgrade SQL to apply these changes.

Continuing work in progress at: working/collab/bshum/array_accum_2_array_agg

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/collab/bshum/array_accum_2_array_agg

Changed in evergreen:
status: Triaged → Confirmed
milestone: none → 2.5.0-alpha
Dan Wells (dbw2)
Changed in evergreen:
milestone: 2.5.0-m1 → none
Ben Shum (bshum)
Changed in evergreen:
importance: Wishlist → Medium
assignee: nobody → Ben Shum (bshum)
status: Confirmed → In Progress
Revision history for this message
Dan Scott (denials) wrote :

Whoa. Most of this is looking good but in 002.functions.aggregate.sql we don't want to create a new custom function "array_agg()" as that would presumably prevent us from reaching the native array_agg() :)

Changed in evergreen:
milestone: none → 2.6.0-alpha1
Revision history for this message
Dan Scott (denials) wrote :

I've forced-pushed a new branch to http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/dbs/array_accum_2_array_agg that signs off on bshum's additional commit, as well as includes a commit of my own to revert the "create new custom array_agg()" change :)

Changed in evergreen:
status: In Progress → Confirmed
tags: added: pullrequest
Revision history for this message
Dan Scott (denials) wrote :

Because of the volume of change here, I'm going to set the pullrequest tag but wait until someone is ready to commit it before I create the upgrade scripts; it seems likely that there will be conflict otherwise.

Dan Wells (dbw2)
Changed in evergreen:
milestone: 2.6.0-alpha1 → 2.6.0-beta1
Revision history for this message
Ben Shum (bshum) wrote :

So I found a few last minute functions that also needed changing over to native use. Please double check my work on the string_agg() changes in one of the new commits.

The other commit included in my branch is a first crack at the upgrade script that replaces all the functions, etc. that were altered. At the end of the script, I included some echo comments explaining that people should re-run their example.reporter-extension.sql script to get the latest changed versions there too.

Working branch at: user/bshum/array_accum_2_array_agg

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/bshum/array_accum_2_array_agg

Almost there, Dan!!!

Revision history for this message
Dan Scott (denials) wrote :

Finally in the process of checking and testing!

Revision history for this message
Dan Scott (denials) wrote :

Confirmed that the located URIs are still getting extracted.

Bibs were ingested fine (noticed that author field entries had "creator" tacked on but that happens in master in general; probably should be the subject of another bug).

General class searches worked.

Bib merge worked.

Authorities were ingested, manage authorities and authority merge worked.

ISBN search worked.

Placing holds failed but that seems to be a problem with master in general (per bug# 1277731).

Ben - thank you so much for persevering with this! I have merged it to master.

Changed in evergreen:
status: Confirmed → Fix Committed
Revision history for this message
Ben Shum (bshum) wrote :

Yay, just detaching my name from being the assigned to person. Thanks Dan, excited to see this go in.

Changed in evergreen:
assignee: Ben Shum (bshum) → nobody
Changed in evergreen:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.