Comment 12 for bug 1835953

Revision history for this message
Jason Stephenson (jstephenson) wrote (last edit ): Re: Circulation auto renewal remaining should not be nullable

Jane,

I implemented an alternate upgrade script based on the article that you mention in comment #10. I have attached it here.

I have runs some tests on a test database server with production data on both Pg10 and Pg16. The Pg10 database is at Evergreen version 3.10.3 and the Pg16 database was upgraded to main as of commit 991e1664c3.

The Pg16 database has optimized settings for the hardware and the Pg10 database uses the PostgreSQL default settings. I think this is OK because what I'm testing are the differences in run time between the two scripts and not differences in Pg versions.

I then ran this alternate db upgrade and the original against those databases using the time program to output how long they took to complete. My methodology with as follows:

1. Load a dump of the database to be tested.
2. Run a vaccuum analyze on the full database.
3. Run one of the alternate upgrade scripts on the database with time.
4. Record the results in a Google sheet, see below.
5. Repeat steps 1-4 with the other db upgrade alternative.

No other updates or queries were communicating with the database server during the tests.

My results from 1 run each of the alternatives are in this Google shseet: https://docs.google.com/spreadsheets/d/1l7PsPUhTpT2C4eI2J9CIpGup3Zbn_OSbLddumbTk7Wg/edit?usp=sharing

In my testing, the "not valid" constraint was about 12 minutes faster than the "not null" constraint, but still took almost 3 hours, on the optimized Pg16 database.

On the unoptimized Pg10 database the "not null" version of the upgrade was faster by 26 minutes.

Besides the performance of the upgrade scripts there are other factors to consider in changing from a "not null" to a "not valid" constraint. One of these is that the vast majority of our current constraints are "not null." There are presently only a couple of "not valid" constraints in the code. Another is that using a "not valid" constraint could make it harder for an automated tool to determine if a field should be required or not.

I also understand that running each db upgrade one time is not particularly scientific. They ought to be run thousands of times and the results averaged. However it takes approximately 1 working day to get results given that a pg_restore has to happen in between the tests and the tests themselves run for several hours.

I think we should open this up to some discussion to decide which approach to the constraints is preferred.

Jason