Circulation auto renewal remaining should not be nullable

Bug #1835953 reported by Michele Morgan
38
This bug affects 8 people
Affects Status Importance Assigned to Milestone
Evergreen
Confirmed
Medium
Unassigned
3.10
Won't Fix
Undecided
Unassigned
3.11
Confirmed
Undecided
Unassigned
3.12
Confirmed
Undecided
Unassigned

Bug Description

Evergreen 3.2.4

We have begun using autorenewals and find that transactions that have a NULL auto_renewal_remaining are being treated the same as transactions with 0 auto_renewal_remaining. Transactions of this type are not renewable at all and autorenewals are failing with MAX_RENEWALS_REACHED. Patrons are getting a notification about the failed renewal on a non-renewable transaction which is causing confusion.

When using autorenewals, rows in action.circulation can have the following for the auto_renewal_remaining field:

auto_renewal_remaining > 0
auto_renewal_remaining = 0
auto_renewal_remaining is NULL

For the first case, the autorenewal should be attempted, and the patron notified of success or failure.

For the second case, the autorenewal should be attempted and the patron notified that there are no renewals remaining.

For the third case, I would argue that the item is not renewable and no attempt should be made to autorenew.

Adding

"-not" : [ { "auto_renewal_remaining" : null } ]

to a custom a_t_filter will prevent renewal attempts on circs where auto_renewal_remaining is NULL, but a baked in solution would be preferable.

Revision history for this message
Michele Morgan (mmorgan) wrote :

Adding a link to relevant IRC discussion:

http://irc.evergreen-ils.org/evergreen/2019-07-09#i_411573

Revision history for this message
Bill Erickson (berick) wrote :

Adding the NULL check to the filter is most efficient way to handle this. Maybe this is a documentation / sample data issue?

A secondary check in the code to mark such events invalid probably wouldn't hurt, though.

I'd like to suggest a slight tweak for the second case (above) where auto_renewal_remaining = 0. We should teach the AutoRenew reactor to bypass the actual circulation back-end call, but still generate the notification.

As I understand it, autorenew batches can take some time to process. This would allow the server to avoid a lot of unnecessary work processing items that are known in advance to be non-auto-renewable.

Revision history for this message
Michele Morgan (mmorgan) wrote :

Perhaps we should have a dedicated hook for autorenewals which includes the NULL check?

Regarding the second case, I like the idea of teaching the reactor to bypass the autorenew attempt and just send a notification when auto_renewal_remaining = 0

Changed in evergreen:
status: New → Confirmed
importance: Undecided → Low
Revision history for this message
Jason Stephenson (jstephenson) wrote :

My thoughts as of right now:

1. Modify the validator to look for auto_renewal_remaining is not null.

2. Modify the AutoRenew reactor to check for auto_renewal_remaining = 0, bypass the checkout, and generate the MAX_RENEWALS_REACHED event on its own.

Revision history for this message
Jason Stephenson (jstephenson) wrote :

An alternative to #1 in my previous comment occurred to me. We could also check for auto_renewal_remaining of NULL in the reactor and set the reason to something like autorenewal not allowed. We could even make a new event for this.

Doing this will notify the patron that the item can't be autorenewed. This might be preferable to just setting the event state to invalid and then not notifying the patron.

What's the consensus?

Revision history for this message
Jason Stephenson (jstephenson) wrote :

I am necro-bumping this bug because I stand by my previous comment even more today. We use the autorenew events as both autorenewal and courtesy notifications for those things that can't be renewed.

I think we should either make the renewal_remaining and auto_renewal_remaining columns "NOT NULL DEFAULT 0" or add code to the AutoRenew.pm handler function to set the variables to 0 when the values from the database are null.

Nothing like having a 50% failure in the autorenew process to get one to pay attention to what's going on.

Changed in evergreen:
importance: Low → Medium
Changed in evergreen:
assignee: nobody → Jason Stephenson (jstephenson)
milestone: none → 3.12-beta
summary: - Autorenewals should not be attempted on circs where
- auto_renewal_remaining is NULL
+ Circulation auto renewal remaining and renewal remaining should not be
+ nullable
Revision history for this message
Jason Stephenson (jstephenson) wrote : Re: Circulation auto renewal remaining and renewal remaining should not be nullable

I have assigned this bug to myself and changed the summary/title to reflect what I think the real problem is. I consider this an architecture implementation bug. In my opinion and the opinion of many others, numeric fields should not be nullable, unless null conveys some useful information that zero does not also convey. In the case of the renewal remaining fields in circulation, null is equivalent to zero.

I'll have a patch commit to fix the circulation table and cir rule tables (if needed) later this week.

Changed in evergreen:
milestone: 3.12-beta → 3.12-rc
Changed in evergreen:
milestone: 3.12-rc → 3.next
summary: - Circulation auto renewal remaining and renewal remaining should not be
- nullable
+ Circulation auto renewal remaining should not be nullable
Changed in evergreen:
status: Confirmed → In Progress
Revision history for this message
Jason Stephenson (jstephenson) wrote :

I'm updating this bug to report that I have made some progress. I have completed the necessary database schema changes and updated the affected tests and Concerto data load.

I still need to test the changes while using the staff client to make sure that new circulation parameters and new circulations are created correctly.

Changed in evergreen:
assignee: Jason Stephenson (jstephenson) → nobody
tags: added: pullrequest
Revision history for this message
Jason Stephenson (jstephenson) wrote :

I have pushed a working branch that is ready for more eyes. It does the following:

1. Sets the auto_renewal_remaining fields of the circulation and aged_circulation tables to not allow nulls.

2. Sets the max_auto_renewals column on config.rule_circ_duration to not allow nulls.

3. Updates the IDL to indicate that the above columns are required, along with the max_renewals column on config.rule_circ_duration.

4. Updates the concerto data load so that the required columns get values.

5. Updates the pgtap and Perl tests as appropriate for the schema change.

6. Adds a regression test to make sure that that columns do not allow null values again in the future.

The branch is working/user/dyrcona/lp1835953-auto-renewal-remaining-not-null: https://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/dyrcona/lp1835953-auto-renewal-remaining-not-null

Changed in evergreen:
status: In Progress → Confirmed
Revision history for this message
Jane Sandberg (sandbergja) wrote :

Thanks for this patch, Jason -- particularly the tests and docs! On first glance, it looks good. My only concern is that adding the NOT NULL constraint to such a large table as action.circulation in the upgrade script could cause a long ACCESS EXCLUSIVE lock.

I was trying to find some good documentation on the topic. This page from squawk seems okay: https://squawkhq.com/docs/adding-not-nullable-field

They note that on postgres 11 and higher, if you supply a default value, it doesn't have to do a full table scan, which should speed up the amount of time it hangs on to that lock. They also suggest a workaround of the ADD CONSTRAINT... VALIDATE CONSTRAINT dance.

Changed in evergreen:
status: Confirmed → In Progress
assignee: nobody → Jason Stephenson (jstephenson)
Revision history for this message
Jason Stephenson (jstephenson) wrote :

I will do timed tests of the database upgrade using my original script and one modified to use the "not valid" constraints. This will take a few days to a week or so since I'll have to reload the database in between tests. I will test on PostgreSQL versions 10 and 16 since I have those set up on my development db server with multiple copies of production data.

Revision history for this message
Jason Stephenson (jstephenson) wrote :

Jane,

I implemented an alternate upgrade script based on the article that you mention in comment #10. I have attached it here.

I have runs some tests on a test database server with production data on both Pg10 and Pg16. The Pg10 database is at Evergreen version 3.10.3 and the Pg16 database was upgraded to main as of commit 991e1664c3.

The Pg16 database has optimized settings for the hardware and the Pg10 database uses the PostgreSQL default settings. I think this is OK because what I'm testing are the differences in run time between the two scripts and not differences in Pg versions.

I then ran this alternate db upgrade and the original against those databases using the time program to output how long they took to complete. My methodology with as follows:

1. Load a dump of the database to be tested.
2. Run a vaccuum analyze on the full database.
3. Run one of the alternate upgrade scripts on the database with time.
4. Record the results in a Google sheet, see below.
5. Repeate steps 1-4 with the other db upgrade alternative.

No other updates or queries were communicating with the database server during the tests.

My results from 1 run each of the alternatives are in this Google shseet: https://docs.google.com/spreadsheets/d/1l7PsPUhTpT2C4eI2J9CIpGup3Zbn_OSbLddumbTk7Wg/edit?usp=sharing

In my testing, the "not valid" constraint was about 12 minutes faster than the "not null" constraint, but still took almost 3 hours, on the optimized Pg16 database.

On the unpotimized Pg10 database the "not null" version of the upgrade was faster by 26 minutes.

Besides the performance of the upgrade scripts there are other factors to consider in changing from a "not null" to a "not valid" constraint. One of these is that the vast majority of our current constraints are "not null." There are presently only a couple of "not valid" constraints in the code. Another is that using a "not valid" constraint could make it harder for an automated tool to determine if a fiedl should be required or not.

U aksi ybderstabd that running each db upgrade one time is not particularly scientific. They ought to be run thousands of times and the reulsts averaged. However it takes approximately 1 working day to get results given that a pg_resote has to happen in between the tests and the tests themselves run for several hours.

I think we should open this up to some discussion to decide which approach to the constraints is preferred.

Jason

Changed in evergreen:
status: In Progress → Confirmed
assignee: Jason Stephenson (jstephenson) → nobody
tags: added: needsdiscussion
removed: pullrequest
Revision history for this message
Jason Stephenson (jstephenson) wrote :

I have removed the needsdiscussion tag and added the pullrequest tag.

I have rebased the branch on main and fixed up the release notes correction commit so this is all in 1 commit at the tip of working/user/dyrcona/lp1835953-auto-renewal-remaining-not-null-rebase (https://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/dyrcona/lp1835953-auto-renewal-remaining-not-null-rebase).

I suggest that we go with the "NOT NULL" constraints to be consistent with the majority of the constraints in the schema. If anyone wants to discuss that and argue in favor of the 'NOT VALID' constraints, then let's have that conversation, but I'm afraid that "needsdiscussion" has been the death too many patches, so I'd rather this just get some attention and possibly go in.

tags: added: pullrequest
removed: needsdiscussion
Revision history for this message
Jane Sandberg (sandbergja) wrote :

A belated thank you for collecting all this data, Jason!

I think NOT NULL makes sense. Could we perhaps just include a warning in the release notes that this migration blocks reads/writes on action.circulation for a long time, particularly on old postgres versions?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.