Evergreen version: master as of 20140904
OpenSRF version: master as of 20140904
PostgreSQL version: Server 9.1.9, client (see below)
O/S: Ubuntu 14.04 Trusty Tahr (see below)
While working on some pcrud scripts to create copies, etc. I ran into an issue with transactions on Ubuntu 14.04 with libdbi 0.9.0. When a database failure would occur, the client script, rather than print the error and go on, would print the error and then hang for about 30 or 60 seconds. (I didn't actually time it.) Upon checking the logs, I found the following:
open-ils.pcrud 2014-09-08 13:10:03 [INFO:1271:osrf_application.c:1059:] CALL: open-ils.pcrud open-ils.pcrud.create.acp "c3a6a0b3335e2c726945a021a28c323c",{"__c":"acp","__p":[null,null,"31234005678911",-1,null,51,null,"t",null,null,null,6043
62,"f",null,null,null,"Some Guy","Some Book",null,604362,2,"t",null,2,1,"f",null,null,0,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,1]}
open-ils.pcrud 2014-09-08 13:10:03 [ERR :1271:oils_sql.c:2454:] open-ils.pcrud ERROR inserting asset::copy object using query [INSERT INTO asset.copy (age_protect,alert_message,barcode,call_number,circ_as_type,circ_lib,circ_modifier,circulate,copy_number,create_date,active_date,creator,deleted,dummy_isbn,deposit,deposit_amount,dummy_author,dummy_title,edit_date,editor,fine_level,holdable,id,loan_duration,location,opac_visible,price,ref,status,status_changed_time,mint_condition,floating,cost) VALUES (DEFAULT,DEFAULT,'31234005678911',-1,DEFAULT,51,DEFAULT,'t',DEFAULT,DEFAULT,DEFAULT,604362,'f',DEFAULT,DEFAULT,DEFAULT,'Some Guy','Some Book',DEFAULT,604362,2,'t',DEFAULT,2,1,'f',DEFAULT,DEFAULT,0,DEFAULT,DEFAULT,DEFAULT,DEFAULT);]: 3505685 3505685: ERROR: duplicate key value violates unique constraint "copy_barcode_key"
DETAIL: Key (barcode)=(31234005678911) already exists.
open-ils.pcrud 2014-09-08 13:10:03 [INFO:1271:osrf_app_session.c:1017:] [open-ils.pcrud] sent 232 bytes of data to <email address hidden>/client_at_jasondev.mvlcstaff.org_1287
open-ils.pcrud 2014-09-08 13:10:03 [ERR :1271:oils_sql.c:275:] Database connection isn't working
open-ils.pcrud 2014-09-08 13:10:03 [INFO:1271:osrf_app_session.c:1017:] [open-ils.pcrud] sent 371 bytes of data to <email address hidden>/client_at_jasondev.mvlcstaff.org_1287
open-ils.pcrud 2014-09-08 13:10:03 [WARN:1271:osrf_application.c:839:] Returning method exception with message: An unknown server error occurred
open-ils.pcrud 2014-09-08 13:10:03 [INFO:1271:osrf_app_session.c:1017:] [open-ils.pcrud] sent 207 bytes of data to <email address hidden>/client_at_jasondev.mvlcstaff.org_1287
open-ils.pcrud 2014-09-08 13:10:03 [INFO:1271:osrf_stack.c:163:] Message processing duration 0.030691
[2014-09-08 13:10:03] pcrud_acp_create_precat.pl [ERR :1287:EX.pm:66:] Exception: OpenSRF::EX::ERROR 2014-09-08T13:10:03 OpenSRF::AppRequest /usr/local/share/perl/5.18.2/OpenSRF/AppSession.pm:1086 System ERROR: Exception: OpenSRF::DomainObject::oilsMethodException 2014-09-08T13:10:03 OpenSRF::AppRequest /usr/local/share/perl/5.18.2/OpenSRF/AppSession.pm:1086 <500> INSERT error -- please see the error log for more details
open-ils.pcrud 2014-09-08 13:10:03 [WARN:1271:osrf_prefork.c:1218:] Prefork child terminating abruptly
open-ils.pcrud 2014-09-08 13:10:03 [INFO:1271:osrf_application.c:269:] Running onExit handler for app open-ils.pcrud
open-ils.pcrud 2014-09-08 13:10:03 [INFO:1288:osrf_system.c:103:] Adding cache server 127.0.0.1:11211
open-ils.pcrud 2014-09-08 13:10:03 [INFO:1288:osrf_system.c:453:] Bootstrapping system with domain private.localhost, port 5222, and unixpath (none)
open-ils.pcrud 2014-09-08 13:10:03 [INFO:1288:oils_sql.c:179:] open-ils.pcrud connecting to database. host=db2.mvlcstaff.org, port=5432, user=opensrf, db=jasondev
open-ils.pcrud 2014-09-08 13:10:03 [INFO:1288:oils_sql.c:205:] open-ils.pcrud successfully connected to the database
open-ils.pcrud 2014-09-08 13:10:03 [INFO:1288:osrf_application.c:254:] open-ils.pcrud child init succeeded
NOTE: After the insert error, the child terminates "abruptly."
The PostgreSQL log looks normal for this type of transaction failure:
2014-09-08 13:10:03 EDT DETAIL: Key (barcode)=(31234005678911) already exists.
2014-09-08 13:10:03 EDT STATEMENT: INSERT INTO asset.copy (age_protect,alert_message,barcode,call_number,circ_as_type,circ_lib,circ_modifier,circulate,copy_number,create_date,active_date,creator,deleted,dummy_isbn,deposit,deposit_amount,dummy_author,dummy_title,edit_date,editor,fine_level,holdable,id,loan_duration,location,opac_visible,price,ref,status,status_changed_time,mint_condition,floating,cost) VALUES (DEFAULT,DEFAULT,'31234005678911',-1,DEFAULT,51,DEFAULT,'t',DEFAULT,DEFAULT,DEFAULT,604362,'f',DEFAULT,DEFAULT,DEFAULT,'Some Guy','Some Book',DEFAULT,604362,2,'t',DEFAULT,2,1,'f',DEFAULT,DEFAULT,0,DEFAULT,DEFAULT,DEFAULT,DEFAULT);
2014-09-08 13:10:03 EDT ERROR: current transaction is aborted, commands ignored until end of transaction block
2014-09-08 13:10:03 EDT STATEMENT: SELECT 1;
I also tried the above on a VM running Ubuntu 12.04 and got different results. Everything actually worked as I expected. The error occurred, my script trapped the error, and then did a pcrud.transaction.rollback with any message about abrupt termination.
Here are the versions of the relevant libraries on Ubuntu Precise Pangolin (12.04) and Trusty Tahr (14.04) for comparison:
On precise
libdbi1 0.8.4-5.1
libdbdpgsql 0.8.3-1+s-2.1ubuntu4
postgresql-client 9.3+154.pgdg12.4+1*
libpq5 9.3.5-1.pgdg12.4+1
*I'm using the PostgreSQL apt source.
On trusty:
libdbi1 0.9.0-1
libdbd-pgsql 0.9.0-2ubuntu2
postgresql-client 9.3.5-0ubuntu0.14.04.1
libpq5 9.3.5-0ubuntu0.14.04.1
I believe the problem is that transaction support was added to libdbi in release 0.9.0 as indicated in the March 25, 2013 news item from http://libdbi.sourceforge.net/index.html. It appears that there is some bad interaction with the Evergreen C code's handling of database transactions and that provided in the newer libdbi. I have not delved into the code or tried debugging to find out exactly what is going on, though.
I cannot recommend that anyone use Ubuntu 14.04 or any other system with libdbi 0.9.0 until this is resolved.
As warned by the code comments, the libdbi error messages did indeed change, wreaking havoc. Fix pushed to:
http:// git.evergreen- ils.org/ ?p=working/ Evergreen. git;a=shortlog; h=refs/ heads/user/ berick/ lp1366964- libdbi- error-handling
I tested this moderately on Ubuntu 14.04 with success. It should be backwards compatible with older versions of libdbi, but I have not tested that. Branch includes a Perl live test for confirming the functionality.