Random create schema failures due to inaccessible _MD_ tables

Bug #1404442 reported by Weishiun Tsai
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Trafodion
Fix Released
Critical
Roberta Marton

Bug Description

We are seeing a large number of random create schema failures. The statement would return error 8841 and 4082 complaining that a table in “_MD_” is inaccessible. Generally the table mentioned in the error message is either the OBJECTS table or the OBJECTS_UNIQ_IDX table.

This obviously is related to the recent check-in of the ANSI schema support, with which the create schema statement becomes a ‘real thing’. On the v1218 build (the first daily build that has the ANSI schema support), the QA regression tests saw a large number of such error. This is definitely a test stopper for QA. Once a create schema statement fails, all tests afterwards using that particular schema would naturally fail as well.

Here are some examples of the failures seen in one run:

------------

SQL>drop schema trafodion.arkcase_arkt1199 cascade;
*** ERROR[1003] Schema TRAFODION.ARKCASE_ARKT1199 does not exist. [2014-12-19 00:29:05]

SQL>create schema trafodion.arkcase_arkt1199;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-19 00:29:06]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2014-12-19 00:29:06]

------------

SQL>drop schema trafodion.arkcase_arkt1371 cascade;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-19 00:31:13]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS does not exist or is inaccessible. [2014-12-19 00:31:13]
*** ERROR[1003] Schema TRAFODION.ARKCASE_ARKT1371 does not exist. [2014-12-19 00:31:13]

SQL>create schema trafodion.arkcase_arkt1371;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-19 00:31:13]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS does not exist or is inaccessible. [2014-12-19 00:31:13]

------------

SQL>drop schema trafodion.arkcase_datarith cascade;
*** ERROR[1003] Schema TRAFODION.ARKCASE_DATARITH does not exist. [2014-12-19 01:09:19]

SQL>create schema trafodion.arkcase_datarith;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-19 01:09:20]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2014-12-19 01:09:20]

------------

SQL>drop schema trafodion.bmo_ssdovf2 cascade;
*** ERROR[1003] Schema TRAFODION.BMO_SSDOVF2 does not exist. [2014-12-19 01:47:03]

SQL>create schema trafodion.bmo_ssdovf2;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-19 01:47:05]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2014-12-19 01:47:05]

------------

SQL>drop schema trafodion.compiler_explain cascade;
*** ERROR[1003] Schema TRAFODION.COMPILER_EXPLAIN does not exist. [2014-12-19 01:55:19]

SQL>create schema trafodion.compiler_explain;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-19 01:55:20]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2014-12-19 01:55:20]

------------

SQL>drop schema trafodion.compiler_smf4 cascade;
*** ERROR[1003] Schema TRAFODION.COMPILER_SMF4 does not exist. [2014-12-19 18:05:19]

SQL>create schema trafodion.compiler_smf4;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-19 18:05:21]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2014-12-19 18:05:21]

------------

SQL>drop schema trafodion.datamining_rotate cascade;
*** ERROR[1003] Schema TRAFODION.DATAMINING_ROTATE does not exist. [2014-12-19 20:54:18]

SQL>create schema trafodion.datamining_rotate;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-19 20:54:20]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2014-12-19 20:54:20]

------------

SQL>drop schema trafodion.ddl_tab007 cascade;
*** ERROR[1003] Schema TRAFODION.DDL_TAB007 does not exist. [2014-12-19 22:17:38]

SQL>create schema trafodion.ddl_tab007;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-19 22:17:39]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2014-12-19 22:17:39]

------------

SQL>drop schema trafodion.features_niz cascade;
*** ERROR[1003] Schema TRAFODION.FEATURES_NIZ does not exist. [2014-12-19 23:06:09]

SQL>control query default limit_max_numeric_precision reset;
--- SQL operation complete.

SQL>create schema trafodion.features_niz;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-19 23:06:10]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2014-12-19 23:06:10]

------------

SQL>drop schema trafodion.fullouterjoins_test_sql3 cascade;
*** ERROR[1003] Schema TRAFODION.FULLOUTERJOINS_TEST_SQL3 does not exist. [2014-12-19 23:27:05]

SQL>create schema trafodion.fullouterjoins_test_sql3;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-19 23:27:06]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2014-12-19 23:27:06]

------------

SQL>drop schema trafodion.groupby_grpby006 cascade;
*** ERROR[1003] Schema TRAFODION.GROUPBY_GRPBY006 does not exist. [2014-12-20 00:35:05]

SQL>create schema trafodion.groupby_grpby006;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-20 00:35:06]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2014-12-20 00:35:06]

------------

SQL>drop schema trafodion.groupby_grpby007 cascade;
*** ERROR[1003] Schema TRAFODION.GROUPBY_GRPBY007 does not exist. [2014-12-20 00:35:11]

SQL>create schema trafodion.groupby_grpby007;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-20 00:35:12]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2014-12-20 00:35:12]

------------

SQL>drop schema trafodion.groupby_grpby009 cascade;
*** ERROR[1003] Schema TRAFODION.GROUPBY_GRPBY009 does not exist. [2014-12-20 00:35:30]

SQL>create schema trafodion.groupby_grpby009;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-20 00:35:31]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2014-12-20 00:35:31]

------------

SQL>drop schema trafodion.purgedata_purgedata182 cascade;
*** ERROR[1003] Schema TRAFODION.PURGEDATA_PURGEDATA182 does not exist. [2014-12-20 01:40:12]

SQL>create schema trafodion.purgedata_purgedata182;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-20 01:40:13]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2014-12-20 01:40:13]

------------

SQL>drop schema trafodion.purgedata_purgedata30 cascade;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-20 02:23:00]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS does not exist or is inaccessible. [2014-12-20 02:23:00]
*** ERROR[1003] Schema TRAFODION.PURGEDATA_PURGEDATA30 does not exist. [2014-12-20 02:23:00]

SQL>create schema trafodion.purgedata_purgedata30;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-20 02:23:00]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS does not exist or is inaccessible. [2014-12-20 02:23:00]

------------

SQL>drop schema trafodion.purgedata_purgedata32 cascade;
*** ERROR[1003] Schema TRAFODION.PURGEDATA_PURGEDATA32 does not exist. [2014-12-20 02:24:35]

SQL>create schema trafodion.purgedata_purgedata32;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-20 02:24:36]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2014-12-20 02:24:36]

------------

SQL>drop schema trafodion.showddl_ddl04 cascade;
*** ERROR[1003] Schema TRAFODION.SHOWDDL_DDL04 does not exist. [2014-12-20 03:27:27]

SQL>create schema trafodion.showddl_ddl04;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-20 03:27:28]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2014-12-20 03:27:28]

------------

SQL>create schema trafodion.spj_spj;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-20 03:42:31]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2014-12-20 03:42:31]

------------

SQL>drop schema trafodion.volatile_table_volt7 cascade;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-20 04:15:13]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS does not exist or is inaccessible. [2014-12-20 04:15:13]
*** ERROR[1003] Schema TRAFODION.VOLATILE_TABLE_VOLT7 does not exist. [2014-12-20 04:15:13]

SQL>create schema trafodion.volatile_table_volt7;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-20 04:15:13]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS does not exist or is inaccessible. [2014-12-20 04:15:13]

------------

SQL>drop schema trafodion.volatile_table_volt9 cascade;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-20 04:20:03]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS does not exist or is inaccessible. [2014-12-20 04:20:03]
*** ERROR[1003] Schema TRAFODION.VOLATILE_TABLE_VOLT9 does not exist. [2014-12-20 04:20:03]

SQL>create schema trafodion.volatile_table_volt9;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2014-12-20 04:20:03]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS does not exist or is inaccessible. [2014-12-20 04:20:03]

------------

Tags: sql-security
Changed in trafodion:
assignee: nobody → Cliff Gray (cliff-gray)
Revision history for this message
Cliff Gray (cliff-gray) wrote :

The symptoms for this problem are the same as 1401683. In addition, while working on 1403995, a reproducible test case for the 8441 symptom has emerged. We have been unable to reproduce the problems described in this report, and will focus on 1403995 for now and see if the same symptoms are caused by the same underlying problem.

Revision history for this message
Weishiun Tsai (wei-shiun-tsai) wrote :
Download full text (5.7 KiB)

Just a quick update on the results of the v0105_0830 build. The QA regression tests saw 16 occurrences of this error, which failed 16 test suites:

SQL>create schema trafodion.arkcase_arkt1422;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2015-01-07 02:48:14]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2015-01-07 02:48:14]

SQL>create schema trafodion.compiler_fixes_nvcrv;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2015-01-07 06:38:35]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2015-01-07 06:38:35]

SQL>create schema trafodion.compress_vchr002;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2015-01-07 07:05:09]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2015-01-07 07:05:09]

SQL>create schema trafodion.datamining_rotate;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2015-01-07 07:49:14]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS does not exist or is inaccessible. [2015-01-07 07:49:14]

SQL>create schema trafodion.datamining_seq;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2015-01-07 07:58:42]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2015-01-07 07:58:42]

SQL>create schema trafodion.ddl_ri001;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2015-01-07 08:00:55]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS does not exist or is inaccessible. [2015-01-07 08:00:55]

SQL>create schema trafodion.fullouterjoins_test_sql2;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2015-01-07 10:24:34]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS does not exist or is inaccessible. [2015-01-07 10:24:34]

SQL>create schema trafodion.fullouterjoins_test_sql3;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQL COMMIT or ROLLBACK WORK. [2015-01-07 10:25:59]
*** ERROR[4082] Object TRAFODION."_MD_".OBJECTS_UNIQ_IDX does not exist or is inaccessible. [2015-01-07 10:25:59]

SQL>create schema trafodion.fullouterjoins_test_sql5;
*** ERROR[8841] User application committed or aborted a transaction started by SQL. This transaction needs to be committed or aborted by calling SQ...

Read more...

Changed in trafodion:
assignee: Cliff Gray (cliff-gray) → Roberta Marton (roberta-marton)
status: New → In Progress
Revision history for this message
Roberta Marton (roberta-marton) wrote :

Error 4082 is a generic error that is returned during a call to get a table description. In this case, it means that the table is inaccessible for some reason - which, in this case, is a transaction inconsistency problem.

We found an issue that could cause this issue:

--> there is no transaction in progress.
--> a child arkcmp executes a SQL statement that inadvertently starts a transaction
--> the child arkcmp does not end its transaction before returning to the master
--> the master is unaware of the outstanding transaction and starts a new transaction
--> the master calls the child arkcmp to perform a compilation
--> the child arkcmp observes a different transaction from its parent and reports error 8841

The SQL statement that inadvertently starts a transaction is a "get tables in schema" request. This happens during certain checks to see if authorization is enabled.

Revision history for this message
Roberta Marton (roberta-marton) wrote :

This was an interesting problem and we eventually got a reproducible test case to debug.

A QA test suite (arkcase/arkt1112.log) contains some update statistics tests which are starting a transaction but the transaction is never ended.
Since the rogue transaction is started in a secondary arkcmp, the master arkcmp is unaware of this transaction.
What differentiates this from other tests is that QA test turns off autocommit before performing a series of update statistics.

QA runs their tests with trafci, which means the processes stay around and are reused. That means the rogue transaction stays around. Therefore, once test arkt1112 runs, you will start seeing spurious 8841 errors.

Details:

  MXOSRVR – the master executor process
    Tdm_arkcmp1 – started by the master to perform update statistics request
        Tdm_arkcmp2 – started by tdm_arkcmp1 to perform compilations and DDL requests
            Tdm_arkcmp3 – started by tdm_arkcmp2 to perform compilations for DDL requests

In this case, tdm_arkcmp1 (where update statistics is executing) sends a DDL request to tdm_arkcmp2
Before sending this request, the CLI sees if there is a transaction active. If not, it starts a transaction.
Once the DDL completes the CLI before returning:
   ends the transaction if autocommit is on
   does nothing if autocommit is off

In this case, autocommit is off, so the transaction is not ended by the CLI.
Usually, the update statistics code starts transactions before issues requests and then commits them.
However, there are a few places where this is not happening. So this transaction is not being ended.

Update statistics performs a DDL (with autocommit off)
The CLI starts a transaction and performs the DDL
Tdm_arkcmp1 now shows a transaction started.
The CLI does not end the transaction since autocommit is off
The update statistics code does not end the transaction, so it is retained.

Since update statistics is run in tdm_arkcmp1, the master executer (MXOSRVR) is unaware of the transaction.
The QA test case does perform a commit transaction, but it fails with “no transaction” since commit work looks at the master executor transaction status.

As long as the remaining QA tests do not make calls to the same MXOSRVR and perform DDL request that invoke tdm_arkcmp1, tests run fine.
If a DDL request is performed using the same MXORSRVR, there is a possibility that it fails with an 8841. Tdm_arkcmp1 thinks there is a transaction.

dtmci does show the active transaction. It is removed once the process is stopped.

This is a duplicate of 1401683

Changed in trafodion:
status: In Progress → Fix Committed
Revision history for this message
Weishiun Tsai (wei-shiun-tsai) wrote :

It's believed that the following 3 bug reports have the same root cause:

https://bugs.launchpad.net/trafodion/+bug/1396801 'Drop schema cascade returns error 20123 complaining about user-defined transaction'
https://bugs.launchpad.net/trafodion/+bug/1401683 'DDL operations see error 8841 about transaction started by SQL'
https://bugs.launchpad.net/trafodion/+bug/1404442 'Random create schema failures due to inaccessible _MD_ tables'

The fix for 1404442 was put in on 1/16. Ran through both the v0116 (v1.0.0rc1) and the v0117 build. Did not see any of the aforementioned problems. This bug report will now be closed.

Changed in trafodion:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.