Error 73 when attempting a CREATE TABLE

Bug #1378544 reported by Atanu Mishra
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Trafodion
Fix Committed
High
Oliver Bucaojit

Bug Description

Encountered a TMF error 73 with hbase .98 build 20141004:

SQL>CREATE TABLE like_nopart LIKE mysalt

>>> SQL statement failed, [25000] [HP][HP ODBC Driver][HP Neoview Database] SQL ERROR:*** ERROR[8606] Transaction subsystem TMF returned error 73 on a commit transaction. [2014-10-06 13:31:46] (-8606) (SQLExecDirectW)

trafodion.dtm.log contained:

centos-mapr6: 2014-10-06 20:31:45,997 ERROR transactional.TransactionManager: Abort HasException true: true
centos-mapr6: 2014-10-06 20:31:45,997 ERROR transactional.TransactionManager: Abort HasException true: java.io.IOException: UnknownTransactionException
centos-mapr6: 2014-10-06 20:31:45,998 ERROR transactional.TransactionManager: exception in doAbortX (ignoring): java.lang.Exception: java.io.IOException: UnknownTransactionException

-----
TM will need to add a balancer workaround into the build until we can implement the code to handle a balance or a split.

Before your testing, can you please try disabling the load balancer through the hbase shell as shown below. Setting the balance_switch to false. The problem is the region is being moved while the old location is still in the client-side Transaction State.

Example:

>hbase shell

                        hbase(main):002:0> balance_switch false

true <-- Output will be the last setting of the balance_switch value

0 row(s) in 0.0080 seconds

A simple solution is not available yet since the Master process does the balancing, and TM code is mostly in the Region. In the 0.94-based version, the TM subclassed the RegionServer, and we had RS calls that were used to communicate whether a split or balance was allowed.

For 0.98 the call needs to be made through a coprocessor

Tags: dtm
Changed in trafodion:
status: New → In Progress
Changed in trafodion:
milestone: r0.9 → r1.0
Revision history for this message
Atanu Mishra (atanu-mishra) wrote :

We have resolved many of the UnknownTransactionException errors with fixes for other issues so the original problem may already be resolved, but I think this is a good placeholder bug for the split/balance work that needs to be done.

Changed in trafodion:
milestone: r1.0 → r1.1
Revision history for this message
Oliver Bucaojit (oliver-bucaojit) wrote :

Checked-in split delay portion of the fix on Jan 8.

Commit Message:
Resubmitting split delay work, additional checks

Split delay portion of LP bug 1378544.
Added a check/put mechanism for putting objects into
shared map. Depends on which coprocessor is started
first according to what was entered in the HBase
configuration under 'hbase.coprocessor.region.classes'

Revision history for this message
Oliver Bucaojit (oliver-bucaojit) wrote :

We haven’t seen any more reports of error 73’s such as in the bug scenario.

This bug was mainly left as a placeholder for the balance/split work.

We are making progress on online-handling of region balance and spitting, it is planned to be enabled in release 1.3+.

Blueprint for the design can be found here:
https://blueprints.launchpad.net/trafodion/+spec/dtm-handling-rebalance-split

Changed in trafodion:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.