Phoenix T2 tests core dumping in CliStatement::execute

Bug #1404430 reported by Aruna Sadashiva
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Trafodion
Fix Committed
High
Selvaganesan Govindarajan

Bug Description

T2 phoenix tests are failing, it worked in the Dec 16th build.

Tested with Selva's T2 checkin today, still seeing these cores.

Running test.java.com.hp.phoenix.end2end.VariableLengthPKTest
test.java.com.hp.phoenix.end2end.VariableLengthPKTest
*** EXECUTOR ASSERTION FAILURE
*** Time: Sat Dec 20 00:41:29 2014
*** Process: 2797
*** Thread ID: 2811
*** File: ../common/Ipc.cpp
*** Line: 5729
*** Message: retryCount_ < 10
/bin/sh: line 1: 2797 Aborted (core dumped) mvn test -Dtest=.*

===

#2 0x00007fffad63af2c in assert_botch_abend (
    f=0x7fffb0408e6a "../common/Ipc.cpp", l=5729,
    m=0x7fffb040a1e2 "retryCount_ < 10", c=0x0) at ../export/NAAbort.cpp:282
#3 0x00007fffad63ac9f in NAAssert (
    condition=0x7fffb040a1e2 "retryCount_ < 10",
    file_=0x7fffb0408e6a "../common/Ipc.cpp", line_=5729)
    at ../export/NAAbort.cpp:209
#4 0x00007fffb02b8d7f in IpcAwaitiox::DoAwaitiox (this=0x7fffa062a328,
    ignoreLrec=1) at ../common/Ipc.cpp:5729
#5 0x00007fffb02b0110 in IpcSetOfConnections::waitOnSet (this=0x7fffa062a238,
    timeout=0, calledByESP=0, timedout=0x0) at ../common/Ipc.cpp:1776
#6 0x00007fffaf9deb1a in IpcAllConnections::waitOnAll (this=0x7fffa062a1d0,
    timeout=0, calledByESP=0, timedout=0x0, waitTime=0x0)
    at ../common/Ipc.h:3073
#7 0x00007fffafb3ec68 in ExScheduler::work (this=0x7fffa08d0840,
    prevWaitTime=0) at ../executor/ExScheduler.cpp:443
#8 0x00007fffafa2e706 in ex_root_tcb::execute (this=0x7fffa08d11b8,
    cliGlobals=0x1fe4a50, glob=0x7fffa08ceb18, input_desc=0x7fffa08c1b58,
    diagsArea=@0x7ffff6b38300, reExecute=0) at ../executor/ex_root.cpp:1055
#9 0x00007fffaf2c70c4 in CliStatement::execute (this=0x7fffa08e7668,
    cliGlobals=0x1fe4a50, input_desc=0x7fffa08c1b58, diagsArea=...,
---Type <return> to continue, or q <return> to quit---
    execute_state=CliStatement::INITIAL_STATE_, fixupOnly=0, cliflags=0)
    at ../cli/Statement.cpp:4814
#10 0x00007fffaf24feaf in SQLCLI_PerformTasks(CliGlobals *, ULng32, SQLSTMT_ID *
, SQLDESC_ID *, SQLDESC_ID *, Lng32, Lng32, typedef __va_list_tag __va_list_tag
*, SQLCLI_PTR_PAIRS *, SQLCLI_PTR_PAIRS *) (cliGlobals=0x1fe4a50, tasks=4882,
    statement_id=0x7fff9f15eef8, input_descriptor=0x7fff9f15ef68,
    output_descriptor=0x0, num_input_ptr_pairs=0, num_output_ptr_pairs=0,
    ap=0x7ffff6b388f0, input_ptr_pairs=0x0, output_ptr_pairs=0x0)
    at ../cli/Cli.cpp:3284
#11 0x00007fffaf250820 in SQLCLI_Exec(CliGlobals *, SQLSTMT_ID *, SQLDESC_ID *,
Lng32, typedef __va_list_tag __va_list_tag *, SQLCLI_PTR_PAIRS *) (
    cliGlobals=0x1fe4a50, statement_id=0x7fff9f15eef8,
    input_descriptor=0x7fff9f15ef68, num_ptr_pairs=0, ap=0x7ffff6b388f0,
    ptr_pairs=0x0) at ../cli/Cli.cpp:3531
#12 0x00007fffaf2dd4d4 in SQL_EXEC_Exec (statement_id=0x7fff9f15eef8,
    input_descriptor=0x7fff9f15ef68, num_ptr_pairs=0)
    at ../cli/CliExtern.cpp:2062
#13 0x00007fffafaa6176 in ExeCliInterface::exec (this=0x7ffff6b38d20,
    inputBuf=0x0, inputBufLen=0) at ../executor/ExExeUtilCli.cpp:599
#14 0x00007fffafaa6d30 in ExeCliInterface::executeImmediateExec (
    this=0x7ffff6b38d20,
    stmtStr=0x7ffff6b38b40 "control query default attempt_esp_parallelism 'OFF';
", outputBuf=0x0, outputBufLen=0x0, nullTerminate=1, rowsAffected=0x0)
---Type <return> to continue, or q <return> to quit---
    at ../executor/ExExeUtilCli.cpp:898
#15 0x00007fffafaa702b in ExeCliInterface::executeImmediate (
    this=0x7ffff6b38d20,
    stmtStr=0x7ffff6b38b40 "control query default attempt_esp_parallelism 'OFF';
", outputBuf=0x0, outputBufLen=0x0, nullTerminate=1, rowsAffected=0x0,
    monitorThis=0, globalDiags=0x0) at ../executor/ExExeUtilCli.cpp:993
#16 0x00007fffafaa978c in ExeCliInterface::holdAndSetCQD (this=0x7ffff6b38d20,
    defaultName=0x7fffab5e4d2e "attempt_esp_parallelism",
    defaultValue=0x7fffab5e4b5b "OFF", globalDiags=0x0)
    at ../executor/ExExeUtilCli.cpp:2069
#17 0x00007fffab4dbe18 in CmpSeabaseDDL::sendAllControlsAndFlags (
    this=0x7ffff6b39f40, prevContext=0x0)
    at ../sqlcomp/CmpSeabaseDDLcommon.cpp:1176
#18 0x00007fffab4f1664 in CmpSeabaseDDL::executeSeabaseDDL (
    this=0x7ffff6b39f40, ddlExpr=0x7fff9f15f0c0, ddlNode=0x7fff9f15dc50,
    currCatName=..., currSchName=...)
    at ../sqlcomp/CmpSeabaseDDLcommon.cpp:6487
#19 0x00007fffb06c6562 in CmpStatement::process (this=0x7fff9f14b000,
    statement=...) at ../arkcmp/CmpStatement.cpp:931
#20 0x00007fffb06b5497 in CmpContext::compileDirect (this=0x7fff9fda0090,
    data=0x7fffa08c35e0 "\200", data_len=520, outHeap=0x7fffa417bd10,
    charset=15, op=CmpMessageObj::PROCESSDDL, gen_code=@0x7ffff6b3a888,
    gen_code_len=@0x7ffff6b3a884, parserFlags=0, diagsArea=0x7fffa08c37f0)
---Type <return> to continue, or q <return> to quit---
    at ../arkcmp/CmpContext.cpp:721
#21 0x00007fffaf9ba6b2 in ExDDLTcb::work (this=0x7fffa08c5070)
    at ../executor/ex_ddl.cpp:265
#22 0x00007fffaf9d7a27 in ex_tcb::sWork (tcb=0x7fffa08c5070)
    at ../executor/ex_tcb.h:99
#23 0x00007fffafb3f4bb in ExSubtask::work (this=0x7fffa08c55c8)
    at ../executor/ExScheduler.cpp:751
#24 0x00007fffafb3e87e in ExScheduler::work (this=0x7fffa08c4cb0,
    prevWaitTime=0) at ../executor/ExScheduler.cpp:328
#25 0x00007fffafa2e706 in ex_root_tcb::execute (this=0x7fffa08c5648,
    cliGlobals=0x1fe4a50, glob=0x7fffa08c2fa8, input_desc=0x7fffa08c2a38,
    diagsArea=@0x7ffff6b3bec0, reExecute=0) at ../executor/ex_root.cpp:1055
#26 0x00007fffaf2c70c4 in CliStatement::execute (this=0x7fffa08c11c0,
    cliGlobals=0x1fe4a50, input_desc=0x7fffa08c2a38, diagsArea=...,
    execute_state=CliStatement::INITIAL_STATE_, fixupOnly=0, cliflags=0)
    at ../cli/Statement.cpp:4814
#27 0x00007fffaf24feaf in SQLCLI_PerformTasks(CliGlobals *, ULng32, SQLSTMT_ID *
, SQLDESC_ID *, SQLDESC_ID *, Lng32, Lng32, typedef __va_list_tag __va_list_tag
*, SQLCLI_PTR_PAIRS *, SQLCLI_PTR_PAIRS *) (cliGlobals=0x1fe4a50, tasks=8063,
    statement_id=0x2b910e0, input_descriptor=0x2b91110, output_descriptor=0x0,
    num_input_ptr_pairs=0, num_output_ptr_pairs=0, ap=0x7ffff6b3c4d0,
    input_ptr_pairs=0x0, output_ptr_pairs=0x0) at ../cli/Cli.cpp:3284
#28 0x00007fffaf25114d in SQLCLI_ClearExecFetchClose(CliGlobals *, SQLSTMT_ID *,
---Type <return> to continue, or q <return> to quit---
 SQLDESC_ID *, SQLDESC_ID *, Lng32, Lng32, Lng32, typedef __va_list_tag __va_lis
t_tag *, SQLCLI_PTR_PAIRS *, SQLCLI_PTR_PAIRS *) (cliGlobals=0x1fe4a50,
    statement_id=0x2b910e0, input_descriptor=0x2b91110, output_descriptor=0x0,
    num_input_ptr_pairs=0, num_output_ptr_pairs=0, num_total_ptr_pairs=0,
    ap=0x7ffff6b3c4d0, input_ptr_pairs=0x0, output_ptr_pairs=0x0)
    at ../cli/Cli.cpp:3775
#29 0x00007fffaf2de465 in SQL_EXEC_ClearExecFetchClose (
    statement_id=0x2b910e0, input_descriptor=0x2b91110, output_descriptor=0x0,
    num_input_ptr_pairs=0, num_output_ptr_pairs=0, num_total_ptr_pairs=0)
    at ../cli/CliExtern.cpp:2618
#30 0x00007fffb1657810 in EXECUTE (pSrvrStmt=0x2b90ad0)
    at native/SqlInterface.cpp:1119
#31 0x00007fffb1652c3b in SRVR_STMT_HDL::Execute (this=0x2b90ad0,
    inCursorName=0x0, totalRowCount=1, inSqlStmtType=0,
    inValueList=0x7ffff6b3c8b0, inSqlAsyncEnable=0, inQueryTimeout=0,
    outValueList=0x7ffff6b3c8a0) at native/CSrvrStmt.cpp:239
#32 0x00007fffb1653268 in SRVR_STMT_HDL::ExecDirect (this=0x2b90ad0,
    inCursorName=0x0, inSqlString=0x7ffff6b3ca30, inStmtType=0,
    inSqlStmtType=0, inHoldability=2, inQueryTimeout=0)
    at native/CSrvrStmt.cpp:396
#33 0x00007fffb1673010 in odbc_SQLSvc_ExecDirect_sme_ (objtag_=0x0,
    call_id_=0x0, exception_=0x7ffff6b3ca60, dialogueId=49343728,
    stmtLabel=0x3111a70 "STMT260", cursorName=0x0,
---Type <return> to continue, or q <return> to quit---
    stmtExplainLabel=0x7fffb1677a2c "", stmtType=0, sqlStmtType=0,
    sqlString=0x7ffff6b3ca30, holdability=2, queryTimeout=0,
    resultSet=140737332366352, estimatedCost=0x7ffff6b3cad0,
    outputDesc=0x7ffff6b3caa0, rowsAffected=0x7ffff6b3cac8,
    sqlWarning=0x7ffff6b3cab0, stmtId=0x7ffff6b3ca90, currentStmtId=0)
    at native/SrvrOthers.cpp:669
#34 0x00007fffb166d659 in Java_org_trafodion_jdbc_t2_SQLMXStatement_executeDirec
t (jenv=0x6149e8, jobj=0x7ffff6b3cc78, server=0x0, dialogueId=49343728,
    txid=0, autoCommit=1 '\001', txnMode=2, stmtLabel=0x7ffff6b3cc40,
    cursorName=0x0, sql=0x7ffff6b3cc30, isSelect=0 '\000', queryTimeout=0,
    holdability=2, resultSet=0x7ffff6b3cc10, currentStmtId=0)
    at native/SQLMXStatement.cpp:140

Tags: sql-exe
Revision history for this message
Selvaganesan Govindarajan (selva-ganesan) wrote :

I suspect that this issue might be related to https://review.trafodion.org/#/c/889/. Please try with this fix and confirm

Changed in trafodion:
status: New → Fix Committed
Revision history for this message
Arvind Narain (arvind-narain) wrote :

Same issue seen after #/c/889/ with build date 20150102_1050.

http://logs.trafodion.org/daily/phoenix_part1_T2-cdh5.1/4bf80cd/console.html

*** EXECUTOR ASSERTION FAILURE
*** Time: Fri Jan 2 11:29:40 2015
*** Process: 25558
*** Thread ID: 25566
*** File: ../common/Ipc.cpp
*** Line: 5729
*** Message: retryCount_ < 10
/bin/sh: line 1: 25558 Aborted (core dumped)

Changed in trafodion:
status: Fix Committed → In Progress
Changed in trafodion:
assignee: Selvaganesan Govindarajan (selva-ganesan) → Justin Du (justin-du-2)
Revision history for this message
Justin Du (justin-du-2) wrote :

The current limitation of IPC completion logic is not proper to handle messages from different Cli contexts. Recent code drops for query cancellation exposed this limitation but it could also happen with ESPs. So far, the problem is only with JDBC T2 driver.

Suggested work-around is to set env variable SQL_NO_REGISTER_CANCEL to 1 (export SQL_NO_REGISTER_CANCEL=1), then run the phoenix T2 test.

Lower the "Importance" to high while working on the fix.

Changed in trafodion:
importance: Critical → High
Changed in trafodion:
assignee: Justin Du (justin-du-2) → Selvaganesan Govindarajan (selva-ganesan)
Revision history for this message
Aruna Sadashiva (aruna-sadashiva) wrote :

Used the workaround with 01/08 build and got a sql core during ReadIsolationTest, will close this and file separate bugs.

Revision history for this message
Aruna Sadashiva (aruna-sadashiva) wrote :

Sorry, will not close, will leave it open since we should be able to use without workaround.

Revision history for this message
Trafodion-Gerrit (neo-devtools) wrote : Related fix proposed to infra (master)

Related fix proposed to branch: master
Review: https://review.trafodion.org/950

Revision history for this message
Trafodion-Gerrit (neo-devtools) wrote : Related fix merged to infra (master)

Reviewed: https://review.trafodion.org/950
Committed: https://github.com/trafodion/infra/commit/3c1b1296309513257362d81ae0b7cf8daed99127
Submitter: Trafodion Jenkins
Branch: master

commit 3c1b1296309513257362d81ae0b7cf8daed99127
Author: Alice Chen <email address hidden>
Date: Tue Jan 13 16:29:40 2015 -0800

    Add environment variable for Phoenix T2 tests

    Add environment variable to temporarily work around bug 1404430
    in the Phoenix T2 tests. This change will be reverted when
    the bug is closed.

    Change-Id: Ib0499af5904150743986589393b660e9c61db51f
    Related-Bug: 1404430

Changed in trafodion:
milestone: r1.0 → r1.1
Revision history for this message
Selvaganesan Govindarajan (selva-ganesan) wrote :
Changed in trafodion:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.