JDBC getColumns tests failing in jdbc_test (Voting test in Jenkins)

Bug #1404090 reported by Aruna Sadashiva
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Trafodion
Fix Committed
High
Justin Du

Bug Description

Different number of getColumns tests fail in jdbc_test suite (4 or 5).

Exception in test JDBC Get Columns 3..The message id: ids_dcs_srvr_not_available With parameters:

These tests started failing from Dec 8 build, seems like its due to some Security checkin.

https://jenkins02.trafodion.org/job/jdbc_test-cdh5.1/lastCompletedBuild/testReport/(root)/TestBasic/TestBasic12/

Tags: sql-cmp
description: updated
Changed in trafodion:
status: New → In Progress
Revision history for this message
Arvind Narain (arvind-narain) wrote :
Download full text (5.3 KiB)

Currently it looks like "NJ costing" change that went in on Dec 8th introduced this regression. No change in the internal query done by mxosrvr for getColumns. Assigning to Sandhya .

Since the failures are random ( some getColumns test pass, some fail - different failures in different runs ) tried to simplify the query to narrow down the problem. cut/paste some of the relevant portions of email chains:

_____________________________________________
From: Sundaresan, Sandhya
Sent: Monday, January 12, 2015 12:38 AM
To: Narain, Arvind; Du, Justin; Sharma, Anoop
Cc: Neelakanthappa, Ravisha; Varshneya, Renu
Subject: RE: Help needed in figuring out critical case related to getColumns

Great – thanks Arvind for trying this. This is what we had suspected – the plan issue and it appears you have indeed confirmed it. Again the randomness that we see currently may be data dependent and what kind of plan the optimizer chooses. So it shows for some tables and not others.
Ravisha is out until the 22nd so ccing Renu to see who else (Hans or Qifan ? ) can help .
Sandhya

_____________________________________________
From: Narain, Arvind
Sent: Monday, January 12, 2015 12:08 AM
To: Du, Justin; Sundaresan, Sandhya; Sharma, Anoop
Cc: Neelakanthappa, Ravisha
Subject: RE: Help needed in figuring out critical case related to getColumns

Hi Sandhya,

I tried some other experiments over this weekend. CC’g Ravisha also.

Firstly you should be able to see the problem with just ‘_MD_’.VERSIONS table – on a fresh install – without running any jdbc tests. Use scripts /designs/seaquest/narain/coltest.org (original with 4 parameter – 2 schemas, 2 tables ) or /designs/seaquest/narain/coltest2 (two params – 1 schema and 1 table ). If tablename is wildcarded then only the columns related to the last object (in this case VIEWS_VIEW) are shown.

I installed on sqws122:86 checkout 1655f1e2969e1c8f3bb28892ff8287f4c1f2d0c2 (Avoiding Thread.sleep() when thread is exiting …Oliver Bucaojit authored on Dec 8, 2014) – this is one commit before NJ change on Dec 8th.

This version works.

I then installed on sqws144:86 checkout fbe571a0159116b10b3b2057cc8edae514239c04 (Merge "NJ costing changes." Trafodion Jenkins authored on Dec 8, 2014 Gerrit Code Review committed on Dec 8, 2014) – this one shows the problem.

Below are the explain options ‘f’ outputs (for coltest.org):

Working one:

--- 5 row(s) selected.
>>explain options 'f' s1;

LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------

9 . 10 root 4.50E+002
8 . 9 sort 4.50E+002
7 1 8 hybrid_hash_join 4.50E+002
6 2 7 hybrid_hash_join 9.00E+001
3 5 6 nested_join 9.00E+001
4 . 5 probe_cache 1.00E+000
. . 4 trafodion_index_scan OBJECTS 1.00E+000
. . 3 trafodion_scan ...

Read more...

tags: added: sql-cmp
removed: client-jdbc-t4
Changed in trafodion:
assignee: Arvind Narain (arvind-narain) → Sandhya Sundaresan (sandhya-sundaresan)
Justin Du (justin-du-2)
Changed in trafodion:
assignee: Sandhya Sundaresan (sandhya-sundaresan) → Justin Du (justin-du-2)
Revision history for this message
Justin Du (justin-du-2) wrote :

There are problems in the ExHbaseAccessSQRowsetTcb::work() method. First, the parent down queue requests are consumed, except the last request, before doing the select operation in PROCESS_SELECT step. It traverses through SETUP_SELECT -> NEXT_ROW -> DONE -> SETUP_SELECT … The key values (row id) are taken from the request at the SETUP_SELECT step, but they are not used until the PROCESS_SELECT step. The PROCESS_SELECT step is entered if there is only 1 request left at the down queue. As result, the select task tcb can only look at the last remaining down queue request for select predicates although it gets all the rows in the row id list. This is the reason that the query didn’t select any row for some keys. Note that if there is no select predicates, we won't see the error except that those returned rows would be returned with the last row id.
The second problem, which doesn’t affect the correctness of the query but make the execution less effective, is that for SELECT, the uniq key list (rowIds_) never gets cleaned once it filled, so the list keeps increase each time the work method is called (unless we de-allocate). The result is that each time we call the select task, we retrieve the same rows we retrieved last time plus few more which keys were added to the list.

Revision history for this message
Justin Du (justin-du-2) wrote :

Plan to put a temp fix that limit the rowset size to one while working on complete fix.
This may have performance impact for query plans using the ExHbaseAccessSQRowsetTcb operator.

Revision history for this message
Justin Du (justin-du-2) wrote :

Alternative method is to turn off hbase rowset operation if the executor (select) predicate exist in generator.

A work around is to set CQD HBASE_ROWSET_VSBB_OPT off.

Lower the priority to high.

Changed in trafodion:
importance: Critical → High
Revision history for this message
Justin Du (justin-du-2) wrote :

The changes in the procode gen to avoid using the rowset vsbb method when executor predicates exist will stay until hbase could tell which keys won't have matched.

Mard the bug fixed.

Changed in trafodion:
status: In Progress → Fix Committed
Revision history for this message
Trafodion-Gerrit (neo-devtools) wrote : Related fix proposed to infra (master)

Related fix proposed to branch: master
Review: https://review.trafodion.org/1020

Revision history for this message
Trafodion-Gerrit (neo-devtools) wrote : Related fix merged to infra (master)

Reviewed: https://review.trafodion.org/1020
Committed: https://github.com/trafodion/infra/commit/5f2106e8c3e27b2b1e2bb9925ab4a7c43848cfdb
Submitter: Trafodion Jenkins
Branch: master

commit 5f2106e8c3e27b2b1e2bb9925ab4a7c43848cfdb
Author: Alice Chen <email address hidden>
Date: Thu Jan 22 12:57:24 2015 -0800

    Make jdbc_test voting again

    Bug 1404090 has been fixed. Make jdbc_test voting again

    Change-Id: If6bab1599fa2743af6383dc115136b6680b3f9c2
    Related-Bug: 1404090

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.