Select count(*) on a table with 32 million rows takes 30 minutes to execute on a cluster

Bug #1243877 reported by Weishiun Tsai
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Trafodion
Fix Released
High
Anoop Sharma

Bug Description

On SEASCAPE2/sqtopl7, which has 6 nodes (n009-n014) and runs an open-source build from 2013/10/23, a select count(*) statement takes 30 minutes to execute, as shown by the shell date commands in the following example. In comparison, the same command on a Seaquest instance from the same machine takes 6 seconds to execute.

On a Trafodion instance:

>>obey count.sql;
>>set schema seabase.g_wisc32;

--- SQL operation complete.
>>sh date;
Wed Oct 23 17:51:49 UTC 2013
>>select count(*) from abase;

(EXPR)
--------------------

            32000000

--- 1 row(s) selected.
>>sh date;
Wed Oct 23 18:20:30 UTC 2013
>>prepare xx from select count(*) from abase;

--- SQL command prepared.
>>explain xx;

------------------------------------------------------------------ PLAN SUMMARY
MODULE_NAME .............. DYNAMICALLY COMPILED
STATEMENT_NAME ........... XX
PLAN_ID .................. 212249310714804657
ROWS_OUT ................. 1
EST_TOTAL_COST ........... 0.01
STATEMENT ................ select count(*) from abase;

------------------------------------------------------------------ NODE LISTING
ROOT ====================================== SEQ_NO 3 ONLY CHILD 2
REQUESTS_IN .............. 1
ROWS_OUT ................. 1
EST_OPER_COST ............ 0
EST_TOTAL_COST ........... 0.01
DESCRIPTION
  max_card_est ........... 1
  fragment_id ............ 0
  parent_frag ............ (none)
  fragment_type .......... master
  statement_index ........ 0
  affinity_value ......... 0
  max_max_cardinality .... 0
  total_overflow_size .... 0.00 KB
  xn_access_mode ......... read_only
  xn_autoabort_interval 0
  auto_query_retry ....... enabled
  plan_version ....... 2,600
  embedded_arkcmp ........ used
  MXCI_PROCESS ........... ON
  LDAP_USERNAME
  SCHEMA ................. SEABASE.G_WISC32
  select_list ............ count(1 )

SORT_SCALAR_AGGR ========================== SEQ_NO 2 ONLY CHILD 1
REQUESTS_IN .............. 1
ROWS_OUT ................. 1
EST_OPER_COST ............ 0.01
EST_TOTAL_COST ........... 0.01
DESCRIPTION
  max_card_est ........... 1
  fragment_id ............ 0
  parent_frag ............ (none)
  fragment_type .......... master
  aggregates ............. count(1 )

SEABASE_SCAN ============================== SEQ_NO 1 NO CHILDREN
TABLE_NAME ............... ABASE
REQUESTS_IN .............. 1
ROWS_OUT ............... 100
EST_OPER_COST ............ 0.01
EST_TOTAL_COST ........... 0.01
DESCRIPTION
  max_card_est ......... 100
  fragment_id ............ 0
  parent_frag ............ (none)
  fragment_type .......... master
  scan_type .............. subset scan of table SEABASE.G_WISC32.ABASE
  columns ................ all
  begin_keys(incl)
  end_keys(incl)
  key_columns ............ UNIQUE2

--- SQL operation complete.

On a Seaquest instance:
>>obey count.sql;
>>set schema seabase.g_wisc32;

--- SQL operation complete.
>>sh date;
Wed Oct 23 18:45:55 UTC 2013
>>select count(*) from abase;

(EXPR)
--------------------

            32000000

--- 1 row(s) selected.
>>sh date;
Wed Oct 23 18:46:01 UTC 2013

Tags: sql-general
Changed in trafodion:
assignee: nobody → Anoop Sharma (anoop-sharma)
Revision history for this message
Anoop Sharma (anoop-sharma) wrote :

Change has been added to use hbase co-processor to perform count(*).
Explain should show hbase_coproc_aggr for this opt to be used.
Also, hbase Aggr coproc property must be set in hbase-site.xml. If not, an error will be
returned during execution.

>>explain options 'f' select count(*) from t;

LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------

1 . 2 root 1.00E+000
. . 1 hbase_coproc_aggr 1.00E+000

--- SQL operation complete.
>>

Changed in trafodion:
status: New → Fix Committed
Changed in trafodion:
status: Fix Committed → Fix Released
information type: Proprietary → Public
Alice Chen (alchen)
tags: added: sql-general
removed: sql
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.