InfiniDB Community

Incorrect Small Side and Large Side estimate in HashJoin

Bug #719135 reported by Shining on 2011-02-15

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	InfiniDB Community	Fix Released	Medium	Unassigned

Bug Description

The Schema is:
CREATE TABLE dim_a (
level_A1 varchar(8)
) engine=infinidb; -- 5rows

CREATE TABLE fact (
Dim_A varchar(8) ,
num1 int
) engine=infinidb; --23000000 rows

The Query is:
SELECT
    dim_a.level_A1 as level_A1 ,
    fact.dim_a as dim_a ,
    SUM(fact.num1) as num1
FROM
    dim_a dim_a,
    fact fact
WHERE
    fact.dim_a = dim_a.level_A1
GROUP BY
    dim_a.level_A1,
    fact.dim_a

The Trace information of ExeMgr is:

Locale is : C
Starting ExeMgr: st = 50, sq = 100, qs = 20, cf = /usr/local/Calpont/etc/Calpont.xml
### UM wants tuples
### For session id 2, got a CSEP

Stack:
doProject Emit pCol/pGetSignature for SimpleColumn
doProject Emit pCol/pGetSignature for SimpleColumn
doProject Emit pCol for SimpleColumn 3005
query steps:
N7joblist12pColScanStepE: 3004
N7joblist12pColScanStepE: 3001
N7joblist17TupleHashJoinStepE: 0
project steps:
N7joblist8pColStepE: 3001
N7joblist15pDictionaryStepE: 3002
N7joblist8pColStepE: 3004
N7joblist15pDictionaryStepE: 3006
N7joblist8pColStepE: 3005
delivery steps:
N7joblist12DeliveryStepE
N7joblist12DeliveryStepE

TupleInfoMap: (tupleKey oid tableName)
0 3000 dim_a
1 3003 fact
6 3000 $exp

TupleKey vector: (tupleKey oid name alias)
0 3000 dim_a dim_a
1 3003 fact fact
2 3004 Dim_A fact
3 3001 level_A1 dim_a
4 3006 Dim_A[d] fact
5 3002 level_A1[d] dim_a
6 3000 $exp $exp
7 0 num1 $exp
8 3005 num1 fact

Table Connection:
0 : 1
1 : 0

1 setting project BPP for TupleBPS ses:2 txn:0 ver:2 st:0 tb/col:3000/3001 alias:dim_a not del (0x1d5423a0[19]) nf:0 in:
  BatchPrimitiveProcessorJL:
   -- scan driven
   1 filter steps:
      ColumnCommandJL: 0 filters colwidth=8 oid=3001 (scan) (tokens)
   0 projection steps:

with PassThruStep ses:2 txn:0 ver:2 st:0 tb/col:3000/3001 alias:dim_a (0x1d558d60[19]) in:3001 (0x1d558d60[19]), and pDictionaryStep ses:2 txn:0 ver:2 st:0 tb/col:3000/3002 (0x1d558d60[19]) in:3001 (0x1d558d60[19]),
RowGroup for 0 : columncount = 1
oids: 3002
keys: 5
offsets: 2 11
types: 12
scales: 0
precisions: 10
1 setting project BPP for TupleBPS ses:2 txn:0 ver:2 st:0 tb/col:3003/3004 alias:fact not del (0x1d542090[19]) nf:0 in:
  BatchPrimitiveProcessorJL:
   -- scan driven
   1 filter steps:
      ColumnCommandJL: 0 filters colwidth=8 oid=3004 (scan) (tokens)
   0 projection steps:

with PassThruStep ses:2 txn:0 ver:2 st:0 tb/col:3003/3004 alias:fact (0x1d584840[19]) in:3004 (0x1d584840[19]), and pDictionaryStep ses:2 txn:0 ver:2 st:0 tb/col:3003/3006 (0x1d584840[19]) in:3004 (0x1d584840[19]),
2 setting project BPP for TupleBPS ses:2 txn:0 ver:2 st:0 tb/col:3003/3004 alias:fact not del (0x1d542090[19]) nf:0 in:
  BatchPrimitiveProcessorJL:
   -- scan driven
   1 filter steps:
      ColumnCommandJL: 0 filters colwidth=8 oid=3004 (scan) (tokens)
   1 projection steps:
      RTSCommandJL: oid=3006
      DictStepJL

with pColStep ses:2 txn:0 ver:2 st:0 tb/col:3003/3005 alias:fact (0x1d55b160[19]) nf:0 in: and NULL
RowGroup for 1 : columncount = 2
oids: 3006 3005
keys: 4 8
offsets: 2 11 15
types: 12 6
scales: 0 0
precisions: 10 10
[0;1m
====== join info ======
[0;39mfact-(:3006:4) join on dim_a-(:3002:5) joinType: 1 typeless
smallSideIndex-largeSideIndex : 0 -- 0
small side RG
columncount = 2
oids: 3006 3005
keys: 4 8
offsets: 2 11 15
types: 12 6
scales: 0 0
precisions: 10 10
large side RG
columncount = 1
oids: 3002
keys: 5
offsets: 2 11
types: 12
scales: 0
precisions: 10
RowGroup join result:
columncount = 3
oids: 3002 3006 3005
keys: 5 4 8
offsets: 2 11 20 24
types: 12 12 6
scales: 0 0 0
precisions: 10 10 10

Output RowGroup 0: columncount = 3
oids: 3002 3006 3005
keys: 5 4 8
offsets: 2 11 20 24
types: 12 12 6
scales: 0 0 0
precisions: 10 10 10
Output RowGroup 1: columncount = 3
oids: 3002 3006 3005
keys: 5 4 8
offsets: 2 11 20 24
types: 12 12 6
scales: 0 0 0
precisions: 10 10 10

====== Aggregation RowGroups ======
projected RG: columncount = 3
oids: 3002 3006 3005
keys: 5 4 8
offsets: 2 11 20 24
types: 12 12 6
scales: 0 0 0
precisions: 10 10 10
aggregated RG: columncount = 3
oids: 3002 3006 3005
keys: 5 4 8
offsets: 2 11 20 28
types: 12 12 9
scales: 0 0 0
precisions: 10 10 19

====== Aggregation RowGroups ======
projected RG: columncount = 3
oids: 3002 3006 3005
keys: 5 4 8
offsets: 2 11 20 24
types: 12 12 6
scales: 0 0 0
precisions: 10 10 10
aggregated1 RG: columncount = 3
oids: 3002 3006 3005
keys: 5 4 8
offsets: 2 11 20 28
types: 12 12 9
scales: 0 0 0
precisions: 10 10 19
aggregated2 RG: columncount = 3
oids: 3002 3006 3005
keys: 5 4 8
offsets: 2 11 20 28
types: 12 12 9
scales: 0 0 0
precisions: 10 10 19
delivered RG: columncount = 3
oids: 3002 3006 0
keys: 5 4 7
offsets: 2 11 20 28
types: 12 12 9
scales: 0 0 0
precisions: 10 10 19

Delivered RowGroup: columncount = 3
oids: 3002 3006 0
keys: 5 4 7
offsets: 2 11 20 28
types: 12 12 9
scales: 0 0 0
precisions: 10 10 19

job parms:
maxBuckets = 128, maxElems = 131072, flushInterval = 16384, fifoSize = 32, ScanLimit/Threshold = 10000/5000

job filter steps:
TupleBPS ses:2 txn:0 ver:2 st:0 tb/col:3003/3004 alias:fact not del (0x1d58f1f0[19]) nf:0 in:
  BatchPrimitiveProcessorJL:
   -- scan driven
   1 filter steps:
      ColumnCommandJL: 0 filters colwidth=8 oid=3004 (scan) (tokens)
   2 projection steps:
      RTSCommandJL: oid=3006
      DictStepJL
      ColumnCommandJL: 0 filters colwidth=4 oid=3005

TupleBPS ses:2 txn:0 ver:2 st:1 tb/col:3000/3001 alias:dim_a not del (0x1d555600[19]) nf:0 in:
  BatchPrimitiveProcessorJL:
   -- scan driven
   1 filter steps:
      ColumnCommandJL: 0 filters colwidth=8 oid=3001 (scan) (tokens)
   1 projection steps:
      RTSCommandJL: oid=3002
      DictStepJL

TupleHashJoinStep ses:2 st:2 in tb/col:3003/1 (0x1d58f1f0[19]) in *tb/col:3003/0 (0x1d555600[19])
out tb/col:3003/100 (0x1d5571c0[19])

AggregateStep ses:2 txn:0 st:3 in:(0x1d5571c0[19])

job project steps:

job delivery steps:
AggregateStep ses:2 txn:0 st:3 in:(0x1d5571c0[19])

runtime updates: start at Tue Feb 15 11:25:44 2011
### UM wanted tuples and we'll do our best;-)
Session: 2; delaying start of query step 1; waitStepCount-1
### For session id 2, got a command = 100
UM join (0) join too big
HJ stopping... status is 2001
Join is too big, raise the UM join limit for now
ses:2 st: 0 finished at 2011-02-15 11:26:12.692463; PhyI/O-8193; CacheI/O-7295; MsgsSent-4; MsgsRvcd-2048; BlocksTouched-7279; BlockedFifoIn/Out-0/16; output size-4585472
PartitionBlocksEliminated-0; MsgBytesIn-67208KB; MsgBytesOut-0KB
1st read 2011-02-15 11:25:44.808160; EOI 2011-02-15 11:26:12.691700; runtime-27.883540s
Job completion status 2001

ses:2 st: 1 finished at 2011-02-15 11:26:12.712262; PhyI/O-0; CacheI/O-0; MsgsSent-0; MsgsRvcd-0; BlocksTouched-0; BlockedFifoIn/Out-0/0; output size-0
PartitionBlocksEliminated-0; MsgBytesIn-0KB; MsgBytesOut-0KB
1st read 2011-02-15 11:26:12.711731; EOI 2011-02-15 11:26:12.711733; runtime-0.000002s
Job completion status 2001

ses:2 st: 3 finished at Tue Feb 15 11:26:12 2011; total rows returned-0
1st read 1970-01-01 08:00:00.000000; EOI 2011-02-15 11:26:12.725778; runtime-1297740372.725778s;
Job completion status 2001

### For session id 2, got a command = 3
### For session id 2, got a command = 0
ses:2 Query Totals: MaxMemPct-31; NumTempFiles-0; TempFileSpace-0MB; ApproxPhyI/O-8193; CacheI/O-7295; BlocksTouched-7279
PartitionBlocksEliminated-0; MsgBytesIn-66MB; MsgBytesOut-0MB; Mode-Distributed at Tue Feb 15 11:26:13 2011

For session 2: 67 bytes sent back at Tue Feb 15 11:26:13 2011

-------------------------------------------------------------------------------
In the "Join Info", You can see that ExeMgr treats fact table as the small side in HashJoin.

Revision history for this message

Shining (nshi-nb) wrote on 2011-02-15:

Test Data Edit (22.5 MiB, application/x-7z-compressed)

Revision history for this message

Shining (nshi-nb) wrote on 2011-02-15:

OS: CentOS 5.5 x86_64
INFINIDB: 2.0.3 CE

Revision history for this message

wweeks (wweeks) wrote on 2011-02-15:

Confirmed. This is an issue with joins on dictionary columns (char with length > 8 or varchar with length > 7). If you change the varchar(8) to a char(8), it will be functionally equivalent within InfiniDB and will correctly select the small side. It will also have better performance; not sure if varchar(8) was just used for the example or if the query that you found this bug on really has a varchar(8).

If you really do need to join on a dictionary column, a work around is to swap the order in the from clause such as "from fact, dim_a". It's highly recommended to join on non-dictionary columns if at all possible as the performance will be much better.