unable to query a hive partitioned table

Bug #1353632 reported by Julie Thai
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Trafodion
Confirmed
Medium
Hans Zeller

Bug Description

Created and populated a hive partitioned table. In hive, rowcount returned is correct.
In traf sqlci, select-count returned 0 rows.

To reproduce, in hive create partitioned :
CREATE TABLE part_lineitem
(
    L_PARTKEY INT
  , L_SUPPKEY INT
  , L_LINENUMBER INT
  , L_QUANTITY DOUBLE
  , L_EXTENDEDPRICE DOUBLE
  , L_DISCOUNT DOUBLE
  , L_TAX DOUBLE
  , L_RETURNFLAG STRING
  , L_LINESTATUS STRING
  , L_SHIPDATE TIMESTAMP
  , L_COMMITDATE TIMESTAMP
  , L_RECEIPTDATE TIMESTAMP
  , L_SHIPINSTRUCT STRING
  , L_SHIPMODE STRING
  , L_COMMENT STRING
)
partitioned by (l_orderkey int)
stored as sequencefile;

insert into table part_lineitem partition (l_orderkey=3000000) select l_partkey, l_suppkey, l_linenumber, l_quantity, l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment from lineitem where l_orderkey <=3000000;

insert into table part_lineitem partition (l_orderkey=6000000) select l_partkey, l_suppkey, l_linenumber, l_quantity, l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment from lineitem where l_orderkey > 3000000 and l_orderkey <= 6000000;

insert into table part_lineitem partition (l_orderkey=9000000) select l_partkey, l_suppkey, l_linenumber, l_quantity, l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment from lineitem where l_orderkey > 6000000 and l_orderkey <= 9000000;

insert into table part_lineitem partition (l_orderkey=12000000) select l_partkey, l_suppkey, l_linenumber, l_quantity, l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment from lineitem where l_orderkey > 9000000 and l_orderkey <= 12000000;

select count(*) from part_lineitem should return rowcount of 11997996 rows.

In traf sqlci, issue select count(*) from hive.hive.part_lineitem;

My output:
hive> select count(*) from part_lineitem;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201407311609_0021, Tracking URL = http://centos-mapr1.hpl.hp.com:50030/jobdetails.jsp?jobid=job_201407311609_0021
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201407311609_0021
Hadoop job information for Stage-1: number of mappers: 5; number of reducers: 1
2014-08-06 10:41:17,482 Stage-1 map = 0%, reduce = 0%
2014-08-06 10:41:27,520 Stage-1 map = 20%, reduce = 0%, Cumulative CPU 9.09 sec
2014-08-06 10:41:28,526 Stage-1 map = 25%, reduce = 0%, Cumulative CPU 9.09 sec
2014-08-06 10:41:29,533 Stage-1 map = 44%, reduce = 0%, Cumulative CPU 9.09 sec
2014-08-06 10:41:30,539 Stage-1 map = 56%, reduce = 0%, Cumulative CPU 9.09 sec
2014-08-06 10:41:31,545 Stage-1 map = 62%, reduce = 0%, Cumulative CPU 9.09 sec
2014-08-06 10:41:32,551 Stage-1 map = 69%, reduce = 0%, Cumulative CPU 9.09 sec
2014-08-06 10:41:33,557 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:34,562 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:35,568 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:36,575 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:37,581 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:38,587 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:39,593 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:40,599 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:41,605 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:42,611 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 73.37 sec
2014-08-06 10:41:43,617 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 73.37 sec
2014-08-06 10:41:44,623 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 73.37 sec
MapReduce Total cumulative CPU time: 1 minutes 13 seconds 370 msec
Ended Job = job_201407311609_0021
MapReduce Jobs Launched:
Job 0: Map: 5 Reduce: 1 Cumulative CPU: 73.37 sec HDFS Read: 1309424475 HDFS Write: 9 SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 13 seconds 370 msec
OK
11997996
Time taken: 31.115 seconds

$ sqlci
Trafodion Conversational Interface 0.8.3
(c) Copyright 2014 Hewlett-Packard Development Company, LP.
>>explain options 'f' select count(*) from hive.hive.part_lineitem;

LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------

2 . 3 root 1.00E+000
1 . 2 sort_scalar_aggr 1.00E+000
. . 1 hive_scan PART_LINEITEM 1.00E+000

--- SQL operation complete.
>>select count(*) from hive.hive.part_lineitem;

(EXPR)
--------------------

                   0

--- 1 row(s) selected.
>>invoke hive.hive.part_lineitem;

-- Definition of hive table PART_LINEITEM
-- Definition current Wed Aug 6 11:22:37 2014

  (
    L_PARTKEY INT
  , L_SUPPKEY INT
  , L_LINENUMBER INT
  , L_QUANTITY FLOAT(54)
  , L_EXTENDEDPRICE FLOAT(54)
  , L_DISCOUNT FLOAT(54)
  , L_TAX FLOAT(54)
  , L_RETURNFLAG VARCHAR(32000 BYTES) CHARACTER SET UTF8
      COLLATE DEFAULT
  , L_LINESTATUS VARCHAR(32000 BYTES) CHARACTER SET UTF8
      COLLATE DEFAULT
  , L_SHIPDATE TIMESTAMP(6)
  , L_COMMITDATE TIMESTAMP(6)
  , L_RECEIPTDATE TIMESTAMP(6)
  , L_SHIPINSTRUCT VARCHAR(32000 BYTES) CHARACTER SET UTF8
      COLLATE DEFAULT
  , L_SHIPMODE VARCHAR(32000 BYTES) CHARACTER SET UTF8
      COLLATE DEFAULT
  , L_COMMENT VARCHAR(32000 BYTES) CHARACTER SET UTF8
      COLLATE DEFAULT
  )

--- SQL operation complete.
>>table hive.hive.part_lineitem;

--- 0 row(s) selected.

Also, should partition info be included in invoke output?

Tags: sql-exe
Changed in trafodion:
milestone: none → r1.0
Revision history for this message
Hans Zeller (hans-zeller) wrote :

A preliminary fix was committed in https://review.trafodion.org/#/c/741/.
We should now get an error for partitioned tables, but I would still like to improve the error message we are getting, also I have not actually tested the fix yet.

Changed in trafodion:
status: New → In Progress
Changed in trafodion:
milestone: r1.0 → r1.1
Changed in trafodion:
milestone: r1.1 → none
Changed in trafodion:
status: In Progress → Confirmed
Revision history for this message
Sandhya Sundaresan (sandhya-sundaresan) wrote :

This is probably more a feature than a bug, support for partitioned tables could take several weeks to implement. Not currently on our roadmap.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.