unable to query a hive partitioned table
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Trafodion |
Confirmed
|
Medium
|
Hans Zeller |
Bug Description
Created and populated a hive partitioned table. In hive, rowcount returned is correct.
In traf sqlci, select-count returned 0 rows.
To reproduce, in hive create partitioned :
CREATE TABLE part_lineitem
(
L_PARTKEY INT
, L_SUPPKEY INT
, L_LINENUMBER INT
, L_QUANTITY DOUBLE
, L_EXTENDEDPRICE DOUBLE
, L_DISCOUNT DOUBLE
, L_TAX DOUBLE
, L_RETURNFLAG STRING
, L_LINESTATUS STRING
, L_SHIPDATE TIMESTAMP
, L_COMMITDATE TIMESTAMP
, L_RECEIPTDATE TIMESTAMP
, L_SHIPINSTRUCT STRING
, L_SHIPMODE STRING
, L_COMMENT STRING
)
partitioned by (l_orderkey int)
stored as sequencefile;
insert into table part_lineitem partition (l_orderkey=
insert into table part_lineitem partition (l_orderkey=
insert into table part_lineitem partition (l_orderkey=
insert into table part_lineitem partition (l_orderkey=
select count(*) from part_lineitem should return rowcount of 11997996 rows.
In traf sqlci, issue select count(*) from hive.hive.
My output:
hive> select count(*) from part_lineitem;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.
In order to limit the maximum number of reducers:
set hive.exec.
In order to set a constant number of reducers:
set mapred.
Starting Job = job_20140731160
Kill Command = /usr/lib/
Hadoop job information for Stage-1: number of mappers: 5; number of reducers: 1
2014-08-06 10:41:17,482 Stage-1 map = 0%, reduce = 0%
2014-08-06 10:41:27,520 Stage-1 map = 20%, reduce = 0%, Cumulative CPU 9.09 sec
2014-08-06 10:41:28,526 Stage-1 map = 25%, reduce = 0%, Cumulative CPU 9.09 sec
2014-08-06 10:41:29,533 Stage-1 map = 44%, reduce = 0%, Cumulative CPU 9.09 sec
2014-08-06 10:41:30,539 Stage-1 map = 56%, reduce = 0%, Cumulative CPU 9.09 sec
2014-08-06 10:41:31,545 Stage-1 map = 62%, reduce = 0%, Cumulative CPU 9.09 sec
2014-08-06 10:41:32,551 Stage-1 map = 69%, reduce = 0%, Cumulative CPU 9.09 sec
2014-08-06 10:41:33,557 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:34,562 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:35,568 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:36,575 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:37,581 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:38,587 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:39,593 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:40,599 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:41,605 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.28 sec
2014-08-06 10:41:42,611 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 73.37 sec
2014-08-06 10:41:43,617 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 73.37 sec
2014-08-06 10:41:44,623 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 73.37 sec
MapReduce Total cumulative CPU time: 1 minutes 13 seconds 370 msec
Ended Job = job_20140731160
MapReduce Jobs Launched:
Job 0: Map: 5 Reduce: 1 Cumulative CPU: 73.37 sec HDFS Read: 1309424475 HDFS Write: 9 SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 13 seconds 370 msec
OK
11997996
Time taken: 31.115 seconds
$ sqlci
Trafodion Conversational Interface 0.8.3
(c) Copyright 2014 Hewlett-Packard Development Company, LP.
>>explain options 'f' select count(*) from hive.hive.
LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------
2 . 3 root 1.00E+000
1 . 2 sort_scalar_aggr 1.00E+000
. . 1 hive_scan PART_LINEITEM 1.00E+000
--- SQL operation complete.
>>select count(*) from hive.hive.
(EXPR)
-------
--- 1 row(s) selected.
>>invoke hive.hive.
-- Definition of hive table PART_LINEITEM
-- Definition current Wed Aug 6 11:22:37 2014
(
L_PARTKEY INT
, L_SUPPKEY INT
, L_LINENUMBER INT
, L_QUANTITY FLOAT(54)
, L_EXTENDEDPRICE FLOAT(54)
, L_DISCOUNT FLOAT(54)
, L_TAX FLOAT(54)
, L_RETURNFLAG VARCHAR(32000 BYTES) CHARACTER SET UTF8
COLLATE DEFAULT
, L_LINESTATUS VARCHAR(32000 BYTES) CHARACTER SET UTF8
COLLATE DEFAULT
, L_SHIPDATE TIMESTAMP(6)
, L_COMMITDATE TIMESTAMP(6)
, L_RECEIPTDATE TIMESTAMP(6)
, L_SHIPINSTRUCT VARCHAR(32000 BYTES) CHARACTER SET UTF8
COLLATE DEFAULT
, L_SHIPMODE VARCHAR(32000 BYTES) CHARACTER SET UTF8
COLLATE DEFAULT
, L_COMMENT VARCHAR(32000 BYTES) CHARACTER SET UTF8
COLLATE DEFAULT
)
--- SQL operation complete.
>>table hive.hive.
--- 0 row(s) selected.
Also, should partition info be included in invoke output?
Changed in trafodion: | |
milestone: | none → r1.0 |
Changed in trafodion: | |
milestone: | r1.0 → r1.1 |
Changed in trafodion: | |
milestone: | r1.1 → none |
Changed in trafodion: | |
status: | In Progress → Confirmed |
A preliminary fix was committed in https:/ /review. trafodion. org/#/c/ 741/.
We should now get an error for partitioned tables, but I would still like to improve the error message we are getting, also I have not actually tested the fix yet.