An ORDER BY query returns wrong result

Bug #1442944 reported by Weishiun Tsai on 2015-04-11
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Trafodion
Fix Released
Critical
Hans Zeller

Bug Description

The following order by query returns wrong result in the v1.1.0rc0 (v0410) build. This query has been returning the correct result ever since the Trafodion early days all the way until the v0407 build. It is a regression introduced between the v0407 build and the r1.1.0rc0 (v0410) build.

Bellow you can find the execution results in v0407 (correct) and in r1.1.0rc0 (incorrect). Notice that this is a query ordered by 2, 3, 4, 1:

set schema trafodion.g_tpcds1x;

prepare xx from
select [first 200]
i_item_id,
ca_country,
ca_state,
ca_county,
......
order by 2,3,4,1;

since 2 (ca_country) is the same value ‘United Stats’ for the first 200 rows, the query essentially should see the rows in the order of 3 (ca_state). In the v0407 execution output, the rows are ordered correctly with ‘IN’ followed by ‘CO’. But in the v1.1.0rc0 execution output, ‘CO’ and ‘IN’ started to mixed together after certain rows.

----------------------------------------------------------

Here is the script to reproduce it. (It requires the QA g_tpch1x tables.)

set schema trafodion.g_tpcds1x;

prepare xx from
select [first 200]
i_item_id,
ca_country,
ca_state,
ca_county,
agg1,
agg2,
agg3,
agg4,
agg5,
agg6,
agg7
from
(
select i_item_id,
ca_country,
ca_state,
ca_county,
avg(cs_quantity) agg1,
avg(cs_list_price) agg2,
avg(cs_coupon_amt) agg3,
avg(cs_sales_price) agg4,
avg(cs_net_profit) agg5,
avg(c_birth_year) agg6,
avg(cd1.cd_dep_count) agg7
from catalog_sales, customer_demographics cd1,
customer_demographics cd2, customer,
customer_address, date_dim, item
where cs_sold_date_sk = d_date_sk and
cs_item_sk = i_item_sk and
cs_bill_cdemo_sk = cd1.cd_demo_sk and
cs_bill_customer_sk = c_customer_sk and
cd1.cd_gender = 'F' and
cd1.cd_education_status = 'Primary' and
c_current_cdemo_sk = cd2.cd_demo_sk and
c_current_addr_sk = ca_address_sk and
c_birth_month in (9,4,3,11,1,12) and
d_year = 1999 and
ca_state in ('VA','NE','MI','WV','WV','CO','IN')
group by 1,2,3,4
union all
select i_item_id,
cast(null as char),
ca_state,
ca_county,
avg(cs_quantity) agg1,
avg(cs_list_price) agg2,
avg(cs_coupon_amt) agg3,
avg(cs_sales_price) agg4,
avg(cs_net_profit) agg5,
avg(c_birth_year) agg6,
avg(cd1.cd_dep_count) agg7
from catalog_sales, customer_demographics cd1,
customer_demographics cd2, customer,
customer_address, date_dim, item
where cs_sold_date_sk = d_date_sk and
cs_item_sk = i_item_sk and
cs_bill_cdemo_sk = cd1.cd_demo_sk and
cs_bill_customer_sk = c_customer_sk and
cd1.cd_gender = 'F' and
cd1.cd_education_status = 'Primary' and
c_current_cdemo_sk = cd2.cd_demo_sk and
c_current_addr_sk = ca_address_sk and
c_birth_month in (9,4,3,11,1,12) and
d_year = 1999 and
ca_state in ('VA','NE','MI','WV','WV','CO','IN')
group by 1,2,3,4
union all
select i_item_id,
cast(null as char),
cast(null as char),
ca_county,
avg(cs_quantity) agg1,
avg(cs_list_price) agg2,
avg(cs_coupon_amt) agg3,
avg(cs_sales_price) agg4,
avg(cs_net_profit) agg5,
avg(c_birth_year) agg6,
avg(cd1.cd_dep_count) agg7
from catalog_sales, customer_demographics cd1,
customer_demographics cd2, customer,
customer_address, date_dim, item
where cs_sold_date_sk = d_date_sk and
cs_item_sk = i_item_sk and
cs_bill_cdemo_sk = cd1.cd_demo_sk and
cs_bill_customer_sk = c_customer_sk and
cd1.cd_gender = 'F' and
cd1.cd_education_status = 'Primary' and
c_current_cdemo_sk = cd2.cd_demo_sk and
c_current_addr_sk = ca_address_sk and
c_birth_month in (9,4,3,11,1,12) and
d_year = 1999 and
ca_state in ('VA','NE','MI','WV','WV','CO','IN')
group by 1,2,3,4
union all
select i_item_id,
cast(null as char),
cast(null as char),
cast(null as char),
avg(cs_quantity) agg1,
avg(cs_list_price) agg2,
avg(cs_coupon_amt) agg3,
avg(cs_sales_price) agg4,
avg(cs_net_profit) agg5,
avg(c_birth_year) agg6,
avg(cd1.cd_dep_count) agg7
from catalog_sales, customer_demographics cd1,
customer_demographics cd2, customer,
customer_address, date_dim, item
where cs_sold_date_sk = d_date_sk and
cs_item_sk = i_item_sk and
cs_bill_cdemo_sk = cd1.cd_demo_sk and
cs_bill_customer_sk = c_customer_sk and
cd1.cd_gender = 'F' and
cd1.cd_education_status = 'Primary' and
c_current_cdemo_sk = cd2.cd_demo_sk and
c_current_addr_sk = ca_address_sk and
c_birth_month in (9,4,3,11,1,12) and
d_year = 1999 and
ca_state in ('VA','NE','MI','WV','WV','CO','IN')
group by 1,2,3,4
union all
select cast(null as char),
cast(null as char),
cast(null as char),
cast(null as char),
avg(cs_quantity) agg1,
avg(cs_list_price) agg2,
avg(cs_coupon_amt) agg3,
avg(cs_sales_price) agg4,
avg(cs_net_profit) agg5,
avg(c_birth_year) agg6,
avg(cd1.cd_dep_count) agg7
from catalog_sales, customer_demographics cd1,
customer_demographics cd2, customer,
customer_address, date_dim, item
where cs_sold_date_sk = d_date_sk and
cs_item_sk = i_item_sk and
cs_bill_cdemo_sk = cd1.cd_demo_sk and
cs_bill_customer_sk = c_customer_sk and
cd1.cd_gender = 'F' and
cd1.cd_education_status = 'Primary' and
c_current_cdemo_sk = cd2.cd_demo_sk and
c_current_addr_sk = ca_address_sk and
c_birth_month in (9,4,3,11,1,12) and
d_year = 1999 and
ca_state in ('VA','NE','MI','WV','WV','CO','IN')
group by 1,2,3,4
) v1
order by 2,3,4,1;

explain options 'f' xx;

execute xx;

----------------------------------------------------------

Launchpad prevents adding long text in the description here. See attached file output.txt for the incorrect result from the r1.1.0rc0 (v0410) build and the correct result from the v0407 build.

Weishiun Tsai (wei-shiun-tsai) wrote :
Changed in trafodion:
assignee: nobody → James Capps (james-capps)
Changed in trafodion:
status: New → In Progress
assignee: James Capps (james-capps) → Justin Du (justin-du-2)
Justin Du (justin-du-2) wrote :

This problem could be another symptom of LP bugs 1442932 and 1442966. I tried the fix provided by Hans for those two on my workstation and it seems fixed this problem too.

Justin Du (justin-du-2) wrote :

Assign it to Hans

Changed in trafodion:
assignee: Justin Du (justin-du-2) → Hans Zeller (hans-zeller)
Hans Zeller (hans-zeller) wrote :

Fix for bug 1442932, bug 1442944 and bug 1442966 was merged into the R1.1 branch on 4/15, see https://review.trafodion.org/1504.

Changed in trafodion:
status: In Progress → Fix Committed
Weishiun Tsai (wei-shiun-tsai) wrote :
Download full text (63.6 KiB)

Verified on the v0416 build installed on a 4-node cluster. This problem has been fixed:

>>set schema trafodion.g_tpcds1x;

--- SQL operation complete.
>>
>>prepare xx from
+> select [first 200]
+> i_item_id,
+> ca_country,
+> ca_state,
+> ca_county,
+> agg1,
+> agg2,
+> agg3,
+> agg4,
+> agg5,
+> agg6,
+> agg7
+> from
+> (
+> select i_item_id,
+> ca_country,
+> ca_state,
+> ca_county,
+> avg(cs_quantity) agg1,
+> avg(cs_list_price) agg2,
+> avg(cs_coupon_amt) agg3,
+> avg(cs_sales_price) agg4,
+> avg(cs_net_profit) agg5,
+> avg(c_birth_year) agg6,
+> avg(cd1.cd_dep_count) agg7
+> from catalog_sales, customer_demographics cd1,
+> customer_demographics cd2, customer,
+> customer_address, date_dim, item
+> where cs_sold_date_sk = d_date_sk and
+> cs_item_sk = i_item_sk and
+> cs_bill_cdemo_sk = cd1.cd_demo_sk and
+> cs_bill_customer_sk = c_customer_sk and
+> cd1.cd_gender = 'F' and
+> cd1.cd_education_status = 'Primary' and
+> c_current_cdemo_sk = cd2.cd_demo_sk and
+> c_current_addr_sk = ca_address_sk and
+> c_birth_month in (9,4,3,11,1,12) and
+> d_year = 1999 and
+> ca_state in ('VA','NE','MI','WV','WV','CO','IN')
+> group by 1,2,3,4
+> union all
+> select i_item_id,
+> cast(null as char),
+> ca_state,
+> ca_county,
+> avg(cs_quantity) agg1,
+> avg(cs_list_price) agg2,
+> avg(cs_coupon_amt) agg3,
+> avg(cs_sales_price) agg4,
+> avg(cs_net_profit) agg5,
+> avg(c_birth_year) agg6,
+> avg(cd1.cd_dep_count) agg7
+> from catalog_sales, customer_demographics cd1,
+> customer_demographics cd2, customer,
+> customer_address, date_dim, item
+> where cs_sold_date_sk = d_date_sk and
+> cs_item_sk = i_item_sk and
+> cs_bill_cdemo_sk = cd1.cd_demo_sk and
+> cs_bill_customer_sk = c_customer_sk and
+> cd1.cd_gender = 'F' and
+> cd1.cd_education_status = 'Primary' and
+> c_current_cdemo_sk = cd2.cd_demo_sk and
+> c_current_addr_sk = ca_address_sk and
+> c_birth_month in (9,4,3,11,1,12) and
+> d_year = 1999 and
+> ca_state in ('VA','NE','MI','WV','WV','CO','IN')
+> group by 1,2,3,4
+> union all
+> select i_item_id,
+> cast(null as char),
+> cast(null as char),
+> ca_county,
+> avg(cs_quantity) agg1,
+> avg(cs_list_price) agg2,
+> avg(cs_coupon_amt) agg3,
+> avg(cs_sales_price) agg4,
+> avg(cs_net_profit) agg5,
+> avg(c_birth_year) agg6,
+> avg(cd1.cd_dep_count) agg7
+> from catalog_sales, customer_demographics cd1,
+> customer_demographics cd2, customer,
+> customer_address, date_dim, item
+> where cs_sold_date_sk = d_date_sk and
+> cs_item_sk = i_item_sk and
+> cs_bill_cdemo_sk = cd1.cd_demo_sk and
+> cs_bill_customer_sk = c_customer_sk and
+> cd1.cd_gender = 'F' and
+> cd1.cd_education_status = 'Primary' and
+> c_current_cdemo_sk = cd2.cd_demo_sk and
+> c_current_addr_sk = ca_address_sk and
+> c_birth_month in (9,4,3,11,1,12) and
+> d_year = 1999 and
+> ca_state in ('VA','NE','MI','WV','WV','CO','IN')
+> group by 1,2,3,4
+> union all
+> select i_item_id,
+> cast(null as char),
+> cast(null as char),
+> cast(null as char),
+> avg(cs_quantity) agg1,
+> avg(cs_list_price) agg2,
+> avg(cs_coupon_amt) agg3,
+> avg(cs_sales_price) agg4,
+> avg(cs_net_profit) agg5,
+> avg(c_birth_year) agg6,
+> avg(cd1.cd_dep_count) agg7
+> from catalog_sales, customer_demograph...

Changed in trafodion:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments