Update stats gets wrong row counts with the first run after loading data

Bug #1402031 reported by Weishiun Tsai on 2014-12-12
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Trafodion
In Progress
High
Barry Fritchman

Bug Description

This problem has been seen several times in the past few builds, the latest one being the v1209_0830 build. When QA populates the g_tpch2x tables on QA clusters, the scripts load the data, do select count(*) on each table, and then run update stats on each table. The select count(*) prior to the update stats show the correct row counts. But showstats after update stats show that the row count in the stats for each table is way off. Some tables get more row counts than the actual row counts and some less.

Rerunning the same set of update stats statements often correct this situation. But this is causing a huge problem for testing. When the stats are this bad, some of the larger queries using these tables would lapse back to nested join and would then hang for 20 hours without finishing.

This problem can’t be reliably reproduced on all clusters, but it does show up frequently enough to cause problems. This case is created to document this problem. Investigation needs to be done in the implementation of update stats to see why select count(*) gets the correct counts while update stats afterwards does not.

=======================================================

Here is the execution output of the select count(*) statements and the update stats statements, in that order, after the data loading.

SQL>select count(*) from region;
(EXPR)
--------------------
                   5

--- 1 row(s) selected.

SQL>select count(*) from nation;
(EXPR)
--------------------
                  25

--- 1 row(s) selected.

SQL>select count(*) from supplier;
(EXPR)
--------------------
               20000

--- 1 row(s) selected.

SQL>select count(*) from customer;
(EXPR)
--------------------
              300000

--- 1 row(s) selected.

SQL>select count(*) from part;
(EXPR)
--------------------
              400000

--- 1 row(s) selected.

SQL>select count(*) from partsupp;
(EXPR)
--------------------
             1600000

--- 1 row(s) selected.

SQL>select count(*) from orders;
(EXPR)
--------------------
             3000000

--- 1 row(s) selected.

SQL>select count(*) from lineitem;
(EXPR)
--------------------
            11997996

--- 1 row(s) selected.

-------------------------------------------------------------------------------
== TEST: tcase.test003
-------------------------------------------------------------------------------
SQL>update statistics for table region on every column;
--- SQL operation complete.

SQL>update statistics for table nation on every column;
--- SQL operation complete.

SQL>update statistics for table supplier on every column;
--- SQL operation complete.

SQL>update statistics for table customer on every column;
--- SQL operation complete.

SQL>update statistics for table part on every column;
--- SQL operation complete.

SQL>update statistics for table partsupp on every column;
--- SQL operation complete.

SQL>update statistics for table orders on every column sample random 10 percent;
--- SQL operation complete.

SQL>update statistics for table lineitem on every column sample random 10 percent;
--- SQL operation complete.

=======================================================

Here is the showstats output for each table after update stats statements were first run. The row counts are completely off.

>>set schema g_tpch2x;

--- SQL operation complete.
>>showstats for table CUSTOMER on every column;

Histogram data for Table TRAFODION.G_TPCH2X.CUSTOMER
Table ID: 314741800727389741

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1392898339 48 1100767 1100767 C_CUSTKEY
1392898334 62 1100767 1100767 C_NAME
1392898329 62 1100767 1100767 C_ADDRESS
1392898324 25 1100767 0 C_NATIONKEY
1392898319 62 1100767 1100767 C_PHONE
1392898314 36 1100767 6 C_ACCTBAL
1392898309 5 1100767 0 C_MKTSEGMENT
1392898304 62 1100767 0 C_COMMENT
1392898299 8 1100767 0 "_SALT_"
1392898344 1 1100767 1100767 "_SALT_", C_CUSTKEY

--- SQL operation complete.
>>showstats for table LINEITEM on every column;

Histogram data for Table TRAFODION.G_TPCH2X.LINEITEM
Table ID: 314741800727390211

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1537555901 28 12720858 2555961 L_ORDERKEY
1537555896 52 12720858 399694 L_PARTKEY
1537555891 48 12720858 20000 L_SUPPKEY
1537555886 7 12720858 7 L_LINENUMBER
1537555881 50 12720858 50 L_QUANTITY
1537555876 39 12720858 939869 L_EXTENDEDPRICE
1537555871 11 12720858 11 L_DISCOUNT
1537555866 9 12720858 9 L_TAX
1537555861 3 12720858 3 L_RETURNFLAG
1537555856 2 12720858 2 L_LINESTATUS
1537555851 50 12720858 2526 L_SHIPDATE
1537555846 50 12720858 2466 L_COMMITDATE
1537555841 50 12720858 2550 L_RECEIPTDATE
1537555836 4 12720858 4 L_SHIPINSTRUCT
1537555831 7 12720858 7 L_SHIPMODE
1537555826 62 12720858 7447670 L_COMMENT
1537555821 8 12720858 8 "_SALT_"
1537555906 1 12720858 12720858 "_SALT_", L_SHIPDATE, L_ORDERKEY, L_LINENUMBER

--- SQL operation complete.
>>showstats for table NATION on every column;

Histogram data for Table TRAFODION.G_TPCH2X.NATION
Table ID: 314741800727389515

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1374739566 25 25 25 N_NATIONKEY
1374739561 25 25 25 N_NAME
1374739556 5 25 5 N_REGIONKEY
1374739551 25 25 25 N_COMMENT
1374739546 8 25 8 "_SALT_"
1374739571 1 25 25 "_SALT_", N_NATIONKEY

--- SQL operation complete.
>>showstats for table ORDERS on every column;

Histogram data for Table TRAFODION.G_TPCH2X.ORDERS
Table ID: 314741800727390091

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1464778717 42 4585155 4585155 O_ORDERKEY
1464778712 56 4585155 193098 O_CUSTKEY
1464778707 3 4585155 3 O_ORDERSTATUS
1464778702 57 4585155 4045135 O_TOTALPRICE
1464778697 50 4585155 2406 O_ORDERDATE
1464778692 5 4585155 5 O_ORDERPRIORITY
1464778687 62 4585155 2000 O_CLERK
1464778682 1 4585155 1 O_SHIPPRIORITY
1464778677 62 4585155 4259632 O_COMMENT
1464778672 8 4585155 8 "_SALT_"
1464778722 1 4585155 4585155 "_SALT_", O_ORDERKEY

--- SQL operation complete.
>>showstats for table PART on every column;

Histogram data for Table TRAFODION.G_TPCH2X.PART
Table ID: 314741800727389857

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1403671365 48 1725243 1725243 P_PARTKEY
1403671360 62 1725243 1279978 P_NAME
1403671355 5 1725243 0 P_MFGR
1403671350 25 1725243 0 P_BRAND
1403671345 150 1725243 0 P_TYPE
1403671340 50 1725243 0 P_SIZE
1403671335 40 1725243 0 P_CONTAINER
1403671330 55 1725243 0 P_RETAILPRICE
1403671325 76 1725243 0 P_COMMENT
1403671320 8 1725243 0 "_SALT_"
1403671370 1 1725243 1725243 "_SALT_", P_PARTKEY

--- SQL operation complete.
>>showstats for table PARTSUPP on every column;

Histogram data for Table TRAFODION.G_TPCH2X.PARTSUPP
Table ID: 314741800727389978

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1433360836 48 2972264 0 PS_PARTKEY
1433360831 48 2972264 0 PS_SUPPKEY
1433360826 48 2972264 0 PS_AVAILQTY
1433360821 48 2972264 0 PS_SUPPLYCOST
1433360816 62 2972264 2205231 PS_COMMENT
1433360811 8 2972264 0 "_SALT_"
1433360841 1 2972264 2972264 "_SALT_", PS_PARTKEY, PS_SUPPKEY

--- SQL operation complete.
>>showstats for table REGION on every column;

Histogram data for Table TRAFODION.G_TPCH2X.REGION
Table ID: 314741800727389399

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1369726849 5 5 5 R_REGIONKEY
1369726844 5 5 5 R_NAME
1369726839 5 5 5 R_COMMENT
1369726834 3 5 3 "_SALT_"
1369726854 1 5 5 "_SALT_", R_REGIONKEY

--- SQL operation complete.
>>showstats for table SUPPLIER on every column;

Histogram data for Table TRAFODION.G_TPCH2X.SUPPLIER
Table ID: 314741800727389626

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1380401507 48 64035 64035 S_SUPPKEY
1380401502 62 64035 64035 S_NAME
1380401497 62 64035 64035 S_ADDRESS
1380401492 25 64035 0 S_NATIONKEY
1380401487 62 64035 64035 S_PHONE
1380401482 56 64035 0 S_ACCTBAL
1380401477 62 64035 47479 S_COMMENT
1380401472 8 64035 0 "_SALT_"
1380401512 1 64035 64035 "_SALT_", S_SUPPKEY

--- SQL operation complete.

=======================================================

Here is the showstats output after rerunning the same set of update stats statements. The row counts are now correct.

>>set schema g_tpch2x;

--- SQL operation complete.
>>showstats for table CUSTOMER on every column;

Histogram data for Table TRAFODION.G_TPCH2X.CUSTOMER
Table ID: 314741800727389741

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1392898338 48 300000 300000 C_CUSTKEY
1392898335 62 300000 300000 C_NAME
1392898328 62 300000 300000 C_ADDRESS
1392898325 25 300000 25 C_NATIONKEY
1392898318 62 300000 300000 C_PHONE
1392898315 36 300000 262499 C_ACCTBAL
1392898308 5 300000 5 C_MKTSEGMENT
1392898305 62 300000 299263 C_COMMENT
1392898298 8 300000 8 "_SALT_"
1392898345 1 300000 300000 "_SALT_", C_CUSTKEY

--- SQL operation complete.
>>showstats for table LINEITEM on every column;

Histogram data for Table TRAFODION.G_TPCH2X.LINEITEM
Table ID: 314741800727390211

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1537555900 27 11997996 2536382 L_ORDERKEY
1537555897 60 11997996 399669 L_PARTKEY
1537555890 48 11997996 20000 L_SUPPKEY
1537555887 7 11997996 7 L_LINENUMBER
1537555880 50 11997996 50 L_QUANTITY
1537555877 40 11997996 940199 L_EXTENDEDPRICE
1537555870 11 11997996 11 L_DISCOUNT
1537555867 9 11997996 9 L_TAX
1537555860 3 11997996 3 L_RETURNFLAG
1537555857 2 11997996 2 L_LINESTATUS
1537555850 50 11997996 2526 L_SHIPDATE
1537555847 50 11997996 2466 L_COMMITDATE
1537555840 50 11997996 2549 L_RECEIPTDATE
1537555837 4 11997996 4 L_SHIPINSTRUCT
1537555830 7 11997996 7 L_SHIPMODE
1537555827 62 11997996 7101582 L_COMMENT
1537555820 8 11997996 8 "_SALT_"
1537555907 1 11997996 11997996 "_SALT_", L_SHIPDATE, L_ORDERKEY, L_LINENUMBER

--- SQL operation complete.
>>showstats for table NATION on every column;

Histogram data for Table TRAFODION.G_TPCH2X.NATION
Table ID: 314741800727389515

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1374739567 25 25 25 N_NATIONKEY
1374739560 25 25 25 N_NAME
1374739557 5 25 5 N_REGIONKEY
1374739550 25 25 25 N_COMMENT
1374739547 8 25 8 "_SALT_"
1374739570 1 25 25 "_SALT_", N_NATIONKEY

--- SQL operation complete.
>>showstats for table ORDERS on every column;

Histogram data for Table TRAFODION.G_TPCH2X.ORDERS
Table ID: 314741800727390091

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1464778716 30 3000000 3000000 O_ORDERKEY
1464778713 59 3000000 191492 O_CUSTKEY
1464778706 3 3000000 3 O_ORDERSTATUS
1464778703 53 3000000 2759999 O_TOTALPRICE
1464778696 50 3000000 2406 O_ORDERDATE
1464778693 5 3000000 5 O_ORDERPRIORITY
1464778686 62 3000000 2000 O_CLERK
1464778683 1 3000000 1 O_SHIPPRIORITY
1464778676 62 3000000 2898572 O_COMMENT
1464778673 8 3000000 8 "_SALT_"
1464778723 1 3000000 3000000 "_SALT_", O_ORDERKEY

--- SQL operation complete.
>>showstats for table PART on every column;

Histogram data for Table TRAFODION.G_TPCH2X.PART
Table ID: 314741800727389857

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1403671364 48 400000 400000 P_PARTKEY
1403671361 62 400000 399982 P_NAME
1403671354 5 400000 5 P_MFGR
1403671351 25 400000 25 P_BRAND
1403671344 150 400000 150 P_TYPE
1403671341 50 400000 50 P_SIZE
1403671334 40 400000 40 P_CONTAINER
1403671331 55 400000 22097 P_RETAILPRICE
1403671324 76 400000 69074 P_COMMENT
1403671321 8 400000 8 "_SALT_"
1403671371 1 400000 400000 "_SALT_", P_PARTKEY

--- SQL operation complete.
>>showstats for table PARTSUPP on every column;

Histogram data for Table TRAFODION.G_TPCH2X.PARTSUPP
Table ID: 314741800727389978

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1433360837 48 1600000 400000 PS_PARTKEY
1433360830 48 1600000 20000 PS_SUPPKEY
1433360827 48 1600000 9999 PS_AVAILQTY
1433360820 48 1600000 99901 PS_SUPPLYCOST
1433360817 62 1600000 1599985 PS_COMMENT
1433360810 8 1600000 8 "_SALT_"
1433360840 1 1600000 1600000 "_SALT_", PS_PARTKEY, PS_SUPPKEY

--- SQL operation complete.
>>showstats for table REGION on every column;

Histogram data for Table TRAFODION.G_TPCH2X.REGION
Table ID: 314741800727389399

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1369726848 5 5 5 R_REGIONKEY
1369726845 5 5 5 R_NAME
1369726838 5 5 5 R_COMMENT
1369726835 3 5 3 "_SALT_"
1369726855 1 5 5 "_SALT_", R_REGIONKEY

--- SQL operation complete.
>>showstats for table SUPPLIER on every column;

Histogram data for Table TRAFODION.G_TPCH2X.SUPPLIER
Table ID: 314741800727389626

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1380401506 48 20000 20000 S_SUPPKEY
1380401503 62 20000 20000 S_NAME
1380401496 62 20000 20000 S_ADDRESS
1380401493 25 20000 25 S_NATIONKEY
1380401486 62 20000 20000 S_PHONE
1380401483 56 20000 19803 S_ACCTBAL
1380401476 62 20000 19972 S_COMMENT
1380401473 8 20000 8 "_SALT_"
1380401513 1 20000 20000 "_SALT_", S_SUPPKEY

--- SQL operation complete.

Anoop Sharma (anoop-sharma) wrote :

Barry, Doesn't update stats gets the count through an internal interface to hbase/hdfs?
Are there times when that may not be correct if some data has not been flushed or some
other reason?

Assigning to Barry.

Changed in trafodion:
assignee: nobody → Barry Fritchman (barry-fritchman)

A likely workaround for this problem is to turn off the cqd USTAT_ESTIMATE_HBASE_ROW_COUNT before running Update Stats.
This will cause the slower but precisely accurate count(*) method to be used instead of estimating based on HBase meta-information for the HFiles and memstore of the table.

The sporadic nature of the problem suggests some possible conflict with a concurrently executing hbase operation like a flush, compaction, region split, etc. Also, the estimate will not be as accurate if a large percentage of the table's data is still in the memstore, since the row count in the memstore can't be measured as accurately. However, this in itself should not produce a grossly inaccurate estimate as reported in this case.

Changed in trafodion:
status: New → In Progress
Weishiun Tsai (wei-shiun-tsai) wrote :
Download full text (6.7 KiB)

This is seen on the v1217_0830 build again. I am changing this case to critical so the implementation of update stats can be looked at to ensure 100% row count accuracy regardless when it is run, and/or if we need to turn off USTAT_ESTIMATE_HBASE_ROW_COUNT in the upcoming release as the default value if we can't fix this problem in time.

>>set schema g_tpch2x;

--- SQL operation complete.
>>showstats for table CUSTOMER on every column;

Histogram data for Table TRAFODION.G_TPCH2X.CUSTOMER
Table ID: 206656109855119777

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1139902532 48 1100767 1100767 C_CUSTKEY
1139902527 62 1100767 1100767 C_NAME
1139902522 62 1100767 1100767 C_ADDRESS
1139902517 25 1100767 0 C_NATIONKEY
1139902512 62 1100767 1100767 C_PHONE
1139902507 36 1100767 6 C_ACCTBAL
1139902502 5 1100767 0 C_MKTSEGMENT
1139902497 62 1100767 0 C_COMMENT
1139902492 8 1100767 0 "_SALT_"
1139902537 1 1100767 1100767 "_SALT_", C_CUSTKEY

--- SQL operation complete.
>>showstats for table LINEITEM on every column;

Histogram data for Table TRAFODION.G_TPCH2X.LINEITEM
Table ID: 206656109855120177

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1309294189 35 12696621 2555377 L_ORDERKEY
1309294184 54 12696621 399594 L_PARTKEY
1309294179 48 12696621 20000 L_SUPPKEY
1309294174 7 12696621 7 L_LINENUMBER
1309294169 50 12696621 50 L_QUANTITY
1309294164 37 12696621 940465 L_EXTENDEDPRICE
1309294159 11 12696621 11 L_DISCOUNT
1309294154 9 12696621 9 L_TAX
1309294149 3 12696621 3 L_RETURNFLAG
1309294144 2 12696621 2 L_LINESTATUS
1309294139 50 12696621 2526 L_SHIPDATE
1309294134 50 12696621 2466 L_COMMITDATE
1309294129 50 12696621 2549 L_RECEIPTDATE
1309294124 4 12696621 4 L_SHIPINSTRUCT
1309294119 7 12696621 7 L_SHIPMODE
1309294114 62 12696621 7448679 L_COMMENT
1309294109 8 12696621 8 "_SALT_"
1309294194 1 12696621 12696621 "_SALT_", L_SHIPDATE, L_ORDERKEY, L_LINENUMBER

--- SQL operation complete.
>>showstats for table NATION on every column;

Histogram data for Table TRAFODION.G_TPCH2X.NATION
Table ID: 206656109855119583

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
1115960438 25 25 25 N_NATIONKEY
1115960433 25 25 25 N_NAME
1115960428 5 25 5 N_REGIONKEY
1115960423 25 25 25 N_COMMENT
1115960418 8 25 8 "_SALT_"
1115960443 1 25 25 "_SALT_", N_NATIONKEY

--- SQL operation complete.
>>showstats for table ORDERS on every column;

Histogram data for Table TRAFODION.G_TPCH2X.ORDERS
Tabl...

Read more...

Changed in trafodion:
importance: High → Critical

Change delivered to turn off row count estimation by default, so I reduced from critical to high.

Changed in trafodion:
importance: Critical → High

Reviewed: https://review.trafodion.org/869
Committed: https://github.com/trafodion/core/commit/ea071d251c52f0f613c739eedc64bd1be534707c
Submitter: Trafodion Jenkins
Branch: master

commit ea071d251c52f0f613c739eedc64bd1be534707c
Author: Barry Fritchman <email address hidden>
Date: Thu Dec 18 21:05:23 2014 +0000

    Turn off ustat row count estimation by default

    The initial execution of Update Statistics after loading a
    table sometimes gets inaccurate HBase row count estimates.
    Until the cause of this variance is determined and fixed,
    the cqd controlling use of estimation has been set to OFF
    by default.

    Change-Id: I698f5ac8407495f890a242c82a1946a5506ceebf
    Partial-Bug: #1402031

Weishiun Tsai (wei-shiun-tsai) wrote :

A quick note on what is seen on the v0106_0830 build, in which the ustat row count estimation is turned off. The stat row count for the lineitem table (the largest table, which has 11997996 rows) was still off by 720656 rows (higher than the actual number) with update stats run right after table loading. It took the 2nd run of update stats to correct this:

Here are the stats when update stats were run right after the table loading:

>>showstats for table LINEITEM on every column;

Histogram data for Table TRAFODION.G_TPCH2X.LINEITEM
Table ID: 55203448999370

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
 688077107 27 12718652 2554238 L_ORDERKEY
 688077102 53 12718652 399775 L_PARTKEY
 688077097 48 12718652 20000 L_SUPPKEY
 688077092 7 12718652 7 L_LINENUMBER
 688077087 50 12718652 50 L_QUANTITY
 688077082 37 12718652 939218 L_EXTENDEDPRICE
 688077077 11 12718652 11 L_DISCOUNT
 688077072 9 12718652 9 L_TAX
 688077067 3 12718652 3 L_RETURNFLAG
 688077062 2 12718652 2 L_LINESTATUS
 688077057 50 12718652 2526 L_SHIPDATE
 688077052 50 12718652 2466 L_COMMITDATE
 688077047 50 12718652 2556 L_RECEIPTDATE
 688077042 4 12718652 4 L_SHIPINSTRUCT
 688077037 7 12718652 7 L_SHIPMODE
 688077032 62 12718652 7388302 L_COMMENT
 688077027 8 12718652 8 "_SALT_"
 688077112 1 12718652 12718652 "_SALT_", L_SHIPDATE, L_ORDERKEY, L_LINENUMBER

--- SQL operation complete.

Here are the stats after rerunning update stats:

>>showstats for table LINEITEM on every column;

Histogram data for Table TRAFODION.G_TPCH2X.LINEITEM
Table ID: 55203448999370

   Hist ID # Ints Rowcount UEC Colname(s)
========== ====== =========== =========== ===========================
 688077106 35 11997996 2532066 L_ORDERKEY
 688077103 53 11997996 399552 L_PARTKEY
 688077096 48 11997996 20000 L_SUPPKEY
 688077093 7 11997996 7 L_LINENUMBER
 688077086 50 11997996 50 L_QUANTITY
 688077083 37 11997996 939096 L_EXTENDEDPRICE
 688077076 11 11997996 11 L_DISCOUNT
 688077073 9 11997996 9 L_TAX
 688077066 3 11997996 3 L_RETURNFLAG
 688077063 2 11997996 2 L_LINESTATUS
 688077056 50 11997996 2526 L_SHIPDATE
 688077053 50 11997996 2466 L_COMMITDATE
 688077046 50 11997996 2553 L_RECEIPTDATE
 688077043 4 11997996 4 L_SHIPINSTRUCT
 688077036 7 11997996 7 L_SHIPMODE
 688077033 62 11997996 7098922 L_COMMENT
 688077026 8 11997996 8 "_SALT_"
 688077113 1 11997996 11997996 "_SALT_", L_SHIPDATE, L_ORDERKEY, L_LINENUMBER

--- SQL operation complete.

Changed in trafodion:
milestone: none → r1.0
Changed in trafodion:
milestone: r1.0 → r1.1
Changed in trafodion:
milestone: r1.1 → r2.0
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers