2.7x performance regression with correlated indexes and sort_intersection in maria-5.1-wl21

Bug #686353 reported by Philip Stoev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
New
Undecided
Igor Babaev

Bug Description

The following query against the DBT-3 dataset

SELECT count(*) FROM lineitem WHERE ( l_receiptDATE BETWEEN '1992-11-01' AND '1992-11-29' ) AND l_shipdate < '1992-11-15';

is 10x times slower when executed with sort_intersection.

Igor says that this is " not interesting because lineitem is naturally ordered by l_receiptDATE and l_receiptDATE is strongly correlated with l_shipdate" , " in the case when indexes are correlated intersection causes degradation: this is a well know fact.".

At the same time, the data set is supposedly standard and in real life, almost all date-based indexes will be naturally ordered and correlated to the PK.

Related branches

Changed in maria:
milestone: none → 5.1
assignee: nobody → Igor Babaev (igorb-seattle)
summary: - 10x performance regression with correlated indexes
+ 10x performance regression with correlated indexes and sort_intersection
+ in maria-5.1-wl21
Revision history for this message
Philip Stoev (philip-stoev) wrote : Re: 10x performance regression with correlated indexes and sort_intersection in maria-5.1-wl21

1. Data volume (DBT-3 factor)
2. Database engine MyISAM
3. Indexes used

           id: 1
  select_type: SIMPLE
        table: lineitem
         type: range
possible_keys: i_l_shipdate,i_l_receiptdate
          key: i_l_receiptdate,i_l_shipdate
      key_len: 4,4
          ref: NULL
         rows: 414
        Extra: Using sort_intersect(i_l_receiptdate,i_l_shipdate); Using where

4. Platform.
Linux eve 2.6.33.3-85.fc13.i686.PAE #1 SMP Thu May 6 18:27:11 UTC 2010 i686 i686 i386 GNU/Linux
Fedora release 13 (Goddard)

5. Execution time

in maria-5.1-wl21 - 0.27 sec , Handler_read_next 72166 , Handler_read_rnd 6296
in maria-5.1 - 0.10 sec, Handler_read_next 7133

Revision history for this message
Philip Stoev (philip-stoev) wrote :

DBT3 factor 0.1

Revision history for this message
Arjen Lentz (arjen-lentz) wrote : Re: [Bug 686353] [NEW] 10x performance regression with correlated indexes and sort_intersection in maria-5.1-wl21

Hi Philip, Igor, all

On 07/12/2010, at 5:45 PM, Philip Stoev wrote:
> The following query against the DBT-3 dataset
>
> SELECT count(*) FROM lineitem WHERE ( l_receiptDATE BETWEEN
> '1992-11-01'
> AND '1992-11-29' ) AND l_shipdate < '1992-11-15';
>
> is 10x times slower when executed with sort_intersection.
>
> Igor says that this is " not interesting because lineitem is naturally
> ordered by l_receiptDATE and l_receiptDATE is strongly correlated
> with
> l_shipdate" , " in the case when indexes are correlated intersection
> causes degradation: this is a well know fact.".

I don't care very much for the specific scenario, I think the main
issue is performance regression.
New features or changes should not make existing/old features slower -
that has been one of the fundamental rules that Monty has worked with
since the start as a guideline for development.

Regards,
Arjen.
--
Arjen Lentz, Exec.Director @ Open Query (http://openquery.com)
Remote expertise & maintenance for MySQL/MariaDB server environments.

Follow us at http://openquery.com/blog/ & http://twitter.com/openquery

Revision history for this message
Philip Stoev (philip-stoev) wrote : Re: 10x performance regression with correlated indexes and sort_intersection in maria-5.1-wl21
summary: - 10x performance regression with correlated indexes and sort_intersection
- in maria-5.1-wl21
+ 2.7x performance regression with correlated indexes and
+ sort_intersection in maria-5.1-wl21
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.