Optimizer uses index access instead of FTS

Bug #1733520 reported by Nickolay Malkov
This bug report is a duplicate of:  Bug #1484311: Wrong cardinality for InnoDB table. Edit Remove
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
New
Undecided
Unassigned

Bug Description

Observed on percona-server 5.7.18, 5.7.19

Issue: if some indexed column contains only a few values and some value takes almost whole table optimizer uses index access despite the fact that FTS will be more preferable.

Example:

CREATE TABLE `t1` (
  `id` SERIAL,
  `environment` tinyint(3) unsigned DEFAULT NULL,
  `status` enum('Y','N') DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `env_idx` (`environment`)
) ENGINE=InnoDB;

mysql> select environment, count(*) from t1 group by environment;
+-------------+----------+
| environment | count(*) |
+-------------+----------+
| 1 | 1000000 |
| 2 | 100 |
+-------------+----------+

mysql> analyze table t1;

mysql> select count(*) from t1 where environment = 1 and status = 'Y';
+----------+
| count(*) |
+----------+
| 900000 |
+----------+
1 row in set (12.70 sec)

If I disable index with hint it uses FTS and selects data much faster:

mysql> select count(*) from t1 ignore index (env_idx) where environment = 1 and status = 'Y';
+----------+
| count(*) |
+----------+
| 900000 |
+----------+
1 row in set (0.61 sec)

Explain shows that optimizer thinks that there are only 499292 rows with environment = 1:

mysql> explain format=json select count(*) from t1 where environment = 1 and status = 'Y';
"query_cost": "105726.40"
"access_type": "ref",
"key": "env_idx",
"rows_examined_per_scan": 499292,
"read_cost": "5868.00",
"eval_cost": "49929.20"

So the total cost is cheaper then FTS:

mysql> explain format=json select count(*) from t1 ignore index (env_idx) where environment = 1 and status = 'Y';
"query_cost": "201673.00"
"access_type": "ALL",
"rows_examined_per_scan": 998585
"read_cost": "101814.50"
"eval_cost": "99858.50"

I've found that such incorrect number is set in the code:
https://github.com/percona/percona-server/blob/5.7/storage/innobase/btr/btr0cur.cc#L5934
/* Do not estimate the number of rows in the range
to over 1 / 2 of the estimated rows in the whole
table */

I tried to find the reason of it but with no luck - it was added somewhere in 4.1 so the commit history is unavailable.

When I removed these lines with the attached patch it fixed observed issue:
mysql> explain format=json select count(*) from t1 where environment = 1 and status = 'Y';
"query_cost": "201673.00"
"access_type": "ALL",
"rows_examined_per_scan": 998585,
"read_cost": "101814.50",
"eval_cost": "99858.50",

Costs of index access in this case:
"query_cost": "413256.40"
"access_type": "ref",
"key": "env_idx",
"rows_examined_per_scan": 2036942,
"read_cost": "5868.00",
"eval_cost": "203694.20",

Tags: contribution
Revision history for this message
Nickolay Malkov (darkmind) wrote :
tags: added: contribution
Revision history for this message
Sveta Smirnova (svetasmirnova) wrote :

Thank you for the report.

This is duplicate of bug #1484311 which was fixed in version 5.7

Revision history for this message
Nickolay Malkov (darkmind) wrote :

Hi Sveta,

> This is duplicate of bug #1484311 which was fixed in version 5.7

It is strange because I've checked it on 5.7.18 and 5.7.19. Also pls pay attention that mentioned code still exists in 5.7 branch file innobase/btr/btr0cur.cc#L5934 (not ha_innodb.cc).

mysql> select version() from dual;
+-----------+
| version() |
+-----------+
| 5.7.18-14 |
+-----------+

I've checked that mentioned issue from bug #1484311 doesn't exist (as expected) but exists the one from this report.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.