Optimizer uses index access instead of FTS
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona Server moved to https://jira.percona.com/projects/PS |
New
|
Undecided
|
Unassigned |
Bug Description
Observed on percona-server 5.7.18, 5.7.19
Issue: if some indexed column contains only a few values and some value takes almost whole table optimizer uses index access despite the fact that FTS will be more preferable.
Example:
CREATE TABLE `t1` (
`id` SERIAL,
`environment` tinyint(3) unsigned DEFAULT NULL,
`status` enum('Y','N') DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `env_idx` (`environment`)
) ENGINE=InnoDB;
mysql> select environment, count(*) from t1 group by environment;
+------
| environment | count(*) |
+------
| 1 | 1000000 |
| 2 | 100 |
+------
mysql> analyze table t1;
mysql> select count(*) from t1 where environment = 1 and status = 'Y';
+----------+
| count(*) |
+----------+
| 900000 |
+----------+
1 row in set (12.70 sec)
If I disable index with hint it uses FTS and selects data much faster:
mysql> select count(*) from t1 ignore index (env_idx) where environment = 1 and status = 'Y';
+----------+
| count(*) |
+----------+
| 900000 |
+----------+
1 row in set (0.61 sec)
Explain shows that optimizer thinks that there are only 499292 rows with environment = 1:
mysql> explain format=json select count(*) from t1 where environment = 1 and status = 'Y';
"query_cost": "105726.40"
"access_type": "ref",
"key": "env_idx",
"rows_examined_
"read_cost": "5868.00",
"eval_cost": "49929.20"
So the total cost is cheaper then FTS:
mysql> explain format=json select count(*) from t1 ignore index (env_idx) where environment = 1 and status = 'Y';
"query_cost": "201673.00"
"access_type": "ALL",
"rows_examined_
"read_cost": "101814.50"
"eval_cost": "99858.50"
I've found that such incorrect number is set in the code:
https:/
/* Do not estimate the number of rows in the range
to over 1 / 2 of the estimated rows in the whole
table */
I tried to find the reason of it but with no luck - it was added somewhere in 4.1 so the commit history is unavailable.
When I removed these lines with the attached patch it fixed observed issue:
mysql> explain format=json select count(*) from t1 where environment = 1 and status = 'Y';
"query_cost": "201673.00"
"access_type": "ALL",
"rows_examined_
"read_cost": "101814.50",
"eval_cost": "99858.50",
Costs of index access in this case:
"query_cost": "413256.40"
"access_type": "ref",
"key": "env_idx",
"rows_examined_
"read_cost": "5868.00",
"eval_cost": "203694.20",
tags: | added: contribution |
Thank you for the report.
This is duplicate of bug #1484311 which was fixed in version 5.7