However, I must confess that I may not know how to get table_statistics to recognize v, v2, v3 and v4 as different tables in the output. Mine always show only v on table_statistics, but the explain output does recognize them as different tables. I am using:
SELECT count(*)
FROM v
LEFT JOIN c
ON v.cid = c.cid
WHERE v.t >= '2012-01-31 05:00:00'
AND v.t <= '2012-02-08 07:59:59'
AND v.did = '208'
AND c.pid = '3124'
AND v.cid IN
( SELECT c.cid
FROM c
WHERE c.pid = '3124'
AND c.s = 0
AND
( ( c.cid IN ( /*Inner query 1*/ SELECT v2.cid FROM v as v2 WHERE v2.did = 208 AND v2.t >= '2012-01-31 05:00:00' AND v2.t <= '2012-02-08 07:59:59' ) ) AND ( c.cid IN ( /*Inner query 2*/ SELECT v3.cid FROM v as v3 WHERE v3.did = 208 AND v3.t >= '2012-01-31 05:00:00' AND v3.t <= '2012-02-08 07:59:59' ) ) AND ( c.cid IN ( /*Inner query 3*/ SELECT v4.cid FROM v as v4 WHERE v4.did = 208 AND v4.t >= '2012-01-31 05:00:00' AND v4.t <= '2012-02-08 07:59:59' ) )
)
);
I had noticed that each v (v, v2, v3, v4) in your statistics reads roughly 14x as many rows as the previous. I was trying to see if a v5 would read ~14*837280.
That is an interesting pattern and may lead us to the culprit. I have a debug build compiled, if you can paste in the commands I need to start mysql with to get the relevant debugging for the optimizer, I can have a look at the source and better understand the problem.
Hi Sergey,
That is a very interesting last result you posted. I just compiled 5.5.25 and attempted to reproduce the SHOW TABLE_STATISTICS like this:
+------ ------- -+----- ------- +------ -----+- ------- ------+ ------- ------- ------- ----+ x_#indexes | ------- -+----- ------- +------ -----+- ------- ------+ ------- ------- ------- ----+ ------- -+----- ------- +------ -----+- ------- ------+ ------- ------- ------- ----+
| Table_schema | Table_name | Rows_read | Rows_changed | Rows_changed_
+------
| bug929732 | v | 276 | 0 | 0 |
| bug929732 | v2 | 3994 | 0 | 0 |
| bug929732 | v3 | 53506 | 0 | 0 |
| bug929732 | v4 | 837280 | 0 | 0 |
| bug929732 | c | 498 | 0 | 0 |
+------
However, I must confess that I may not know how to get table_statistics to recognize v, v2, v3 and v4 as different tables in the output. Mine always show only v on table_statistics, but the explain output does recognize them as different tables. I am using:
SELECT count(*)
(
c. cid IN
( /*Inner query 1*/ SELECT v2.cid
FROM v as v2
WHERE v2.did = 208
AND v2.t >= '2012-01-31 05:00:00'
AND v2.t <= '2012-02-08 07:59:59'
)
)
AND
(
c. cid IN
( /*Inner query 2*/ SELECT v3.cid
FROM v as v3
WHERE v3.did = 208
AND v3.t >= '2012-01-31 05:00:00'
AND v3.t <= '2012-02-08 07:59:59'
)
)
AND
(
c. cid IN
( /*Inner query 3*/ SELECT v4.cid
FROM v as v4
WHERE v4.did = 208
AND v4.t >= '2012-01-31 05:00:00'
AND v4.t <= '2012-02-08 07:59:59'
)
)
FROM v
LEFT JOIN c
ON v.cid = c.cid
WHERE v.t >= '2012-01-31 05:00:00'
AND v.t <= '2012-02-08 07:59:59'
AND v.did = '208'
AND c.pid = '3124'
AND v.cid IN
( SELECT c.cid
FROM c
WHERE c.pid = '3124'
AND c.s = 0
AND
(
)
);
I had noticed that each v (v, v2, v3, v4) in your statistics reads roughly 14x as many rows as the previous. I was trying to see if a v5 would read ~14*837280.
That is an interesting pattern and may lead us to the culprit. I have a debug build compiled, if you can paste in the commands I need to start mysql with to get the relevant debugging for the optimizer, I can have a look at the source and better understand the problem.