Comment 4 for bug 1671152

Revision history for this message
George Ormond Lorch III (gl-az) wrote :

So we had an extended offline discussion and some debugging and it comes down to a few things.

1) Logical row counting in PerconaFT is a moving target as operations can be counted at the time of operation, and others can not be known until the message is persistently delivered to the leaf nodes. It 'floats' towards accurate but is never going to be 100% accurate except for the moment after either an OPTIMIZE TABLE or an ANALYZE TABLE with RECOUNT_ROWS
2) Different indices, PK and secondaries, on the same table will report different logical row count estimates back simply due to the dynamics of the PerconaFT messaging scheme, different record sizes, data concentration within nodes, background threads randomly picking nodes to perform work on, etc, etc...all impact the 'in-flight' messaging and means that no two indices will ever be identical.
3) When an OPTIMIZE is run, it flushes all messages to the leaf nodes and removes any garbage from all of the index trees, and as a result normalizes the row counts across the indices.
4) There is a bug (I say bug because it is a different behavior from InnoDB) in TokuDBs ha_tokudb::records_in_range method when the optimizer asks for the number of records in the range of -infinity to +infinity. TokuDB returns the logical row count for that index, not the table/PK which tends to always be more accurate. This causes the optimizer to calculate different costs for different indices as a result of getting sometimes drastically different logical row counts. InnoDB returns the same 'table count' for all indices. The bug fix should be to have TokuDB return the logical row count value from the PK for all indices so that the optimizer can compare apples to apples w.r.t. cardinality statistics rather than apples to Volkswagens.