Cardinality=0 for all rows in a TokuDB table

Bug #1607300 reported by jocelyn fournier
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Status tracked in 5.7
5.6
Fix Released
Critical
George Ormond Lorch III
5.7
Fix Released
Critical
George Ormond Lorch III

Bug Description

Hi,

On a table converted from InnoDB to TokuDB and which was working properly during several days, I suddently hit a strange problem where the cardinality of the table for all the row (even the PK) changed to 0 (with the dramatic impact on all the query plans, the server went down) :

CREATE TABLE `sc_user` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `date_creation` timestamp NULL DEFAULT NULL,
  `date_validation` timestamp NULL DEFAULT NULL,
  `date_last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT ' ',
  `date_last_session` timestamp NULL DEFAULT NULL,
  `date_login_change` timestamp NULL DEFAULT NULL,
  `email` varchar(128) NOT NULL DEFAULT '',
  `username` varchar(20) DEFAULT '',
  `pass` varchar(60) DEFAULT '' COMMENT ' ',
  `first_name` varchar(30) DEFAULT NULL,
  `last_name` varchar(30) DEFAULT NULL,
  `displayed_name` varchar(30) DEFAULT NULL,
  `valid_email` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `email_validation_code` varchar(32) DEFAULT NULL,
  `accept_tos` tinyint(3) unsigned DEFAULT NULL,
  `accept_mailing` tinyint(3) unsigned DEFAULT NULL,
  `accept_mailing_partner` tinyint(3) unsigned DEFAULT NULL,
  `gender_id` tinyint(3) unsigned DEFAULT NULL,
  `date_of_birth` date DEFAULT NULL,
  `country_id` int(10) unsigned DEFAULT NULL,
  `zip_code_id` int(10) unsigned DEFAULT NULL,
  `zip` varchar(8) DEFAULT NULL,
  `state` tinyint(4) NOT NULL DEFAULT '0',
  `gen_unread_count` int(10) unsigned NOT NULL DEFAULT '0',
  `json` text,
  `json_avatar` varchar(500) DEFAULT NULL,
  `about` text,
  `alert_feed` tinyint(3) unsigned NOT NULL DEFAULT '1',
  `alert_weekly_feed` tinyint(3) unsigned NOT NULL DEFAULT '1',
  `alert_info` tinyint(3) unsigned NOT NULL DEFAULT '1',
  `alert_follower` tinyint(3) unsigned NOT NULL DEFAULT '1',
  `alert_badge` tinyint(3) unsigned NOT NULL DEFAULT '1',
  `alert_newsletter` tinyint(3) unsigned NOT NULL DEFAULT '1',
  `alert_advice` tinyint(3) NOT NULL DEFAULT '1',
  `alert_message` tinyint(3) NOT NULL DEFAULT '1',
  `alert_like` tinyint(3) unsigned NOT NULL DEFAULT '1',
  `alert_relance` tinyint(3) unsigned NOT NULL DEFAULT '1',
  `alert_forum_comment` tinyint(3) unsigned NOT NULL DEFAULT '1',
  `url_www` varchar(200) DEFAULT NULL,
  `privacy_profile` tinyint(4) DEFAULT '0',
  `privacy_name` tinyint(4) DEFAULT '1',
  `privacy_gender` tinyint(4) DEFAULT '1',
  `privacy_facebook` tinyint(4) DEFAULT '0',
  `privacy_twitter` tinyint(4) DEFAULT '0',
  `privacy_gplus` tinyint(4) DEFAULT '1',
  `privacy_age` tinyint(4) DEFAULT '1',
  `privacy_avatar` tinyint(4) DEFAULT '1',
  `privacy_geo` tinyint(4) DEFAULT '1',
  `privacy_about` tinyint(4) DEFAULT '1',
  `show_welcome` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `show_tv_welcome` tinyint(3) unsigned NOT NULL DEFAULT '1',
  `show_profile_type` tinyint(4) DEFAULT NULL,
  `show_homepage_type` tinyint(4) DEFAULT NULL,
  `show_tv_agenda` tinyint(4) unsigned NOT NULL DEFAULT '1',
  `cookie_ref` varchar(32) DEFAULT NULL,
  `affinity_ref` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `user_id_invitation` int(10) unsigned DEFAULT NULL,
  `password_reset_code` varchar(32) DEFAULT NULL,
  `date_password_reset` timestamp NULL DEFAULT NULL,
  `favorite_display` tinyint(3) unsigned NOT NULL DEFAULT '1',
  `api_id_referrer` int(3) unsigned DEFAULT NULL,
  `state_validation_collection_review` tinyint(4) NOT NULL DEFAULT '0',
  `state_validation_list` tinyint(4) NOT NULL DEFAULT '0',
  `state_validation_message` tinyint(4) NOT NULL DEFAULT '0',
  `state_validation_shout` tinyint(4) NOT NULL DEFAULT '0',
  `date_delete_init` timestamp NULL DEFAULT NULL,
  `date_delete` timestamp NULL DEFAULT NULL,
  `state_admin` int(11) DEFAULT NULL,
  `gen_scout_count` int(10) unsigned NOT NULL DEFAULT '0',
  `gen_follower_count` int(10) unsigned NOT NULL DEFAULT '0',
  `gen_collection_count` int(10) unsigned NOT NULL DEFAULT '0',
  `gen_review_count` int(10) unsigned NOT NULL DEFAULT '0',
  `gen_list_count` int(10) NOT NULL DEFAULT '0',
  `flag_gen` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `subtype_id` int(10) unsigned DEFAULT '15',
  `is_merge_notifications` tinyint(4) NOT NULL DEFAULT '0',
  `alert_privilege` tinyint(4) NOT NULL DEFAULT '0',
  `alert_agenda` tinyint(4) NOT NULL DEFAULT '1',
  `alert_agenda_email` tinyint(4) NOT NULL DEFAULT '1',
  `alert_contest` tinyint(4) unsigned NOT NULL DEFAULT '1',
  PRIMARY KEY (`id`),
  UNIQUE KEY `idx_user__email` (`email`) USING BTREE,
  KEY `idx_username` (`username`),
  KEY `idx_flag_gen` (`flag_gen`),
  KEY `state` (`state`,`gen_collection_count`),
  KEY `password_reset_code` (`password_reset_code`,`state`,`date_password_reset`),
  KEY `date_last_session` (`date_last_session`),
  KEY `idx_email_validation_code` (`email_validation_code`)
) ENGINE=TokuDB AUTO_INCREMENT=650327 DEFAULT CHARSET=utf8 `compression`='tokudb_small'

SHOW INDEX FROM sc_user;
+---------+------------+---------------------------+--------------+-----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+---------------------------+--------------+-----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| sc_user | 0 | PRIMARY | 1 | id | A | 0 | NULL | NULL | | BTREE | | |
| sc_user | 0 | idx_user__email | 1 | email | A | 0 | NULL | NULL | | BTREE | | |
| sc_user | 1 | idx_username | 1 | username | A | 0 | NULL | NULL | YES | BTREE | | |
| sc_user | 1 | idx_flag_gen | 1 | flag_gen | A | 0 | NULL | NULL | | BTREE | | |
| sc_user | 1 | state | 1 | state | A | 0 | NULL | NULL | | BTREE | | |
| sc_user | 1 | state | 2 | gen_collection_count | A | 0 | NULL | NULL | | BTREE | | |
| sc_user | 1 | password_reset_code | 1 | password_reset_code | A | 0 | NULL | NULL | YES | BTREE | | |
| sc_user | 1 | password_reset_code | 2 | state | A | 0 | NULL | NULL | | BTREE | | |
| sc_user | 1 | password_reset_code | 3 | date_password_reset | A | 0 | NULL | NULL | YES | BTREE | | |
| sc_user | 1 | date_last_session | 1 | date_last_session | A | 0 | NULL | NULL | YES | BTREE | | |
| sc_user | 1 | idx_email_validation_code | 1 | email_validation_code | A | 0 | NULL | NULL | YES | BTREE | | |
+---------+------------+---------------------------+--------------+-----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
11 rows in set (0.00 sec)

MariaDB [sc_2]> ANALYZE TABLE sc_user;
+--------------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+--------------+---------+----------+----------+
| sc_2.sc_user | analyze | status | OK |
+--------------+---------+----------+----------+
1 row in set (9.64 sec)

MariaDB [sc_2]> SHOW INDEX FROM sc_user;
+---------+------------+---------------------------+--------------+-----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+---------------------------+--------------+-----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| sc_user | 0 | PRIMARY | 1 | id | A | 0 | NULL | NULL | | BTREE | | |
| sc_user | 0 | idx_user__email | 1 | email | A | 0 | NULL | NULL | | BTREE | | |
| sc_user | 1 | idx_username | 1 | username | A | 0 | NULL | NULL | YES | BTREE | | |
| sc_user | 1 | idx_flag_gen | 1 | flag_gen | A | 0 | NULL | NULL | | BTREE | | |
| sc_user | 1 | state | 1 | state | A | 0 | NULL | NULL | | BTREE | | |
| sc_user | 1 | state | 2 | gen_collection_count | A | 0 | NULL | NULL | | BTREE | | |
| sc_user | 1 | password_reset_code | 1 | password_reset_code | A | 0 | NULL | NULL | YES | BTREE | | |
| sc_user | 1 | password_reset_code | 2 | state | A | 0 | NULL | NULL | | BTREE | | |
| sc_user | 1 | password_reset_code | 3 | date_password_reset | A | 0 | NULL | NULL | YES | BTREE | | |
| sc_user | 1 | date_last_session | 1 | date_last_session | A | 0 | NULL | NULL | YES | BTREE | | |
| sc_user | 1 | idx_email_validation_code | 1 | email_validation_code | A | 0 | NULL | NULL | YES | BTREE | | |
+---------+------------+---------------------------+--------------+-----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
11 rows in set (0.00 sec)

After switching it back to InnoDB, it fixes the issue :

ALTER TABLE sc_user ENGINE=InnoDB;
Query OK, 622834 rows affected (1 min 4.93 sec)
Records: 622834 Duplicates: 0 Warnings: 0

MariaDB [sc_2]> show index FROM sc_user;
+---------+------------+---------------------------+--------------+-----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+---------------------------+--------------+-----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| sc_user | 0 | PRIMARY | 1 | id | A | 542895 | NULL | NULL | | BTREE | | |
| sc_user | 0 | idx_user__email | 1 | email | A | 542895 | NULL | NULL | | BTREE | | |
| sc_user | 1 | idx_username | 1 | username | A | 542895 | NULL | NULL | YES | BTREE | | |
| sc_user | 1 | idx_flag_gen | 1 | flag_gen | A | 2 | NULL | NULL | | BTREE | | |
| sc_user | 1 | state | 1 | state | A | 4 | NULL | NULL | | BTREE | | |
| sc_user | 1 | state | 2 | gen_collection_count | A | 9201 | NULL | NULL | | BTREE | | |
| sc_user | 1 | password_reset_code | 1 | password_reset_code | A | 8899 | NULL | NULL | YES | BTREE | | |
| sc_user | 1 | password_reset_code | 2 | state | A | 8899 | NULL | NULL | | BTREE | | |
| sc_user | 1 | password_reset_code | 3 | date_password_reset | A | 9048 | NULL | NULL | YES | BTREE | | |
| sc_user | 1 | date_last_session | 1 | date_last_session | A | 542895 | NULL | NULL | YES | BTREE | | |
| sc_user | 1 | idx_email_validation_code | 1 | email_validation_code | A | 542895 | NULL | NULL | YES | BTREE | | |
+---------+------------+---------------------------+--------------+-----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
11 rows in set (0.00 sec)

Any idea of what could have happened here ? (tokudb_auto_analyze in off, nothing suspicious in the error log).
I'm using MariaDB 10.1.16 with TokuDB 5.6.30-76.3.
Unfortunately I didn't keep the original corrupted TokuDB table.
The server is a Master, so no replication or RFR involved here.

Thanks and regards,
  Jocelyn Fournier

Tags: tokudb
Revision history for this message
jocelyn fournier (joce) wrote :

Tokudb specific variables :

tokudb_pk_insert_mode=1
tokudb_cache_size=64G
tokudb_lock_timeout=120000
tokudb_row_format="tokudb_small"

description: updated
jocelyn fournier (joce)
description: updated
Revision history for this message
jocelyn fournier (joce) wrote :
Download full text (9.1 KiB)

Hi,

I've just reproduced the same issue with another table. The bug showed up after running a manual ANALYZE TABLE on a TokuDB table :

ANALYZE TABLE sc_product;
+-----------------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+-----------------+---------+----------+----------+
| sc_2.sc_product | analyze | status | OK |
+-----------------+---------+----------+----------+
1 row in set (20.01 sec)

MariaDB [sc_2]> SHOW INDEX FROM sc_product;
+------------+------------+---------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------+------------+---------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| sc_product | 0 | PRIMARY | 1 | id | A | 104 | NULL | NULL | | BTREE | | |
| sc_product | 0 | idx_id_big | 1 | id_big | A | 104 | NULL | NULL | YES | BTREE | | |
| sc_product | 1 | fk_subtype_id | 1 | subtype_id | A | 0 | NULL | NULL | | BTREE | | |
| sc_product | 1 | fk_parent_id | 1 | parent_id | A | 0 | NULL | NULL | YES | BTREE | | |
| sc_product | 1 | idx_state | 1 | state | A | 0 | NULL | NULL | | BTREE | | |
| sc_product | 1 | idx_flag | 1 | flag_gen | A | 0 | NULL | NULL | | BTREE | | |
| sc_product | 1 | idx_flag | 2 | gen_collection_count | A | 0 | NULL | NULL | | BTREE | | |
| sc_product | 1 | idx_flag | 3 | gen_type_id | A | 0 | NULL | NULL | YES | BTREE | | |
+------------+------------+---------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
8 rows in set (0.00 sec)

MariaDB [sc_2]> ANALYZE TABLE sc_product;
+-----------------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+-----------------+---------+----------+----------+
| sc_2.sc_product | analyze | status | OK |
+-----------------+---------+----------+----------+
1 row in set (20.05 sec)

MariaDB [sc_2]> SHOW INDEX FROM sc_product;
+------------+------------+---------------+--------------+----------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq...

Read more...

Revision history for this message
George Ormond Lorch III (gl-az) wrote :

Was this data stored within tokudb prior to 5.6.27-76.0? If so I suspect it is a row count error as described in here https://www.percona.com/doc/percona-server/5.6/tokudb/tokudb_background_analyze_table.html where TokuDB used to report the PerconeFT row counts, which were physical row counts and not logical row counts, meaning they could be >100% different between the two and eventually reach 0 which would result in a cardinality that does not reflect the make up of the index at all, possibly 0.

Tokudb follows the InnoDB model of reporting cardinality as a ratio of actual rows sampled to unique rows sampled * scaling% (where scaling % is 50% by default). The server then takes this information, along with the reported row counts and produces the index cardinality that you can see in the "SHOW INDEX ..." results.

The analysis starts from the left side of the index, if at half way through the specified time limit it is not yet half way through the reported number of rows, it will reverse direction and begin scanning from the right side of the index. This is to try to avoid one sided activity (deletions and/or insertions).

Can you also post the result of show table status as that might have some info that is useful.

Worst case, brute force would be to run RECOUNT_ROWS ANALYZE on the table, which might take some time depending the size of the table and how much garbage (unapplied messages) is in it followed by a STANDARD ANALYZE with a reasonably long timeout or even 0 for a full scan, but I would like to try to understand how it got here before blowing the evidence away.

Revision history for this message
jocelyn fournier (joce) wrote :
Download full text (3.2 KiB)

Hi George,

The data has been imported with TokuDB 5.6.26-74.0 (MariaDB 10.1.14), and now it's running 5.6.30-76.3 (MariaDB 10.1.16). However, the sc_user table has been converted from InnoDB to TokuDB *after* the switch to 10.1.16.

The output of SHOW TABLE STATUS :

MariaDB [sc_2]> SHOW TABLE STATUS LIKE 'sc_product';
+------------+--------+---------+------------+----------+----------------+-------------+---------------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+------------------------------+---------+
| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment |
+------------+--------+---------+------------+----------+----------------+-------------+---------------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+------------------------------+---------+
| sc_product | TokuDB | 10 | Dynamic | 14826321 | 64 | 958031469 | 9223372036854775807 | 585093367 | 88915968 | 22152072 | 2016-07-28 16:16:53 | 2016-07-28 17:32:17 | NULL | utf8_general_ci | NULL | `compression`='tokudb_small' | |
+------------+--------+---------+------------+----------+----------------+-------------+---------------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+------------------------------+---------+
1 row in set (0.00 sec)

MariaDB [sc_2]> SHOW TABLE STATUS LIKE 'sc_product_corrupted';
+----------------------+--------+---------+------------+------+----------------+-------------+---------------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+------------------------------+---------+
| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment |
+----------------------+--------+---------+------------+------+----------------+-------------+---------------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+------------------------------+---------+
| sc_product_corrupted | TokuDB | 10 | Dynamic | 0 | 0 | 972855326 | 9223372036854775807 | 586033005 | 424992768 | 22150655 | 2016-07-19 15:21:43 | 2016-07-28 16:30:09 | NULL | utf8_general_ci | NULL | `compression`='tokudb_small' | |
+----------------------+--------+---------+------------+------+----------------+-------------+---------------------+--------------+-----------+----------------+---------------------+---------------------+------------+---------...

Read more...

Revision history for this message
George Ormond Lorch III (gl-az) wrote :

Hmm, OK, that is interesting. I wonder if the conversion/"ALTER TABLE ... ENGINE=TOKUDB" misses something from just a "CREATE TABLE ... ENGINE=TOKUDB; INSERT..."

We can see clearly that in the bad table, it thinks it has 0 rows, so any non unique index is going to automatically have a cardinality of 0. Now, how did we get here :-\

Revision history for this message
jocelyn fournier (joce) wrote :

For sc_product_corrupted, I reproduced the issue with a table imported directly into TokuDB, without any TokuDB <-> InnoDB conversion.
Before the ANALYZE TABLE, the cardinality was correct.
After the first ANALYZE, only the PK and the UNIQUE have a corrected cardinality (140), all other rows are set to 0.
After the second ANALYZE, all the columns have a cardinality of 0 :(

Revision history for this message
jocelyn fournier (joce) wrote :

Note for sc_product I already ran an ANALYZE a week before without any issue.

Revision history for this message
George Ormond Lorch III (gl-az) wrote :

The key here is that the table row count is 0. This means that all non unique keys will always report a cardinality of 0.

Since we know that this data was all created/originated within the table post 5.6.27-76.0, we know it is using the logical FT row count instead of the physical FT row count.

So the real question here is why does the FT report 0 logical rows on the PK?

My theory is that some INSERT/DELETE operation(s) are being either over or under counted.

It is possible though that there may be many deferred operations that 'correct' the logical count only when the messages are physically applied to the leaf entries (rows) sitting up in the tree and causing the logical count to be inaccurate, but that is a very long way inaccurate and hard to believe that there would be that many deferred corrections sitting in the tree.

Revision history for this message
George Ormond Lorch III (gl-az) wrote :

https://tokutek.atlassian.net/browse/DB-1006 added, continuing analysis over there.

Revision history for this message
jocelyn fournier (joce) wrote :
Download full text (11.6 KiB)

But why the first ANALYZE TABLE half corrupted the table, and the second finished to corrupt the table (the second ANALYZE have been executed a few second after the first one) ? Moreover the sc_user is the kind of table which is changing so often.

FYI, here is the behaviour of the server when the ANALYZE was running :

| 88 | root | localhost | sc_2 | Query | 9 | | ANALYZE TABLE sc_product | 0.000 |
| 145 | repl | 172.16.4.3:49541 | NULL | Binlog Dump | 475 | Master has sent all binlog to slave; waiting for binlog to be updated | NULL | 0.000 |
| 218 | www-prod | sql-master:35879 | NULL | Sleep | 48 | | NULL | 0.000 |
| 335 | root | localhost | NULL | Query | 0 | init | show processlist | 0.000 |
| 899 | www-prod | cron-02.sc.lan:29654 | sc_2 | Sleep | 0 | | NULL | 0.000 |
| 903 | www-prod | www-14.sc.lan:46114 | sc_2 | Query | 0 | Waiting for table level lock | UPDATE `sc_product` SET `id` = 22076396, `id_big` = NULL, `parent_id` = NULL, `gen_year` = NULL, `pr | 0.000 |
| 905 | www-prod | www-13.sc.lan:16006 | sc_2 | Sleep | 0 | | NULL | 0.000 |
| 910 | www-prod | www-11.sc.lan:21556 | sc_2 | Query | 1 | Waiting for table level lock | UPDATE `sc_product` SET `id` = 22101633, `id_big` = NULL, `parent_id` = NULL, `gen_year` = NULL, `pr | 0.000 |
| 911 | www-prod | www-12.sc.lan:28156 | sc_2 | Query | 0 | Waiting for table level lock | UPDATE `sc_product` SET `id` = 22074921, `id_big` = NULL, `parent_id` = NULL, `gen_year` = NULL, `pr | 0.000 |
| 916 | www-prod | www-10.sc.lan:42513 | sc_2 | Sleep | 1 | | NULL | 0.000 |
| 1652 | www-prod | cron-01.sc.lan:23003 | sc_2 | Query | 7 | Waiting for table level lock | INSERT INTO `sc_product` (`subtype_id`, `label`, `state`, `flag_gen`, `date_last_update`, `gen_colle | 0.000 |
| 2976 | www-prod | www-14.sc.lan:61886...

Revision history for this message
fimbulvetr (fimbulvetr) wrote :

I have just run into this problem as well. However, my circumstances are different. I never came from or used Innodb in the life of this table, it was populated by load data infile. There are 400m records on this slave which originally started as PS 5.7.12. I've upgraded it rather recently to 5.7.13 due to the wrong result in a left join bug.

Cardinality is showing 0 for all non-uq indexes and an analyze table does not recover the correct stats. Show table status says 0 rows, however there are 400m rows in the table.

Revision history for this message
fimbulvetr (fimbulvetr) wrote :

Also I should note that I never run insert ignore. Another thing of interest is that this is a slave and on the master I am currently running 'delete from table where primary_key_id in (...)'.

There is one query at a time, deleting 50 records each query. The amount that has been deleted so far is about .1% of the rows in the table and it is 1/10th done (400,000 records have been deleted).

The master is running PS 5.6.31-77.0-log and does not exhibit these symptoms.

Revision history for this message
fimbulvetr (fimbulvetr) wrote :

FYI I followed the steps from https://bugs.launchpad.net/percona-server/+bug/1596055/comments/1 and it appears to have the correct row count now.

The slave will continue to receive deletes from master. I will report if this happens again here, unless the fix is to have auto analyze on.

If that's the case I don't believe the auto-analyze is a very good feature albeit it does at least work. Many of my tables change at drastically different rates relative to current row count than this one. I don't think any calculation I can work out for table_delta >= (table_rows * tokudb_auto_analyze) would apply well to the tables subject to churn. Still, I will continue to experiment - perhaps I am incorrect about the utility of auto analyze.

Revision history for this message
fimbulvetr (fimbulvetr) wrote :

So after further thought, I do have to call into question the progenity of this table. It's possible that it was created around 5.6.27-76.0. Honestly I don't quite remember as I have dozens of servers running toku and I'm in the middle of upgrading from the legacy toku.com mariadb builds (5.5.40-MariaDB-36.1) - and this table could have been been dumped and reloaded into PS 5.6.x.

 Is there anyway you can tell me how to verify this? Is it possible via tokudb_debug or inspecting an offset in the .frm file? Is a table version number stored anywhere? Does an alter table engine=tokudb correct it?

Revision history for this message
George Ormond Lorch III (gl-az) wrote :

fimbulvetr, thanks or the input. I an not clear on how your use case trips this (insert ignore, replace into or read free replication), but the core of the issue is that the logical row count goes wildly inaccurate down to 0. The server will consider a non unique index on a table that reports 0 rows to always have a cardinality of 0 regardless of what the storage engine reports to the server.

If/when that is fixed, auto analyze then should work just fine to maintain correct cardinality counts for your data. The auto analyze feature is only really needed for workloads that tend to change the balance of key uniqueness over the lifetime of a table. If the ratio/balance of uniques to duplicate keys (a.k.a records per key) stays relatively the same for a given index, then analyzing it once to set the cardinality should be enough. The feature was modeled from a similar concept in InnoDB https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_stats_auto_recalc

Revision history for this message
fimbulvetr (fimbulvetr) wrote :

Thanks George - this slave is indeed using RFR but not insert ignore and not replace into (Of this I am confident) for this particular table.

So we'll work on the issue for how logical count got to 0 and then I will explore the usage of auto analyze.

This table is typically an insert/update only table but from time to time there are deletes of less than < 5% of data - this is one of those times and now is the first time it's been run using PS 5.6 or PS 5.7. All other times have been running the old tokutek builds of patched mariadb (Toku 7.5.x).

There are two slaves this happened on, one is an identical AWS snapshot clone of the other and differ only in hostname & server-id. They had been running fine with no logical row count issues until this manual cleanup process was started and then within some days they reflected a row count of 0. After analyze_row_count, they are back to the correct numbers. I am continuing the cleanup job and if this appears again I will report back.

Revision history for this message
fimbulvetr (fimbulvetr) wrote :

Ok, here's my follow up.

Around my last response, I had 210,000,000 records in my table. I continued with my 'slow delete' process, roughly 30-50 rows per second for the last 6 days.

Please note this is entirely unscientific.

I've gone from 210,000,000 to 188,396,007 actual rows.

Here are my show table status rows:

PS 5.7.13-6-log - 150,815,123
PS 5.6.31-77.0-log - 185,383,998
5.5.41-tokudb-7.5.5-MariaDB-log - 193,416,244

The most accurate is 5.6. I suspect 5.5.41 could have been more accurate but the problem is that I did not run an analyze on it 6 days ago so difference has been accrued over a much larger time frame. My suspicion is that it would be much more accurate had I done this and that for *my* data churn model, the old style row estimations (toku 7.5.5) were much more accurate.

During this time, no tables had an analyze performed on them, and auto_analyze was off.

At this rate, I expect my PS 5.7.13-6-log to hit a estimated row count of 0 well before the table actually has 0 rows.

I personally would consider this a bug - with auto_analyze being a just barely acceptable workaround as this particular workload is infrequent.

Revision history for this message
George Ormond Lorch III (gl-az) wrote :

fimbulvetr, there _is_ a bug in the logical row counting that, depending specifically on your workload, can cause the row counts to either increase forever or decrease forever away from the actual target count. Basic table analysis does not correct this, it only scans and calculates cardinality, but, that scanning does cause node reads and message application, which then in turn can impact the logical row count. There is a fix for the issues in the logical row counting just about ready to merge.

Setting tokudb_analyze_mode=tokudb_analyze_recount_rows and running an ANALYZE TABLE will specifically scan, logically, all rows in the table, count them up and set the internally persisted logical row counter to the correct value. In this mode it will not do a cardinality calculation(tokudb_analyze_mode=tokub_analyze_standard). If the workload that instigates this issue continues, the counter will go inaccurate again.

The older physical row counting would cause many other issues for other use cases, again due to the deferred nature of work in the Fractal Tree messaging system.

Revision history for this message
Ross E Carver (recarv) wrote :

Hi, I've been following this bug and have implemented several of the temporary fixes, including upgrading server (currently PS 5.7.16-10-log) which should include the fix.

I have a table with around 30m rows which are updated often - high turnover of about 50+ % of table rows based on updates (no deletes) on a weekly or bi-weekly basis. We have other tables with append-only style behavior totaling over 1b rows without problem.

This is a long bug list, so I'm hoping to get some new info requests for help to diagnose the problem before I try to alleviate the problem and destroy any possible evidence in so doing. I'd post as a new bug, but it does relate to this because the temporary fixes and behavior are the same.

Please let me know what you'd like for further evidence to investigate - I can sustain the degraded performance until 12/31 (thankfully due to holidays).

Revision history for this message
Ross E Carver (recarv) wrote :

Here is a screenshot of (an outdated) mysql Administrator showing the zero row behavior. I'm happy to post whatever else is necessary.

Revision history for this message
Ross E Carver (recarv) wrote :

And in following the previous commands mentioned by fimublvetr1 in july of 2016 above, I've pasted the results below:

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> SET SESSION TOKUDB_ANALYZE_IN_BACKGROUND=1;
Query OK, 0 rows affected (0.02 sec)

mysql> SET SESSION TOKUDB_ANALYZE_MODE=TOKUDB_ANALYZE_RECOUNT_ROWS;
Query OK, 0 rows affected (0.00 sec)

mysql> SET SESSION TOKUDB_ANALYZE_THROTTLE=0;
Query OK, 0 rows affected (0.00 sec)

mysql> ANALYZE TABLE retail.lw_item_discovery;
+--------------------------+---------+----------+------------------+
| Table | Op | Msg_type | Msg_text |
+--------------------------+---------+----------+------------------+
| retail.lw_item_discovery | analyze | status | Operation failed |
+--------------------------+---------+----------+------------------+
1 row in set (0.01 sec)

mysql> SET SESSION TOKUDB_ANALYZE_MODE=TOKUDB_ANALYZE_STANDARD;
Query OK, 0 rows affected (0.00 sec)

mysql> SET SESSION TOKUDB_ANALYZE_THROTTLE=0;
Query OK, 0 rows affected (0.00 sec)

mysql> SET SESSION TOKUDB_ANALYZE_TIME=0;
Query OK, 0 rows affected (0.00 sec)

mysql> ANALYZE TABLE retail.lw_item_discovery;

-- I'll post the result when finished.

Revision history for this message
Sly Ripper (sly-ripper) wrote :
Revision history for this message
Emil Makariev (emakariev) wrote :

 Hello,
I see there is a fix released, but with version 5.7.18-16, I'm experiencing the same problem.

Revision history for this message
Phil Sweeney (3-launchpa9-9) wrote :

Anyone still having row count issues after 5.7.18-16 (or 5.6.37-82.2), please file a new bug - this one is closed.

Revision history for this message
Phil Sweeney (3-launchpa9-9) wrote :

Also note there's a proposed fix for a bug around row counts being lost on hot index creation here:

https://github.com/percona/PerconaFT/pull/379
(bug tracked in https://jira.percona.com/browse/TDB-35)

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-391

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.