KEY partitioning results in uneven data distribution - bad hash algorithm

Bug #1348295 reported by Swany
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server moved to https://jira.percona.com/projects/PS
Won't Fix
Medium
Unassigned
5.5
Won't Fix
Medium
Unassigned
5.6
Won't Fix
Medium
Unassigned
5.7
Won't Fix
Medium
Unassigned

Bug Description

UPSTREAM BUG #72428, #65112

When using a DATE with KEY partitioning, the partition distribution is skewed and only certain partitions are filled.

Following is a simple test case from #72428:

create table t1(c1 int, c2 date) partition by key(c2) partitions 8;
insert into t1 values (1, '2014-04-22');
insert into t1 values (1, '2014-04-23');
insert into t1 values (1, '2014-04-24');
insert into t1 values (1, '2014-04-25');
insert into t1 values (1, '2014-04-26');
insert into t1 values (1, '2014-04-27');
insert into t1 values (1, '2014-04-28');
insert into t1 values (1, '2014-04-29');
insert into t1 values (1, '2014-04-30');

Notice how most of the partitions have zero rows!

select partition_name, table_rows from information_schema.partitions where table_name='t1'\G
*************************** 1. row ***************************
partition_name: p0
    table_rows: 3
*************************** 2. row ***************************
partition_name: p1
    table_rows: 0
*************************** 3. row ***************************
partition_name: p2
    table_rows: 0
*************************** 4. row ***************************
partition_name: p3
    table_rows: 0
*************************** 5. row ***************************
partition_name: p4
    table_rows: 2
*************************** 6. row ***************************
partition_name: p5
    table_rows: 0
*************************** 7. row ***************************
partition_name: p6
    table_rows: 4
*************************** 8. row ***************************
partition_name: p7
    table_rows: 0
8 rows in set (0.00 sec)

There is a note in the upstream bug that this requires changing the HASH algorithm that KEY partitioning uses to something like MD5 like MySQL cluster uses. Of course this will break all KEY partitioned tables in the database so perhaps a new KEY partitioning method ( PARTITION BY MD5KEY or somethjing like that) may be needed.

Swany (greenlion)
description: updated
tags: added: upstream
Revision history for this message
Valerii Kravchuk (valerii-kravchuk) wrote :

This is easy to confirm:

openxs@ao756:~$ mysql -uroot -proot test
Warning: Using a password on the command line interface can be insecure.
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 55
Server version: 5.6.19-67.0 Percona Server (GPL), Release 67.0, Revision 618

Copyright (c) 2009-2014 Percona LLC and/or its affiliates
Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create table t1(c1 int, c2 date) partition by key(c2) partitions 8;
Query OK, 0 rows affected (2,33 sec)

mysql> insert into t1 values (1, '2014-04-22');
Query OK, 1 row affected (0,05 sec)

mysql> insert into t1 values (1, '2014-04-23');
Query OK, 1 row affected (0,06 sec)

mysql> insert into t1 values (1, '2014-04-24');
Query OK, 1 row affected (0,05 sec)

mysql> insert into t1 values (1, '2014-04-25');
Query OK, 1 row affected (0,06 sec)

mysql> insert into t1 values (1, '2014-04-26');
Query OK, 1 row affected (0,16 sec)

mysql> insert into t1 values (1, '2014-04-27');
Query OK, 1 row affected (0,14 sec)

mysql> insert into t1 values (1, '2014-04-28');
Query OK, 1 row affected (0,04 sec)

mysql> insert into t1 values (1, '2014-04-29');
Query OK, 1 row affected (0,14 sec)

mysql> insert into t1 values (1, '2014-04-30');
Query OK, 1 row affected (0,04 sec)

mysql> select partition_name, table_rows from information_schema.partitions where table_name='t1'\G
*************************** 1. row ***************************
partition_name: p0
    table_rows: 3
*************************** 2. row ***************************
partition_name: p1
    table_rows: 0
*************************** 3. row ***************************
partition_name: p2
    table_rows: 0
*************************** 4. row ***************************
partition_name: p3
    table_rows: 0
*************************** 5. row ***************************
partition_name: p4
    table_rows: 2
*************************** 6. row ***************************
partition_name: p5
    table_rows: 0
*************************** 7. row ***************************
partition_name: p6
    table_rows: 4
*************************** 8. row ***************************
partition_name: p7
    table_rows: 0
8 rows in set (0,00 sec)

tags: added: i41540
Changed in percona-server:
status: Triaged → Won't Fix
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1529

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.