Table containing cyrillic letters are destroyed

Bug #1666490 reported by Dmitriy Razumovskiy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
New
Undecided
Unassigned

Bug Description

Hi,

I tried to sync differences between tables on master and slave with a command:
pt-table-sync --execute --replicate percona.checksums master1

As the result, the data both on master and slave were destroyed because all cyrillic characters were replaced with question marks.

The table collation is ut8_general_ci. In my.cnd under [mysqld] I have character-set-server=utf8

Versions:
Percona Server 5.6.35-80.0.1.jessie
Percona Toolkit 2.2.20-1

Tags: cyrillic
Revision history for this message
Jaime Sicam (jssicam) wrote :

Hi,
  Can you try using the --charset option to specify the default character set of pt-table-sync?

Eg. Compare the differences between these two runs:
pt-table-sync --print --charset=utf8 --replicate percona.checksums master1
pt-table-sync --print --charset=utf8 --replicate percona.checksums master1

Thanks.

Revision history for this message
Jaime Sicam (jssicam) wrote :

Oops. I meant:

pt-table-sync --print --replicate percona.checksums master1
pt-table-sync --print --charset=utf8 --replicate percona.checksums master1

Thanks!

Revision history for this message
Jaime Sicam (jssicam) wrote :

I'll set the status of this case to invalid because I think this can be resolved by setting --charset

Changed in percona-toolkit:
status: New → Invalid
Revision history for this message
Dmitriy Razumovskiy (darland) wrote :

Hi Jaime,

I appreciate your workaround, however I still believe that this behaviour is a very serious issue that has to be properly resolved for two reasons:

1. UTF-8 is default charset for the vast majority of applications and databases. Keeping data in UTF-8 means that you're secured from data corruption due to incompatible charset
2. In the manual of pt-table-sync there is no single mention, that running the tool with default setting most probably destroy your data. People like me carefully read the manual and with a false sense of safety run pt-table-sync agains a master on production environment. As the result - master and both slaves are destroyed

If pt-table-sync is really meant for production use I would suggest either:
- set default charset to UTF-8
- set charset as a required parameter
- (less safe) update the manual and write in bold, that the tool most probably destroy your data if you do not specify a charset of the DB

Leaving this bug without action in invalid status means that more and more people continue destroying the there databases and as a consequence - trust to Percona Community.

Changed in percona-toolkit:
status: Invalid → New
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-1412

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.