pt-archiver --bulk-insert may corrupt data
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona Toolkit moved to https://jira.percona.com/projects/PT |
Fix Released
|
High
|
Brian Fraser |
Bug Description
This bug seems to pop up whenever the following conditions are set for a table to table archive copy:
1. The tables have a TEXT field that include utf8 characters, like foreign language
2. --charset utf8 is used
3. --bulk-insert is used
When these 3 conditions are true, pt-archiver immediately returns with a Wide Character error on line 3950. This seems to be similar to bug #940253 and seems that it's linked to utf8 encoding as related to the temporary bulk load insert file that's created, as the problem goes away immediately when I turn off --bulk-insert or I set --no-check-charset. For now, I'm sacrificing speed (about 3x) by not using bulk-insert to get around this problem... otherwise I need to use pt-table-sync once finished to repair all rows with encoding data mismatches.
Related branches
- Daniel Nichter: Approve
-
Diff: 161 lines (+77/-19)3 files modifiedbin/pt-archiver (+29/-19)
t/pt-archiver/bulk_insert.t (+36/-0)
t/pt-archiver/samples/bug_1127450.sql (+12/-0)
Changed in percona-toolkit: | |
status: | New → Confirmed |
Changed in percona-toolkit: | |
milestone: | none → 2.2.2 |
tags: | added: charset pt-archiver |
Changed in percona-toolkit: | |
importance: | Undecided → High |
Changed in percona-toolkit: | |
status: | Confirmed → Fix Committed |
Changed in percona-toolkit: | |
status: | Fix Committed → In Progress |
summary: |
- pt-archiver wide character + pt-archiver --charset and --bulk-insert fail, may corrupt data |
summary: |
- pt-archiver --charset and --bulk-insert fail, may corrupt data + pt-archiver --bulk-insert may corrupt data |
tags: | added: dbd-mysql risk |
Changed in percona-toolkit: | |
status: | Fix Committed → Fix Released |
In pt-archiver version 2.1.8, the error has changed to: bin/pt- archiver line 5840.
Wide character in print at /usr/local/