pt-online-schema-change fails with Copying rows caused a MySQL error 1317

Bug #1658097 reported by Jericho Rivera
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Won't Fix
Medium
Carlos Salguero

Bug Description

pt-osc fails when executed from passive node while active node was receiving writes on same table being altered.

pt-online-schema-change --alter="ENGINE=INNODB" D=sb,t=sbtest1,u=root --execute --max-flow-ctl 0
No slaves found. See --recursion-method if host pxc2 has slaves.
Not checking slave lag because no slaves were found and --check-slave-lag was not specified.

# A software update is available:
# * The current version for Percona::Toolkit is 2.2.20.

Operation, tries, wait:
  analyze_table, 10, 1
  copy_rows, 10, 0.25
  create_triggers, 10, 1
  drop_triggers, 10, 1
  swap_tables, 10, 1
  update_foreign_keys, 10, 1
Altering `sb`.`sbtest1`...
Creating new table...
Created new table sb._sbtest1_new OK.
Altering new table...
Altered `sb`.`_sbtest1_new` OK.
2017-01-20T08:36:13 Creating triggers...
2017-01-20T08:36:13 Created triggers OK.
2017-01-20T08:36:13 Copying approximately 11133 rows...
2017-01-20T08:36:18 Dropping triggers...
2017-01-20T08:36:18 Dropped triggers OK.
2017-01-20T08:36:18 Dropping new table...
2017-01-20T08:36:18 Dropped new table OK.
`sb`.`sbtest1` was not altered.
2017-01-20T08:36:18 Error copying rows from `sb`.`sbtest1` to `sb`.`_sbtest1_new`: 2017-01-20T08:36:18 Copying rows caused a MySQL error 1317:
    Level: Error
     Code: 1317
  Message: Query execution was interrupted
    Query: INSERT LOW_PRIORITY IGNORE INTO `sb`.`_sbtest1_new` (`id`, `k`, `c`, `pad`) SELECT `id`, `k`, `c`, `pad` FROM `sb`.`sbtest1` FORCE INDEX(`PRIMARY`) WHERE ((`id` >= ?)) AND ((`id` <= ?)) LOCK IN SHARE MODE /*pt-online-schema-change 1708 copy nibble*/

From node's error log I see the following:
2017-01-20 08:45:20 1634 [Note] WSREP: MDL BF-BF conflict
schema: sb
request: (2 seqno 5656596 wsrep (1, 1, 0) cmd 0 146 (null))
granted: (9 seqno 5656595 wsrep (2, 1, 0) cmd 3 105 DROP TRIGGER IF EXISTS `sb`.`pt_osc_sb_sbtest1_del`)

How to repeat:
using 3-node PXC
run sysbench on node1 inserting to sbtest1
run pt-osc on node2 on same table sbtest1

If pt-osc is executed on node1 while table is updated, pt-osc does not fail and actually completes.

Revision history for this message
Jericho Rivera (jericho-rivera) wrote :
Revision history for this message
Jericho Rivera (jericho-rivera) wrote :

Marking as confirmed since this is repeatable also based on customer account. Will be interested in developer feedback.

Changed in percona-toolkit:
status: New → Confirmed
Changed in percona-toolkit:
importance: Undecided → Medium
Changed in percona-toolkit:
status: Confirmed → In Progress
assignee: nobody → Carlos Salguero (carlos-salguero)
milestone: none → 3.0.2
tags: added: pt106
Changed in percona-toolkit:
milestone: 3.0.2 → 3.0.3
Revision history for this message
Carlos Salguero (carlos-salguero) wrote :

It cannot be reproduced.
See https://jira.percona.com/browse/PT-106

Changed in percona-toolkit:
status: In Progress → Won't Fix
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-725

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.