add new option and behaviour to pt-online-schema-change to detect long running transactions

Bug #1371237 reported by Fernando Ipar
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Confirmed
Undecided
Unassigned

Bug Description

Long running transactions can make pt-online-schema-change abort in the final stages, as an X metadata lock can prevent drop of triggers in the table being altered. Additionally, while the DROP TRIGGER statement is blocked waiting for the ML, other threads will queue behind it, increasing load (if this happens, in fact, pt-online-schema-change aborts due to load exceeded).

I think it would be a good addition if the script detected long running transactions and aborts unless the user chooses to ignore this.

To implement this, two new options could be added:
  --long-running-transaction-time <TIME>

  to specify how many seconds a trx needs to be ACTIVE to be considered long running. I'd default it to 60.

  --ignore-long-running-transactions

  to tell the script to go ahead and start its work even if there are long running transactions present.

Revision history for this message
Nilnandan Joshi (nilnandan-joshi) wrote :

Able to reproduce this issue that long running transactions can make pt-online-schema-change abort in the final stages of dropping triggers. I have started pt-osc on one session and then run update from another without commit the transaction. Got below output.

nilnandan@Dell-XPS:~$ pt-online-schema-change --user=root --password=root --alter "ADD COLUMN phone INT" D=test,t=members_without_part --execute
No slaves found. See --recursion-method if host Dell-XPS has slaves.
Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
Operation, tries, wait:
  copy_rows, 10, 0.25
  create_triggers, 10, 1
  drop_triggers, 10, 1
  swap_tables, 10, 1
  update_foreign_keys, 10, 1
Altering `test`.`members_without_part`...
Creating new table...
Created new table test._members_without_part_new OK.
Altering new table...
Altered `test`.`_members_without_part_new` OK.
2014-10-14T13:29:15 Creating triggers...
2014-10-14T13:29:15 Created triggers OK.
2014-10-14T13:29:15 Copying approximately 3281355 rows...
2014-10-14T13:30:03 Dropping triggers...
2014-10-14T13:40:12 Error dropping trigger: 2014-10-14T13:40:12 DBD::mysql::db do failed: Lock wait timeout exceeded; try restarting transaction [for Statement "DROP TRIGGER IF EXISTS `test`.`pt_osc_test_members_without_part_del`;"] at /usr/bin/pt-online-schema-change line 10187.

^C
^C^C# Exiting on SIGINT.
Not dropping triggers because the tool was interrupted. To drop the triggers, execute:
DROP TRIGGER IF EXISTS `test`.`pt_osc_test_members_without_part_del`;
DROP TRIGGER IF EXISTS `test`.`pt_osc_test_members_without_part_upd`;
DROP TRIGGER IF EXISTS `test`.`pt_osc_test_members_without_part_ins`;
`test`.`members_without_part` was not altered.
nilnandan@Dell-XPS:~$

mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> update members_without_part set fname='nandan' where fname = 'nil';
Query OK, 1097152 rows affected (1 min 4.01 sec)
Rows matched: 1097152 Changed: 1097152 Warnings: 0

mysql>

Changed in percona-toolkit:
status: New → Confirmed
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-1242

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.