Comment 1 for bug 1402051

Revision history for this message
Muhammad Irfan (muhammad-irfan) wrote : Re: [Feature] pt-osc fault tolerance if slave disconnects

I altered table on master server via pt-online-schema-change tool and killed mysqld on slave2 during p-osc tool is in progress to simulate slave network connectivity issues/mysqld disappeared I found that killing mysqld process on slave aborts the pt-osc tool and in result table is not altered no where neither master nor slave2.

root@master:~# ./pt-online-schema-change --execute --nodrop-old-table --alter "ADD COLUMN line_number VARCHAR(10) DEFAULT NULL" u=root,p=p3rc0na123,D=world=test &>> ptosc9.log

Found 2 slaves:
slave2
slave1
Will check slave lag on:
slave2
slave1
Operation, tries, wait:
copy_rows, 10, 0.25
create_triggers, 10, 1
drop_triggers, 10, 1
swap_tables, 10, 1
update_foreign_keys, 10, 1
Altering `world`.`test`...
Creating new table...
Created new table world._test_new OK.
Altering new table...
Altered `world`.`_test_new` OK.
2014-12-11T15:59:35 Creating triggers...
2014-12-11T15:59:35 Created triggers OK.
2014-12-11T15:59:35 Copying approximately 58402 rows...
Not dropping triggers because the tool was interrupted. To drop the triggers, execute:
DROP TRIGGER IF EXISTS `world`.`pt_osc_world_test_del`;
DROP TRIGGER IF EXISTS `world`.`pt_osc_world_test_upd`;
DROP TRIGGER IF EXISTS `world`.`pt_osc_world_test_ins`;
Not dropping the new table `world`.`_test_new` because the tool was interrupted. To drop the new table, execute:
DROP TABLE IF EXISTS `world`.`_test_new`;
`world`.`test` was not altered.
(in cleanup) 2014-12-11T15:59:44 Error copying rows from `world`.`test` to `world`.`_test_new`: Lost connection to replica slave2 while attempting to get its lag (DBI connect('world;host=slave2;mysql_read_default_group=client','root',...) failed: Can't connect to MySQL server on 'slave2' (111) at ./pt-online-schema-change line 2261)

Not dropping triggers because the tool was interrupted. To drop the triggers, execute:
DROP TRIGGER IF EXISTS `world`.`pt_osc_world_test_del`;
DROP TRIGGER IF EXISTS `world`.`pt_osc_world_test_upd`;
DROP TRIGGER IF EXISTS `world`.`pt_osc_world_test_ins`;
Not dropping the new table `world`.`_test_new` because the tool was interrupted. To drop the new table, execute:
DROP TABLE IF EXISTS `world`.`_test_new`;
`world`.`test` was not altered.

As you can see from the output that world.test is not altered. pt-osc behavior doesn't seems to be user friendly as tool aborted and failed because of temporal mysqld disappeared.