pt-online-schema-change Threads_running check causes premature bailout

Reported by Roger Booth on 2012-05-09
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona Toolkit
Undecided
Unassigned

Bug Description

Command:
pt-online-schema-change --execute --alter "drop key owner","add key owner (owner,path)" D=v4net,t=files,u=*****,p=*****,h=master

Tool version:
Percona Toolkit 2.1.1

MySQL version, all servers:
5.5.12-log

OS:
SunOS 5.10 Generic_141415-07 i86pc i386 i86pc
Solaris 10

Typical query during execution:
Sending data INSERT LOW_PRIORITY IGNORE INTO `v4net`.`_files_new` (`id`, `name`, `owner`, `path`, `content`, `properties`) SELECT `id`, `name`, `owner`, `path`, `content`, `properties` FROM `v4net`.`files` FORCE INDEX(`PRIMARY`) WHERE ((`id` >= '72751')) AND ((`id` <= '72774')) /*pt-online-schema-change 16248 copy nibble*/

Tool output:
...
Copying `v4net`.`files`: 94% 00:44 remain
Copying `v4net`.`files`: 96% 00:27 remain
Replica lag is 9 seconds on ned. Waiting.
Copying `v4net`.`files`: 97% 00:18 remain
Copying `v4net`.`files`: 98% 00:10 remain
Copying `v4net`.`files`: 99% 00:05 remain
Replica lag is 15 seconds on ned. Waiting.
Error copying rows from `v4net`.`files` to `v4net`.`_files_new`: Threads_running=52 exceeds its critical threshold 50

Dropping triggers...
Dropped triggers OK.
Dropping new table...
Dropped new table OK.
`v4net`.`files` was not altered.

Note also, the tool incorrectly estimates the number of rows that will be copied (Copying approximately 52556 when the actual number is 189799.

The tool worked correctly on a copy of the same table on one of our replicants, but failed twice on the attempts to run this on the master.
We have a master and two replicants. We use 'mixed' replication (mostly statement-based).

According to cacti graphs, running threads stayed below 10 on the master while the schema change operation was in progress.
The sample interval for Cacti is 5 minutes, so it is possible that running threads could spike without us knowing... however load average was below 1 for the duration of the operation.

Baron Schwartz (baron-xaprb) wrote :

This is unlikely to be a bug; Threads_running=52 is unlikely to be incorrect. It's much more likely that you simply didn't see the spike. Please try running the tool again while also running something like "mysqladmin extended-status -i1" or watching the server with innotop set to a 1-second refresh interval and write back if you can confirm that Threads_running doesn't increase at the moment the tool claims it does.

The estimate comes from EXPLAIN, so that's just an artifact of MySQL's inexact statistics.

Changed in percona-toolkit:
status: New → Triaged
Roger Booth (roger-booth) wrote :

Thanks Baron.
We will run the operation again and monitor Threads_running at one second intervals.
Our next opportunity to make the attempt will be tomorrow morning.

-Roger

Roger Booth (roger-booth) wrote :

I am rerunning the process and have verified that I do in fact see spikes in Threads running.

Here's a PMP-like snippet showing the state of most of the running threads during a spike:

  58 lwp_park,cond_wait_queue,_cond_wait,cond_wait,pthread_cond_wait,os_event_wait_low,log_write_up_to,trx_prepare_off_kernel,trx_prepare_for_mysql,__1cTinnobase_xa_prepare6FpnKhandlerton_pnDTHD_b_i_,__1cPha_commit_trans6FpnDTHD_b_i_,__1cRtrans_commit_stmt6FpnDTHD__b_,__1cVmysql_execute_command6FpnDTHD__i_,__1cLmysql_parse6FpnDTHD_pcIpnMParser_state__v_,__1cQdispatch_command6FnTenum_server_command_pnDTHD_pcI_b_,__1cKdo_command6FpnDTHD__b_,__1cYdo_handle_one_connection6FpnDTHD__v_,handle_one_connection,_thr_setup,_lwp_start

The OS is solaris 10, and filesystem is ZFS.

During the current run, the spikes have not lasted long enough to cause pt-online-schema-change to bail out.

Roger Booth (roger-booth) wrote :

Well, it just bailed out. Threads running spikes have stopped.

Roger Booth (roger-booth) wrote :

This might be relevant:
The table is huge...
66G May 10 08:42 files.ibd

CREATE TABLE `files` (
  `id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(100) NOT NULL DEFAULT '',
  `owner` mediumint(8) unsigned NOT NULL DEFAULT '0',
  `path` mediumint(8) unsigned NOT NULL DEFAULT '0',
  `content` mediumblob NOT NULL,
  `properties` text,
  PRIMARY KEY (`id`),
  UNIQUE KEY `name` (`name`,`owner`,`path`),
  KEY `owner` (`owner`)
) ENGINE=InnoDB

Roger Booth (roger-booth) wrote :

Average row length for the table is 484253.
There are 189933 rows.

Before the process stopped, it reduced the chunk-size...
Rows are copying very slowly. --chunk-size has been automatically reduced to 1. Check that the server is not being overloaded, or increase --chunk-time. The last chunk selected 2 rows and took 2.007 seconds to execute.

That helped for a short time.

Roger Booth (roger-booth) wrote :

If we processed 20 rows per second, it should complete in less than 3 hours.
Can I configure the tool to process "20 rows per second" and see if that works for our IO capacity?

I am worried that you might be experiencing a bug that Daniel is currently working on: it's choosing the wrong index. 20 rows per second is SLOW. Please PTDEBUG it, then ctrl-C it right after it starts "nibbling" the table and let's see what index it's chosen.

Roger Booth (roger-booth) wrote :

Hi Baron,
I've attached the debug output.
It may be picking the right index.
I thought maybe the size of individual rows could be an issue.

Roger Booth (roger-booth) wrote :

Hi Baron,
Setting chunk-size to 20 is working for us. 80% of the rows have copied so far (actual, not an estimate).
Throughput is about 20 rows per second.
Threads_running has been much lower. I still get some spikes, but only one of them approached 50.
The table is 66GB, but there are only 190k rows.

This is our biggest table, so pt-online-schema-change will likely work for any table we've got.

Minor additional note: I performed some file maintenance on the server, and may have improved zfs performance a little.

Thanks,
Roger

tags: added: pt-online-schema-change

It's been a long time since this bug was update, but I'm working on pt-osc 2.2.1 and so I'm reading through its bugs, like this one. Nibbling won't be a problem (re Baron's comment #8) because there's a simple AUTO_INC PK. I think the rows are just large (probably about 400k/each/avg), and that's a lot to move around, causing the tool (or something on the system) to slow down. I don't see a bug at this point, so I'll close this issue. If something has changed, let me know and we'll re-open it.

Changed in percona-toolkit:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments