I have some DBs with tens of thousands of tables, perhaps 5% of which are MyISAM. When I start Mydumper, it can take 30+ minutes before it actually begins doing backups. Between verbose output, the MySQL processlist, and an strace, it looks like the following:
- Mydumper requests a global FLUSH TABLES WITH READ LOCK
- Once granted, it performs a 'SHOW TABLE STATUS' on each table
- Once the TABLE STATUS loops finish, it backs up non-InnoDB tables
- Once non-InnoDB tables are backed up, it releases the READ LOCK
- Once the READ LOCK is released, it backs up InnoDB tables
The thing is, with tens of thousands of tables - at least on my abused servers - it can take more than 30 minutes simply to finish the SHOW TABLE STATUS stage. Is it necessary for data integrity to have a READ LOCK as Mydumper is collecting this information? If not, I'd *love* to see the READ LOCK acquired afterward (status -> lock -> non-InnoDB -> release rather than lock -> status -> non-InnoDB -> release).
Whether or not it's deemed necessary to lock first, would it perhaps be faster to query for table data a la 'mysql -Be 'select TABLE_SCHEMA, TABLE_NAME, ENGINE from information_schema.tables where ENGINE not like "InnoDB"'', IF (??) information_schema can provide all the data you need?