provider crawls all tables and excludes rather than just including specified tables
Bug #677718 reported by
MikeG
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
holland-backup |
Won't Fix
|
Low
|
Unassigned |
Bug Description
If you have:
[mysqldump]
tables = "foo.importantT
You will see holland do something along the lines of:
Excluding mysql.columns_priv
Excluding mysql.db
Excluding mysql.func
Writing ignore-table exlusions to /var/spool/
This could be really inefficient on a server with thousands of tables when you want just the one
Changed in holland-backup: | |
status: | New → Won't Fix |
To post a comment you must log in.
Basically what holland-mysqldump does is, given a set of inclusion/exclusion filters generate a mysqldump command line to backup the requested tables. However, mysqldump itself has *very* primitive support for inclusion/ exclusions. This support basically falls into three categories:
1) Dump specific tables from *one* database only => mysqldump [options] database table [table..]
2) Dump multiple databases and exclude specific tables => mysqldump [options] --ignore- table=db. tbl --databases db [db...]
3) Dump all databases and exclude specific tables => mysqldump [options] [--ignore- table=db. tbl...] --all-databases
If we have a single 'tables' request, we may be able to hard code a fast path for that to always choose (1). If it includes a pattern we still have to scan a potentially large number of tables to find matches. If "tables" includes tables from multiple databases but file-per- database= off (meaning, the user wants a consistent snapshot of all those tables via a single mysqldump command) there's not a good way to support that without generating a large list of --ignore-table options.
I think the common use case is to simply filter a few tables and otherwise back everything else up. You can always manipulate the mysqldump command line through holland via additional-options. For efficient inclusion/exclusion we would need support from mysqldump (or a similar backend). I'm skeptical that the additional complexity in holland-mysqldump to more efficiently handle edge cases here is warranted currently. Although perhaps I'm just not aware of interesting use cases.
Essentially I view this as a limitation in mysqldump itself and holland has a reasonable general purpose filtering algorithm. I'm completely open to contributions to provide better alternatives - either as new plugins or reworking the existing "Given a set of filters, run these mysqldump commands" algorithm.