provider crawls all tables and excludes rather than just including specified tables

Bug #677718 reported by MikeG
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
holland-backup
Won't Fix
Low
Unassigned

Bug Description

If you have:

[mysqldump]
tables = "foo.importantTable'

You will see holland do something along the lines of:

Excluding mysql.columns_priv
Excluding mysql.db
Excluding mysql.func
Writing ignore-table exlusions to /var/spool/holland/bkset/20101120_022118/my.cnf

This could be really inefficient on a server with thousands of tables when you want just the one

Revision history for this message
Andrew Garner (abg) wrote :

Basically what holland-mysqldump does is, given a set of inclusion/exclusion filters generate a mysqldump command line to backup the requested tables. However, mysqldump itself has *very* primitive support for inclusion/exclusions. This support basically falls into three categories:

1) Dump specific tables from *one* database only => mysqldump [options] database table [table..]

2) Dump multiple databases and exclude specific tables => mysqldump [options] --ignore-table=db.tbl --databases db [db...]

3) Dump all databases and exclude specific tables => mysqldump [options] [--ignore-table=db.tbl...] --all-databases

If we have a single 'tables' request, we may be able to hard code a fast path for that to always choose (1). If it includes a pattern we still have to scan a potentially large number of tables to find matches. If "tables" includes tables from multiple databases but file-per-database=off (meaning, the user wants a consistent snapshot of all those tables via a single mysqldump command) there's not a good way to support that without generating a large list of --ignore-table options.

I think the common use case is to simply filter a few tables and otherwise back everything else up. You can always manipulate the mysqldump command line through holland via additional-options. For efficient inclusion/exclusion we would need support from mysqldump (or a similar backend). I'm skeptical that the additional complexity in holland-mysqldump to more efficiently handle edge cases here is warranted currently. Although perhaps I'm just not aware of interesting use cases.

Essentially I view this as a limitation in mysqldump itself and holland has a reasonable general purpose filtering algorithm. I'm completely open to contributions to provide better alternatives - either as new plugins or reworking the existing "Given a set of filters, run these mysqldump commands" algorithm.

Changed in holland-backup:
importance: Undecided → Low
Andrew Garner (muzazzi)
Changed in holland-backup:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.