Tables are loaded in a random order, loading large ones first is preferable

Bug #1579479 reported by Scott Wheeler
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL Data Dumper
Opinion
Wishlist
Unassigned

Bug Description

I noticed that per chance my largest table in a database I'm testing with was being loaded last with myloader, which meant that it spent most of its loading time not benefiting from parallelism.

I added a bit of code to sort the table list by decreasing file size, which ensures that the larger tables are loaded first. (File size isn't a perfect proxy for load time, but it's much better than random.)

Patch attached.

Revision history for this message
Scott Wheeler (wheeler) wrote :
Revision history for this message
Scott Wheeler (wheeler) wrote :

Ah, meant to point out that while the results weren't amazingly dramatic for my test case, it did at the end shave about 6% of the load time off of the database, which is worthwhile.

Revision history for this message
Max Bubenick (max-bubenick) wrote :

Hi Scott, to improve parallelism with large tables you have the chunking options

  -r, --rows Try to split tables into chunks of this many rows. This option turns off --chunk-filesize
  -F, --chunk-filesize Split tables into chunks of this output file size. This value is in MB

Changed in mydumper:
status: New → Opinion
importance: Undecided → Wishlist
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.