Copy to library takes 30 times longer than import
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
calibre |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Calibre 2.58 on Ubuntu 14.04.
I have a large library >450k books. When I import books, I import into an empty library so I can edit metadata quickly. This includes, typically, switching Author and Title fields, correcting Authors with commas, setting language to English, eliminating dates from Author fields, etc. When finished, I can either import the books from the small library directory, or copy the books to the large library. The import loses some edits as it re-reads data from the files themselves. The library copy works better, but takes MUCH longer, like multiple days to import 4k files into the large library. I think the library copy should be faster, not much slower, as it should just be copying the metadata from the database to the other database. The import is re-reading the files and getting bad data again. The correcting edits then take much longer, as updating the large library takes a minute or so for each individual edit.
Copying to library has to first serialize metadata, then import it into
the destination library. Importing directly makes use of the already
serialized metadata in the OPF file -- so it will always be a little slower than
direct import. However, most of the performance difference comes from
the adding books process having an optimized implementation for finding
duplicates -- I'll port that over to copy to library someday, but in the
meantime you could just turn off checking for dupes when copying to
library in Preferences->Adding Books.
And note that importing directly from the source library folders will >Library
not lose any metadata provided that you run Library maintenance-
metadata backup stats and wait for the backups to be completed (this
causes the aforementioned opf files to be written out).