bzr check is slow
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Bazaar |
Confirmed
|
Medium
|
Unassigned |
Bug Description
Running `bzr check` for bzr.dev repository using bzr 2.4b5.
C:\work\
Checking working tree at 'C:/work/
Checking branch at 'file:/
Checking branch at 'file:/
Checking branch at 'file:/
Checking branch at 'file:/
Checking repository at 'file:/
checked repository file://
37785 revisions
1950 file-ids
2 ghost revisions
1 inconsistent parents
checked branch file://
checked branch file://
checked branch file://
checked branch file://
time: 12850.577
It took 3 hours and 35 minutes to finish.
It spent almost 3 hours in the phase:
checking file graphs:
and during that phase bzr read from the disk about 6.5 GB of data, while the repo itself has 58 MB in packs and 13 MB in indices.
Then bzr spent about 35 minutes in the phase:
checking file graphs:
and during that phase bzr read from the disk about 80 MB of data
I've filed this bug report because of the harsh feedback from http://
I would agree with the point that if bzr check is very slow then people won't use it on their own repositories.
Maybe by default `bzr check` should work much faster and maybe only check the MD5 sum of pack files (Bug #676014), checksum of index files and pack-names database. I hope btree-index files (index files and pack-names) have some checksum inside(?). If the first reason to run check regularly is to have faster alert about filesystem inconsistency, then maybe such fast check should be enough for the pitfalls explained in Adrian's mail? It won't help against malicious attack, of course, but at least common problems with filesystems, including network shares could be checked much faster and easier?
tags: | added: check-for-breezy |
On 26/08/11 16:11, Alexander Belchenko wrote:
> Maybe by default `bzr check` should work much faster and maybe only
> check the MD5 sum of pack files (Bug #676014), checksum of index files
> and pack-names database. I hope btree-index files (index files and pack-
> names) have some checksum inside(?). If the first reason to run check
> regularly is to have faster alert about filesystem inconsistency, then
> maybe such fast check should be enough for the pitfalls explained in
> Adrian's mail? It won't help against malicious attack, of course, but at
> least common problems with filesystems, including network shares could
> be checked much faster and easier?
I think limiting what is checked by default would be a reasonable thing
to do. We can then have a --full option or something like that that
checks more.
Cheers,
Jelmer