inconsistent delta with deleted directories
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Bazaar |
Fix Released
|
High
|
Unassigned |
Bug Description
Sorry, I wasn't thinking too clearly this morning and just deleted the repo and created a new one after I got this bug since I had just started working on a new project. I can try to recreate if you can't get enough info from the traceback and log, but I'm not sure of the exact sequence of events.
Basically, I deleted a directory test/ where there was evidently some confusion over the presence or absence of a subdirectory test/functional. When I committed this delete, I got an inconsistent delta error and subsequently the repo was corrupted.
The repo contained a Rails application and Rails generators automatically create and delete files as needed in the test/ directory.
Ana Nelson (ananelson) wrote : | #1 |
Ana Nelson (ananelson) wrote : | #2 |
Robert Collins (lifeless) wrote : Re: [Bug 256409] Re: inconsistent delta with deleted directories | #3 |
Changed in bzr: | |
status: | New → Triaged |
Robert Collins (lifeless) wrote : | #4 |
This is critical, having looked at it it is a data loss bug:
here is what I think is happening:
- we get a bad delta for some reason coming *from commit*
- a bad commit is recorded - we have dangling references
- the tree update catches the fail
What we need to do urgently:
- ensure commit rolls back the branch pointer if this happens so that the problem commit doesn't get referenced
- help the users affected to date recover
- get the workaround into distros using 1.3.1 (which seems to be common version for this, could be 1.3.1 specific or more likely just that its the version in hardy
Less urgently we should get a root cause diagnosis and fix the deltas being generated to be correct; also we need to ensure a bad delta passed to commit won't cause nonsense data to be inserted.
Changed in bzr: | |
importance: | Undecided → Critical |
robsta (robsta) wrote : | #5 |
Changed in bzr: | |
milestone: | none → 1.6 |
Robert Collins (lifeless) wrote : | #6 |
I find a long list of missing text refs in the tree.
The following file ids are affected:
set(['cbdbackgr
And the following revisions:
set(['<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', 'rstaudinger@
Robert Collins (lifeless) wrote : | #7 |
I tried this, which is close to what revision 11 does, but it does not fail. I'm going to try more permutations.
Rob - if you can remember at all how you did the move - did you 'mkdir', 'mv', 'bzr mv', or did you have bzr do the move, etc.. it would be very helpful.
def test_add_
# https:/
# corrupt deltas generated at commit...
tree = self.make_
# create a basis tree
# now a new dir and a rename
tree = tree.bzrdir.
Robert Collins (lifeless) wrote : | #8 |
Looking at rev 11, the first place an issue occured, the inventory delta looks ok and does correctly include cbd:
[(None,
'cbd',
'cbd-
InventoryDire
('src/
'cbd/
'cssborder.
InventoryFile
('src/
'src/
'makefile.
InventoryFile
('theme/
'theme/
'makefile.
InventoryFile
('src/css-gtk.c',
'cbd/css-gtk.c',
'cssgtk.
InventoryFile
('theme/
'theme/
'makefile.
InventoryFile
('src/css-gtk.h',
'cbd/css-gtk.h',
'cssgtk.
InventoryFile
('configure.in',
'configure.in',
'configure.
InventoryFile
('tests/
'tests/
'drawing.
InventoryFile
('src/
'cbd/
'csscolor.
InventoryFile
('theme/
'theme/
'makefile.
InventoryFile
('src/
Robert Collins (lifeless) wrote : | #9 |
Still no joy reproducing the first error. I am suspecting delete/rename bugs with commit suffering the consequences of an inconsistent dirstate.
=== modified file 'bzrlib/
--- bzrlib/
+++ bzrlib/
@@ -567,3 +567,143 @@
+
+ def test_add_
+ # https:/
+ # corrupt deltas generated at commit...
+ tree = self.make_
+ rev1 = "rev1"
+ rev2 = "rev2"
+ basis_string = ('<inventory format="5" revision_
+ '<file file_id=
+ '<file file_id=
+ '<file file_id=
+ '<file file_id=
+ '<file file_id=
+ '<file file_id=
+ '<file file_id=
+ '<file executable="yes" file_id=
+ '<file file_id=
+ '<directory file_id=
+ '<file file_id=
+ '<file file_id=
+ '<file file_id=
Robert Collins (lifeless) wrote : | #10 |
I think we can assume a mismatch between the commit builders new_inventory, and the commit objects _basis_delta object.
However both calls to record_
And the _report_
_populate_
Martin Pool (mbp) wrote : Re: dirstate internals question | #11 |
(Again kind of stating the obvious in the hope of working up to the
actual bug.) So looking at the logfile, the user has removed the
directory but not run bzr rm on it, so presumably on entering commit
they are all still marked as present in the dirstate's current tree.
One thing I notice is that after 1.3 Robert added a check to
_apply_removals() to do with checking children are not left behind if
a parent is removed. This doesn't change the success condition but
might have caught a problem at a different place.
There are some 'noise' changes in that file with changes of assertions
to if/raise but I've checked them again and they all seem safe, and
anyhow are not closely related to this code path.
I think the files removed from the tree should be removed from
commit's in-progress inventory, and accumulated into a delta to be
later removed, by _report_
trunk this does a set difference to find the removed files. Doing set
difference is generally something we want to avoid for performance but
not necessarily wrong. This code makes me suspicious of bugs in cases
like removing a file and then re-adding it with a different id. And
as the XXX in here says, these files are not properly sorted by
directory order, and we have had bugs related to that with dirstate in
the past.
I'm not sure if the code that later uses this list relies in it being
in directory order but we should certainly check. If this is the bug
it would account for it not being trivially reproducible unless you
have carefully constructed filenames.
def _report_
# XXX: Could the list of deleted paths and ids be instead taken from
# _populate_
deleted_ids = set(self.
if deleted_ids:
deleted = [(self.
for file_id in deleted_ids]
# XXX: this is not quite directory-order sorting
for path, file_id in deleted:
--
Martin
robsta (robsta) wrote : | #12 |
Robert Collins (lifeless) wrote : | #13 |
We've checked that the same commands do not generate a corrupt branch at revision 10 in robsta's tree; its looking more and more like this is not a dup, but all the history for the bug is in this report.
The only missing key in rev 11 is
0.658 missing keys during fetch: set([('
, which is also the only new id during that commit.
Robert Collins (lifeless) wrote : | #14 |
latest test script
=== modified file 'bzrlib/
--- bzrlib/
+++ bzrlib/
@@ -567,3 +567,129 @@
+
+ def test_add_
+ # https:/
+ # corrupt deltas generated at commit...
+ tree = self.make_
+ rev1 = "rev1"
+ rev2 = "rev2"
+ basis_string = ('<inventory format="5" revision_
+ '<file file_id=
+ '<file file_id=
+ '<file file_id=
+ '<file file_id=
+ '<file file_id=
+ '<file file_id=
+ '<file file_id=
+ '<file executable="yes" file_id=
+ '<file file_id=
+ '<directory file_id=
+ '<file file_id=
+ '<file file_id=
+ '<file file_id=
+ ...
Martin Pool (mbp) wrote : | #15 |
John says:
latest status is, we think we've fixed this, we can't reproduce this.
Maybe we should close this for now?
Robert Collins (lifeless) wrote : Re: [Bug 256409] Re: inconsistent delta with deleted directories | #16 |
On Mon, 2008-09-29 at 23:14 +0000, Martin Pool wrote:
> John says:
>
> latest status is, we think we've fixed this, we can't reproduce this.
>
> Maybe we should close this for now?
From my aug 11 note on this:
What we need to do urgently:
- ensure commit rolls back the branch pointer if this happens so that
the problem commit doesn't get referenced
*** I don't recall getting this done.
- help the users affected to date recover
*** We did this
- get the workaround into distros using 1.3.1 (which seems to be common
version for this, could be 1.3.1 specific or more likely just that its
the version in hardy
*** I'm not sure we've made a push for this.
-Rob
--
GPG key available at: <http://
Martin Pool (mbp) wrote : | #17 |
Bumping to high because it's not being treated as critical
Changed in bzr: | |
importance: | Critical → High |
Changed in bzr: | |
status: | Triaged → Confirmed |
Changed in bzr: | |
status: | Confirmed → Fix Released |
Hi, your repository should be fine, the dirstate is only the working
tree.
If you have any edits in your working directory, use bzr diff or
something similar to capture them to a file, then do:
bzr remove-tree
bzr checkout .
and you will get a clean unedited tree of your last commit; from there
you can restore your edits and keep going.
Now, I did find a fix a bug that is similar to this in dirstate during
the 1.6 cycle; its possible that the issue remains, and I think I have
enough data to test from your log files, so I'm going to leave this bug
open until it gets a closer look.
Thanks,
Rob
status triaged