inconsistent delta with deleted directories

Bug #256409 reported by Ana Nelson
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
High
Unassigned

Bug Description

Sorry, I wasn't thinking too clearly this morning and just deleted the repo and created a new one after I got this bug since I had just started working on a new project. I can try to recreate if you can't get enough info from the traceback and log, but I'm not sure of the exact sequence of events.

Basically, I deleted a directory test/ where there was evidently some confusion over the presence or absence of a subdirectory test/functional. When I committed this delete, I got an inconsistent delta error and subsequently the repo was corrupted.

The repo contained a Rails application and Rails generators automatically create and delete files as needed in the test/ directory.

Tags: commit
Revision history for this message
Ana Nelson (ananelson) wrote :
Revision history for this message
Ana Nelson (ananelson) wrote :
Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 256409] Re: inconsistent delta with deleted directories

Hi, your repository should be fine, the dirstate is only the working
tree.

If you have any edits in your working directory, use bzr diff or
something similar to capture them to a file, then do:

bzr remove-tree
bzr checkout .

and you will get a clean unedited tree of your last commit; from there
you can restore your edits and keep going.

Now, I did find a fix a bug that is similar to this in dirstate during
the 1.6 cycle; its possible that the issue remains, and I think I have
enough data to test from your log files, so I'm going to leave this bug
open until it gets a closer look.

Thanks,
Rob

 status triaged

Changed in bzr:
status: New → Triaged
Revision history for this message
Robert Collins (lifeless) wrote :

This is critical, having looked at it it is a data loss bug:

here is what I think is happening:
 - we get a bad delta for some reason coming *from commit*
 - a bad commit is recorded - we have dangling references
 - the tree update catches the fail

What we need to do urgently:
 - ensure commit rolls back the branch pointer if this happens so that the problem commit doesn't get referenced
 - help the users affected to date recover
 - get the workaround into distros using 1.3.1 (which seems to be common version for this, could be 1.3.1 specific or more likely just that its the version in hardy

Less urgently we should get a root cause diagnosis and fix the deltas being generated to be correct; also we need to ensure a bad delta passed to commit won't cause nonsense data to be inserted.

Changed in bzr:
importance: Undecided → Critical
Revision history for this message
robsta (robsta) wrote :
Martin Pool (mbp)
Changed in bzr:
milestone: none → 1.6
Revision history for this message
Robert Collins (lifeless) wrote :
Download full text (4.2 KiB)

I find a long list of missing text refs in the tree.

The following file ids are affected:
set(['cbdbackground.c-20080709141546-0hwoe1hny7t66qbw-1', 'libcbd-20080721081802-p4lpq2aku0ea1ank-1', 'cbdnode.c-20080728145019-3sq30ejmfun0eu9o-1', 'cbdruleset.c-20080707085659-wmf77z0bqqnn00vy-1', 'cbdprimitives.h-20080707091602-bgllbsokrc0ljkxk-1', 'cbd.sgml-20080723094649-6cdasqyczacflh6z-4', 'cbdstyle.c-20080708145010-gh3qsjzf599eg09x-1', 'cbdbackground.h-20080707153730-w999lwz113rj8xme-1', 'cbdoverrides.txt-20080722080842-sduahgvr15cbhvjd-1', 'cbdfeatures.h-20080721084723-0mddxlkv0sus4qxk-1', 'cbd-20080704110353-r3riigod3h5b596c-1', 'cbdsections.txt-20080722080915-dbjyl0dva5t3ka5h-1', 'cbdruleset.h-20080707085659-wmf77z0bqqnn00vy-2', 'cbdunused.sgml-20080723094649-6cdasqyczacflh6z-2', 'cbdblock.c-20080724092807-bzwd9c0qibr70jlq-1', 'cbdruleset.h-20080721153356-rsu1jmsbw6rzb3c5-2', 'cbdgtkstyle.c-20080721090633-tjcvf13ma0um53ge-1', 'cbdutils.h-20080721153356-rsu1jmsbw6rzb3c5-3', 'cbdselectorgroup.c-20080724124821-6cv9au4venf5fad0-1', 'libcbddoc-20080722074216-2arwt4kj1wfjszvc-1', 'cbdgtkstyle.h-20080721090633-tjcvf13ma0um53ge-2', 'cbd_gtk_style_functi-20080723094649-6cdasqyczacflh6z-6', 'cbdparser.h-20080708124649-ph9wkyq3xtwhy374-1', 'cbd.types-20080722082235-ysmj3ya2brlq1q76-1', 'cbddocs.sgml-20080724151058-oy5zvis094lesz29-1', 'cbdselectorgrouppriv-20080807102815-xwrnrk10gusgzop3-1', 'cbdruleset.c-20080721153356-rsu1jmsbw6rzb3c5-1', 'cbdblock.h-20080724092807-bzwd9c0qibr70jlq-2', 'cbdparser.c-20080704132546-ksu1ladgwczvc1p8-1', 'cbdcolor.c-20080704132609-4cyg0p8ht0gu8pun-1', 'cbdselectorgroup.h-20080724124323-w7n1y0l77tg0snja-1', 'cbdnode.h-20080718121752-hznnzep4l13ypohw-1', 'cbd.c-20080718121207-rjjv8f1j8jdt4nzq-1', 'libcbdexamples-20080807125348-15izf8csat9pqo3n-1'])

And the following revisions:
set(['<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', 'rstaudinger@meqbuq-20080707161316-ovzr4ro5nnepw1vk', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', 'rstaudinger@meqbuq-20080704140118-zfu9ft5a61y34252', '<email address hidden>', '<email address hidden>', '<email address hidden>', '<email address hidden>', 'robsta@gno...

Read more...

Revision history for this message
Robert Collins (lifeless) wrote :

I tried this, which is close to what revision 11 does, but it does not fail. I'm going to try more permutations.

Rob - if you can remember at all how you did the move - did you 'mkdir', 'mv', 'bzr mv', or did you have bzr do the move, etc.. it would be very helpful.

    def test_add_directory_and_rename_into_bug_256409(self):
        # https://bugs.launchpad.net/bzr/+bug/256409
        # corrupt deltas generated at commit...
        tree = self.make_branch_and_tree('.')
        # create a basis tree
        self.build_tree(['src/', 'src/css.h'])
        tree.smart_add(['.'])
        tree.commit('first post')
        # now a new dir and a rename
        self.build_tree(['cbd/'])
        tree.smart_add(['.'])
        tree.move(['src/css.h'], 'cbd')
        tree = tree.bzrdir.open_workingtree()
        tree.commit('make and move')
        self.run_bzr('check')

Revision history for this message
Robert Collins (lifeless) wrote :
Download full text (4.0 KiB)

Looking at rev 11, the first place an issue occured, the inventory delta looks ok and does correctly include cbd:

[(None,
  'cbd',
  'cbd-20080704110353-r3riigod3h5b596c-1',
  InventoryDirectory('cbd-20080704110353-r3riigod3h5b596c-1', 'cbd', parent_id='TREE_ROOT', revision='rstaudinger@meqbuq-20080704112853-mzq042mhe9mtnjma')),
 ('src/css-border.c',
  'cbd/css-border.c',
  'cssborder.c-20080702140619-zp6u6cvl3fug2tnh-1',
  InventoryFile('cssborder.c-20080702140619-zp6u6cvl3fug2tnh-1', 'css-border.c', parent_id='cbd-20080704110353-r3riigod3h5b596c-1', sha1='504502967ef9402a291b0c486c331c2bc466b5b8', len=5196)),
 ('src/Makefile.am',
  'src/Makefile.am',
  'makefile.am-20080630152523-k74ehpkuf0lttzfl-12',
  InventoryFile('makefile.am-20080630152523-k74ehpkuf0lttzfl-12', 'Makefile.am', parent_id='src-20080630152523-k74ehpkuf0lttzfl-10', sha1='1c92262b27d826b8ed3b13bc867dc58fc074d281', len=341)),
 ('theme/null/gtk-2.0/Makefile.am',
  'theme/null/gtk-2.0/Makefile.am',
  'makefile.am-20080630152523-k74ehpkuf0lttzfl-22',
  InventoryFile('makefile.am-20080630152523-k74ehpkuf0lttzfl-22', 'Makefile.am', parent_id='gtk2.0-20080630152523-k74ehpkuf0lttzfl-21', sha1='1afa74745a742644821c461d21f5bb18d60ac795', len=90)),
 ('src/css-gtk.c',
  'cbd/css-gtk.c',
  'cssgtk.c-20080702140619-zp6u6cvl3fug2tnh-2',
  InventoryFile('cssgtk.c-20080702140619-zp6u6cvl3fug2tnh-2', 'css-gtk.c', parent_id='cbd-20080704110353-r3riigod3h5b596c-1', sha1='ef88dca3b2e001a41bc69ef38ba1592a6c1cebc0', len=9313)),
 ('theme/null/Makefile.am',
  'theme/null/Makefile.am',
  'makefile.am-20080630152523-k74ehpkuf0lttzfl-20',
  InventoryFile('makefile.am-20080630152523-k74ehpkuf0lttzfl-20', 'Makefile.am', parent_id='null-20080630152523-k74ehpkuf0lttzfl-19', sha1='7cea4ff64cb044fe624e8be862a5343a8e2e625f', len=19)),
 ('src/css-gtk.h',
  'cbd/css-gtk.h',
  'cssgtk.h-20080702140622-1a1l73xns7a30ji6-3',
  InventoryFile('cssgtk.h-20080702140622-1a1l73xns7a30ji6-3', 'css-gtk.h', parent_id='cbd-20080704110353-r3riigod3h5b596c-1', sha1='c75b107cf8dbd905163cd8aa9f52c31e170f5c4b', len=1308)),
 ('configure.in',
  'configure.in',
  'configure.in-20080630152523-k74ehpkuf0lttzfl-9',
  InventoryFile('configure.in-20080630152523-k74ehpkuf0lttzfl-9', 'configure.in', parent_id='TREE_ROOT', sha1='7d3d9221781fd5f29bdd75e43466b8a864bde50a', len=1145)),
 ('tests/drawing.c',
  'tests/drawing.c',
  'drawing.c-20080703141534-hfdtbme1gtdl6g10-1',
  InventoryFile('drawing.c-20080703141534-hfdtbme1gtdl6g10-1', 'drawing.c', parent_id='tests-20080703135019-lso2pyy41scmv466-1', sha1='f8448842b21b14d95305eb82c46afea12a19323f', len=2845)),
 ('src/css-color.h',
  'cbd/css-color.h',
  'csscolor.h-20080702140622-1a1l73xns7a30ji6-2',
  InventoryFile('csscolor.h-20080702140622-1a1l73xns7a30ji6-2', 'css-color.h', parent_id='cbd-20080704110353-r3riigod3h5b596c-1', sha1='dba80fd0561a445bdde9cd88f384d16b4d01e458', len=943)),
 ('theme/Makefile.am',
  'theme/Makefile.am',
  'makefile.am-20080630152523-k74ehpkuf0lttzfl-18',
  InventoryFile('makefile.am-20080630152523-k74ehpkuf0lttzfl-18', 'Makefile.am', parent_id='theme-20080630152523-k74ehpkuf0lttzfl-11', sha1='0a5f00f318430def61926d3161251ae6ad03c173', len=16)),
 ('src/css-border....

Read more...

Revision history for this message
Robert Collins (lifeless) wrote :
Download full text (17.3 KiB)

Still no joy reproducing the first error. I am suspecting delete/rename bugs with commit suffering the consequences of an inconsistent dirstate.

=== modified file 'bzrlib/tests/blackbox/test_commit.py'
--- bzrlib/tests/blackbox/test_commit.py 2007-12-10 16:39:00 +0000
+++ bzrlib/tests/blackbox/test_commit.py 2008-08-12 01:44:45 +0000
@@ -567,3 +567,143 @@
             retcode=3)
         self.assertContainsRe(err,
             r'^bzr: ERROR: Cannot lock.*readonly transport')
+
+ def test_add_directory_and_rename_into_bug_256409(self):
+ # https://bugs.launchpad.net/bzr/+bug/256409
+ # corrupt deltas generated at commit...
+ tree = self.make_branch_and_tree('.')
+ rev1 = "rev1"
+ rev2 = "rev2"
+ basis_string = ('<inventory format="5" revision_id="rstaudinger@meqbuq-20080703224820-8k0vqs92cnxmhhbg">\n'
+ '<file file_id="authors-20080630152523-k74ehpkuf0lttzfl-1" name="AUTHORS" revision="rev1" text_sha1="3cac852421abbeb59f7e29bf69b6604a61b6ccfd" text_size="37" />\n'
+ '<file file_id="copying-20080630152523-k74ehpkuf0lttzfl-2" name="COPYING" revision="rev1" text_sha1="dd4547c72e04fd16fd675556993abb945491a69c" text_size="26428" />\n'
+ '<file file_id="changelog-20080630152523-k74ehpkuf0lttzfl-3" name="ChangeLog" revision="rev1" text_sha1="da39a3ee5e6b4b0d3255bfef95601890afd80709" text_size="0" />\n'
+ '<file file_id="install-20080630152523-k74ehpkuf0lttzfl-4" name="INSTALL" revision="rev1" text_sha1="e6ea7e38b35a5876da160c67dbdf349649755142" text_size="9512" />\n'
+ '<file file_id="makefile.am-20080630152523-k74ehpkuf0lttzfl-5" name="Makefile.am" revision="rev1" text_sha1="9a8307720305f34ebcb5b10d88c3bcb1b55b47cc" text_size="177" />\n'
+ '<file file_id="news-20080630152523-k74ehpkuf0lttzfl-6" name="NEWS" revision="rev1" text_sha1="da39a3ee5e6b4b0d3255bfef95601890afd80709" text_size="0" />\n'
+ '<file file_id="readme-20080630152523-k74ehpkuf0lttzfl-7" name="README" revision="rev1" text_sha1="c6922bd6b9d2be14fc37574a3f319db47fb3d3e4" text_size="5792" />\n'
+ '<file executable="yes" file_id="autogen.sh-20080630152523-k74ehpkuf0lttzfl-8" name="autogen.sh" revision="rev1" text_sha1="4bd6a59b990f9b8602c6811d09f7011381dc1f14" text_size="4348" />\n'
+ '<file file_id="configure.in-20080630152523-k74ehpkuf0lttzfl-9" name="configure.in" revision="rev1" text_sha1="c0b4ecfa8bbeca08e54c5c92cca194cc5354ee89" text_size="1135" />\n'
+ '<directory file_id="src-20080630152523-k74ehpkuf0lttzfl-10" name="src" revision="rev1" />\n'
+ '<file file_id="makefile.am-20080630152523-k74ehpkuf0lttzfl-12" name="Makefile.am" parent_id="src-20080630152523-k74ehpkuf0lttzfl-10" revision="rev1" text_sha1="c79b9f6ee3d5e190159bded594a3c57bac99cb2d" text_size="389" />\n'
+ '<file file_id="cssborder.c-20080702140619-zp6u6cvl3fug2tnh-1" name="css-border.c" parent_id="src-20080630152523-k74ehpkuf0lttzfl-10" revision="rev1" text_sha1="931d11b533d9dcf8aa845b266399aa0103456101" text_size="5210" />\n'
+ '<file file_id="cssborder.h-20080702140622-1a1l73xns7a30ji6-1" name="css-border.h" parent_id="src-2008063...

Revision history for this message
Robert Collins (lifeless) wrote :

I think we can assume a mismatch between the commit builders new_inventory, and the commit objects _basis_delta object.

However both calls to record_entry_contents in commit.py have ann appropriate self._basis_delta.append(delta) call afterwards, which is good.

And the _report_and_accumulate_deletes method which looks for the file ids removed from from the basis inventory vs the new inventory also seems correct.

_populate_from_inventory tells the working tree appropriately about unversioning steps it takes, but the working tree inventory is not considered in delta creation (this is appropriate).

Revision history for this message
Martin Pool (mbp) wrote : Re: dirstate internals question

(Again kind of stating the obvious in the hope of working up to the
actual bug.) So looking at the logfile, the user has removed the
directory but not run bzr rm on it, so presumably on entering commit
they are all still marked as present in the dirstate's current tree.

One thing I notice is that after 1.3 Robert added a check to
_apply_removals() to do with checking children are not left behind if
a parent is removed. This doesn't change the success condition but
might have caught a problem at a different place.

There are some 'noise' changes in that file with changes of assertions
to if/raise but I've checked them again and they all seem safe, and
anyhow are not closely related to this code path.

I think the files removed from the tree should be removed from
commit's in-progress inventory, and accumulated into a delta to be
later removed, by _report_and_accumulate_deletes from commit.py. In
trunk this does a set difference to find the removed files. Doing set
difference is generally something we want to avoid for performance but
not necessarily wrong. This code makes me suspicious of bugs in cases
like removing a file and then re-adding it with a different id. And
as the XXX in here says, these files are not properly sorted by
directory order, and we have had bugs related to that with dirstate in
the past.

I'm not sure if the code that later uses this list relies in it being
in directory order but we should certainly check. If this is the bug
it would account for it not being trivially reproducible unless you
have carefully constructed filenames.

    def _report_and_accumulate_deletes(self):
        # XXX: Could the list of deleted paths and ids be instead taken from
        # _populate_from_inventory?
        deleted_ids = set(self.basis_inv._byid.keys()) - \
            set(self.builder.new_inventory._byid.keys())
        if deleted_ids:
            self.any_entries_deleted = True
            deleted = [(self.basis_tree.id2path(file_id), file_id)
                for file_id in deleted_ids]
            deleted.sort()
            # XXX: this is not quite directory-order sorting
            for path, file_id in deleted:
                self._basis_delta.append((path, None, file_id, None))
                self.reporter.deleted(path)

--
Martin

Revision history for this message
robsta (robsta) wrote :
Revision history for this message
Robert Collins (lifeless) wrote :

We've checked that the same commands do not generate a corrupt branch at revision 10 in robsta's tree; its looking more and more like this is not a dup, but all the history for the bug is in this report.

The only missing key in rev 11 is
0.658 missing keys during fetch: set([('cbd-20080704110353-r3riigod3h5b596c-1', 'rstaudinger@meqbuq-20080704112853-mzq042mhe9mtnjma')])
, which is also the only new id during that commit.

Revision history for this message
Robert Collins (lifeless) wrote :
Download full text (10.6 KiB)

latest test script
=== modified file 'bzrlib/tests/blackbox/test_commit.py'
--- bzrlib/tests/blackbox/test_commit.py 2007-12-10 16:39:00 +0000
+++ bzrlib/tests/blackbox/test_commit.py 2008-08-12 08:37:32 +0000
@@ -567,3 +567,129 @@
             retcode=3)
         self.assertContainsRe(err,
             r'^bzr: ERROR: Cannot lock.*readonly transport')
+
+ def test_add_directory_and_rename_into_bug_256409_2(self):
+ # https://bugs.launchpad.net/bzr/+bug/256409
+ # corrupt deltas generated at commit...
+ tree = self.make_branch_and_tree('.')
+ rev1 = "rev1"
+ rev2 = "rev2"
+ basis_string = ('<inventory format="5" revision_id="rstaudinger@meqbuq-20080703224820-8k0vqs92cnxmhhbg">\n'
+ '<file file_id="authors-20080630152523-k74ehpkuf0lttzfl-1" name="AUTHORS" revision="rev1" text_sha1="3cac852421abbeb59f7e29bf69b6604a61b6ccfd" text_size="37" />\n'
+ '<file file_id="copying-20080630152523-k74ehpkuf0lttzfl-2" name="COPYING" revision="rev1" text_sha1="dd4547c72e04fd16fd675556993abb945491a69c" text_size="26428" />\n'
+ '<file file_id="changelog-20080630152523-k74ehpkuf0lttzfl-3" name="ChangeLog" revision="rev1" text_sha1="da39a3ee5e6b4b0d3255bfef95601890afd80709" text_size="0" />\n'
+ '<file file_id="install-20080630152523-k74ehpkuf0lttzfl-4" name="INSTALL" revision="rev1" text_sha1="e6ea7e38b35a5876da160c67dbdf349649755142" text_size="9512" />\n'
+ '<file file_id="makefile.am-20080630152523-k74ehpkuf0lttzfl-5" name="Makefile.am" revision="rev1" text_sha1="9a8307720305f34ebcb5b10d88c3bcb1b55b47cc" text_size="177" />\n'
+ '<file file_id="news-20080630152523-k74ehpkuf0lttzfl-6" name="NEWS" revision="rev1" text_sha1="da39a3ee5e6b4b0d3255bfef95601890afd80709" text_size="0" />\n'
+ '<file file_id="readme-20080630152523-k74ehpkuf0lttzfl-7" name="README" revision="rev1" text_sha1="c6922bd6b9d2be14fc37574a3f319db47fb3d3e4" text_size="5792" />\n'
+ '<file executable="yes" file_id="autogen.sh-20080630152523-k74ehpkuf0lttzfl-8" name="autogen.sh" revision="rev1" text_sha1="4bd6a59b990f9b8602c6811d09f7011381dc1f14" text_size="4348" />\n'
+ '<file file_id="configure.in-20080630152523-k74ehpkuf0lttzfl-9" name="configure.in" revision="rev1" text_sha1="c0b4ecfa8bbeca08e54c5c92cca194cc5354ee89" text_size="1135" />\n'
+ '<directory file_id="src-20080630152523-k74ehpkuf0lttzfl-10" name="src" revision="rev1" />\n'
+ '<file file_id="makefile.am-20080630152523-k74ehpkuf0lttzfl-12" name="Makefile.am" parent_id="src-20080630152523-k74ehpkuf0lttzfl-10" revision="rev1" text_sha1="c79b9f6ee3d5e190159bded594a3c57bac99cb2d" text_size="389" />\n'
+ '<file file_id="cssborder.c-20080702140619-zp6u6cvl3fug2tnh-1" name="css-border.c" parent_id="src-20080630152523-k74ehpkuf0lttzfl-10" revision="rev1" text_sha1="931d11b533d9dcf8aa845b266399aa0103456101" text_size="5210" />\n'
+ '<file file_id="cssborder.h-20080702140622-1a1l73xns7a30ji6-1" name="css-border.h" parent_id="src-20080630152523-k74ehpkuf0lttzfl-10" revision="rev1" text_sha1="c2013976f4382623866b631d52f94476384d6253" text_size="1879" />\n'
+ ...

Revision history for this message
Martin Pool (mbp) wrote :

John says:

latest status is, we think we've fixed this, we can't reproduce this.

Maybe we should close this for now?

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 256409] Re: inconsistent delta with deleted directories

On Mon, 2008-09-29 at 23:14 +0000, Martin Pool wrote:
> John says:
>
> latest status is, we think we've fixed this, we can't reproduce this.
>
> Maybe we should close this for now?

From my aug 11 note on this:

What we need to do urgently:
 - ensure commit rolls back the branch pointer if this happens so that
the problem commit doesn't get referenced
 *** I don't recall getting this done.

 - help the users affected to date recover
 *** We did this

 - get the workaround into distros using 1.3.1 (which seems to be common
version for this, could be 1.3.1 specific or more likely just that its
the version in hardy
  *** I'm not sure we've made a push for this.

-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.

Revision history for this message
Martin Pool (mbp) wrote :

Bumping to high because it's not being treated as critical

Changed in bzr:
importance: Critical → High
Martin Pool (mbp)
Changed in bzr:
status: Triaged → Confirmed
Vincent Ladeuil (vila)
Changed in bzr:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.