status shows incorrect info on mv over rm'd file

Bug #109993 reported by Matthew Fuller
2
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
High
John A Meinel

Bug Description

If I rm a file, then mv another file over the same name, 'status' shows the wrong information. If I 'commit' the changes, they come out right in the log however.

Some notes:

- This affects dirstate trees, on 0.15 and current bzr.dev. It does not affect knit trees.

- This IS an ordering-dependent problem. If I call the files 'foo' and 'bzr' instead of 'M' and 'x', the problem does NOT occur (originally seen in a tree with Makefile and xMakefile)

When the following script runs, the 'stat' shows me:
removed:
  M
  x

It should instead show:
removed:
  M
renamed:
  x => M

('dbzr' is my bzr.dev symlinks. Salt to taste)

#!/bin/sh
bzr="dbzr"

${bzr} init
touch M x
${bzr} add
${bzr} ci -m '1'
rm M
${bzr} rm M
${bzr} mv x M
${bzr} stat

Tags: dirstate
Revision history for this message
John A Meinel (jameinel) wrote :

Confirmed.
The dirstate file itself seems to have recorded this correctly (It shows that x was renamed to M, and that M with the old file id was deleted.

If you switch the removed file, it "works".

bzr init foo
cd foo
touch M x
bzr add
bzr commit -m 1
rm x
bzr rm x
bzr mv M x
bzr status

Will properly show M => x and x removed.

I think the problem is that _iter_changes is finding that x is renamed, and then looking up the record for M, and finding the deleted one, rather than the proper target.

And I *think* that is because the file-id "x-foobabrabou" is sorting after "M-aboeunthoeu", so whether we find the existing file, or the deleted one switches based on original filename (which effects file id).

Changed in bzr:
importance: Undecided → High
status: Unconfirmed → Confirmed
Revision history for this message
John A Meinel (jameinel) wrote :

starting on it in associated branch

Changed in bzr:
assignee: nobody → jameinel
status: Confirmed → In Progress
Revision history for this message
John A Meinel (jameinel) wrote :

I don't have a fix yet, but the branch has a failing test.

The problem seems to be with how we are iterating over the filesystem and dirstate at the same time.

Imagine:

touch a b
bzr add a b
bzr commit -m 1
bzr rm a
bzr mv b a

bzr status

At this point there are 2 ('', 'a') records ('', 'a', 'a-id') => deleted, and ('', 'a', 'b-id') => renamed from 'b'

So what happens is we iterate over the filesystem and dirstate at the same time, and we get to
('', 'a') on disk and ('', 'a', 'a-id') in the dirstate. Which tells us (correctly) that 'a' was deleted.
Then we step the dirstate and directory pointer because we have determined that we handled that file.
Then we get to None, and ('', 'a', 'b-id'). Because we already stepped past 'a'.

So then we assume that because ('', 'a', 'b-id') has no matching file on disk, then it is 'missing/deleted'.

The trick is that when we get to a filesystem path, we have to make sure to match it against all matching dirstate records.

This either means we need to not move the filesystem pointer forward until we stop matching, or that when we evaluate for a match, we look ahead to see if there are multiple matching dirstate records.

Both end up causing a small performance hit, because you have extra checks on every record to see if there are more records you need to handle. (You can't tell just from 'a' being deleted that 'b' has been moved on top).

Note, things work correctly if you switch the direction (a => b), because the first record we encounter is the renamed ('', 'b', 'a-id') and then we encounter the deleted ('', 'b', 'b-id').

one other possibility would be to update _process_entry to make it _process_entries(), and have a DirState._get_entries() function which would return all entries that match a given path.

Or have a DirState.iter_entries_by_path() which might return multiple DirState entries (rows?).

Revision history for this message
John A Meinel (jameinel) wrote :

fix available in branch.

Changed in bzr:
status: In Progress → Fix Committed
Revision history for this message
John A Meinel (jameinel) wrote :

This is now in bzr.dev, it missed 0.16rc1, but I imagine it will be in 0.16-final

Changed in bzr:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.