status gives false positives for text files with CRLF (@win32)

Bug #153493 reported by Alexander Belchenko
4
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Critical
John A Meinel

Bug Description

It seems that Python 2.4 support on windows platform is broken. Here is results for bzr.exe (python 2.5-based) and bzr.dev.revno.2904 running on python 2.4.

bzr.exe 0.91:

E:\Bazaar\mydev\bzr.dev>bzr --no-plugins st

^-- nothing changed. as expected.

Then I run from sources without compiled extensions:

E:\Bazaar\mydev\bzr.dev>python bzr --no-plugins st
modified:
  bzr.ico
  bzrlib/util/effbot/__init__.py
  bzrlib/util/effbot/org/__init__.py
  bzrlib/util/effbot/org/gzip_consumer.py
  bzrlib/util/effbot/org/http_client.py
  bzrlib/util/effbot/org/http_manager.py
  tools/win32/start_bzr.bat

Please note, that actually I don't touch any file.

After this point dirstate file is modified and bzr.exe also starts to show the same result.
When I try to look at diff I get traceback:

E:\Bazaar\mydev\bzr.dev>python bzr di
=== modified file 'bzr.ico'
Binary files bzr.ico 2006-07-31 16:12:57 +0000 and bzr.ico 2007-10-16 10:21:28 +0000 differ
=== modified file 'bzrlib/util/effbot/__init__.py'
bzr: ERROR: exceptions.IndexError: list index out of range

Traceback (most recent call last):
  File "E:\Bazaar\mydev\bzr.dev\bzrlib\commands.py", line 802, in run_bzr_catch_errors
    return run_bzr(argv)
  File "E:\Bazaar\mydev\bzr.dev\bzrlib\commands.py", line 758, in run_bzr
    ret = run(*run_argv)
  File "E:\Bazaar\mydev\bzr.dev\bzrlib\commands.py", line 492, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "E:\Bazaar\mydev\bzr.dev\bzrlib\commands.py", line 768, in ignore_pipe
    result = func(*args, **kwargs)
  File "E:\Bazaar\mydev\bzr.dev\bzrlib\builtins.py", line 1502, in run
    old_label=old_label, new_label=new_label)
  File "E:\Bazaar\mydev\bzr.dev\bzrlib\diff.py", line 344, in diff_cmd_helper
    extra_trees=extra_trees)
  File "E:\Bazaar\mydev\bzr.dev\bzrlib\diff.py", line 378, in show_diff_trees
    path_encoding=path_encoding)
  File "E:\Bazaar\mydev\bzr.dev\bzrlib\diff.py", line 464, in _show_diff_trees
    True, kind, to_file, diff_file)
  File "E:\Bazaar\mydev\bzr.dev\bzrlib\diff.py", line 512, in _maybe_diff_file_or_symlink
    new_tree, to_file)
  File "E:\Bazaar\mydev\bzr.dev\bzrlib\inventory.py", line 160, in diff
    output_to, reverse)
  File "E:\Bazaar\mydev\bzr.dev\bzrlib\inventory.py", line 630, in _diff
    to_label, to_text, output_to)
  File "E:\Bazaar\mydev\bzr.dev\bzrlib\diff.py", line 96, in internal_diff
    ud[0] = ud[0][:-2] + '\n'
IndexError: list index out of range

bzr 0.92.0.dev.0 on python 2.4.3.final.0 (win32)
arguments: ['bzr', 'di']
encoding: 'cp1251', fsenc: 'mbcs', lang: None
plugins:
  launchpad E:\Bazaar\mydev\bzr.dev\bzrlib\plugins\launchpad [unknown]
  multiparent E:\Bazaar\mydev\bzr.dev\bzrlib\plugins\multiparent.pyc [unknown]

** Please send this report to <email address hidden>
   with a description of what you were doing when the
   error occurred.

All files above either binary or have CRLF line-endings.

Tags: win32

Related branches

Revision history for this message
Alexander Belchenko (bialix) wrote :

This bug reproducible with python 2.5 too on Windows 2000, but on my laptop with Python 2.5 @ Windows XP Home is not.
Something really weird going on here.

Revision history for this message
Gary van der Merwe (garyvdm) wrote :

I experienced this bug on WIn XP with python 2.5.1.final.0

Revision history for this message
Alexander Belchenko (bialix) wrote :

I'm not sure is this bug should be marked as High or Critical. For me it's critical enough and for 0.92 it should be fixed.

Changed in bzr:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Alexander Belchenko (bialix) wrote :

Promoted to critical.

I try to dogfooding recent bzr.dev and I can't working with it on win32. Many of my branches affected by this problem, when status lies about changes, then diff blowed up with ugly traceback, and only commit works correctly say me:

bzr: ERROR: no changes to commit. use --unchanged to commit anyhow

Changed in bzr:
importance: High → Critical
Revision history for this message
Alexander Belchenko (bialix) wrote : Re: [Bug 153493] Re: status gives false positives for text files with CRLF (@win32)

This regression was introduced by revno.2894 (Martin Pool) with log
message:

add -Dhashcache, sha_file_by_name using raw os files rather than file
objects (mbp)

I think bug actually in following line:

=== modified file 'bzrlib/osutils.py'
--- bzrlib/osutils.py 04.10.2007 8:09:58
+++ bzrlib/osutils.py 08.10.2007 8:09:59
@@ -590,6 +590,20 @@
      return s.hexdigest()

+def sha_file_by_name(fname):
+ """Calculate the SHA1 of a file by reading the full text"""
+ s = sha.new()
+ f = os.open(fname, os.O_RDONLY)

^-- I believe on Windows it should be:

+ f = os.open(fname, os.O_RDONLY|os.O_BINARY)

And this change indeed fixes regression.
The problem here is O_BINARY flag itself. According to Python
documentation this flag exists only on Windows.
So probably fix should be something like this:

=== modified file 'bzrlib/osutils.py'
--- bzrlib/osutils.py 2007-10-08 05:09:59 +0000
+++ bzrlib/osutils.py 2007-10-22 10:57:42 +0000
@@ -590,10 +590,15 @@
      return s.hexdigest()

+if sys.platform == 'win32':
+ OS_OPEN_FLAGS = os.O_RDONLY | os.O_BINARY
+else:
+ OS_OPEN_FLAGS = os.O_RDONLY
+
  def sha_file_by_name(fname):
      """Calculate the SHA1 of a file by reading the full text"""
      s = sha.new()
- f = os.open(fname, os.O_RDONLY)
+ f = os.open(fname, OS_OPEN_FLAGS)
      try:
          while True:
              b = os.read(f, 1<<16)

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Alexander Belchenko wrote:
> This regression was introduced by revno.2894 (Martin Pool) with log
> message:
>
> add -Dhashcache, sha_file_by_name using raw os files rather than file
> objects (mbp)
>
>
> I think bug actually in following line:
>
> === modified file 'bzrlib/osutils.py'
> --- bzrlib/osutils.py 04.10.2007 8:09:58
> +++ bzrlib/osutils.py 08.10.2007 8:09:59
> @@ -590,6 +590,20 @@
> return s.hexdigest()
>
>
> +def sha_file_by_name(fname):
> + """Calculate the SHA1 of a file by reading the full text"""
> + s = sha.new()
> + f = os.open(fname, os.O_RDONLY)
>
> ^-- I believe on Windows it should be:
>
> + f = os.open(fname, os.O_RDONLY|os.O_BINARY)
>
> And this change indeed fixes regression.
> The problem here is O_BINARY flag itself. According to Python
> documentation this flag exists only on Windows.
> So probably fix should be something like this:
>
> === modified file 'bzrlib/osutils.py'
> --- bzrlib/osutils.py 2007-10-08 05:09:59 +0000
> +++ bzrlib/osutils.py 2007-10-22 10:57:42 +0000
> @@ -590,10 +590,15 @@
> return s.hexdigest()
>
>
> +if sys.platform == 'win32':
> + OS_OPEN_FLAGS = os.O_RDONLY | os.O_BINARY
> +else:
> + OS_OPEN_FLAGS = os.O_RDONLY
> +
> def sha_file_by_name(fname):
> """Calculate the SHA1 of a file by reading the full text"""
> s = sha.new()
> - f = os.open(fname, os.O_RDONLY)
> + f = os.open(fname, OS_OPEN_FLAGS)
> try:
> while True:
> b = os.read(f, 1<<16)
>
>
>
>

+1

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHHKxTJdeBCYSNAAMRAuwzAJ9whDCYz0LLd8TVBnEpFk85c4lxxACfRZQp
MNzao25GqHQX8ISzJpb6Vmk=
=sYck
-----END PGP SIGNATURE-----

Revision history for this message
John A Meinel (jameinel) wrote :

Fix will be in associated branch.

Changed in bzr:
status: Confirmed → In Progress
Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Alexander Belchenko wrote:
> I'm not sure is this bug should be marked as High or Critical. For me
> it's critical enough and for 0.92 it should be fixed.
>

I think it is a Critical regression.

The reason 'diff' is blowing up is because 'internal_diff' dies when you feed
it 2 texts which have no differences. Which should also be considered a bug.

It doesn't effect any actual committed changes, etc.

Oh, and we've already handled the O_BINARY issues in the Transport code. You
can use 'osutils.O_BINARY' which is set up to handle when os.O_BINARY is not
available.

I'll write up a patch.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD4DBQFHHMOSJdeBCYSNAAMRAjXXAJdZPngKfOZwuDlrDVdmkiahW/++AJ4xC4n7
rO670Fi5LL3CvEqV0YK95Q==
=16ci
-----END PGP SIGNATURE-----

John A Meinel (jameinel)
Changed in bzr:
status: In Progress → Fix Committed
assignee: nobody → jameinel
Revision history for this message
John A Meinel (jameinel) wrote :

submitted as bzr.dev 2923

Changed in bzr:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.