`bzr diff --using C:\foo\bar` does not work

Bug #392428 reported by Alexander Belchenko
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Medium
Gordon Tyler

Bug Description

`diff --using` uses shlex.split to handle filename of diff tool, but shlex.split works bad with backslashes. See

In [1]: import shlex

In [2]: shlex.split(r'foo C:/bar/spam')
Out[2]: ['foo', 'C:/bar/spam']

In [3]: shlex.split(r'foo C:\bar\spam')
Out[3]: ['foo', 'C:barspam']

In [5]: shlex.split(r'foo C:\\bar\\spam')
Out[5]: ['foo', 'C:\\bar\\spam']

In [6]: shlex.split(r'foo "C:\bar\spam"')
Out[6]: ['foo', 'C:\\bar\\spam']

`diff --using` uses helper function in commands.py shlex_split_unicode.
Because even in the case of

bzr diff --using "C:\foo\bar"

this function will be invoked with unquoted argument, then there is always will be problem for windows users.

To solve this problem properly bzr should use Win32 API CommandLineToArgvW() function (http://msdn.microsoft.com/en-us/library/bb776391(VS.85).aspx) instead of shlex.split.
This function is already using in win32utils:get_unicode_argv() function.

Related problems: https://bugs.launchpad.net/bzr/+bug/336228, and http://stackoverflow.com/questions/595885/using-winmerge-with-bazaar

Tags: win32

Related branches

Changed in bzr:
assignee: nobody → Alexander Belchenko (bialix)
status: New → Confirmed
Changed in bzr:
assignee: Alexander Belchenko (bialix) → nobody
Martin Pool (mbp)
Changed in bzr:
importance: Undecided → Medium
tags: added: win32
Revision history for this message
Gordon Tyler (doxxx) wrote :

win32utils.get_unicode_argv has been changed to explicitly return the current process' command line arguments, however win32utils._command_line_to_argv does what you want to do using the UnicodeShlex class which duplicates shlex functionality in a win32-compatible manner. Is there any reason it must use the Win32 API CommandLineToArgvW rather than this existing function in win32utils?

Revision history for this message
Gordon Tyler (doxxx) wrote :

In fact, we can't use the Win32 API CommandLineToArgvW because quotes are taken into account but stripped, making it impossible to decide whether a particular arg should be glob expanded.

For example:

'foo "bar baz" stuff'

becomes :

['foo', 'bar baz', 'stuff']

Revision history for this message
John A Meinel (jameinel) wrote :

I think Alexander's point is that when processing "--using" we should use the new UnicodeShlex functionality. He isn't talking about parsing the command line params, but instead parsing the "--using" parameter.

Revision history for this message
Alexander Belchenko (bialix) wrote : Re: [Bug 392428] Re: `bzr diff --using C:\foo\bar` does not work

John A Meinel пишет:
> I think Alexander's point is that when processing "--using" we should
> use the new UnicodeShlex functionality. He isn't talking about parsing
> the command line params, but instead parsing the "--using" parameter.

I suspect new command-line parser from John actually fixed this problem. Just need to test it.

Gordon Tyler (doxxx)
Changed in bzr:
assignee: nobody → Gordon Tyler (doxxx)
milestone: none → 2.1.0rc1
Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Alexander Belchenko wrote:
> John A Meinel пишет:
>> I think Alexander's point is that when processing "--using" we should
>> use the new UnicodeShlex functionality. He isn't talking about parsing
>> the command line params, but instead parsing the "--using" parameter.
>
> I suspect new command-line parser from John actually fixed this problem.
> Just need to test it.
>

I doubt it. I think we double-parse. One time to get the arguments from
the command line, which gives us a big string for '--using'. And then we
parse that string a second time. To give us a shell executable +
arguments to pass. For example

bzr diff --using "diff -bu"

or

bzr diff --using "C:\Program Files\gnuwin32\diff -bu"

Of course, the latter actually needs to be written as:

bzr diff --using "\"C:\\Program Files\\gnuwin32\\diff\" -bu"

(though if you had let me support ' as a quote character, then you could
have done:

bzr diff --using '"C:\Program Files\gnuwin32\diff" -bu'

But if you want to use " inside of an already quoted string, then you
need to escape it, and you need to escape your escape chars. And the
single quote on the outside means that it ignores '\' in the inside. (Or
at least, could have meant that.)

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksvz1YACgkQJdeBCYSNAANkMgCeMF8sN/J0em82ZsdjZ06Xz60W
EDIAn1hbxrDmE2IL6XDOoCDORoHQIl/I
=ow+J
-----END PGP SIGNATURE-----

Revision history for this message
Gordon Tyler (doxxx) wrote :

Oddly enough, the following works:

bzr diff --using "C:\Program Files (x86)\WinMerge\WinMergeU.exe"

as well as:

bzr diff --using "C:\Program Files (x86)\WinMerge\WinMergeU.exe @old_path @new_path"

There's some interesting guessing going on somewhere along the codepath that makes it Do The Right Thing(tm) since when I added some debug code to diff.py to dump the command line template it was using, the list had split the path th WinMergeU.exe at the spaces. But it still worked...

Revision history for this message
Gordon Tyler (doxxx) wrote :

For example: [u'C:\\Program', u'Files', u'(x86)\\WinMerge\\WinMergeU.exe', u'@old_path', u'@new_path']

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gordon Tyler wrote:
> For example: [u'C:\\Program', u'Files',
> u'(x86)\\WinMerge\\WinMergeU.exe', u'@old_path', u'@new_path']
>

I think that if you look at the code using diff, it actually tries to
exec each portion of the path with the rest as args until it works. So
it tries:

C:\Program
then
C:\Program Files
then
C:\Program Files (x86)
then
...

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksv4EgACgkQJdeBCYSNAAMZRgCeOQon9ubngCbCUJUPJ2xscZlN
zmwAoJ+cBEVZ5IipoFLedXW3VfcWoEwp
=0stz
-----END PGP SIGNATURE-----

John A Meinel (jameinel)
Changed in bzr:
milestone: 2.1.0rc1 → none
Martin Pool (mbp)
Changed in bzr:
status: Confirmed → In Progress
Gordon Tyler (doxxx)
Changed in bzr:
status: In Progress → Fix Committed
Revision history for this message
Martin Pool (mbp) wrote :

will be in 2.2b1

Changed in bzr:
milestone: none → 2.2.0b1
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.