can't debug bzr on windows with pdb

Bug #587868 reported by Jason Spashett
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Medium
Unassigned

Bug Description

Bzr 'does' something with arguments on windows, which means it cannot be debugged with pdb as far as I can tell. The same command works on ubuntu.

C:\bzr\bazaar\578005_no_revision_none>python -m pdb bzr.py help
> c:\bzr\bazaar\578005_no_revision_none\bzr.py(19)<module>()
-> """Bazaar -- a free distributed version-control tool"""
(Pdb) c
bzr: ERROR: unknown command "bzr.py"
bzr: warning: some compiled extensions could not be loaded; see <https://answers.launchpad.net/bzr/+faq/703>

C:\bzr\bazaar\578005_no_revision_none>

Tags: win32

Related branches

Revision history for this message
Jason Spashett (jspashett) wrote :

Looking into it I find in get_unicode_argv

 # Skip the first argument, since we only care about parameters
        argv = _command_line_to_argv(command_line)[1:]

but the command line retunred from GetCommandLineW on windows, is:

unicode: C:\\Python26\\python.exe -u C:\\eclipse\\plugins\\org.python.pydev.debug_1.5.7.2010050621\\pysrc\\pydevd.py --vm_type python --client localhost --port 2134 --file C:\\bzr\\bazaar\\578005_no_revision_none\\bzr.py

under the debugger.

I'll check later how it works on other platforms. Perhaps pdb "fixes up the argv", but GetCommandLineW sidesteps the fix.

It makes no difference if I use eclipse or pdb straight.

Revision history for this message
Alexander Belchenko (bialix) wrote :

python interpreter itself processing its own command-line options and removed them from sys.argv. But bzr on windows trying to parse original command-line manually.

The best you can do is to write your own bzr.py to bypass that command-line parsing.

Changed in bzr:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Jason Spashett (jspashett) wrote :

Yes perhaps it cannot be fixed. How come we need to call the windows api to get unicode arguments? Python does not do this on windows?

Revision history for this message
Jason Spashett (jspashett) wrote :

I am thinking along the lines of the last answer here on stackoverflow: http://stackoverflow.com/questions/846850/how-to-read-unicode-characters-from-command-line-arguments-in-python-on-windows

that is count len sys.argv, and len CommandLineW and remove the same number of characters. At least, that's what I think it does. -- but it won't work. becuase of multibyte encoding. However, prehaps we can count the number of arguments presumably, and remove the appropriate number from the start of the command line, which might work, or it might not with stateful encodings, it dends what happens withe the space character I assume.

Revision history for this message
Alexander Belchenko (bialix) wrote : Re: [Bug 587868] Re: can't debug bzr on windows with pdb

Jason Spashett пишет:
> Yes perhaps it cannot be fixed.

No, it can be fixed, just need to add more code to bzr command-line
parsing functions. If you willing to fix this I can guide you.

> How come we need to call the windows api
> to get unicode arguments?

1) We need to get actual unicode command-line because python tends to
encode arguments in user encoding, effectively hiding from us real
unicode. We had bug report about this.

2) Starting from 2.1 bzr parsing raw unsicode command-line string to be
able imitate glog expansion as on Linux, e.g.

bzr status *.txt

will show you status only for text files in the current directory.

> Python does not do this on windows?

Yep.

Revision history for this message
Jason Spashett (jspashett) wrote :

Hello Alexander,

Alexander Belchenko wrote:
> Jason Spashett пишет:
>
>> Yes perhaps it cannot be fixed.
>>
>
> No, it can be fixed, just need to add more code to bzr command-line
> parsing functions. If you willing to fix this I can guide you.
>
>
Yes, I can have a look at this, could you describe briefly what it may
require? What I wrote in the bug report wasn't a good way?

>> How come we need to call the windows api
>> to get unicode arguments?
>>
>
> 1) We need to get actual unicode command-line because python tends to
> encode arguments in user encoding, effectively hiding from us real
> unicode. We had bug report about this.
>
>
Understood.

> 2) Starting from 2.1 bzr parsing raw unsicode command-line string to be
> able imitate glog expansion as on Linux, e.g.
>
> bzr status *.txt
>
> will show you status only for text files in the current directory.
>
Not sure why unicode is needed for (2), but it doesn't matter. Obviously
there is a reason.

Regards,

Jason

Revision history for this message
Alexander Belchenko (bialix) wrote :

Jason Spashett пишет:
> Alexander Belchenko wrote:
>> 2) Starting from 2.1 bzr parsing raw unsicode command-line string to be
>> able imitate glog expansion as on Linux, e.g.
>>
>> bzr status *.txt
>>
>> will show you status only for text files in the current directory.
>>
> Not sure why unicode is needed for (2), but it doesn't matter. Obviously
> there is a reason.

For emulating glob properly bzr needs to know about each argument: is
it was quoted or not. So

bzr ignore "*.obj"

won't expand wildcards and keep them as is. For this reason bzr needs
access to raw command-line string. Unicode is not required for this of
course.

Revision history for this message
Alexander Belchenko (bialix) wrote :

What version of bzr you're using: lp:bzr/2.1 or lp:bzr (aka bzr.dev)?

tags: added: win32
Revision history for this message
Alexander Belchenko (bialix) wrote :

Code hint: look at the bzrlib/win32utils.py, end of the file has the function get_unicode_argv. In that function you need to improve the following code block:

        if getattr(sys, 'frozen', None) is None:
            # Invoked via 'python.exe' which takes the form:
            # python.exe [PYTHON_OPTIONS] C:\Path\bzr [BZR_OPTIONS]
            # we need to get only BZR_OPTIONS part,
            # We already removed 'python.exe' so we remove everything up to and
            # including the first non-option ('-') argument.
            for idx in xrange(len(argv)):
                if argv[idx][:1] != '-':
                    break
            argv = argv[idx+1:]

Currently it handles `python -c "foo" ...` use case only.

If you comparing sys.argv (sys.argv[1:]) and output of that function you can see extra arguments (from -m pdb) which should be removed as well.

Revision history for this message
Jason Spashett (jspashett) wrote :

Alexander Belchenko wrote:
> What version of bzr you're using: lp:bzr/2.1 or lp:bzr (aka bzr.dev)?
>
> ** Tags added: win32
>
>
bzr.dev

Revision history for this message
Alexander Belchenko (bialix) wrote :

Jason Spashett пишет:
> Hello Alexander,
>
> You said that bug #587868 could be fixed. I can look at this but perhaps
> you can tell me what you think needs to be done to fix it.

Read my comment https://bugs.launchpad.net/bzr/+bug/587868/comments/9

If you run qannotate on this function and go back to its origin, you can
see that I've used len(sys.argv[1:]) to get the size of the tail. But
now (with glob expansion) we can't use the same. So we need to use
different approach.

The first good step would be to run bzr under debugger (as you did) and
compare sys.argv with output of get_unicode_argv(), please post those
data as comment to bug report. Based on this data we need to find the
solution to make bzr compatible with pdb.

Another option would be to implement support for some environment
variable, e.g. BZR_NO_GLOB to avoid using this code and forcing use
sys.argv instead.

Maybe the latter would be much simpler to implement.

Revision history for this message
Alexander Belchenko (bialix) wrote :

I've implemented BZR_NO_GLOB support in the attached branch (https://code.launchpad.net/~bialix/bzr/no-glob). If you're OK with it then it could be merged into bzr.dev.

Revision history for this message
Jason Spashett (jspashett) wrote :

Thanks Alexander. I have also put a branch up: https://code.launchpad.net/~jspashett/bzr/587868

which should correctly strip arguments in line with sys.argv. Please have a look and tell me what you think. I am just trying to think of cases where it may not work or cause problems. Tested, and appears to work standalone, with pdb and with eclipse.

Revision history for this message
Jason Spashett (jspashett) wrote :

I have updated https://code.launchpad.net/~jspashett/bzr/587868_args_handling_cant_debug and added rational in comments and assert for the case that:

len(arguments) < len (sys.argv) cannot occur.

Can anyone see a problem with the way this fix operates?

Revision history for this message
John A Meinel (jameinel) wrote :

Branch landed in 2.2, marking it as fixed.

Changed in bzr:
milestone: none → 2.2.0
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.