URL parsing is different from git and prevents some use cases

Bug #1169368 reported by Brian Ealdwine
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Dulwich
Fix Released
Medium
Unassigned

Bug Description

A stack trace occurs, ending in the error written in the title.

This occurs with the following command (private names obscured):
hg clone ssh://<email address hidden>/~/git/my_project.git

SSHing into the host works fine.
Doing a git clone works fine.

To recreate:
go to www.openshift.redhat.com
..acquire a free account
..create an app
..post your ssh pubkey
..ssh into the host to check that ssh is working properly
..attempt to clone with hg-git.

I am using:
dulwich 0.8.7
hg-git (most recent from <email address hidden>/durin42/hg-git)
hg 2.2.2

Revision history for this message
Jelmer Vernooij (jelmer) wrote : Re: [Bug 1169368] [NEW] Invalid literal for int() with base 16: 'Inva'

On Tue, Apr 16, 2013 at 12:18:24AM -0000, Brian Visel wrote:
> A stack trace occurs, ending in the error written in the title.
>
> This occurs with the following command (private names obscured):
> hg clone ssh://<email address hidden>/~/git/my_project.git
>
> SSHing into the host works fine.
> Doing a git clone works fine.
>
> To recreate:
> go to www.openshift.redhat.com
> ..acquire a free account
> ..create an app
> ..post your ssh pubkey
> ..ssh into the host to check that ssh is working properly
> ..attempt to clone with hg-git.
>
> I am using:
> dulwich 0.8.7
> hg-git (most recent from <email address hidden>/durin42/hg-git)
> hg 2.2.2
Can you reproduce this with "dulwich clone ssh://" and perhaps with a public URL (i.e. non-ssh)?

Cheers,

Jelmer

Revision history for this message
Brian Ealdwine (eode) wrote :

Ok, I did a little research for you..

1) yes, the problem still exists if I use dulwich directly.

2) It looks like the problem is in the URL structure (or how a URL is
interpreted)

Apparently, when git goes to ssh://foo.bar.com/baz, it goes to:
foo.bar.com
baz

..but when dulwich goes to ssh://foo.bar.com/baz, it goes to:
foo.bar.com
/baz

I can actually get a nearly identical message from git by altering the URL:
$ git clone ssh://foo.bar.com//baz
Cloning into 'baz'...
fatal: protocol error: bad line length character: Inva

..hope this helps.

On Tue, Apr 16, 2013 at 4:28 AM, Jelmer Vernooij <<email address hidden>
> wrote:

> On Tue, Apr 16, 2013 at 12:18:24AM -0000, Brian Visel wrote:
> > A stack trace occurs, ending in the error written in the title.
> >
> > This occurs with the following command (private names obscured):
> > hg clone ssh://<email address hidden>/~/git/my_project.git
> >
> > SSHing into the host works fine.
> > Doing a git clone works fine.
> >
> > To recreate:
> > go to www.openshift.redhat.com
> > ..acquire a free account
> > ..create an app
> > ..post your ssh pubkey
> > ..ssh into the host to check that ssh is working properly
> > ..attempt to clone with hg-git.
> >
> > I am using:
> > dulwich 0.8.7
> > hg-git (most recent from <email address hidden>/durin42/hg-git)
> > hg 2.2.2
> Can you reproduce this with "dulwich clone ssh://" and perhaps with a
> public URL (i.e. non-ssh)?
>
> Cheers,
>
> Jelmer
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1169368
>
> Title:
> Invalid literal for int() with base 16: 'Inva'
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/dulwich/+bug/1169368/+subscriptions
>

Revision history for this message
Brian Ealdwine (eode) wrote :

One last note -- being as that's the issue, if I go into a python shell and
import the dulwich client, then proceed to set up a client and do:
client.fetch(path="baz", target=some_repo)

whereas:
client.fetch(path='/baz', target=some_repo)
..gives the original error.

On Thu, Apr 18, 2013 at 3:19 AM, Brian Visel <email address hidden> wrote:

> Ok, I did a little research for you..
>
> 1) yes, the problem still exists if I use dulwich directly.
>
> 2) It looks like the problem is in the URL structure (or how a URL is
> interpreted)
>
> Apparently, when git goes to ssh://foo.bar.com/baz, it goes to:
> foo.bar.com
> baz
>
> ..but when dulwich goes to ssh://foo.bar.com/baz, it goes to:
> foo.bar.com
> /baz
>
> I can actually get a nearly identical message from git by altering the URL:
> $ git clone ssh://foo.bar.com//baz
> Cloning into 'baz'...
> fatal: protocol error: bad line length character: Inva
>
> ..hope this helps.
>
>
> On Tue, Apr 16, 2013 at 4:28 AM, Jelmer Vernooij <
> <email address hidden>> wrote:
>
>> On Tue, Apr 16, 2013 at 12:18:24AM -0000, Brian Visel wrote:
>> > A stack trace occurs, ending in the error written in the title.
>> >
>> > This occurs with the following command (private names obscured):
>> > hg clone ssh://<email address hidden>/~/git/my_project.git
>> >
>> > SSHing into the host works fine.
>> > Doing a git clone works fine.
>> >
>> > To recreate:
>> > go to www.openshift.redhat.com
>> > ..acquire a free account
>> > ..create an app
>> > ..post your ssh pubkey
>> > ..ssh into the host to check that ssh is working properly
>> > ..attempt to clone with hg-git.
>> >
>> > I am using:
>> > dulwich 0.8.7
>> > hg-git (most recent from <email address hidden>/durin42/hg-git)
>> > hg 2.2.2
>> Can you reproduce this with "dulwich clone ssh://" and perhaps with a
>> public URL (i.e. non-ssh)?
>>
>> Cheers,
>>
>> Jelmer
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https://bugs.launchpad.net/bugs/1169368
>>
>> Title:
>> Invalid literal for int() with base 16: 'Inva'
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/dulwich/+bug/1169368/+subscriptions
>>
>
>

Revision history for this message
Brian Ealdwine (eode) wrote : Re: Invalid literal for int() with base 16: 'Inva'

..there is also a difference in that when cloning a path that ends in "/", dulwich fails with an OSError (presumably trying to create a folder called '').

..anyways, here's a patch for client.py. I think it's necessary, as there's no other way to get the desired behavior (other than opening up a python console).

Revision history for this message
Brian Ealdwine (eode) wrote :

..there is also a difference in that when cloning a path that ends in "/", dulwich fails with an OSError (presumably trying to create a folder called '').

..anyways, here's a patch for client.py. I think it's necessary, as there's no other way to get the desired behavior (other than opening up a python console).

Revision history for this message
Brian Ealdwine (eode) wrote :

..to clarify, the issue where paths ending in '/' are inaccessible is not a difference introduced by the patch, it's a separate, pre-existing issue.

I don't think url-ending issue really prevents any necessary behaviour, it's just a bit confusing -- whereas the behaviour addressed by the patch inhibits some use cases.

Changed in dulwich:
status: New → Confirmed
summary: - Invalid literal for int() with base 16: 'Inva'
+ URL parsing is different from git and prevents some use cases
Revision history for this message
Jelmer Vernooij (jelmer) wrote : Re: [Bug 1169368] Re: Invalid literal for int() with base 16: 'Inva'

On Thu, 18 Apr, 2013 at 10:56 AM, Brian Visel
<email address hidden> wrote:
> ..to clarify, the issue where paths ending in '/' are inaccessible is
> not a difference introduced by the patch, it's a separate,
> pre-existing
> issue.
>
> I don't think url-ending issue really prevents any necessary
> behaviour,
> it's just a bit confusing -- whereas the behaviour addressed by the
> patch inhibits some use cases.
>
Thanks for the debugging, appreciate it.

The patch seems to be Python 2.5-specific (Dulwich supports 2.4).
Ideally we'd also want to add a unit test for it so it doesn't regress.

Unfortunately my time is fairly limited at the moment, so it might be a
while before I can look at these. Any help appreciated. :-)

Cheers,

Jelmer

Changed in dulwich:
status: Confirmed → Triaged
importance: Undecided → Medium
Revision history for this message
Brian Ealdwine (eode) wrote :

Two bugs fixed in pull request 91:
* The latest commit fixes a urlparse issue that was introduced in December 2010 which makes the command-line unusable for any Python < 2.6. This fix is not exactly elegant, though -- It adds a 2.4 compatible urlparse.py file, and imports it instead if an incompatible urlparse is imported.
  * https://github.com/eode/dulwich/commit/0baed03424ca1bc774e926bccbdc48b1a88fc4f4
* The commit previous to that fixes the 2.4 compatibility issues specific to my earlier patch.
  * https://github.com/eode/dulwich/commit/7e4f29aaa1bdcdd97200130cd088895b74f0fee2

Both of the above patches are included in my pull request:
https://github.com/jelmer/dulwich/pull/91

urlparse bug introduced Dec 2010: https://github.com/jelmer/dulwich/commit/6eb98037b4970642bb307f4c1c8eae702f46c746
Note usage of parsed.scheme, parsed.port, parsed.username, etc.

Revision history for this message
Brian Ealdwine (eode) wrote :

..unit test was also updated to reflect the correct url structure.

Jelmer Vernooij (jelmer)
Changed in dulwich:
status: Triaged → Fix Committed
Revision history for this message
Brian Ealdwine (eode) wrote :

Thanks! :-)

Jelmer Vernooij (jelmer)
Changed in dulwich:
milestone: none → 0.9.0
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.