MediaUri changes order of parameters

Bug #451512 reported by Guillaume Desmottes
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Moovida
Fix Released
High
Ugo Riboni

Bug Description

Consider the following simple code:

result_model.playable_model.uri = MediaUri('mms://artestras.wmod.llnwd.net/a3903/o35/geo/arte7/EUR_DE_FR/arteprod/A7_SGT_ENC_08_040351-000-A_PG_HQ_FR.wmv?e=1255542944&h=a87a5ba3a44fdb5a2f43fff8dc7685be')
                print "result:", result_model.playable_model.uri

When executing it I get:

result: mms://artestras.wmod.llnwd.net/a3903/o35/geo/arte7/EUR_DE_FR/arteprod/A7_SGT_ENC_08_040351-000-A_PG_HQ_FR.wmv?amp;h=a87a5ba3a44fdb5a2f43fff8dc7685be&e=1255542944

As you can see the resulting URL is slightly different, causing the server rejecting my stream request.

Related branches

Revision history for this message
Olivier Tilloy (osomon) wrote :

The string you are passing to build your URI contains an HTML escaped ampersand (&) which causes the parsing of the parameters to fail. If I replace the escaped entity by its simple representation ("&") it seems to work fine.

I'm not sure whether this is a bug in the MediaUri parser that should do the conversion itself, or if the URI you're getting is malformed. HTML escaping in URLs works in firefox, but that doesn't mean it's correct according to the specification. A quick read through the RFC (http://www.ietf.org/rfc/rfc2396.txt) didn't convince me that HTML escaping of the query delimiters is valid.
If it happened to be the case, could you please provide a failing unit test?

Revision history for this message
Guillaume Desmottes (cassidy) wrote :

Ok, I'll try to write a test.

FYI, I get this kind of URI by parsing an ASF file downloaded from the ARTE server. Obvioulsy I can't fix the server so it would be good if Moovida was handling the URI correctly.

Revision history for this message
Michał Sawicz (saviq) wrote :

AFAIK ASF is xml, that's why you're getting escaped ampersands. Just parse the file with dom.minidom and you should get a correct URI with & instead of &.

BTW, can you please paste such an ASF playlist?

Now I can see what's going on - MediaUri splits the params at '&' and what it gets is a list containing:
- e=1255542944
- amp;h=a87a5ba3a44fdb5a2f43fff8dc7685be

And that's what it then joins in the resulting uri.

Unescaping the ampersand (e.g. using dom) will solve your issues.

Changed in elisa:
status: New → Incomplete
Revision history for this message
Guillaume Desmottes (cassidy) wrote :
Revision history for this message
Guillaume Desmottes (cassidy) wrote :

I use from elisa.extern.coherence import et for the parsing BTW

Revision history for this message
Michał Sawicz (saviq) wrote :

In [1]: from elisa.extern.coherence import et
In [2]: from elisa.core.media_uri import MediaUri
In [3]: tree = et.parse_xml('<ASX VERSION="3.0"><ENTRY><REF HREF="mms://artestras.wmod.llnwd.net/a3903/o35/geo/arte7/ALL/arteprod/A7_SGT_ENC_08_040886-286-A_PG_HQ_FR.wmv?e=1255601312&amp;h=b08e4b7f727bae8554ba4cda60bbd87c"/></ENTRY></ASX>')
In [4]: uri = MediaUri(tree.find('ENTRY/REF').items()[0][1])
In [5]: str(uri)
Out[5]: 'mms://artestras.wmod.llnwd.net/a3903/o35/geo/arte7/ALL/arteprod/A7_SGT_ENC_08_040886-286-A_PG_HQ_FR.wmv?h=b08e4b7f727bae8554ba4cda60bbd87c&e=1255601312'

What gives?

Revision history for this message
Guillaume Desmottes (cassidy) wrote :

Oh thanks, I guess I was missusing the API.

I still have a problem though. Apparantly the order of the parameters does matter. For example this URI will work (video is played)
mms://artestras.wmod.llnwd.net/a3903/o35/geo/arte7/EUR_DE_FR/arteprod/A7_SGT_ENC_08_039010-000-A_PG_HQ_FR.wmv?e=1255643828&h=f1b014bb437782e89bec30f35bacca96

while this one doesn't (an audio error message is played)
mms://artestras.wmod.llnwd.net/a3903/o35/geo/arte7/EUR_DE_FR/arteprod/A7_SGT_ENC_08_039010-000-A_PG_HQ_FR.wmv?h=f1b014bb437782e89bec30f35bacca96&e=1255643828

I know that in theory the order shouldn't matter but I'm afraid that the server is crap and there is no chance to get it fixed :(

So I'd need a way to keep the order of the params.

Revision history for this message
Guillaume Desmottes (cassidy) wrote :

I tried to change MediaUri's implementation to use a OrderedDict [1] and that solved my problem!
Any chance to see such change merged?

[1] http://www.voidspace.org.uk/python/odict.html

Revision history for this message
Guillaume Desmottes (cassidy) wrote :

As you can see this change is trivial.

Olivier Tilloy (osomon)
tags: added: patch-available
Revision history for this message
Olivier Tilloy (osomon) wrote :

Interestingly, I couldn't find in the RFC any details on the expected syntax of the query part of a URI (http://tools.ietf.org/html/rfc3986#section-3.4), and wikipedia seems to confirm that "the query string syntax is not generically defined" (http://en.wikipedia.org/wiki/URI_scheme#Generic_syntax).

Meaning that parsing the query string into a dictionary makes an unjustified assumption on the form of the query (which by the way can take other forms, wikipedia mentions the semicolon as a valid separator for parameters as well as the ampersand).

The correct fix would consist in storing the original query string, also parse it into a dictionary if needed/required, and when returning a string representation of the URI, use the original query string.

summary: - MediaUri corrupts my URI
+ MediaUri corrupts the query string
Changed in elisa:
importance: Undecided → High
status: Incomplete → Confirmed
Revision history for this message
Olivier Tilloy (osomon) wrote : Re: MediaUri corrupts the query string

Note that a complete patch must also contain additional unit tests.

Ugo Riboni (uriboni)
Changed in elisa:
assignee: nobody → Ugo Riboni (uriboni)
Changed in elisa:
milestone: none → bug-fixing-day
Ugo Riboni (uriboni)
Changed in elisa:
status: Confirmed → In Progress
Revision history for this message
Florian Boucault (fboucault) wrote :

@Guillaume: I am currently reviewing Ugo's merge request that should fix it for you. Stay tuned!

Revision history for this message
Ugo Riboni (uriboni) wrote :

The real proper fix is what Olivier said in his comments above (preserving the original QS and returning it when a string representation is needed).

However, this is not currently possible since there are methods in the MediaURI class that allow you to manipulate pieces of the QS (such as set_param and similar).
There is a solution that would still allow to implement Olivier's idea, which is keeping track if the QS was modified through the methods, and return the original if it was never modified. However it makes the behaviour of the class less clear, since it introduce side-effects on the methods that manipulate the QS.

Therefore, for now we will only fix the issue like originally suggested by Guillaume D. by using an ordered dictionaly to store the query string parts internally.

Ugo Riboni (uriboni)
Changed in elisa:
status: In Progress → Fix Committed
Revision history for this message
Michał Sawicz (saviq) wrote :

I just found one more failing example:

If a parameter has no value (e.g. schema://host/path?param), the resulting url gets an added equal sign (schema://host/path?param=)

Adding the above example to samples in TestMediaURI makes a failing test.

Changed in moovida:
assignee: Ugo Riboni (uriboni) → nobody
status: Fix Committed → Confirmed
Revision history for this message
Florian Boucault (fboucault) wrote :

Maybe that would be better having a separate bug report. :)

Michał Sawicz (saviq)
Changed in moovida:
assignee: nobody → Ugo Riboni (uriboni)
status: Confirmed → Fix Committed
summary: - MediaUri corrupts the query string
+ MediaUri changes order of parameters
Olivier Tilloy (osomon)
Changed in moovida:
milestone: bug-fixing-day → 1.0.9
Olivier Tilloy (osomon)
Changed in moovida:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.