[patch] Python-feedparser does not parse http://www.democracynow.org/podcast.xml correctly

Bug #179208 reported by Thomas Perl
6
Affects Status Importance Assigned to Milestone
feedparser (Ubuntu)
Fix Released
Medium
Emmet Hikory

Bug Description

This patch fixes two issues in upstream's bug tracker:
  http://code.google.com/p/feedparser/issues/detail?id=28 and
  http://code.google.com/p/feedparser/issues/detail?id=80

This feed doesn't get parsed correctly: http://www.democracynow.org/podcast.xml
What doesn't work: The titles for all Thursday episodes are wrong

You can try to parse it right away - the feedparser will not display the title of feeds that contain the word "Thursday". Looking into feedparser's code and the RSS file, I see that the feed has type="plain" and mapContentType() doesn't map this one currectly, so I've added a mapping for "plain" to "text/plain".

The problem is that it doesn't happen for other feeds because the base64 decoder doesn't produce valid results for a normal string that isn't base64 encoded (i.e. raises binascii.Error), but does so when the string contains "Thursday":

Python 2.5.1 (r251:54863, Oct 5 2007, 13:36:32)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import base64
>>> base64.decodestring('Feedparser does not work')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.5/base64.py", line 321, in decodestring
    return binascii.a2b_base64(s)
binascii.Error: Incorrect padding
>>> base64.decodestring('Thursday')
'N\x1b\xab\xb1\xd6\xb2'

Please include the patch or (as an alternative) validate the output of the base64 decoded string, to see if the string is really base64-encoded. Or is this a bug in the base64 module?

Revision history for this message
Thomas Perl (thp) wrote :
Changed in feedparser:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Mb (mb-deactivatedaccount-deactivatedaccount-deactivatedaccount) wrote :

Providing a debdiff for the patch.

Revision history for this message
Mb (mb-deactivatedaccount-deactivatedaccount-deactivatedaccount) wrote :

Subscribing u-u-s.

Emmet Hikory (persia)
Changed in feedparser:
assignee: nobody → persia
status: Confirmed → In Progress
Revision history for this message
Emmet Hikory (persia) wrote :

I've reverted the requested Standards-Version update, as the updated package does not appear to comply with the updated standards version. I've also removed the XS-Vcs-* fields from debian/control, as the packaging no longer matches the VCS.

Changed in feedparser:
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package feedparser - 4.1-9ubuntu1

---------------
feedparser (4.1-9ubuntu1) hardy; urgency=low

  [ Mario Bonino ]
  * debian/patches/correct_content_type_mapping.patch:
    - patch to feedparser.py to do the correct content
      type mapping (LP: #179208)
      (patch from Thomas Perl)
  * debian/control:
    - updated Maintainer field

  [ Emmet Hikory ]
  * Drop XS-Vcs-* from debian/control, as the packaging differs from the VCS

 -- Emmet Hikory <email address hidden> Wed, 02 Jan 2008 20:24:23 +0900

Changed in feedparser:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.