Tweet parsing problem

Bug #1185031 reported by Contribucious
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Birdie
Fix Committed
High
vasco

Bug Description

Tweet parsing problem:
  => See this screenshot: http://i.imgur.com/NfF8d4y.png
  => Tweet concerned: https://twitter.com/MKERone/status/339364094786285568

N.B.: Also note that, visibly, when 1 tweet only is returned in search results, it appears at the bottom of the window, rather than at the top. I'll report that in a separate bug report.

OS: Ubuntu 12.04.2 LTS (with XFCE)
Birdie version used: the last one available in the DAILY repository (did an "apt-get install birdie" to update it few minutes ago).

*EDIT*
In fact, I suppose it's the "greater than" character (>), the problem, in this case.
From the text of the tweet. Not the & from the URL.
Need to parse all special characters so.

*EDIT 2*
Problem already resolved in the next revision visibly, cf. my comment below.

--
Related log lines (I think):
--
[_LOG_LEVEL_WARN 15:40:37.564540] [Gtk] Failed to set text from markup due to error parsing markup: Error on line 1: Entity did not end with a semicolon; most likely you used an ampersand character without intending to start an entity - escape ampersand as &

description: updated
description: updated
Revision history for this message
Contribucious (contribucious) wrote :

Ahah, I saw recently this commit by browsing the new stuff done:
http://bazaar.launchpad.net/~birdie-team/birdie/trunk/revision/238?start_revid=239

And thought I had already this revision as updated very recently (few minutes before I saw this bug and report it).
So I thought it still had a parsing problem (even with this commit done so) but no!

I checked and... I have the r237(!):
=> "Preparing to replace birdie 0.2+r231-0+pkg7~precise1 (using .../birdie_0.2+r237-0+pkg7~precise1_amd64.deb) ..."

And the commit that seems to resolve that is the r238! :^)
So, visibly, the (auto-)build is not done yet for this revision. Just need to wait then. :-)

description: updated
description: updated
summary: - URL parsing problem
+ Tweet parsing problem
description: updated
vasco (vasco-m-nunes)
Changed in birdie:
status: New → Fix Committed
Revision history for this message
Contribucious (contribucious) wrote :

I would say that you better to use a quality parsing library to do that stuff, rather than manually .replace...
Because it's the better method to forget many entities in the parsing list. :^)

By the way, two new tweets with problem detected (even with the last revision r244, which includes the fix so):
1/ http://i.imgur.com/OBjSixI.png -> https://twitter.com/iVerger/status/339683424216428544 (seems to be the " at the end, which becomes part of the URL)
2/ http://i.imgur.com/xnBVznt.png -> https://twitter.com/MagaliPernin/status/339690878882959360 (same remark)

vasco (vasco-m-nunes)
Changed in birdie:
status: Fix Committed → Triaged
importance: Undecided → High
assignee: nobody → vasco (vasco-m-nunes)
Revision history for this message
vasco (vasco-m-nunes) wrote :

Thank you for all your reports. This should now be fixed in revision #245.

Changed in birdie:
status: Triaged → Fix Committed
Revision history for this message
Contribucious (contribucious) wrote :

Nice! :)

You're welcome, but I'm just a bug reporter and suggestion maker you know.
Even if it's of course very useful, it's YOU (among others) who gave life to this so great Twitter client and working hard on the code! So, definitely thank YOU!

Make my contribution to the building is therefore the minimum I can do, for my part. ;)

-

About this report however, still, I see that you added this time many manual .replace.
Great, but for example, you added .replace ("½", "½") but not .replace ("¼", "¼") nor .replace ("¾", "¾").
This is just an example to show you that, by doing this stuff manually, you're SURE to forget so many things because the list is so long!
See for example this link (and list maybe even not exhaustive!): http://www.w3schools.com/tags/ref_entities.asp

=> Morality: Find a (light but still) decent/quality parsing library (in which many guys have contributed since many years) to do that parsing stuff. ;)

Revision history for this message
Contribucious (contribucious) wrote :

…as tweet parsing is a too important thing in a Twitter client (it's the content itself that we talk here! therefore the base.) to do just some manual replacing, in any ways. IMHO. ;)

Revision history for this message
Contribucious (contribucious) wrote :

Mmm… I see a nice r246 (about a Purple library used visibly) eheh!
Will try that soon! ;-)

By the way, note that I often update my reports in general (new information added, being more precise on my already posted information, etc.). So, don't hesitate to check again some reports if you're not notified about reports that have just been updated (which is probably the case, as that can trigger sending too many mails).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.