UnicodeDecodeError: 'utf8' codec can't decode byte 0xe1 in position 0: unexpected end of data

Bug #605543 reported by Captain Chaos
38
This bug affects 7 people
Affects Status Importance Assigned to Milestone
Gwibber
Incomplete
Undecided
Unassigned
gwibber (Ubuntu)
Expired
Medium
Unassigned

Bug Description

Binary package hint: gwibber

According to bug 530195 the problems in Gwibber should have been fixed in 2.30.1, but no such luck. Both the Facebook and Twitter support are broken, rendering Gwibber useless for me.

I just upgraded from Karmic to Lucid. I now have Gwibber 2.30.1 running on my system, and things have gone from bad to worse. Whereas previously it was just Facebook that was working only sporadically, Twitter has now stopped working for me as well.

I had a Flickr account, a Facebook account and a Twitter account configured in Gwibber, yet the Message screen remains completely empty. No Twitter or Facebook messages are displayed, even after manually selecting Refresh. On the Home screen the only things that are displayed are Flickr photos, and Twitter messages that are @directed at me.

I've attached a log file of a gwibber -d run. As you can see it complains that the com.Gwibber.Accounts and com.Gwibber.Streams services are not not provided by any service files. That sounds bad.

I have already tried deleting every Gwibber-related settings directory and cache directory I could find, but none of it made any difference. I also tried removing and re-adding my accounts (during which I discovered that it is actually impossible to add a Facebook account anymore! I'll file a separate bug for that), but that made no difference either.

Revision history for this message
Captain Chaos (launchpad-chaos) wrote :
Revision history for this message
Omer Akram (om26er) wrote :

start gwibber-service from terminal and paste the logs
$ sudo pkill gwibber
$ gwibber-service -do

Changed in gwibber (Ubuntu):
status: New → Incomplete
Revision history for this message
Captain Chaos (launchpad-chaos) wrote :

Wow, thanks for the quick response! Here is the output of that command:

Xlib: extension "RANDR" missing on display ":0.0".
Updating...
Gwibber Dispatcher: DEBUG Setting up monitors
Gwibber Dispatcher: DEBUG Monitors are up
Gwibber Dispatcher: INFO Gwibber Service is reloading account credentials
Gwibber Dispatcher: DEBUG Refresh interval is set to 5
Gwibber Dispatcher: DEBUG ** Starting Refresh - Wed Jul 14 21:18:58 2010 **
Gwibber Dispatcher: DEBUG <twitter:private> Performing operation
Gwibber Dispatcher: DEBUG <twitter:responses> Performing operation
Gwibber Dispatcher: DEBUG <flickr:images> Performing operation
Gwibber Dispatcher: DEBUG <twitter:receive> Performing operation
Gwibber Dispatcher: DEBUG <flickr:images> Finished operation
Gwibber Dispatcher: DEBUG <twitter:responses> Finished operation
Gwibber Dispatcher: DEBUG <twitter:private> Finished operation
Gwibber Dispatcher: ERROR <twitter:receive> Operation failed
Gwibber Dispatcher: DEBUG Traceback:
Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/dispatcher.py", line 75, in perform_operation
    message_data = PROTOCOLS[account["protocol"]].Client(account)(opname, **args)
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 144, in __call__
    return getattr(self, opname)(**args)
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 147, in receive
    return self._get("statuses/home_timeline.json", count=count, since_id=since)
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 135, in _get
    if parse: return [getattr(self, "_%s" % parse)(m) for m in data]
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 78, in _message
    m = self._common(data)
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 64, in _common
    m["text"] = unescape(data["text"])
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 51, in unescape
    p.feed(s)
  File "/usr/lib/python2.6/sgmllib.py", line 104, in feed
    self.goahead(0)
  File "/usr/lib/python2.6/sgmllib.py", line 193, in goahead
    self.handle_entityref(name)
  File "/usr/lib/python2.6/sgmllib.py", line 436, in handle_entityref
    self.handle_data(replacement)
  File "/usr/lib/python2.6/htmllib.py", line 65, in handle_data
    self.savedata = self.savedata + data
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe1 in position 0: unexpected end of data

Gwibber Dispatcher: INFO Loading complete: 1 - ['Success', 'Failure', 'Success', 'Success']

Note that I don't have a Facebook account configured any more, due to the problem with the Add button not being displayed after authenticating with Facebook.

Changed in gwibber (Ubuntu):
status: Incomplete → New
Revision history for this message
Omer Akram (om26er) wrote :

definetly not a different problem from bug 530195

Revision history for this message
Omer Akram (om26er) wrote :

sorry I meant not a duplicate of 530195. the logs are quite different

Omer Akram (om26er)
summary: - Twitter and Facebook still not working in 2.30.1
+ Twitter still not working in 2.30.1
summary: - Twitter still not working in 2.30.1
+ UnicodeDecodeError: 'utf8' codec can't decode byte 0xe1 in position 0:
+ unexpected end of data
Omer Akram (om26er)
Changed in gwibber (Ubuntu):
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Vitali Kulikou (sabotatore) wrote :
Download full text (6.5 KiB)

This bug affects me:

$ gwibber-service -d -o
Updating...
Gwibber Dispatcher: DEBUG Setting up monitors
Gwibber Dispatcher: DEBUG Monitors are up
Gwibber Dispatcher: INFO Gwibber Service is reloading account credentials
Gwibber Dispatcher: DEBUG Refresh interval is set to 5
Gwibber Dispatcher: DEBUG Raising gwibber client
Gwibber Dispatcher: INFO Gwibber Service is reloading account credentials
Gwibber Dispatcher: DEBUG Account changed: twitter-sabotatore
Gwibber Dispatcher: DEBUG ** Starting Single Operation **
Gwibber Dispatcher: DEBUG <twitter:receive> Performing operation
Gwibber Dispatcher: DEBUG <twitter:responses> Performing operation
Gwibber Dispatcher: INFO Gwibber Service is reloading account credentials
Gwibber Dispatcher: DEBUG Refresh interval is set to 5
Gwibber Dispatcher: DEBUG ** Starting Refresh - Thu Jul 15 09:19:24 2010 **
Gwibber Dispatcher: DEBUG <twitter:receive> Performing operation
Gwibber Dispatcher: DEBUG <twitter:responses> Performing operation
Gwibber Dispatcher: DEBUG <twitter:responses> Adding record
Gwibber Dispatcher: DEBUG <twitter:responses> Adding record
Gwibber Dispatcher: DEBUG Checking message 18019169712 timestamp (2010-07-08 11:02:02.00) to see if it is newer than 2010-07-15 09:09:25.24
Gwibber Dispatcher: DEBUG <twitter:responses> Adding record
Gwibber Dispatcher: DEBUG Checking message 17941268076 timestamp (2010-07-07 14:18:40.00) to see if it is newer than 2010-07-15 09:09:25.28
Gwibber Dispatcher: DEBUG <twitter:responses> Adding record
Gwibber Dispatcher: DEBUG Checking message 17940849118 timestamp (2010-07-07 14:09:26.00) to see if it is newer than 2010-07-15 09:09:25.33
Gwibber Dispatcher: DEBUG <twitter:responses> Adding record
Gwibber Dispatcher: DEBUG Checking message 16676099166 timestamp (2010-06-21 10:07:42.00) to see if it is newer than 2010-07-15 09:09:25.37
Gwibber Dispatcher: DEBUG <twitter:responses> Adding record
Gwibber Dispatcher: DEBUG Checking message 14856168773 timestamp (2010-05-27 23:02:26.00) to see if it is newer than 2010-07-15 09:09:25.42
Gwibber Dispatcher: DEBUG <twitter:responses> Adding record
Gwibber Dispatcher: DEBUG Checking message 10117280443 timestamp (2010-03-07 13:53:54.00) to see if it is newer than 2010-07-15 09:09:25.47
Gwibber Dispatcher: DEBUG <twitter:responses> Adding record
Gwibber Dispatcher: DEBUG Checking message 9304922662 timestamp (2010-02-19 00:40:08.00) to see if it is newer than 2010-07-15 09:09:25.51
Gwibber Dispatcher: DEBUG <twitter:responses> Adding record
Gwibber Dispatcher: DEBUG Checking message 2689903104 timestamp (2009-07-17 18:28:26.00) to see if it is newer than 2010-07-15 09:09:25.56
Gwibber Dispatcher: DEBUG <twitter:responses> Finished operation
Gwibber Dispatcher: DEBUG Checking message 2131006635 timestamp (2009-06-12 16:20:29.00) to see if it is newer than 2010-07-15 09:09:25.61
Gwibber Dispatcher: DEBUG <twitter:private> Performing operation
Gwibber Dispatcher: DEBUG Gwibber Client raised
Gwibber Dispatcher: DEBUG <twitter:private> Finished operation
Gwibber Dispatcher: DEBUG <twitter:responses> Finished operation
Gwibber Dispatcher: DEBUG <twitter:private> Performing operation
Gwibber Dispatcher: ERROR <twitter:receive> O...

Read more...

Revision history for this message
Captain Chaos (launchpad-chaos) wrote :

Anything happening on this? It's very frustrating. I went from Karmic, where Twitter worked but Facebook didn't, to Lucid, and now Facebook works but Twitter doesn't! It's amazing that a program which is a centrepiece of the Ubuntu GUI can be so buggy for so long...

Please let me know if I can help in any way! Would a network trace be helpful? Just tell me which server and port to capture traffic to and I will do so.

Revision history for this message
Captain Chaos (launchpad-chaos) wrote :

Twitter started working in Gwibber a few days ago, but now it has stopped again.

I noticed that one of the tweets currently on my twitter.com page has the word Catalonië in it. Could it be that Gwibber can't handle accented characters? And that it only works whenever your Twitter feed happens to have no accented characters anywhere?

Revision history for this message
Omer Akram (om26er) wrote : Re: [Bug 605543] Re: UnicodeDecodeError: 'utf8' codec can't decode byte 0xe1 in position 0: unexpected end of data

no 'Catalonië' just went fine for me

On Wed, Jul 28, 2010 at 4:01 PM, Captain Chaos <email address hidden>wrote:

> Twitter started working in Gwibber a few days ago, but now it has
> stopped again.
>
> I noticed that one of the tweets currently on my twitter.com page has
> the word Catalonië in it. Could it be that Gwibber can't handle accented
> characters? And that it only works whenever your Twitter feed happens to
> have no accented characters anywhere?
>
> --
> UnicodeDecodeError: 'utf8' codec can't decode byte 0xe1 in position 0:
> unexpected end of data
> https://bugs.launchpad.net/bugs/605543
> You received this bug notification because you are subscribed to gwibber
> in ubuntu.
>
> Status in Gwibber: New
> Status in “gwibber” package in Ubuntu: Confirmed
>
> Bug description:
> Binary package hint: gwibber
>
> According to bug 530195 the problems in Gwibber should have been fixed in
> 2.30.1, but no such luck. Both the Facebook and Twitter support are broken,
> rendering Gwibber useless for me.
>
> I just upgraded from Karmic to Lucid. I now have Gwibber 2.30.1 running on
> my system, and things have gone from bad to worse. Whereas previously it was
> just Facebook that was working only sporadically, Twitter has now stopped
> working for me as well.
>
> I had a Flickr account, a Facebook account and a Twitter account configured
> in Gwibber, yet the Message screen remains completely empty. No Twitter or
> Facebook messages are displayed, even after manually selecting Refresh. On
> the Home screen the only things that are displayed are Flickr photos, and
> Twitter messages that are @directed at me.
>
> I've attached a log file of a gwibber -d run. As you can see it complains
> that the com.Gwibber.Accounts and com.Gwibber.Streams services are not not
> provided by any service files. That sounds bad.
>
> I have already tried deleting every Gwibber-related settings directory and
> cache directory I could find, but none of it made any difference. I also
> tried removing and re-adding my accounts (during which I discovered that it
> is actually impossible to add a Facebook account anymore! I'll file a
> separate bug for that), but that made no difference either.
>
>
>

Revision history for this message
Foppe Hemminga (foppe) wrote :

# NOS Teletekst Teletekst

Israëliër in VN-panel aanval Gaza http://nos.nl/l/175947/t 4:36 PM Aug 2nd via API

This is the first post _not_ being shown in GWibber. The 'ë' might be a coincidence though.

UnicodeDecodeError: 'utf8' codec can't decode byte 0xeb in position 0: unexpected end of data

Revision history for this message
Captain Chaos (launchpad-chaos) wrote :

Aha, a clue!

The error message is "'utf8' codec can't decode byte 0xeb", and the message contained a ë. Byte 0xeb in the *ISO-8859-1* character encoding is ë!

This means that it is very likely that the message is not encoded with UTF-8 at all, but with ISO-8859-1, and Gwibber for some reason tries to decode the message using the wrong character encoding!

In my own error message posted above the byte it can't decode using UTF-8 is 0xe1, which is á in ISO 8859-1. I think it's likely we have found the bug.

Revision history for this message
Captain Chaos (launchpad-chaos) wrote :

The plot thickens. I've been looking at the code, and this happens while decoding an HTML entity tag (such as &aacute;). Apparently the problem is not that the contents of the tweet are being decoded with the wrong character encoding, but that the tweet contains entity tags, and Python's htmllib is failing at converting those.

The line that fails is:

self.savedata = self.savedata + data

Where savedata is the content of the tweet so far, and data is the character that corresponds to the entity tag, for instance an ë. I wonder why Python feels the need to perform a UTF-8 conversion to perform that concatenation. Any Python experts care to comment?

Revision history for this message
Benjamin-Timm Broich (b-broich) wrote :

Maybe this helps. I get a similiar error:

Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/dispatcher.py", line 71, in perform_operation
    message_data = PROTOCOLS[account["service"]].Client(account)(opname, **args)
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 177, in __call__
    return getattr(self, opname)(**args)
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 180, in receive
    return self._get("statuses/home_timeline.json", count=count, since_id=since)
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 168, in _get
    if parse: return [getattr(self, "_%s" % parse)(m) for m in data]
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 93, in _message
    m = self._common(data)
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 67, in _common
    m["text"] = unescape(data["text"])
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 54, in unescape
    p.feed(s)
  File "/usr/lib/python2.6/sgmllib.py", line 104, in feed
    self.goahead(0)
  File "/usr/lib/python2.6/sgmllib.py", line 193, in goahead
    self.handle_entityref(name)
  File "/usr/lib/python2.6/sgmllib.py", line 436, in handle_entityref
    self.handle_data(replacement)
  File "/usr/lib/python2.6/htmllib.py", line 65, in handle_data
    self.savedata = self.savedata + data
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe2 in position 0: unexpected end of data

Revision history for this message
Omer Akram (om26er) wrote :

would any gwibber developer have a look at this please.

Changed in gwibber:
assignee: nobody → Gwibber Team (gwibber-team)
status: New → Confirmed
Revision history for this message
Captain Chaos (launchpad-chaos) wrote :

Victory!

I managed to get Twitter working in Gwibber by changing the line 436 in /usr/lib/python2.6/sgmllib.py from:

            self.handle_data(replacement)

to:

            self.handle_data(unicode(replacement,'latin-1'))

I also deleted /usr/lib/python2.6/sgmllib.pyc. This is my first time ever doing something in Python, so I have no idea what the proper way of doing this would be, or whether this correctly fixes the bug. Hopefully a Python expert can pick it up from here.

Revision history for this message
Foppe Hemminga (foppe) wrote :

Hi Capt Chaos,

Your idea is fine but you are editing python files where you should edit gwibber files ;)

in /usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py change line 64

   m["text"] = unescape(data["text"])

to

   m["text"] = unescape(data["text"].encode("latin1"))

Revision history for this message
Captain Chaos (launchpad-chaos) wrote :

I tried it, but as I already feared it lead to:

Gwibber Dispatcher: DEBUG Traceback:
Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/dispatcher.py", line 75, in perform_operation
    message_data = PROTOCOLS[account["protocol"]].Client(account)(opname, **args)
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 144, in __call__
    return getattr(self, opname)(**args)
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 147, in receive
    return self._get("statuses/home_timeline.json", count=count, since_id=since)
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 135, in _get
    if parse: return [getattr(self, "_%s" % parse)(m) for m in data]
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 78, in _message
    m = self._common(data)
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 64, in _common
    m["text"] = unescape(data["text"].encode("latin1"))
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-21: ordinal not in range(256)

That modification would make Gwibber unable to support any characters not in latin-1 (such as Chinese, Cyrillic, Arabic, etc...). I really think the problem is with htmllib.py and sgmllib.py, which are unable to handle a mix of unicode and 8-bit strings. Or possibly the problem is that the default encoding on Lucid seems to be utf-8 (although I can't find why), which sgmllib.py doesn't seem to able to handle.

Revision history for this message
Foppe Hemminga (foppe) wrote :

The 'latin1' part in my proposed code (/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py line 64) should be 'utf8'

   m["text"] = unescape(data["text"].encode("utf8"))

The rationale is as follows: htmllib.HTMLParser from function unescape in /usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py line 48 assumes unicode strings and won't guess character encoding if they're not.
The Twitter API supports UTF-8 [1]. So if the text strings aren't manipulated along the way they still are in UTF-8.

[1] http://apiwiki.twitter.com/Things-Every-Developer-Should-Know#7Encodingaffectsstatuscharactercount

Revision history for this message
Omer Akram (om26er) wrote :

Captain Chaos, can you try Foppe's suggestion to see if that works. it would be better to get it fixed on gwibber's side ;)

Ryan Paul (segphault)
Changed in gwibber:
milestone: none → 3.0
Revision history for this message
Benjamin-Timm Broich (b-broich) wrote :

The solution from Foppe Hemminga worked fine for me!

Revision history for this message
Captain Chaos (launchpad-chaos) wrote :

I tried Foppe's suggestion, but now I get this error:

Gwibber Dispatcher: ERROR <twitter:receive> Operation failed
Gwibber Dispatcher: DEBUG Checking message 20614540018 timestamp (2010-08-08 10:16:04.00) to see if it is newer than 2010-08-08 13:11:59.38
Gwibber Dispatcher: DEBUG Traceback:
Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/dispatcher.py", line 88, in perform_operation
    m["rtl"] = util.isRTL(re.sub(text_cleaner, "", m["text"].decode('utf-8')))
  File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 32-34: invalid data

I think what's happening is that unescape() is now parsing a UTF-8 encoded 8-bit string, but the entity replacements it's appending to the result are actually latin-1 encoded. So the result is a mix of UTF-8 and latin-1 encoded characters.

Revision history for this message
Foppe Hemminga (foppe) wrote :

Thanks Captain Chaos,

I don't have time to look into this right now. So a couple of questions and remarks.
What languages are the tweets you receive?
You can print the raw data with the print statement:
  print data["text"]
or
  print s
in function unescape (s)
and view the twitter stream in console if you run `gwibber-services -od` (-id?)
Before the data is parsed by htmllib.HTMLParser it must be unicode. Search Google for that, see for example [1]

[1] http://evanjones.ca/python-utf8.html

Revision history for this message
Captain Chaos (launchpad-chaos) wrote :

The tweets are mainly in Dutch, English and Japanese.

You say that "before the data is parsed by htmllib.HTMLParser it must be unicode", but your modification actually turns the string into a UTF-8-encoded 8-bit string, not unicode. What's more, a "print type(s)" in unescape() reveals that *without* the modification the type of the string passed in *is* unicode.

I'm pretty sure that what happens is this:

* unescape() invokes HTMLParser.save_bgn(), which initialises HTMLParser.savedata to an empty 8-bit string
* unescape() invokes HTMLParser.feed (inherited from SGMLParser) with a unicode string (m["text"], verified with a "print type(s)")
* the string is concatenated to SGMLParser.rawdata, which started out as an empty 8-bit string but now becomes unicode
* feed() invokes goahead()
* goahead() searches rawdata for HTML tags and invokes handle_data() (implemented by HTMLParser) for the text parts in between
* handle_data() concatenates the unicode string to savedata, which started out as an empty 8-bit string but now becomes unicode
* when goahead() encounters an entity tag, it invokes handle_entityref()
* handle_entityref() invokes convert_entityref() to convert the tag name to the corresponding character. it uses the entitydefs table for this. HTMLParser has imported entitydefs from htmlentitydefs.py. It contains each entity tag's corresponding character as an 8-bit string in the latin-1 encoding, or a character reference if the character is not contained in the latin-1 encoding
* handle_entityref() then invokes handle_data() to append the character referenced by the entity tag. it passes in the latin-1 encoded 8-bit string it got from convert_entityref()
* handle_data() does this:

self.savedata = self.savedata + data

at this point savedata is unicode, but data is an 8-bit string. Python therefore has to convert the 8-bit string to unicode in order to be able to append it. it uses the "default encoding" for this. on my system the default encoding at this point appears to be utf8 (this is borne out by the error message). the utf8 codec tries to interpret the latin-1 encoded character as utf8 and (correctly) fails

The questions that need answering at this point are:

* Why is the default encoding utf8? Could it have to do with my locale setting (which is en_US.utf8)?
* Interestingly, according to the Python documentation the regular default encoding is ascii, which would also fail, so why doesn't everyone have this problem?
* HTMLParser doesn't work correctly when: 1) the default encoding is not latin-1, 2) you offer it unicode strings and 3) the strings contain entity tags. My fix remedies this. Is this not a bug which needs fixing?

I'm reverting back to my original fix. It's the only one so far which results in no error messages at all (at least for twitter). As much as I would like to, I don't have the time to learn Python and become a Gwibber developer and unicode expert to get this bug fixed, especially since I don't think the problem is actually in Gwibber itself.

It would be great if an Ubuntu & Python expert could look at my reasoning above and see if it holds water and if sgmllib.py and htmllib.py need to be fixed.

Revision history for this message
Maykel Moya (mmoyar) wrote :

@Captain Chaos

After applying Foppe's patch I got the same error you reported in #21. In #23 you said 'My fix remedies this', what fix are you referring to?

Revision history for this message
Captain Chaos (launchpad-chaos) wrote :

@Maykel

The one from comment #15.

Revision history for this message
Foppe Hemminga (foppe) wrote :

@Captain Chaos:
Thanks for your research. I think you are correct on most issues.
The most intriguing question remains:
  why doesn't everyone have this problem?

I came up with a new solution that works with my English / Dutch tweets.

1) Change line 1 to
  import network, util, HTMLParser
2) Change line 49 to
    p = HTMLParser.HTMLParser(None)

This uses another parser (the one that's used in Python3) and may be a solution to our issue.

Revision history for this message
Captain Chaos (launchpad-chaos) wrote :

@Foppe

That solution doesn't work for me. First it failed because the HTMLParser constructor doesn't take arguments, so it should be p = HTMLParser.HTMLParser(). But when I changed it to that I got the following error message:

Gwibber Dispatcher: ERROR <twitter:receive> Operation failed
Gwibber Dispatcher: DEBUG Traceback:
Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/dispatcher.py", line 75, in perform_operation
    message_data = PROTOCOLS[account["protocol"]].Client(account)(opname, **args)
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 146, in __call__
    return getattr(self, opname)(**args)
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 149, in receive
    return self._get("statuses/home_timeline.json", count=count, since_id=since)
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 137, in _get
    if parse: return [getattr(self, "_%s" % parse)(m) for m in data]
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 80, in _message
    m = self._common(data)
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 66, in _common
    m["text"] = unescape(data["text"])
  File "/usr/lib/python2.6/dist-packages/gwibber/microblog/twitter.py", line 52, in unescape
    p.save_bgn()
AttributeError: HTMLParser instance has no attribute 'save_bgn'

It appears that HTMLParser.HTMLParser is not compatible with htmllib.HTMLParser. You say it works on your system, how is that possible? Did you make more changes to twitter.py?

Revision history for this message
Victor Vargas (kamus) wrote :

Captain, any news about this issue? are you still facing this behaviour?

Changed in gwibber (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Foppe Hemminga (foppe) wrote :

I looked at this topic a couple of weeks ago,I haven't noticed it for some time, but I had a gap not using Gwibber for a longer period (temporary switch to KDE.)
The issue looks related to some older version of Gwibber. Also at the time of this bug I did ran development versions, so that may be part of the issue.
This bug was real and reproducable. We were close to pointing at the location of it. There were never many people affected by the bug. I don't see any need for keeping this bug open.

Revision history for this message
Captain Chaos (launchpad-chaos) wrote :

I haven't been seeing the behaviour since august last year, but that is because I fixed it by making the change in comment #15. I'll remove my fix to see whether the bug still occurs.

One reason I might have been seeing this problem and others didn't is my locale setting. It's a bit different than most people's:

LANG=en_US.utf8
LC_CTYPE="en_US.utf8"
LC_NUMERIC=nl_NL.UTF-8
LC_TIME=nl_NL.UTF-8
LC_COLLATE="en_US.utf8"
LC_MONETARY=nl_NL.UTF-8
LC_MESSAGES="en_US.utf8"
LC_PAPER=nl_NL.UTF-8
LC_NAME=nl_NL.UTF-8
LC_ADDRESS=nl_NL.UTF-8
LC_TELEPHONE=nl_NL.UTF-8
LC_MEASUREMENT=nl_NL.UTF-8
LC_IDENTIFICATION="en_US.utf8"
LC_ALL=

That is because I don't like Dutch language translations. I find them to be clunky and incomplete, so I'd rather see things in English, but I still want to see Dutch number formats, date formats, etc... Perhaps this unusual locale setting is triggering the bug?

Like Foppe says this was a real and reproducable bug. I think closing it now would be a bit premature. I still think it's likely that it points to a bug in the Python libraries.

Revision history for this message
Ken VanDine (ken-vandine) wrote :

I am reasonably sure this was fixed. Please re-open if you experience it again.

Changed in gwibber:
status: Confirmed → Fix Released
assignee: Gwibber Team (gwibber-team) → nobody
status: Fix Released → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for gwibber (Ubuntu) because there has been no activity for 60 days.]

Changed in gwibber (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.