Won't start - win xp, cp1250, exception thrown

Bug #320207 reported by Karel Kolman
2
Affects Status Importance Assigned to Milestone
wxBanker
Fix Released
Medium
Michael Rooney

Bug Description

Deciding to see wxBanker work in windows, i installed python 2.6, wxPython 2.8 and downloaded wxBanker 0.4. It wouldn't start.

The exception being thrown:
  File "E:\Program Files\Python2.6\lib\encodings\cp1250.py", line 15, in decode
    return codecs.charmap_decode(input,errors,decoding_table)
UnicodeDecodeError: 'charmap' codec can't decode byte 0x83 in position 13: character maps to <undefined>

I'm new to python and the string / unicode string madness is still not very clear to me, but changing the russian currency symbol assignment on line 181
from
self.LOCALECONV['currency_symbol'] = '\xd1\x80\xd1\x83\xd0\xb1'
to
self.LOCALECONV['currency_symbol'] = u'руб'
in the utf-8 encoded currencies.py fixes the problem and the app will start.

The issue can be tested in Linux like this:
$ LANG=POSIX ./wxBanker.py
(this will complain about other bytes other than 0x83 specific to cp1250 encoding).

Related branches

Revision history for this message
Karel Kolman (kolmis) wrote :

i guess it's simpler to prepend the 'u' marking unicode string in the currency symbol assignment like

self.LOCALECONV['currency_symbol'] = '\xd1\x80\xd1\x83\xd0\xb1'
into
self.LOCALECONV['currency_symbol'] = u'\xd1\x80\xd1\x83\xd0\xb1'

still, wouldn't it be better to have utf-8 chars like 'руб' in the utf-8 encoded file ? or is there a problem with this ?

Revision history for this message
Karel Kolman (kolmis) wrote :

oops, the '\xd1\x80\xd1\x83\xd0\xb1' => u'\xd1\x80\xd1\x83\xd0\xb1' substituon wouldn't throw an exception but the characters shown are completely different than meant of course, so just ignore that part :)

Revision history for this message
Michael Rooney (mrooney) wrote :

Thanks again Karel! Originally I was using the encoded strings because of problems. However I think the problems were only because I wasn't aware of how to have utf-8 python files. Now that I've discovered the "# -*- coding: utf-8 -*-" header, I think it should be fine (I am already using them in the tests in that same file).

In my initial tests I get past the original exception but end up with:
michael@NinaMyers:~/Code/wxbanker-0.4$ LANG=POSIX ./wxbanker.py
Traceback (most recent call last):
  File "./wxbanker.py", line 211, in <module>
    main()
  File "./wxbanker.py", line 194, in main
    frame = BankerFrame()
  File "./wxbanker.py", line 56, in __init__
    self.managePanel = managetab.ManagePanel(notebook)
  File "/home/michael/Code/wxbanker-0.4/managetab.py", line 40, in __init__
    self.accountCtrl = accountCtrl = AccountListCtrl(leftPanel)
  File "/home/michael/Code/wxbanker-0.4/bankcontrols.py", line 222, in __init__
    self._PutAccount(accountName)
  File "/home/michael/Code/wxbanker-0.4/bankcontrols.py", line 345, in _PutAccount
    self._InsertItem(index, accountName)
  File "/home/michael/Code/wxbanker-0.4/bankcontrols.py", line 360, in _InsertItem
    totalText = wx.StaticText(self, label=Bank().float2str(balance))
  File "/usr/lib/python2.5/site-packages/wx-2.8-gtk2-unicode/wx/_controls.py", line 1136, in __init__
    _controls_.StaticText_swiginit(self,_controls_.new_StaticText(*args, **kwargs))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd1 in position 7: ordinal not in range(128)

I will have to do some more investigation.

Changed in wxbanker:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Karel Kolman (kolmis) wrote :

i created an account and some transactions and the app failed with an exception similar to that of yours, with the float2str causing problems, here's what i found out:

currencies.py, line 106

  s = locale.format('%%.%if' % digits, abs(val), grouping, monetary=True)

on my system cs_CZ.UTF-8 it returns "1 500", but the problem is that the whitespace there isn't the single \x20 byte but a two-byte sequence \xc2\xa0 !

now at line
 s = s + (separated and ' ' or '') + smb
the code tries to concatenate s (which is an ill formed ascii string) with the unicode smb part and the ascii string is to be converted into unicode as far as i understand - it obviously fails,

seems like a python bug to me, the locale.format should return a unicode string where appropriate (if not all the time).

so another workaround is to be found if i am not mistaken :)
on way to go would be

localeenc = 'utf-8'
s = unicode(locale.format('%%.%if' % digits, abs(val), grouping, monetary=True), localeenc)

but what happens when locale.format gets fixed in python (if my understanding of it being broken is correct) ?

Revision history for this message
Karel Kolman (kolmis) wrote :

python issue 1995
 http://bugs.python.org/issue1995
fixed in 3.0 branch

Revision history for this message
Karel Kolman (kolmis) wrote :

Still couldn't start the app in a czech win xp installation, the russian currency characters prevented the app from starting.

I'm attaching a patch that solves this, what it does is:
1. assigns all the currency symbols as unicode strings (this is needed, the russian currency prevents the app from starting in codepage 1250 environemnt, the other currencies can be assigned without the u string prefix ('£' instead of u'£') but the characters appearing - e.g. in the currencies menu - are all wrong)
2. at startup float2str method is tested in all the assigned currencies, if an exception is thrown, a workaround step is turned on
3. locale.format() output in float2str is converted into a unicode string taking notice of the current locale's encoding

The patch probably breaks some tests.

One more thing, the currency doctests are locale dependent, running them on my linux box (unaltered 0.5 branch) many fail when locale is not English.

Revision history for this message
Michael Rooney (mrooney) wrote :

Thanks for the patch, I'll take a look at this. I definitely need more robust unit testing here. I'll work on something that sets the locale a few different times and tests multiple locales.

Changed in wxbanker:
assignee: nobody → mrooney
status: Confirmed → In Progress
Revision history for this message
Karel Kolman (kolmis) wrote :

Michael, i see that you altered the patch making the _locale_encoding an attribute of BaseCurrency. The problem is you probably forgot to change the _locale_encoding module attribute assignment when the UnicodeDecodeError is encountered, which is why instances of BaseCurrency have _locale_encoding always set to None.

So you either want to move the try/catch UnicodeDecodeError block with _locale_encoding initialization into the BaseCurrency constructor or keep it as a module attribute as in the submitted patch.

I am in favor of the latter because the error cause is connected to the locale.format() calls using the locale specific thousand separator chars ('thousands_sep', 'mon_thousands_sep') and the workaround should be turn on either for all the currencies or for none of them.

One more thing, the self.LOCALECONV['mon_thousands_sep'] and self.LOCALECONV['thousands_sep'] are not being used and if you don't intend to alter other functionality from Lib/locale.py to use these, i'd say they ought to be removed.

Revision history for this message
Karel Kolman (kolmis) wrote :

i'm attaching a patch to the unittests.py file.

1] set the locale to en_US.UTF8 for testCurrencyDisplay()
2] test the _locale_encoding workaround for ru_RU.UTF8

Revision history for this message
Karel Kolman (kolmis) wrote :

i'm linking to my 0.5 branch, i'll try to commit patches there if i have any in the future, since it's probably more comfortable to work with branches than with patches.

Revision history for this message
Michael Rooney (mrooney) wrote :

Thanks Karel, branches are easier indeed! I reverted to using _locale_encoding as a module attribute and merged in your branch, but the test still fails:

ERROR: test locale.format() thousand separator workaround
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/michael/Code/wxbanker-0.5/unittests.py", line 44, in testLocaleFormatWorkaround
    curr().float2str(1000)
  File "/home/michael/Code/wxbanker-0.5/currencies.py", line 117, in float2str
    s = s + (separated and ' ' or '') + smb
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 2: ordinal not in range(128)

I've pushed up the changes, does it work for you?

Revision history for this message
Karel Kolman (kolmis) wrote :

i cannot judge. tha latest 0.5 update fails me with "The wx.App object must be created first!" exception when starting wxBannker or trying to runn unit tests.

Revision history for this message
Karel Kolman (kolmis) wrote :

when i run the unittests for the currency module only (using your 0.5 branch), no tests fail

(using cs_CZ.UTF8)

Revision history for this message
Michael Rooney (mrooney) wrote :

Strange, I can't duplicate that wx.App error. I tried a fresh checkout of my 0.5 branch, and also one of yours with mine merged in, and in either case it runs fine, although the test still fails with the traceback I mentioned. I also tried with LC_ALL=cs_CZ.utf8 just in case but can still run it just fine.

However you are right, with cs_CZ the tests pass, but not with en_US. Interesting.

Revision history for this message
Karel Kolman (kolmis) wrote :

i forgot to write that the wx.App error occurs when trying to start wxBanker, not when running unittests. the traceback is (this one from windows, happens in linux too, wxPython 2.8.9.1 in both systems)

Traceback (most recent call last):
  File "c:\Projects\wxbanker\0.5\wxbanker.py", line 157, in <module>
    main()
  File "c:\Projects\wxbanker\0.5\wxbanker.py", line 145, in main
    app = init()
  File "c:\Projects\wxbanker\0.5\wxbanker.py", line 118, in init
    bankController = Controller(path)
  File "c:\Projects\wxbanker\0.5\controller.py", line 240, in __init__
    self.InitConfig()
  File "c:\Projects\wxbanker\0.5\controller.py", line 250, in InitConfig
    config = wx.Config.Get()
  File "E:\Program Files\Python2.6\lib\site-packages\wx-2.8-msw-unicode\wx\_misc
.py", line 2918, in Get
    return _misc_.ConfigBase_Get(*args, **kwargs)
wx._core.PyNoAppError: The wx.App object must be created first!

i'll look into the tests failing with en_US.

Revision history for this message
Karel Kolman (kolmis) wrote :

the problem was that the workaround is activated upon module import. thus after changing the locale in the unittest, the workaround flag stays unaltered. a module reload statement fixes it, pushed in my 0.5 branch.

Revision history for this message
Michael Rooney (mrooney) wrote : Re: [Bug 320207] Re: Won't start - win xp, cp1250, exception thrown

Excellent catch on the workaround issue, I've merged and pushed,
adding a comment in the workaround documenting the reload issue.

Regarding the traceback, I tried with both the tests and running
wxbanker normally, also in cs_CZ, and can't reproduce it. Have you
tried on a fresh checkout?

Revision history for this message
Karel Kolman (kolmis) wrote :

fresh checkout of the 0.5 branch:

intrepid (wx 2.8.8.0) - runs fine
jaunty, windows (wx 2.8.9.1) - fails with the above stacktrace

Revision history for this message
Michael Rooney (mrooney) wrote :

Ahh yes I see. I am not sure why that worked in wxpython 2.8.8.x, now
that I look at it! Anyway, fixed and pushed.

Revision history for this message
Michael Rooney (mrooney) wrote :

Is it safe to mark this bug as fixed now? I think this is addressed.

Changed in wxbanker:
milestone: none → 0.5
Revision history for this message
Karel Kolman (kolmis) wrote :

yes i think so, works fine running czech win xp now.

Michael Rooney (mrooney)
Changed in wxbanker:
status: In Progress → Fix Committed
Revision history for this message
Michael Rooney (mrooney) wrote :

Released in 0.5!

Changed in wxbanker:
status: Fix Committed → Fix Released
Revision history for this message
Michael Rooney (mrooney) wrote :

Hi Karel, I'm revisiting this as I am trying to improve and clean up currency support in 0.6. Would a patch like:

- if _locale_encoding is not None:
- s = unicode(s, _locale_encoding)
+ if not isinstance(s, unicode):
+ s = unicode(s, locale.getlocale()[1])

be sufficient? It seems to work and pass all tests but seems too easy. Do you think this would work instead of iterating over everything at module load time and putting in the workaround only in that case?

Changed in wxbanker:
status: Fix Released → Incomplete
Revision history for this message
Karel Kolman (kolmis) wrote :

There are two
"if _locale_encoding is not None:"
conditions in the currencies.py, one in the "float2str" method, the other in "LocalizedCurrency"'s constructor. Do you intend to keep the second occurance untouched ?

Revision history for this message
Karel Kolman (kolmis) wrote :

I didn't check the trunk branch, all seems working there, the ugly hacks are gone, thumbs up :)

Revision history for this message
Michael Rooney (mrooney) wrote :

Great, thanks for the review! The only problem that exists (which I don't think is a regression from the previous method) is that when it sets the locale encoding it assumes that locale has the same encoding as the current one. As they are all utf-8 it seems, this isn't a problem, but if it becomes one, I'll have to store the encoding on the Currency object as well and use that. I don't know of any non-utf8 locales off-hand, though.

Changed in wxbanker:
status: Incomplete → Fix Released
Revision history for this message
Karel Kolman (kolmis) wrote :

Not sure I understand what the problem would be, but using a win1250 encoding (not a utf8) and switching currencies between the locale defined and wxbanker define doesn't seem to cause a problem, if that observation is of any help....

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.