update-software-center does not work properly in Turkish locales

Bug #581207 reported by M. Vefa Bicakci on 2010-05-16
32
This bug affects 5 people
Affects Status Importance Assigned to Milestone
software-center (Ubuntu)
Medium
Unassigned
Lucid
Medium
Unassigned

Bug Description

Binary package hint: software-center

Hello,

After trying out Ubuntu 10.04 with a Turkish locale (tr_TR.UTF-8), I was
sad to notice that when update-software-center is run by a dpkg trigger or
a manual command line interaction, it produces errors similar to the following:
(I was able to reproduce this with software-center 2.0.3.)

=== 8< ===
WARNING:root:error processing: /usr/share/app-install/desktop/cecilia.desktop 'WEIGHT_DESKTOP_GENERiCNAME'
WARNING:root:error processing: /usr/share/app-install/desktop/anjal.desktop 'WEIGHT_DESKTOP_GENERiCNAME'
WARNING:root:error processing: /usr/share/app-install/desktop/projectl.desktop 'WEIGHT_DESKTOP_GENERiCNAME'
WARNING:root:error processing: /usr/share/app-install/desktop/kde4_kate.desktop 'WEIGHT_DESKTOP_GENERiCNAME'
WARNING:root:error processing: /usr/share/app-install/desktop/kde4_kteatime.desktop 'WEIGHT_DESKTOP_GENERiCNAME'
WARNING:root:error processing: /usr/share/app-install/desktop/kadu.desktop 'WEIGHT_DESKTOP_GENERiCNAME'
WARNING:root:error processing: /usr/share/app-install/desktop/avidemux-qt.desktop 'WEIGHT_DESKTOP_GENERiCNAME'
WARNING:root:error processing: /usr/share/app-install/desktop/pydance.desktop 'WEIGHT_DESKTOP_GENERiCNAME'
WARNING:root:error processing: /usr/share/app-install/desktop/etw.desktop 'WEIGHT_DESKTOP_GENERiCNAME'
WARNING:root:error processing: /usr/share/app-install/desktop/qtpfsgui.desktop 'WEIGHT_DESKTOP_GENERiCNAME'
WARNING:root:error processing: /usr/share/app-install/desktop/kde_kalcul.desktop 'WEIGHT_DESKTOP_GENERiCNAME'
=== >8 ===

The end result of this is that a lot of the installed software cannot be
seen in the Ubuntu Software Center, which is a very big usability problem
for people using a Turkish locale. (After running update-software-center
in the English locale [en_US.UTF-8], one can see 67 programs in the
"Installed Software" menu. With the Turkish locale [tr_TR.UTF-8], this
number drops to 40.)

Notice in the output above that the "i" is not capitalized in the keyword
"WEIGHT_DESKTOP_GENERiCNAME". This is a very common problem
with programs which run in the Turkish locale, but that expect the
capitalization of "i" to work as in the English alphabet (i.e. in ASCII).
Unfortunately, in the Turkish alphabet the capitalization rules of the "i"s
are different:

=== 8< ===
English:
 lowercase: i
 uppercase: I

Turkish:
 Dotless "i":
  lowercase: ı (idotless)
  uppercase: I

 "i" with dot:
  lowercase: i
  uppercase: İ (Idotabove)
=== >8 ===

The source of this problem can be seen in the following file:

=== /usr/share/software-center/softwarecenter/db/update.py ===
191 # now add search data from the desktop file
192 for key in ["GenericName","Comment"]:
193 if not parser.has_option_desktop(key):
194 continue
195 s = parser.get_desktop(key)
196 w = globals()["WEIGHT_DESKTOP_"+key.replace(" ","").upper()]
197 term_generator.index_text_without_positions(s, w)
=== /usr/share/software-center/softwarecenter/db/update.py ===

As you can see, on line 196, the string "key", which contains ASCII data such
as "GenericName", is capitalized using the "upper" function.

Because the data to be operated on is ASCII, but the locale is a Turkish locale,
this causes problems when one tries to capitalize an "i".

The attached patch fixes this issue by creating a function called ascii_upper
which uses the the "string.maketrans" and "string.translate" functions and the
"string.ascii_lowercase" and "string.ascii_uppercase" constants to convert mixed
case ASCII strings to uppercase ASCII strings using English capitalization rules.

Dear Ubuntu Software Center maintainer: Because this is a high profile bug that
affects everyone running in a Turkish locale, and because its fix is relatively
simple, is there any way we can get this bug fixed in Ubuntu 10.04 LTS ? Because
this is a LTS release, I really don't want Turkish users to have a suboptimal
experience during the 3 years this release is supported.

I realize that I should have tested an alpha or beta version of Ubuntu 10.04 LTS
and reported this bug earlier. I apologize for not doing this.

Regards,

M. Vefa Bicakci

TEST CASE:
1. install turkish locale:
  $ sudo apt-get install language-pack-gnome-tr-base
2. from a terminal run the following command and note any error message:
  $ sudo LC_ALL=tr_TR.UTF8 update-software-center
3. run software-center
4. Select "installed software" and note the number of items on the status bar
5. from a terminal run the following command:
  $ sudo LC_ALL=C update-software-center
6. run software-center
7. Select "installed software" and note the number of items on the status bar
8. Compare the number of items in steps 4 and 7

VERIFICATION FAILED:
1. install software-center from -updates
2. The following messages are printed on the terminal output:
WARNING:root:error processing: /usr/share/app-install/desktop/qtpfsgui.desktop 'WEIGHT_DESKTOP_GENERiCNAME'
3. The numbers of items between tr_TR.utf8 and C locales are differents

VERIFICATION SUCCEEDED:
1. install software-center from -proposed (2.0.5)
2. No error message is printed on the terminal
3. The numbers of items are the same.

M. Vefa Bicakci (mvb) wrote :
Omer Akram (om26er) on 2010-05-16
tags: added: patch
removed: center locale software turkish ubuntu
Olivier Tilloy (osomon) wrote :

I can reproduce the bug, it is indeed pretty bad for users with a Turkish locale.
I can also confirm that the attached patch fixes the issue.

Note that this is the only place in the code where str.upper() is called, but there are several other places in the code where str.lower() is called, so this is potentially an issue for non unicode strings. This will need more investigation.

In the meantime I recommend the importance of this bug be set to high and the fix be merged as soon as possible.

Changed in software-center (Ubuntu):
status: New → Confirmed
Olivier Tilloy (osomon) wrote :

For convenience I linked a branch that contains the patch (slightly reformatted).

Olivier Tilloy (osomon) wrote :

Note: to reproduce the bug, execute the following commands in a terminal:

    locale-gen tr_TR.UTF-8
    export LANG=tr_TR.UTF-8
    sudo update-software-center

Expected result: no output.

Current result: a large number of warnings like the following:
WARNING:root:error processing: /usr/share/app-install/desktop/tryton.desktop 'WEIGHT_DESKTOP_GENERiCNAME'

Michael Vogt (mvo) wrote :

Hello! Thanks for the bugreport and the fix (and thanks to Olivier for importing it into the branch).

I merged it into the lucid branch and we can upload it once the previous update moved from -proposed to -updates (should happen relatively soon).

Cheers,
 Michael

Changed in software-center (Ubuntu):
status: Confirmed → Fix Committed
Changed in software-center (Ubuntu Lucid):
status: New → Confirmed
Changed in software-center (Ubuntu):
importance: Undecided → Medium
Changed in software-center (Ubuntu Lucid):
importance: Undecided → Medium
status: Confirmed → In Progress
M. Vefa Bicakci (mvb) wrote :

Hello again,

Dear Olivier Tilloy, thank you for spotting that I have missed the lower
function. I was just after the bug I observed, and for some reason
didn't think about the lower function. You are right, the same problem
exists with the lower function as well.

I am attaching a patch which includes an ascii_lower function in
addition to the ascii_upper function, and replaces all instances of
the lower and upper functions with their ascii_ prefixed counterparts.
(I also tried to use the formatting Olivier Tilloy preferred.)

Dear Olivier Tilloy and Michael Vogt, I would really appreciate it if you
could import the changes introduced by this improved patch into the
source code repository as well.

Regards,

M. Vefa Bicakci

Olivier Tilloy (osomon) wrote :

I looked at the code more in depth and it appears that all the calls to str.lower() in softwarecenter/db/update.py are applied on ASCII-only strings, therefore the overhead of using a locale-independent translation table is not needed.

For details, see the freedesktop specification for desktop entries: http://standards.freedesktop.org/desktop-entry-spec/latest/ar01s05.html.
The relevant keys here are "Type", "Categories" and "X-AppInstall-Ignore", all of which are non-localized strings.

M. Vefa Bicakci (mvb) wrote :

Please do not get offended, but I think you misunderstood the
problem. As I explained above in my first post, because the Turkish
language (and hence the Turkish locale) has different capitalization
rules regarding "i", even if the desktop entries are composed of
ASCII characters, we will get problems when we try to process
"i" or "I" characters. (In Turkish: "ı" <-> "I" and "i" <-> "İ".)

How?

Let say we are processing "X-AppInstall-Ignore". If we try to use
lower() on this string in a non-Turkish locale, we would get
"x-appinstall-ignore". However, if we do this in a Turkish locale,
then we would get "x-appInstall-Ignore". Notice that the "I"
characters stay the same. This is because we can't represent
the small dot-less i in ASCII.

So, the problem isn't whether a string contains Unicode data.
The problem is that we are trying to apply Turkish capitalization
rules to a string that contains English/ASCII data. As I noted
above Turkish capitalization rules of the "i" are different compared
to that of English, and because of this, we are not going to get
a string we expect.

I am going to attach a small Python script to illustrate this problem.
Please run it on your system to see the effects of the Turkish locale.
As you will see, the "i" and "I" characters are not capitalized or made
lowercase properly.

To my knowledge, the only work-around to this problem is to define
ASCII-only upper and lower functions and use them whenever the
data we are operating on is English.

I am sorry; I think I repeated myself a bit, but I really want to get
the message across. Please let me know if you have any questions.

Regards,

M. Vefa Bicakci

Olivier Tilloy (osomon) wrote :

Right, I had indeed overlooked the issue. Thanks for your patience and the detailed explanation.
In this context, the complete patch makes sense indeed, and there are potentially other places in the code (i.e. other source files) that need patching as well. A quick grep on the trunk reveals the following places:

$ grep -rn --exclude="*.pyc" "lower()" *
softwarecenter/view/appview.py:272: k = os.environ["SOFTWARE_CENTER_SEARCHES_SORT_MODE"].strip().lower()
softwarecenter/view/historypane.py:225: search_matches = self.searchentry.get_text().lower() in pkg.lower()
softwarecenter/view/catview.py:231: q = xapian.Query("AC"+and_elem.text.lower())
softwarecenter/view/catview.py:240: xapian.Query("XS"+and_elem.text.lower()),
softwarecenter/view/catview.py:241: xapian.Query("AE"+and_elem.text.lower()))
softwarecenter/view/catview.py:245: q = xapian.Query("AT"+and_elem.text.lower())
softwarecenter/view/catview.py:249: q = xapian.Query("AH"+and_elem.text.lower())
softwarecenter/view/catview.py:254: q1 = xapian.Query("AP"+and_elem.text.lower())
softwarecenter/view/catview.py:256: xapian.Query("XP"+and_elem.text.lower()))
softwarecenter/view/catview.py:261: s = "pkg_wildcard:%s" % and_elem.text.lower()
softwarecenter/view/catview.py:278: return xapian.Query("AC"+include.text.lower())
softwarecenter/apt/apthistory.py:36: setattr(self, k.lower(), map(string.strip, sec[k].split(",")))
softwarecenter/apt/apthistory.py:38: setattr(self, k.lower(), [])
softwarecenter/apt/apthistory.py:42: count += len(getattr(self, k.lower()))
softwarecenter/db/update.py:130: if ignore.strip().lower() == "true":
softwarecenter/db/update.py:159: doc.add_term("AC"+cat.lower())
softwarecenter/db/update.py:163: doc.add_term("AT"+type.lower())
utils/installedapps.py:29: cmp=lambda x, y: cmp(x.split(":")[0].lower(),
utils/installedapps.py:30: y.split(":")[0].lower())))
utils/query.py:22: s = search_term.lower()
utils/query.py:33: query = xapian.Query(str_to_prefix[search_prefix]+search_term.lower())

How about putting the ascii_lower and ascii_upper functions in e.g. softwarecenter/utils.py and patching the whole codebase where relevant?

Launchpad Janitor (janitor) wrote :
Download full text (3.3 KiB)

This bug was fixed in the package software-center - 2.1.0

---------------
software-center (2.1.0) maverick; urgency=low

  [ Matthew McGowan ]
  * merged lp:~mmcg069/software-center/backforward-redraw-fix
  * make the overlaywithpixbuf cellrenderer inherit from a text
    cellrenderer, does away with the need to have 1px column in the
    appview for accessibility reasons.
    (lp:~mmcg069/software-center/overlay-w-pixbuf-tweak)
  * add nice animation to pathbar elements
    (lp:~mmcg069/software-center/pathbar-scroll-inn)

  [ Olivier Tilloy ]
  * fix LP: #564785:
    "each row has a progress bar (which itself never contains any text)"
  * show download completion status (LP: #460888)
  * add "bottom border" effect (LP: #439621)
  * add "history" GUI that reads /var/log/apt/history.log
  * Re-claim used memory after updating an existing AppStore with a
    new one (LP: #577540)
  * Fix the database update when run with a Turkish locale
    (patch by M. Vefa Bicakci). LP: #581207
  * Make buttons activate on mouse up, fix other inconsistencies
    in list view button operation (LP: #514835)

  [ Jacob Johan Edwards ]
  * merged lp:~j-johan-edwards/software-center/smooth_search, this
    massively improves the search and stops it from flickering
    (LP: #570682)
  * merged lp:~j-johan-edwards/software-center/action_bar that provides
    the foundation for the "custom packages list" branch
  * merged lp:~j-johan-edwards/software-center/unbranded_icons to
    provide a set of unbranded icons for e.g. Debian
  * merged lp:~j-johan-edwards/software-center/custom_lists to
    implement https://wiki.ubuntu.com/SoftwareCenter#Custom%20package%20lists

  [ Ken van Dine ]
  * allow sharing apps via gwibber and apturl
    (lp:~ken-vandine/software-center/sharing)

  [ Julian Andres Klode ]
  * merged lp:~juliank/software-center/debian that include fixes and
    updates for the new python-apt 0.8 API

  [ Kiwinote ]
  * data/featured.menu.in:
    - Update featured applications list per Desktop team (LP: #548534)
    - Feature 'fretsonfire-game' rather than 'fretsonfire' (LP: #538646)
  * softwarecenter/view/app.py:
    - Set correct sensitivity of 'edit > undo,redo,cut,copy,delete,select_all'
      (LP: #439613, LP: #530194)

  [ Michael Vogt ]
  * softwarecenter/view/appview.py:
    - simplify application list buildup and improve responsiveness
  * softwarecenter/view/*pane.py:
    - fix crash when ngettext is translated without %s format
      (LP: #449053)
  * add test/Makefile and ensure all tests are run in the bzr-buildpackage
    pre-build hook
  * softwarecenter/db/database.py, softwarecenter/view/appdetailsview.py:
    - add "StoreDatabase.get_iconname()" and use it
  * softwarecenter/view/appview.py:
    - small cleanups
  * softwarecenter/view/availablepane.py:
    - add iconnames when installing custom lists
  * softwarecenter/view/pendingview.py:
    - look for "appname" and "pkgname" (in this order) when showing
      the progress information
  * update about (LP: #566571)
  * merged lp:~apulido/software-center/mago_fix (many thanks to Ara Pulido)
  * data/unbranded-software-center.desktop.in:
    - add unbranded desktop file
  * softwar...

Read more...

Changed in software-center (Ubuntu):
status: Fix Committed → Fix Released
M. Vefa Bicakci (mvb) wrote :

Dear Olivier Tilloy,

I apologize for the late reply.

Yes, your suggestion is the optimal fix for this problem. We should have the
ascii_upper and ascii_lower functions in a utility file, and we should use these
functions instead of the upper and lower functions whenever we are dealing
with English data.

Regards,

M. Vefa Bicakci

Accepted software-center into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in software-center (Ubuntu Lucid):
status: In Progress → Fix Committed
tags: added: verification-needed

I have reproduced the problem with software-center 2.0.4 and have verified that the version of software-center 2.0.5 in proposed fixes
the issue.

Marking as verification-done

description: updated
tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package software-center - 2.0.5

---------------
software-center (2.0.5) lucid-proposed; urgency=low

  [ Olivier Tilloy ]
  * Fix the database update when run with a Turkish locale
    (patch by M. Vefa Bicakci). LP: #581207

  [ Michael Vogt ]
  * softwarecenter/view/installedpane.py:
    - do not crash if model is None (LP: #586306)

  [ Matthew McGowan ]
  * Fix draw artifacts in the back/forward buttons when the widgets
    are resized (lp:~mmcg069/software-center/backforward-redraw-fix)
    Fixes LP: #582143
 -- Michael Vogt <email address hidden> Mon, 17 May 2010 09:44:05 +0200

Changed in software-center (Ubuntu Lucid):
status: Fix Committed → Fix Released
heartsmagic (heartsmagic) wrote :

I am using Ubuntu 10.10 Maverick Meerkat development branch and my Software Center verison is 2.1.4.
I can reproduce the very same bug also here.

File "/usr/lib/python2.6/decimal.py", line 3646, in <module>
    val = globals()[globalname]
KeyError: 'ROUND_CEiLiNG'

This is the same problem as described in this bug record.

Olivier Tilloy (osomon) wrote :

This occurrence is a known issue in python itself. A cheap workaround is to explicitly `import decimal` before setting the locale, at startup, even though we don't need it directly.

tags: added: testcase
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers