UTF-8 / system / filesystem encoding bugs

Bug #698498 reported by Peter Clifton
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
gEDA
Confirmed
High
Unassigned

Bug Description

There are cases where non UTF-8 strings are being passed to GTK widgets within gschem. This makes them unhappy, as they are UTF-8 only. Such characters may come from filenames in a system encoding which is not UTF-8.

Additionally, _() translated messages are being printed to stderr / stdout, wihthout any regard for converting to the system encoding.

This results in garbled console output where the console / system locale isn't UTF-8.

Tags: i18n sf-bugs
Revision history for this message
Peter TB Brett (peter-b) wrote :

Bumping priority of this bug, because it's a genuine issue that needs addressing with increased internationalisation in the suite.

Peter TB Brett (peter-b)
Changed in geda:
status: New → Confirmed
Revision history for this message
Gareth Edwards (gareth-uk) wrote :

Are there specific known cases or do we need to audit the code to find them? I've found a few string operations in the code (i18n and otherwise) I'm unhappy with, I just wonder the best place to capture them.

tags: added: i18n
Revision history for this message
Gareth Edwards (gareth-uk) wrote :

Split second part into 1083076.

Revision history for this message
gpleda.org commit robot (gpleda-launchpad-robot) wrote :

A commit was made which affects this bug
git master commit f9140160a1ba1a2d76f97aa0371dd02093130bb2
http://git.geda-project.org/geda-gaf/commit/?id=f9140160a1ba1a2d76f97aa0371dd02093130bb2

commit f9140160a1ba1a2d76f97aa0371dd02093130bb2
Author: Peter TB Brett <email address hidden>
Commit: Peter TB Brett <email address hidden>

    libgeda: Require gEDA files to be valid UTF-8.

    Affects-bug: lp-698498

Revision history for this message
Peter TB Brett (peter-b) wrote :

One way to tackle this in general might be to use GFile objects (https://developer.gnome.org/gio/2.36/GFile.html) whenever we need to work with filenames, rather than just using raw character array filenames.

Revision history for this message
Vladimir Zhbanov (vzhbanov) wrote :

I've tested many dialogs using the 8-bit Russian encoding KOI8-R.

I installed the KOI8-R locale as the default one and rebooted.
I added a symbol with a name in the KOI8-R encoding:
  echo ~/gaf/symbols/passive/фыва.sym |recode ..koi8-r| xargs touch
Then I opened gschem (built from current git) in KOI8-R
environment (using the special KOI8-R terminal though I could do
it just using LANG=KOI8-R):
  koi8rxterm -e gschem

Then I've tested as much gschem dialogs as I could:

COMPSELECT
  I found the symbol in my library which name is shown with
  characters displayed as diagonally crossed rectangles. I cannot
  select it to add to the schematic.
  gschem says as I open the dialog:
    (gschem:13010): Pango-WARNING **: Invalid UTF-8 string passed
    to pango_layout_set_text()

PAGE-MANAGER
  "Pango-WARNING ..." as above and the same appearance of the file
  name.

FILE OPEN, FILE SAVE AS
  They work well and display filenames correctly.

  If I launch 'LANG=C gschem' the 'File open' dialog shows the
  file name as '<four diamonds here>.sym (invalid encoding)'

OVERWRITE FILE
  When I'm trying to overwrite the file using "file save as"
  dialog the 'Overwrite file?' dialog
  causes the same effect with "Pango-WARNING ..."

CLOSE WITHOUT SAVING
  In "close without saving" window the header "Save the changes to
  schematic <filename> before closing?" is lost.

MULTIATTRIB
  If I even add an attribute (using text editor) in KOI8-R the
  multiattrib dialog adds text in UTF-8 and works well

TEXT-ENTRY
AUTONUMBER
SINGLEATTRIB
TEXT-EDIT
SLOT-EDIT
SYMBOL TRANSLATE
  Tried to insert text, sometimes using those two encodings, the
  dialogs work well and there are no errors anywhere

LOG
  If I type ":" to type a command and enter any cyrillic symbols
  (e.g. '(фыва)') and then Enter, it outputs "Invalid UTF-8 in log
  message. See stderr or gschem.log." in the log window and
  outputs the backtrace in terminal, whereas when I type in English
  it outputs the backtrace in the log window (this behaviour is
  the same for any locale, with UTF-8 it behaves the same way, so
  I believe this is another bug).

HOTKEYS
COLOR-EDIT
SNAP-SIZE
TEXT-SIZE
LINE-TYPE
PIN-TYPE-EDIT
  OK (cannot add any text here)

If I launch 'LANG=C gschem' with the above file name gschem
outputs in the log window
  ** Invalid UTF-8 in log message. See stderr or gschem.log.
and barfs to stderr (I suspect this is because of the "Loading
schematic ..." string where the name is in KOI8-R).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.