Extension parameter null with accented character

Bug #1732929 reported by bka
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Inkscape
New
Undecided
Unassigned

Bug Description

In an extension containing a string parameter, if the user enter a string that contains at least one accented character (ex: à, é, è...) the value of the parameter received by the python script is null.

This behaviour is also reproduced by default extensions shipped in Inkscape (ex: Web>Slicer>Create a slicer rectangle..., see the attachement animated gif).

This behaviour is reproduced in 0.92.2 and 0.92.1 using the french or english locale (switching in preferences panel).

Revision history for this message
bka (bkasol) wrote :
Revision history for this message
Patrick Storz (ede123) wrote :

Related to (if not a duplicate of) bug #1256694.

Note that the extension receives the string just fine (at least for all characters supported in the OSs console encoding) as confirmed with the small sample extension attached.

The problem probably is some issue with encodings in the Python script itself (I assume the ANSI-encoded parameters are interpreted as UTF-8 in the script which leads to invalid UTF-8 characters) but this needs proper investigation.

Revision history for this message
Patrick Storz (ede123) wrote :

Seems not trivial to fix...

The problem is that the encoding of the command line arguments of the spawned process is somewhat hard to determine: We use Glib::spawn_async_with_pipes for that (see src/extension/implementation/script.cpp) and glib "works it's magic" behind the scenes.

On Windows for example the command line arguments (argv) are converted to wide character strings (wchar) and this is what Python will receive.

Now Python itself works it's own magic... I think Python 2 uses the received arguments without decoding, Python 3 is more clever and I think it should convert, see also relevant upstream bug [1].

This means in Python 2 we still need to decode, i.e. something like the following seems to work:
    for i, a in enumerate(sys.argv):
        sys.argv[i] = a.decode(sys.getfilesystemencoding())

sys.getfilesystemencoding() returns 'mbcs' on Windows, which should be the "multibyte code set" which I think is Python-talk for wchar (but I'm not completely sure yet).

While the above seems to fix the encoding error I'm not yet sure if it fixes all issues (e.g. the extension mentioned in the other report suceeds, but does not render the measure text to the proper ID, so there still seem to be issues remaining.

And this was Windows... not sure about any other OS yet...

[1] https://bugs.python.org/issue2128

Revision history for this message
Patrick Storz (ede123) wrote :

As a matter of fact the above still does not work fully... (still limited to characters of the systems codepage).

A proper fix (for Windows) seems possible with the code from the linked upstream report
https://bugs.python.org/issue2128#msg125827

But yeah, it's ugly as hell and still does not solve the issue on other OSs... I'm a bit skeptical if this is the way to solve this issue. :-/

Revision history for this message
Nathan Lee (nathan.lee) wrote :

Replicated in Inkscape 0.92.5 (0ad1ac969f, 2020-08-06), but not in Inkscape 1.2-dev (da2df6f5a3, 2021-07-26) or Inkscape 1.0 (4035a4fb49, 2020-05-01) on Linux Mint 20 (so we can consider it fixed on Linux systems)

Also replicated in 0.92.5, but fixed in Inkscape 1.0, and Inkscape 1.1 Windows 10.

Tested with �, �, �, english locale and the Create a slicer rectangle extension

I understand that the linked issue relates to python2 (while 1.1+ uses python3) and is marked as resolved, so I think this issue can be considered fixed.

I'm not too sure, but thought I'd leave the results. I've closed the linked Inkscape issue https://bugs.launchpad.net/inkscape/+bug/1256694

Revision history for this message
Nathan Lee (nathan.lee) wrote :

Funny that I can't comment those characters. To clarify, I tested with the characters mentioned in the original post.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.