UTF-8 is not very well supported inside snaps

Bug #1576411 reported by Bruno Nova
102
This bug affects 21 people
Affects Status Importance Assigned to Milestone
snapd
Triaged
Medium
Unassigned
snapd (Ubuntu)
Triaged
Medium
Unassigned

Bug Description

It seems there can be encoding problems inside snaps.

Example 1:
1. Install the "hello-world" snap.
2. Run "hello-world.sh" (a shell is started).
3. Try to write something with an accented or special character, like "Olá Mundo" or "€".
   It doesn't work.

Example 2:
1. Create a simple python script that prints an accented character, like:

#!/usr/bin/env python3
print("Olá Mundo!")

2. Package it in a snap that provides that script as an app.
3. Run that snappy app.
   It raises an exception:

Traceback (most recent call last):
  File "/snap/ola/100001/ola", line 2, in <module>
    print("Ol\xe1 Mundo!")
UnicodeEncodeError: 'ascii' codec can't encode character '\xe1' in position 2: ordinal not in range(128)

Tags: xenial
Michael Vogt (mvo)
Changed in snappy:
status: New → Triaged
importance: Undecided → High
milestone: none → sru-2
Changed in snapd (Ubuntu):
status: New → Triaged
importance: Undecided → High
Revision history for this message
Zygmunt Krynicki (zyga) wrote :

Can you please attach a snap that shows this to happen?

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

I can confirm the hello-world case. It is quite curious, not sure what to make of it yet.

Revision history for this message
Bruno Nova (brunonova) wrote :

Here's the snap for the 2nd example.

Revision history for this message
Oliver Grawert (ogra) wrote :

this is due to the fact that snappy does not ship any locale data except C and C.UTF-8 ... nor any fonts or keymaps ... given that we fully focus on embedded and IoT with the rootfs

perhaps the ubuntu-core-launcher should enforce C-UTF-8 for all snaps ...

Revision history for this message
Kyle Fazzari (kyrofa) wrote :

> perhaps the ubuntu-core-launcher should enforce C-UTF-8 for all snaps ...

That would enforce it on the desktop as well, where we do have locales. Would that be an issue?

Revision history for this message
Oliver Grawert (ogra) wrote :

how would a snap access the desktop locales ?

Revision history for this message
Mark Shuttleworth (sabdfl) wrote : Re: [Bug 1576411] Re: UTF-8 is not very well supported inside snaps

This is an interesting one. We definitely want to be GREAT for IoT, we
also want to become the best way to deliver a desktop app. The latter
means that locales need to be workable.

Mark

Revision history for this message
Kyle Fazzari (kyrofa) wrote :

> how would a snap access the desktop locales ?

I talked to mvo about this today. It sounds like there are two ways to deal with this: either bind-mount the desktop locales into the ubuntu-core snap, or bound-mound them into /classic in the ubuntu-core snap and redirect things accordingly. I guess both of these solutions would be done by the launcher.

Note that the same type of thing would need to be done for fonts etc.

Revision history for this message
Sebastien Bacher (seb128) wrote :

On a similar note we(desktop) opened bug #1576282 "Snaps built from deb can't be gettext translated" with that note

"- traditional desktop applications are built with calls to 'bindtextdomain ("domain", LOCALEDIR)', where LOCALDIR is defined at buildtime and so pointing to /usr

there seems to be no way to redirect to another directory at runtime"

Which means the mount to /classic solution is going to not work if we want to properly support making snaps from debs, or we would need some hack to "divert" the glibc bindtextdomain() calls and mangle the arguments at runtine...

Revision history for this message
Sebastien Bacher (seb128) wrote :

bug #1576303 is about font since that was mentioned as well

Revision history for this message
erio (eri0) wrote :

I made a question in askubuntu, a comment linked me to here. http://askubuntu.com/questions/783758/self-built-snap-run-fail-on-locale-error

Revision history for this message
Oliver Grawert (ogra) wrote :

for the non bindtextdomain snaps (i.e. any binaries that just use libc's locale functions)
line 7-21 in
http://bazaar.launchpad.net/~ogra/+junk/nethack/view/head:/nethack.sh

plus the libc-bin and locales packages in "stage-packages" are a solution if you want to ship your own locales inside the snap (this should work fine for most IoT use cases)

this is currently also used by the freeCAD snap in the store.

i assume for desktop use we actually need an interface, though since the locale data itself will be inside the app snap while the locales command will live in the core snap or in the desktop install itself this will likely require some extra work and some bind mounting magic to make the two come together.

Revision history for this message
Roberto Alsina (ralsina) wrote :

The workaround by Oliver doesn't wok for me, apparently because the environment variable names have changed?

Here's a version that is working for me with snapcraft 2.11, snapd 2.0.8

https://github.com/getnikola/nikola/blob/provide-snap/snapcraft/nikola.sh

Revision history for this message
Kyle Fazzari (kyrofa) wrote :

Any more progress on actually fixing this issue?

Revision history for this message
Oliver Grawert (ogra) wrote :

@ralsina, yeah, that was more a "how it once worked" example :)
i have update (and uploaded) nethack now at https://github.com/ogra1/nethack/blob/master/nethack.sh

Revision history for this message
Nish Aravamudan (nacc) wrote :

@ogra, thank you for your example from nethack! I hit the same issue with a simple python3 script that was failing due to sys.getfilesystemencoding() returning 'ascii'. Modifying my wrapper to do the following:

export LOCPATH=$SNAP_USER_DATA

LANG=en_US
ENC=UTF-8
LOC="$LANG.$ENC"

if [ ! -e $SNAP_USER_DATA/$LOC ]; then
    localedef --prefix=$SNAP_USER_DATA -f $ENC -i $LANG $SNAP_USER_DATA/$LOC
fi

export LC_ALL=$LOC
export LANG=$LOC
export LANGUAGE=${LANG%_*}

seems to be a necessary step for my script. It seems like this is something we would want to eventually internalize to snapcraft/snapd?

Revision history for this message
Adam Stokes (adam-stokes) wrote :

I had to workaround this issue as well for conjure-up by exporting LC_ALL=C.UTF-8 (https://github.com/conjure-up/conjure-up/tree/master/snapcraft).

Revision history for this message
Leo Arias (elopio) wrote :

Same problem in errbot, and same workaround to force LC_ALL=C.UTF-8:
https://github.com/elopio/errbot/blob/snapcraft/snapcraft.yaml#L14

Revision history for this message
Leo Arias (elopio) wrote :
Revision history for this message
Leo Arias (elopio) wrote :
Revision history for this message
Zygmunt Krynicki (zyga) wrote :

From a desktop POV, can we reuse the locale data available on the host somehow? If snapd on classic were to bind mount /usr/share/locale from the host into the snap space (and somehow, which is tricky, merged it with locale data from core and the app snap) would that be the desired outcome?

On core devices we can be as small as anyone wants, on classic devices we can just use what is there.

Changed in snapd:
status: New → Confirmed
Changed in snappy:
milestone: sru-2 → none
no longer affects: snappy
Revision history for this message
Oliver Grawert (ogra) wrote :

@zyga

i think you would have to call locale-gen on first start of each app (depending on the CPU power of your system that can take quite a while), not sure you can hack around this easily.

Revision history for this message
Oliver Grawert (ogra) wrote :

(alternatively snapcraft could call it and generate *all* possible locales (well, the UTF-8 variant) for the app and have it shipped in the snap)

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

If one is calling locale-gen on the first app invocation, it seems plausible on install snapd could:
1. do the equivalent of 'snap run --shell <snap>.<command>'
2. run locale-gen
3. exit the shell and use 'snap-discard-ns <snap>'

Revision history for this message
John Lenton (chipaca) wrote :

If we're doing this, we should have a core config as to what locales to generate. That way on small core devices we just generate what's needed (often: nothing). On non-core we can make that unsettable and proxy to whatever the host uses for the 'get' step.

Revision history for this message
erio (eri0) wrote :

Was this ever solved?

Revision history for this message
John Lenton (chipaca) wrote :

@eri0 that depends on exactly which "this" you mean :-)

Revision history for this message
Merlijn Sebrechts (merlijn-sebrechts) wrote :

@chipaca

The issue that remains is that applications that need UTF-8 require workarounds in the snap.

This issue explains it in more detail with an example in python: https://bugs.launchpad.net/snapcraft/+bug/1804845

Basically, the issue is that stuff like $LANG gets passed into the snap, but the snap only has the locales C, C.UTF-8 and POSIX. As a result, python applications fall back to ASCII, breaking many libraries such as click.

snapcraft itself has that issue and works around it by forcing "C.UTF-8": https://github.com/snapcore/snapcraft/blob/4042556714400d2156cb89efb86bf294500d1f41/snapcraft/cli/__main__.py#L40

I don't know if this is a snapd, core, documentation or snapcraft issue, but many, many people trip over it. Just look at how many forum posts contain locale errors concerning C.UTF-8: https://forum.snapcraft.io/search?q=locale%20c.utf-8%20error

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

The issue is in the core snap which does not provide the required locale data, forcing applications to attempt to do it themselves, which is not easy to do.

Revision history for this message
林博仁(Buo-ren, Lin) (buo-ren-lin) wrote :

I wonder if my `locales-launch` remote part can workaround this issue at the snap end?:
https://forum.snapcraft.io/t/the-locales-launch-remote-part/8729

Also according to the discussion at https://forum.snapcraft.io/t/lack-of-compiled-locales-breaks-gettext-based-localisation/3758 , can't we just simply put a copy of compiled data of all reasonable locales into the core snap?

Changed in snapd:
status: Confirmed → Triaged
no longer affects: snapcraft
Changed in snapd:
importance: Undecided → Medium
Changed in snapd (Ubuntu):
importance: High → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.