UTF-8 is not very well supported inside snaps

Bug #1576411 reported by Bruno Nova on 2016-04-28
48
This bug affects 9 people
Affects Status Importance Assigned to Milestone
snapd
Undecided
Unassigned
snapd (Ubuntu)
High
Unassigned

Bug Description

It seems there can be encoding problems inside snaps.

Example 1:
1. Install the "hello-world" snap.
2. Run "hello-world.sh" (a shell is started).
3. Try to write something with an accented or special character, like "Olá Mundo" or "€".
   It doesn't work.

Example 2:
1. Create a simple python script that prints an accented character, like:

#!/usr/bin/env python3
print("Olá Mundo!")

2. Package it in a snap that provides that script as an app.
3. Run that snappy app.
   It raises an exception:

Traceback (most recent call last):
  File "/snap/ola/100001/ola", line 2, in <module>
    print("Ol\xe1 Mundo!")
UnicodeEncodeError: 'ascii' codec can't encode character '\xe1' in position 2: ordinal not in range(128)

Michael Vogt (mvo) on 2016-05-03
Changed in snappy:
status: New → Triaged
importance: Undecided → High
milestone: none → sru-2
Changed in snapd (Ubuntu):
status: New → Triaged
importance: Undecided → High
Zygmunt Krynicki (zyga) wrote :

Can you please attach a snap that shows this to happen?

Zygmunt Krynicki (zyga) wrote :

I can confirm the hello-world case. It is quite curious, not sure what to make of it yet.

Bruno Nova (brunonova) wrote :

Here's the snap for the 2nd example.

Oliver Grawert (ogra) wrote :

this is due to the fact that snappy does not ship any locale data except C and C.UTF-8 ... nor any fonts or keymaps ... given that we fully focus on embedded and IoT with the rootfs

perhaps the ubuntu-core-launcher should enforce C-UTF-8 for all snaps ...

Kyle Fazzari (kyrofa) wrote :

> perhaps the ubuntu-core-launcher should enforce C-UTF-8 for all snaps ...

That would enforce it on the desktop as well, where we do have locales. Would that be an issue?

Oliver Grawert (ogra) wrote :

how would a snap access the desktop locales ?

This is an interesting one. We definitely want to be GREAT for IoT, we
also want to become the best way to deliver a desktop app. The latter
means that locales need to be workable.

Mark

Kyle Fazzari (kyrofa) wrote :

> how would a snap access the desktop locales ?

I talked to mvo about this today. It sounds like there are two ways to deal with this: either bind-mount the desktop locales into the ubuntu-core snap, or bound-mound them into /classic in the ubuntu-core snap and redirect things accordingly. I guess both of these solutions would be done by the launcher.

Note that the same type of thing would need to be done for fonts etc.

Sebastien Bacher (seb128) wrote :

On a similar note we(desktop) opened bug #1576282 "Snaps built from deb can't be gettext translated" with that note

"- traditional desktop applications are built with calls to 'bindtextdomain ("domain", LOCALEDIR)', where LOCALDIR is defined at buildtime and so pointing to /usr

there seems to be no way to redirect to another directory at runtime"

Which means the mount to /classic solution is going to not work if we want to properly support making snaps from debs, or we would need some hack to "divert" the glibc bindtextdomain() calls and mangle the arguments at runtine...

Sebastien Bacher (seb128) wrote :

bug #1576303 is about font since that was mentioned as well

erio (eri0) wrote :

I made a question in askubuntu, a comment linked me to here. http://askubuntu.com/questions/783758/self-built-snap-run-fail-on-locale-error

Oliver Grawert (ogra) wrote :

for the non bindtextdomain snaps (i.e. any binaries that just use libc's locale functions)
line 7-21 in
http://bazaar.launchpad.net/~ogra/+junk/nethack/view/head:/nethack.sh

plus the libc-bin and locales packages in "stage-packages" are a solution if you want to ship your own locales inside the snap (this should work fine for most IoT use cases)

this is currently also used by the freeCAD snap in the store.

i assume for desktop use we actually need an interface, though since the locale data itself will be inside the app snap while the locales command will live in the core snap or in the desktop install itself this will likely require some extra work and some bind mounting magic to make the two come together.

Roberto Alsina (ralsina) wrote :

The workaround by Oliver doesn't wok for me, apparently because the environment variable names have changed?

Here's a version that is working for me with snapcraft 2.11, snapd 2.0.8

https://github.com/getnikola/nikola/blob/provide-snap/snapcraft/nikola.sh

Kyle Fazzari (kyrofa) wrote :

Any more progress on actually fixing this issue?

Oliver Grawert (ogra) wrote :

@ralsina, yeah, that was more a "how it once worked" example :)
i have update (and uploaded) nethack now at https://github.com/ogra1/nethack/blob/master/nethack.sh

Nish Aravamudan (nacc) wrote :

@ogra, thank you for your example from nethack! I hit the same issue with a simple python3 script that was failing due to sys.getfilesystemencoding() returning 'ascii'. Modifying my wrapper to do the following:

export LOCPATH=$SNAP_USER_DATA

LANG=en_US
ENC=UTF-8
LOC="$LANG.$ENC"

if [ ! -e $SNAP_USER_DATA/$LOC ]; then
    localedef --prefix=$SNAP_USER_DATA -f $ENC -i $LANG $SNAP_USER_DATA/$LOC
fi

export LC_ALL=$LOC
export LANG=$LOC
export LANGUAGE=${LANG%_*}

seems to be a necessary step for my script. It seems like this is something we would want to eventually internalize to snapcraft/snapd?

Adam Stokes (adam-stokes) wrote :

I had to workaround this issue as well for conjure-up by exporting LC_ALL=C.UTF-8 (https://github.com/conjure-up/conjure-up/tree/master/snapcraft).

Leo Arias (elopio) wrote :

Same problem in errbot, and same workaround to force LC_ALL=C.UTF-8:
https://github.com/elopio/errbot/blob/snapcraft/snapcraft.yaml#L14

Zygmunt Krynicki (zyga) wrote :

From a desktop POV, can we reuse the locale data available on the host somehow? If snapd on classic were to bind mount /usr/share/locale from the host into the snap space (and somehow, which is tricky, merged it with locale data from core and the app snap) would that be the desired outcome?

On core devices we can be as small as anyone wants, on classic devices we can just use what is there.

Changed in snapd:
status: New → Confirmed
Changed in snappy:
milestone: sru-2 → none
no longer affects: snappy
Oliver Grawert (ogra) wrote :

@zyga

i think you would have to call locale-gen on first start of each app (depending on the CPU power of your system that can take quite a while), not sure you can hack around this easily.

Oliver Grawert (ogra) wrote :

(alternatively snapcraft could call it and generate *all* possible locales (well, the UTF-8 variant) for the app and have it shipped in the snap)

Jamie Strandboge (jdstrand) wrote :

If one is calling locale-gen on the first app invocation, it seems plausible on install snapd could:
1. do the equivalent of 'snap run --shell <snap>.<command>'
2. run locale-gen
3. exit the shell and use 'snap-discard-ns <snap>'

John Lenton (chipaca) wrote :

If we're doing this, we should have a core config as to what locales to generate. That way on small core devices we just generate what's needed (often: nothing). On non-core we can make that unsettable and proxy to whatever the host uses for the 'get' step.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers