UnicodeEncodeError when creating user with non-ascii chars
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
cloud-init |
Fix Released
|
Medium
|
Unassigned | ||
cloud-init (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
livecd-rootfs (Ubuntu) |
Fix Released
|
Undecided
|
Michael Hudson-Doyle |
Bug Description
I was testing subiquity, and at the user creation prompt typed in "André D'Silva" for the username, and just "andre" for the login.
The installer finished fine, but upon first login I couldn't login. Booting into rescue mode showed me that the user had not been created.
Checking cloud-init logs, I find the UnicodeEncodeError.
2018-02-22 12:44:01,386 - __init__.py[DEBUG]: Adding user andre
2018-02-22 12:44:01,387 - util.py[WARNING]: Failed to create user andre
2018-02-22 12:44:01,387 - util.py[DEBUG]: Failed to create user andre
Traceback (most recent call last):
File "/usr/lib/
util.
File "/usr/lib/
env=env, shell=shell)
File "/usr/lib/
restore_
File "/usr/lib/
restore_
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 4: ordinal not in range(128)
user-data contains this:
#cloud-config
hostname: sbqt
users:
- gecos: "Andr\xE9 D'Silva"
groups: [adm, cdrom, dip, lpadmin, plugdev, sambashare, debian-tor, libvirtd, lxd,
sudo]
lock-passwd: false
name: andre
passwd: $6$UaxxahbQam4K
shell: /bin/bash
ssh_import_id: ['lp:ahasenack']
cloud-init is 17.2-34-
Related branches
- Steve Langasek: Approve
-
Diff: 35 lines (+17/-0)2 files modifieddebian/changelog (+7/-0)
live-build/auto/build (+10/-0)
- Server Team CI bot: Approve (continuous-integration)
- Scott Moser: Pending requested
-
Diff: 371 lines (+141/-27)10 files modifiedcloudinit/config/cc_puppet.py (+40/-14)
cloudinit/config/cc_salt_minion.py (+9/-0)
cloudinit/sources/DataSourceGCE.py (+7/-8)
cloudinit/util.py (+7/-2)
debian/changelog (+13/-0)
doc/examples/cloud-config-chef.txt (+2/-2)
tests/cloud_tests/testcases/modules/salt_minion.py (+5/-0)
tests/cloud_tests/testcases/modules/salt_minion.yaml (+5/-0)
tests/unittests/test_datasource/test_gce.py (+19/-1)
tests/unittests/test_util.py (+34/-0)
- Server Team CI bot: Approve (continuous-integration)
- Ryan Harper: Approve
-
Diff: 72 lines (+40/-1)2 files modifiedcloudinit/util.py (+6/-1)
tests/unittests/test_util.py (+34/-0)
tags: | added: id-5a9fa29eda1dc1b22307ed30 |
Changed in cloud-init: | |
status: | Confirmed → Fix Committed |
Changed in livecd-rootfs (Ubuntu): | |
status: | New → In Progress |
assignee: | nobody → Michael Hudson-Doyle (mwhudson) |
I think the issue is:
a.) there is no default locale set in the subiquity installed system.
b.) python3 subprocess is doing a 'decode' for each argument in the
command list.
python2 default encoding *is* supposed to be based on the environment [1],
but python3 default encoding is not. python3 is supposed to be utf-8.
In the trace above we are down in C code where it is clearly doing 'ascii'
encoding.
[1] https:/ /docs.python. org/2/library/ sys.html? highlight= getdefaultencod ing#sys. getdefaultencod ing /docs.python. org/3/library/ stdtypes. html?highlight= decode# str.encode
[2] https:/
You can see the problem generally below. I only use 'json' as a convienent
way to pass in utf-8 characters. You can see that either unset LANG
or LANG=C causes the issue.
I guess I never thought that subprocess would be converting an argument
list of strings to bytes. That does make some sense.
So I think there are actually two changes:
a.) subiquity (via either curtin or cloud-init) should be setting a utf-8
default locale (all ubuntu generally do that). I'm not sure why the image
being installed didnt have one set.
b.) cloud-init's subp should probably just do the conversion to bytes
of whatever it gets as an argument list for the command, and always assume
that strings are to be encoded as utf-8.
$ cat go.py sys.argv[ 1]) check_call( cmd)
#!/usr/bin/python3
import json, subprocess, sys
cmd = json.loads(
print("cmd=%s" % [x.encode("utf-8") for x in cmd])
subprocess.
# my default lang is en_US.utf-8
$ ./go.py '["echo", "Andr\u00e9 DSilva"]'
cmd=[b'echo', b'Andr\xc3\xa9 DSilva']
André DSilva
$ LANG=en_US.utf-8 ./go.py '["echo", "Andr\u00e9 DSilva"]'
cmd=[b'echo', b'Andr\xc3\xa9 DSilva']
André DSilva
$ env -u LANG ./go.py '["echo", "Andr\u00e9 DSilva"]' check_call( cmd) python3. 6/subprocess. py", line 286, in check_call python3. 6/subprocess. py", line 267, in call python3. 6/subprocess. py", line 709, in __init__ signals, start_new_session) python3. 6/subprocess. py", line 1275, in _execute_child signals, start_new_session, preexec_fn)
cmd=[b'echo', b'Andr\xc3\xa9 DSilva']
Traceback (most recent call last):
File "./go.py", line 5, in <module>
subprocess.
File "/usr/lib/
retcode = call(*popenargs, **kwargs)
File "/usr/lib/
with Popen(*popenargs, **kwargs) as p:
File "/usr/lib/
restore_
File "/usr/lib/
restore_
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 4: ordinal not in range(128)
$ LANG=C ./go.py '["echo", "Andr\u00e9 DSilva"]' check_call( cmd) python3. 6/subprocess. py", line 286, in check_call python3. 6/subprocess. py", line 267, in call python3. 6/subprocess. py", line 709, in __init__ signals, start_new_session) python3. 6/subprocess. py", line 1275, in _execute_child signals, start_new_session, preexec_fn)
cmd=[b'echo', b'Andr\xc3\xa9 DSilva']
Traceback (most recent call last):
File "./go.py", line 5, in <module>
subprocess.
File "/usr/lib/
retcode = call(*popenargs, **kwargs)
File "/usr/lib/
with Popen(*popenargs, **kwargs) as p:
File "/usr/lib/
restore_
File "/usr/lib/
restore_
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 4: ordinal not in range(128)