UTF-8 filename causes decoding error
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
SBCL |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
In a shell
touch /tmp/mouse↑.svg
LC_CTYPE=C sbcl --eval '(directory #P"/tmp/*.*")'
causes
debugger invoked on a SB-INT:
#<THREAD "main thread" RUNNING {1001950083}>:
:ASCII c-string decoding error: the octet sequence #(226) cannot be decoded.
(SB-IMPL:
This causes problems in Quicklisp:
https:/
Also
(open (format nil "/tmp/mouse~c.svg" (code-char 8593)))
fails in SBCL, but works in CCL for example.
Versions:
SBCL 1.4.2
SBCL 1.4.2.137-26f361f4a
on
Darwin now 15.6.0 Darwin Kernel Version 15.6.0: Sun Jun 4 21:43:07 PDT 2017; root:xnu-
(CFFI-FEATURES:
CFFI-FEATURES:
ALEXANDRIA.
:ASDF2 :ASDF :OS-MACOSX :OS-UNIX :NON-BASE-
:64-BIT-REGISTERS :ALIEN-CALLBACKS :ANSI-CL :ASH-RIGHT-VOPS :BSD
:C-STACK-
:COMPARE-
:DARWIN9-OR-BETTER :FLOAT-EQL-VOPS :FP-AND-
:IEEE-
:INTEGER-EQL-VOP :LINKAGE-TABLE :LITTLE-ENDIAN :MACH-EXCEPTION
:MEMORY-
:OS-PROVIDES-
:OS-PROVIDES-
:RAW-INSTANCE-
:SB-CORE-
:SB-SOURCE-
:STACK-
:STACK-
:STACK-
:UNBIND-N-VOP :UNDEFINED-
:X86-64)
This is what happens:
1. LC_CTYPE=C tells SBCL that the system encodes characters using the ASCII character encoding. SBCL chooses the default external format, ASCII in this case, accordingly.
2. (directory #P"/tmp/*.*") requires converting (parts of the) the operating system filename in question (a sequence of octets) into a Lisp string which is done using the default external format.
3. Since the default external format is ASCII, the filename cannot be converted into a Lisp string and an error is signaled.
(open (format nil "/tmp/mouse~c.svg" (code-char 8593))) encounters the same problem but during the conversion from a Lisp string to a sequence of octets.
What did you expect to happen instead?