read-from-string convert utf-8 parentheses to ascii parentheses

Bug #2066217 reported by mrkissinger
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Invalid
Undecided
Unassigned

Bug Description

I'am using read-from-string to read a string which contains utf8 parentheses characters "()", but LISP converts them into ASCII parenthese characters "()"。

CL-USER> (let ((s "測試(中文)"))
  (format t "~A~%~A~%"
          s
          (read-from-string s)))

測試(中文)
測試(中文)
NIL
CL-USER>

I want to keep the utf-8 parentheses as normal characters in string, not LISP parenthese as list mark.

Is this a bug or a feature? Can I avoid it?

Revision history for this message
mrkissinger (mrkissinger) wrote :

I tested again in console, clisp and sbcl gave different results.

$clisp test.lisp
測試(中文)
測試(中文)
$sbcl --script test.lisp
測試(中文)
測試(中文)

description: updated
Revision history for this message
Christophe Rhodes (csr21-cantab) wrote :

You can control unicode normalization in the lisp reader using the READTABLE-NORMALIZATION field of your readtable. https://www.sbcl.org/manual/#Symbol-Name-Normalization.

Using READ or READ-FROM-STRING is for converting text into Lisp code -- just reading the string itself (as text, from a stream) will not perform any normalization, but parsing it to Lisp using READ or one of its relatives will use the current readtable's.

> I want to keep the utf-8 parentheses as normal characters in string, not LISP parenthese as list mark.

When you read from the string you use in your test, you do not get a list. You get a symbol containing parenthesis characters, whether normalized to ascii parentheses or preserved as utf-8.

If you want a string with utf-8 parentheses, you can just use it: (char "測試(中文)" 2) returns #\FULLWIDTH_LEFT_PARENTHESIS.

Changed in sbcl:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.