Space character printed readably in bad way

Bug #1985814 reported by Mark David
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Fix Released
Undecided
Unassigned

Bug Description

The space character in SBCL is printed in readable representation as "#\ ", i.e., putting out the space character after the "#\" pair of characters. There are better alternatives: #\sp and #\space

This is first of all very inconvenient in that the character cannot be easily seen in most circumstances. E.g., you cannot tell if it's a space or a tab or some other invisible character or possibly just "#\" followed by nothing.

The biggest problem happens when this printed representation interacts with the pretty printer. When printing a list of characters, for example, the pretty printer does a sort of "word wrapping" on the output such that any whitespace type character at the end of a line is simply eliminated. If you then, say, copy/paste the resulting output, the space characters that happened to be printed at the end of lines lose their original value, typically becoming #\newline characters.

Example in REPL with *print-pretty* = T:

> CL-USER> (progn (print (concatenate 'list (make-list 30 :initial-element #\space))) t)
> (#\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\
> #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ )
> T

Note that the end of the first line of output has no space before the linebreak.

Now copy the first two lines of output. Then type single quote (') and then paste:

> CL-USER> '(#\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\
> #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ )
> (#\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\
> #\Newline #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ #\ )

Note that the character corresponding to the previous last space on the first line is now a #\Newline character.

A good solution would be to have the default printed output representation be one that uses multiple actual visible glyphs, e.g., #\space or #\sp. This is, by the way, more consistent with with the handling of tab, newline, and return. Space is handled in an inconsistent manner with respect to those characters. E.g.,

> CL-USER> #\tab
> #\Tab
> CL-USER> #\newline
> #\Newline
> CL-USER> #\return
> #\Return
> CL-USER> #\space
> #\

SBCL Version: SBCL 2.2.6
uname -a
Darwin mhd-mac-3.fios-router.home 21.6.0 Darwin Kernel Version 21.6.0: Sat Jun 18 17:07:25 PDT 2022; root:xnu-8020.140.41~1/RELEASE_X86_64 x86_64

Revision history for this message
Christophe Rhodes (csr21-cantab) wrote :

Yes, but...

CLHS 13.1.4.1 says "A graphic character is one that has a standard textual representation as a single glyph, such as A or * or =. Space, which effectively has a blank glyph, is defined to be a graphic." (It also clarifies that Tab, Newline and Return are non-graphic.)

CLHS 22.1.3.2 says "For the graphic standard characters, the character itself is always used for printing in #\ notation---even if the character also has a name[5]."

I agree that the interaction between these two issues is awkward. Regarding the further interaction with *print-pretty*, possibly our initial pprint-dispatch-table should have an entry to make #\Space print as #\Space: I don't think that there's anything that would disallow that.

Revision history for this message
Mark David (mhd-yv) wrote :

That is great info. I did know (or remember) that rule about #\ printing w.r.t. space. I'm glad to hear the pretty printer could be rigged to work around this awkward interaction. Hope that will be done.

Revision history for this message
Douglas Katzman (dougk) wrote :

I think this is not the first bug report suggesting that the deletion of explicit spaces coming from printing of a character preceding a pprint-newline is a silly notion. I agree but note that *PRINT-READABLY* meets this particular need in requiring readable printing, versus *print-escape* which just "encourages" the printer to use syntactic markers that would otherwise be absent.

* (let ((*print-pretty* t) (*print-readably* t)) (progn (print (concatenate 'list (make-list 30 :initial-element #\space))) t))

(#\Space #\Space #\Space #\Space #\Space #\Space #\Space #\Space #\Space
 #\Space #\Space #\Space #\Space #\Space #\Space #\Space #\Space #\Space
 #\Space #\Space #\Space #\Space #\Space #\Space #\Space #\Space #\Space
 #\Space #\Space #\Space)

I think the real glitch in the spec is that if *not* printing with #\\ notation there is no way to not have space characters be eliminated before a newline when they were not merely incidental whitespace, but actually part of what you really wanted to output. It seems strange.
That said, I'm not opposed to changing the pretty-print-dipsatch entry for CHARACTER when using *print-escape* = T

Revision history for this message
Douglas Katzman (dougk) wrote :

I checked a few other implementations just to get opinions.

Clozure, ECL, CMU, and ABCL never output the character name under any combination of options.
i.e. for all of them you get
* (write #\space :readably t :escape t :pretty t)
#\
* (write #\space :readably t :escape t :pretty nil)
#\
and for all other permutations of T/NIL to the keywords.

CLISP: non-spec-compliant. Always outputs the name regardless of printer options.
[1]> (write #\space :readably nil :escape nil :pretty nil)
#\Space

Seems odd that we would have uniquely stumbled upon a correct solution regarding *print-readably* and/or adjustment of the default dispatch table.
Not sure if serious, but what if we could exploit some Unicode spaces to convey behavior through the printer-printer. The spec should have seen that deletion of significant whitespace is wrong

Revision history for this message
Mark David (mhd-yv) wrote :

As pointed out (#4), Clozure outputs #\ followed by space character in pretty print mode. Having just tried Clozure, I see it does not get rid of the space after #\ before a newline. Why can't SBCL follow Clozure's lead on this? Sorry if that was explained above, but if so, I guess I could use a simpler explanation. Thanks.

Changed in sbcl:
status: New → Fix Committed
Changed in sbcl:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.