read giving #!eof to list->string

Bug #295334 reported by Derick Eddington
2
Affects Status Importance Assigned to Milestone
Ikarus Scheme
Fix Committed
Medium
Abdulaziz Ghuloum

Bug Description

With my debugging print-out I added to ikarus.strings.ss:

Ikarus Scheme version 0.0.3+ (revision 1661, build 2008-11-07)
Copyright (c) 2006-2008 Abdulaziz Ghuloum

> (define bv
#vu8(10 159 101 118 111 108 117 116 105 111 110 58 108 105
     115 116 95 115 99 114 111 108 108 95 112 111 115 105
     116 105 111 110 141 55 52 50 56 56 46 48 48 48 48 48 48
     151 101 118 111 108 117 116 105 111 110 58 115 101 108
     101 99 116 101 100 95 117 105 100 140 56 102 69 86 90
     85 85 66 57 51 53 128)
)
> (read (open-bytevector-input-port bv (native-transcoder)))
*** list->string
  length = 77
  ls = (#\� #\e #\v #\o #\l #\u #\t #\i #\o #\n #\: #\l #\i #\s #\t #\_ #\s #\c #\r #\o #\l #\l #\_ #\p #\o #\s #\i #\t #\i #\o #\n #\7 #\4 #\2 #\8 #\8 #\. #\0 #\0 #\0 #\0 #\0 #\0 #\e #\v #\o #\l #\u #\t #\i #\o #\n #\: #\s #\e #\l #\e #\c #\t #\e #\d #\_ #\u #\i #\d #\8 #\f #\E #\V #\Z #\U #\U #\B #\9 #\3 #\5 #!eof)
  nice = �evolution:list_scroll_position74288.000000evolution:selected_uid8fEVZUUB935#!eof
Unhandled exception
 Condition components:
   1. &assertion
   2. &who: list->string
   3. &message: "not a character"
   4. &irritants: (#!eof)
>

I had fun hunting this one down :)

Related branches

Revision history for this message
Abdulaziz Ghuloum (aghuloum) wrote : Re: [Bug 295334] [NEW] read giving #!eof to list->string

On Nov 7, 2008, at 4:16 PM, Derick Eddington wrote:

>> (define bv
> #vu8(10 159 101 118 111 108 117 116 105 111 110 58 108 105
> 115 116 95 115 99 114 111 108 108 95 112 111 115 105
> 116 105 111 110 141 55 52 50 56 56 46 48 48 48 48 48 48
> 151 101 118 111 108 117 116 105 111 110 58 115 101 108
> 101 99 116 101 100 95 117 105 100 140 56 102 69 86 90
> 85 85 66 57 51 53 128)
> )
>> (read (open-bytevector-input-port bv (native-transcoder)))
> *** list->string
> length = 77
> ls = (#\� #\e #\v #\o #\l #\u #\t #\i #\o #\n #\: #\l #\i #\s #
> \t #\_ #\s #\c #\r #\o #\l #\l #\_ #\p #\o #\s #\i #\t #\i #\o #\n #
> \7 #\4 #\2 #\8 #\8 #\. #\0 #\0 #\0 #\0 #\0 #\0 #\e #\v #\o #\l #\u #
> \t #\i #\o #\n #\: #\s #\e #\l #\e #\c #\t #\e #\d #\_ #\u #\i #\d #
> \8 #\f #\E #\V #\Z #\U #\U #\B #\9 #\3 #\5 #!eof)
> nice =
> �evolution:list_scroll_position74288.000000evolution:selected_uid8fE
> VZUUB935#!eof

Is this the symbol it's supposed to produce?

�evolution:list_scroll_position�4288.000000�volution:selected_uid
�fEVZUUB935�

Aziz,,,

Revision history for this message
Abdulaziz Ghuloum (aghuloum) wrote :

On Nov 7, 2008, at 9:53 PM, Abdulaziz Ghuloum wrote:

> Is this the symbol it's supposed to produce?

I don't think so. Never mind.

Revision history for this message
Abdulaziz Ghuloum (aghuloum) wrote :

On Nov 7, 2008, at 9:57 PM, Abdulaziz Ghuloum wrote:

>
> On Nov 7, 2008, at 9:53 PM, Abdulaziz Ghuloum wrote:
>
>> Is this the symbol it's supposed to produce?
>
> I don't think so. Never mind.

On a second thought, yes, I think so :-)

 > (define bv
#vu8(10 159 101 118 111 108 117 116 105 111 110 58 108 105
      115 116 95 115 99 114 111 108 108 95 112 111 115 105
      116 105 111 110 141 55 52 50 56 56 46 48 48 48 48 48 48
      151 101 118 111 108 117 116 105 111 110 58 115 101 108
      101 99 116 101 100 95 117 105 100 140 56 102 69 86 90
      85 85 66 57 51 53 128))

 > (let ([p (open-bytevector-input-port bv (native-transcoder))])
     (let f () (let ([x (read-char p)]) (if (eof-object? x) '() (cons
x (f))))))

(#\linefeed #\xFFFD #\e #\v #\o #\l #\u #\t #\i #\o #\n #\:
   #\l #\i #\s #\t #\_ #\s #\c #\r #\o #\l #\l #\_ #\p #\o
   #\s #\i #\t #\i #\o #\n #\xFFFD #\7 #\4 #\2 #\8 #\8 #\.
   #\0 #\0 #\0 #\0 #\0 #\0 #\xFFFD #\e #\v #\o #\l #\u #\t
   #\i #\o #\n #\: #\s #\e #\l #\e #\c #\t #\e #\d #\_ #\u
   #\i #\d #\xFFFD #\8 #\f #\E #\V #\Z #\U #\U #\B #\9 #\3
   #\5 #\xFFFD)

Were you supposed to use the latin-1 codec for this, or was
it really supposed to be this malcoded data?

[I fixed the read error in revision 1662]

Revision history for this message
Abdulaziz Ghuloum (aghuloum) wrote :

On Nov 7, 2008, at 10:02 PM, Abdulaziz Ghuloum wrote:

> [I fixed the read error in revision 1662]

Nop. (sorry for the flood)

The bug is in lookahead-char which should not advance the port
position when a decoding error occurs.

Changed in ikarus:
assignee: nobody → aghuloum
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Abdulaziz Ghuloum (aghuloum) wrote :

In revision 1663, I get the following. Does it look right now?

 > (define bv
#vu8(10 159 101 118 111 108 117 116 105 111 110 58 108 105
      115 116 95 115 99 114 111 108 108 95 112 111 115 105
      116 105 111 110 141 55 52 50 56 56 46 48 48 48 48 48 48
      151 101 118 111 108 117 116 105 111 110 58 115 101 108
      101 99 116 101 100 95 117 105 100 140 56 102 69 86 90
      85 85 66 57 51 53 128))
 > (symbol->string (read (open-bytevector-input-port bv (native-
transcoder))))
"�evolution:list_scroll_position�74288.000000�evolution:selected_u
id�8fEVZUUB935�"

Revision history for this message
Derick Eddington (derick-eddington) wrote :

On Fri, 2008-11-07 at 22:02 -0500, Abdulaziz Ghuloum wrote:
> Were you supposed to use the latin-1 codec for this, or was
> it really supposed to be this malcoded data?

It's supposed to be malcoded. I've made a "datum finder" which walks
down directories attempting to read all datums from every file. If
get-datum raises &lexical, it skips the rest of the file. I made this
so I can search Scheme code for forms that match patterns :) I was
testing it by making it eat through my entire home directory :)

Revision history for this message
Derick Eddington (derick-eddington) wrote :

On Fri, 2008-11-07 at 22:11 -0500, Abdulaziz Ghuloum wrote:
> In revision 1663, I get the following. Does it look right now?
>
> > (define bv
> #vu8(10 159 101 118 111 108 117 116 105 111 110 58 108 105
> 115 116 95 115 99 114 111 108 108 95 112 111 115 105
> 116 105 111 110 141 55 52 50 56 56 46 48 48 48 48 48 48
> 151 101 118 111 108 117 116 105 111 110 58 115 101 108
> 101 99 116 101 100 95 117 105 100 140 56 102 69 86 90
> 85 85 66 57 51 53 128))
> > (symbol->string (read (open-bytevector-input-port bv (native-
> transcoder))))
> "�evolution:list_scroll_position�74288.000000�evolution:selected_u
> id�8fEVZUUB935�"

I think that's right, because with r1661 still:

> (bytevector->string bv (native-transcoder))
"\n�evolution:list_scroll_position�74288.000000�evolution:selected_uid�8fEVZUUB935�"
> (read (open-string-input-port (bytevector->string bv (native-transcoder))))
�evolution:list_scroll_position�74288.000000�evolution:selected_uid�8fEVZUUB935
> (symbol->string (read (open-bytevector-input-port bv (native-transcoder))))
Unhandled exception
 Condition components:
   1. &assertion
   2. &who: list->string
   3. &message: "not a character"
   4. &irritants: (#!eof)
>

Revision history for this message
Derick Eddington (derick-eddington) wrote :

On Fri, 2008-11-07 at 21:53 -0500, Abdulaziz Ghuloum wrote:
> Is this the symbol it's supposed to produce?
>
> �evolution:list_scroll_position�4288.000000�volution:selected_uid
> �fEVZUUB935�

I think it is. R6RS says:

<identifier> → <initial> <subsequent>*
         | <peculiar identifier>
<initial> → <constituent> | <special initial>
         | <inline hex escape>
<constituent> → <letter>
         | 〈any character whose Unicode scalar value is greater than
             127, and whose category is Lu, Ll, Lt, Lm, Lo, Mn,
             Nl, No, Pd, Pc, Po, Sc, Sm, Sk, So, or Co〉

and:

> (char-general-category #\xFFFD)
So
>

Revision history for this message
Derick Eddington (derick-eddington) wrote :

On Fri, 2008-11-07 at 19:52 -0800, Derick Eddington wrote:
> On Fri, 2008-11-07 at 21:53 -0500, Abdulaziz Ghuloum wrote:
> > Is this the symbol it's supposed to produce?
> >
> > �evolution:list_scroll_position�4288.000000�volution:selected_uid
> > �fEVZUUB935�
>
> I think it is. R6RS says:

Ah, I just noticed that the above symbol has missing characters. The
r1663 fix:

�evolution:list_scroll_position�74288.000000�evolution:selected_uid�8fEVZUUB935

does look right because it corresponds to the read-char at-a-time list
you showed previously.

Changed in ikarus:
status: Confirmed → Fix Committed
Revision history for this message
Abdulaziz Ghuloum (aghuloum) wrote :

On Nov 7, 2008, at 10:18 PM, Derick Eddington wrote:

> It's supposed to be malcoded. I've made a "datum finder" which walks
> down directories attempting to read all datums from every file. If
> get-datum raises &lexical, it skips the rest of the file. I made this
> so I can search Scheme code for forms that match patterns :) I was
> testing it by making it eat through my entire home directory :)

BTW, I'm glad I have someone stress-testing Ikarus in ways that I could
never have imagined! :-)

Changed in ikarus:
milestone: none → 0.0.4
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.