"unhandled string" error after specific sequence of printing

Bug #306684 reported by Derick Eddington
2
Affects Status Importance Assigned to Milestone
Ikarus Scheme
Fix Committed
Critical
Abdulaziz Ghuloum

Bug Description

Ikarus does this every time, but only if the same sequence of expressions is done, and only if I let the print-outs go for at least a second before ^C interrupting them.

[d@eep:~]-> ikarus
Ikarus Scheme version 0.0.3+ (revision 1713, build 2008-12-09)
Copyright (c) 2006-2008 Abdulaziz Ghuloum

> (define v (make-vector #e1e5))
> v
#(0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[...]
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ^C
> (car v)
Unhandled exception
 Condition components:
   1. &assertion
   2. &who: car
   3. &message: "argument does not have required pair structure"
   4. &irritants: (#(0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[...]
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ^C
> (car v)
 Unhandled exception
 Condition components:
   1. &assertion
   2. &who: car
   3. &message: "argument does not have required pair structure"
   4. &irritants: (#(0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[...]
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ^C
> v
unhandled string 0x00000000b6d611c6 with fst=0x0000000065206465
[d@eep:~]->

Related branches

Revision history for this message
Abdulaziz Ghuloum (aghuloum) wrote : Re: [Bug 306684] [NEW] "unhandled string" error after specific sequence of printing

I can't reproduce this.

There was a bug in the GC that I fixed a few revs back.
I know you're using rev 1713, but, is it possible that
you installed a new boot file and did not install and
update the ikarus executable?

The GC bug had exactly the same symptoms you describe,
and I did trigger it with (make-vector 1000000) every
time. I did track a bug in the gc, fixed it, and the
symptoms did go away. There might be another close by
but I need to get to it.

Aziz,,,

Revision history for this message
Derick Eddington (derick-eddington) wrote :

On Wed, 2008-12-10 at 04:39 +0000, Abdulaziz Ghuloum wrote:
> I can't reproduce this.
>
> There was a bug in the GC that I fixed a few revs back.
> I know you're using rev 1713, but, is it possible that
> you installed a new boot file and did not install and
> update the ikarus executable?

I've made two fresh checkouts, and did "make distclean" in the checkout
I've been using for months, and ran the problem with
[d@eep:~/t3/ikarus.dev]-> src/ikarus -b scheme/ikarus.boot
and: with the first fresh checkout, the bug did not happen 3 of 3 tries,
but after cleaning and building again it did happen 2 of 2 tries; with
my old checkout, after cleaning and building, it did happen 2 of 2
tries; with the second fresh checkout, it did happen and with the first
build but died with a segmentation fault instead 2 of 2 tries.

I've kept all these builds. You can SSH into my computer if you want.

Revision history for this message
Abdulaziz Ghuloum (aghuloum) wrote :

I tried on two linux installations: debian on vmware, and gentoo on a real machine through ssh. I can't get it.

Can you try to come up with a script that exhibits the bug. The bug is GC/guardian related, and has nothing to do with interrupts AFAICT. It's just some crazy allocation pattern that's triggering it. If you can come up with a reliable way of reproducing it (e.g., not wait one sec and press ^C quickly), I'll try to see what I can do with it.

Thanks.

Revision history for this message
Abdulaziz Ghuloum (aghuloum) wrote :

I have added two tests that stress the symbol table and the guardians.
Can you run them and see if they fail or pass on your machine.

$ cd ikarus.dev/scheme
$ ikarus --r6rs-script run-tests.ss guardians symbol-table guardians symbol-table

Revision history for this message
Abdulaziz Ghuloum (aghuloum) wrote :

Try the following change and let me know if the problem goes away please.

=== modified file 'src/ikarus-runtime.c'
--- src/ikarus-runtime.c 2008-12-05 07:19:26 +0000
+++ src/ikarus-runtime.c 2008-12-10 15:14:17 +0000
@@ -81,7 +81,7 @@
     pcb->segment_vector = (unsigned int*)(long)(s - new_lo * pagesize);
     pcb->memory_base = (new_lo * segment_size);
   }
- else if (q > pcb->memory_end){
+ else if (q >= pcb->memory_end){
     unsigned long int lo = segment_index(pcb->memory_base);
     unsigned long int old_hi = segment_index(pcb->memory_end);
     unsigned long int new_hi = segment_index(q+segment_size-1);

Revision history for this message
Derick Eddington (derick-eddington) wrote : Re: [Bug 306684] Re: "unhandled string" error after specific sequence of printing

On Wed, 2008-12-10 at 14:22 +0000, Abdulaziz Ghuloum wrote:
> I have added two tests that stress the symbol table and the guardians.
> Can you run them and see if they fail or pass on your machine.
>
> $ cd ikarus.dev/scheme
> $ ikarus --r6rs-script run-tests.ss guardians symbol-table guardians symbol-table
>

After updating to 1715, I first checked the bug is still happening, and
it is.

[d@eep:~/t2/ikarus.dev/scheme]-> ../src/ikarus -b ikarus.boot --r6rs-script run-tests.ss guardians symbol-table guardians symbol-table
[testing guardians] ... OK
[testing symbol-table] ... OK
[testing guardians] ... OK
[testing symbol-table] ... OK
Happy Happy Joy Joy
[d@eep:~/t2/ikarus.dev/scheme]->

Also all OKs when done with a different checkout.

Revision history for this message
Derick Eddington (derick-eddington) wrote :

On Wed, 2008-12-10 at 15:17 +0000, Abdulaziz Ghuloum wrote:
> Try the following change and let me know if the problem goes away
> please.
>
> === modified file 'src/ikarus-runtime.c'
> --- src/ikarus-runtime.c 2008-12-05 07:19:26 +0000
> +++ src/ikarus-runtime.c 2008-12-10 15:14:17 +0000
> @@ -81,7 +81,7 @@
> pcb->segment_vector = (unsigned int*)(long)(s - new_lo * pagesize);
> pcb->memory_base = (new_lo * segment_size);
> }
> - else if (q > pcb->memory_end){
> + else if (q >= pcb->memory_end){
> unsigned long int lo = segment_index(pcb->memory_base);
> unsigned long int old_hi = segment_index(pcb->memory_end);
> unsigned long int new_hi = segment_index(q+segment_size-1);
>

Tried it with two different checkouts (and verified the patch was
applied) and the bug still happens the same.

Revision history for this message
Derick Eddington (derick-eddington) wrote :

On Wed, 2008-12-10 at 07:34 +0000, Abdulaziz Ghuloum wrote:
> The bug is
> GC/guardian related, and has nothing to do with interrupts AFAICT.

I've narrowed down the sequence. The bug does not happen if the one
interrupt below is not done.

[d@eep:~/t3/ikarus.dev]-> src/ikarus -b scheme/ikarus.boot
Ikarus Scheme version 0.0.3+ (revision 1715, build 2008-12-10)
Copyright (c) 2006-2008 Abdulaziz Ghuloum

> (define v (make-vector 100000))
;; Interrupt this one
> (car v)
Unhandled exception
 Condition components:
   1. &assertion
   2. &who: car
   3. &message: "argument does not have required pair structure"
   4. &irritants: (#(0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[...]
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ^C
;; Do not interrupt this one, but still required
> (car v)
0Unhandled exception
 Condition components:
   1. &assertion
   2. &who: car
   3. &message: "argument does not have required pair structure"
   4. &irritants: (#(0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[...]
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))
> v
unhandled string 0x00000000b70561c6 with fst=0x0000000065206465
[d@eep:~/t3/ikarus.dev]->

Revision history for this message
Abdulaziz Ghuloum (aghuloum) wrote :

I think this is IO related actually.

Printing the offending string shows that the first few words are the following consistently:

  0x65206465 0x70656378 0x6e6f6974

If you decode this as utf8, you get:
  "ed exception"

Meaning: printing "unhandled exception" to the error port is corrupting the next memory locations which may contain a string (producing the error) or some other pointer value (producing a segfault).

So, the bug might be in "ikarus.io.ss".

I tried commenting out the bytevector-u8-set! in the UNSAFE module in ikarus.io.ss to see what happens, and indeed, using the safe bytevector-u8-set! does signal an error (but that makes the errors recursive, so, it does not help debugging). At least I narrowed it down to there for now and will look into it later.

Aziz,,,

Revision history for this message
Abdulaziz Ghuloum (aghuloum) wrote :

It appears that there is a serious design flaw in the output ports buffering mechanism.

Currently, output ports work as follows:
  * The port has an index and a size. The index starts at 0, and the size is the size of the bytevector buffer.
  * When you do a write-char (assume ascii), a single byte is written at bv[idx], and the index is advanced by 1.
     * After advancing, if the index reaches the size, it means the buffer is full and is flushed immediately
     * After flushing, the index is set to 0.

Now what happens if the flush-output-port is interrupted and a write-char occurs in the interrupt routine?

* The buffer is already full, and its index is set equal to the size. (the invariant that index<size is broken here)
* A byte is written, and the index is advanced (now beyond the size)
* No more flushing occurs since the index != size.
* This keeps on happening until the system is fubar.

Bummer!

Revision history for this message
Abdulaziz Ghuloum (aghuloum) wrote :

Can you please check revision 1716 to see if that fixes things. Thanks.

Changed in ikarus:
assignee: nobody → aghuloum
importance: Undecided → Critical
status: New → Fix Committed
Revision history for this message
Derick Eddington (derick-eddington) wrote :

On Fri, 2008-12-12 at 21:05 +0000, Abdulaziz Ghuloum wrote:
> Can you please check revision 1716 to see if that fixes things. Thanks.

With revision 1716, I tried to cause the bug a number of times, and it
never happened.

Changed in ikarus:
milestone: none → 0.0.4
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.