Steel Bank Common Lisp

long line with byte 0 cause SBCL to crash into LDB

Reported by Rademaker on 2013-07-17
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Undecided
Unassigned

Bug Description

Consider the attached files. The zero.log file contains:

$ hexdump -C zero.log
00000000 41 42 00 00 00 43 44 45 0a 46 0a 47 0a 48 0a 49 |AB...CDE.F.G.H.I|
00000010 0a 4a 0a |.J.|
00000013

That is, after the letter B I put three 00 (hexadecimal byte). The file zero300M.log contains:

$ hexdump -C zero300M.log
00000000 41 42 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |AB..............|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
12c00000 00 00 43 44 45 0a 46 0a 47 0a 48 0a 49 0a 4a 0a |..CDE.F.G.H.I.J.|
12c00010

That is, only more 00 than only three. The first file is read correctly but the second one causes a crash:

> (with-open-file (in "zero.log")
    (loop for line = (read-line in nil)
   while line
   do
   (format t "~a~%" line)))

Heap exhausted during garbage collection: 1488 bytes available, 2064 requested.
 Gen StaPg UbSta LaSta LUbSt Boxed Unboxed LB LUB !move Alloc Waste Trig WP GCs Mem-age
   0: 0 0 0 0 0 0 0 0 0 0 0 10737418 0 0 0.0000
   1: 19385 22791 0 0 93 10219 0 0 0 322310352 15593264 10737418 0 0 1.0002
   2: 29967 29962 0 0 469 13849 0 0 37 448042192 21130032 2000000 432 0 0.0000
   3: 0 0 0 0 0 0 0 0 0 0 0 2000000 0 0 0.0000
   4: 0 0 0 0 0 0 0 0 0 0 0 2000000 0 0 0.0000
   5: 0 0 0 0 0 0 0 0 0 0 0 2000000 0 0 0.0000
   6: 0 0 0 0 1192 173 0 0 0 44728320 0 2000000 1075 0 0.0000
   Total bytes allocated = 1026850528
   Dynamic-space-size bytes = 1073741824
GC control variables:
   *GC-INHIBIT* = true
   *GC-PENDING* = in progress
fatal error encountered in SBCL pid 72033:
Heap exhausted, game over.

Allegro Common Lisp (lisp) is able to read both files without any problem.

More info:

$ sbcl --version
SBCL 1.1.9

$ uname -a
Darwin urca.br.ibm.com 12.4.0 Darwin Kernel Version 12.4.0: Wed May 1 17:57:12 PDT 2013; root:xnu-2050.24.15~1/RELEASE_X86_64 x86_64

> *features*
(:SWANK :QUICKLISP :ASDF2 :ASDF :ASDF-UNICODE :ALIEN-CALLBACKS :ANSI-CL
 :ASH-RIGHT-VOPS :BSD :C-STACK-IS-CONTROL-STACK :COMMON-LISP
 :COMPARE-AND-SWAP-VOPS :COMPLEX-FLOAT-VOPS :CYCLE-COUNTER :DARWIN
 :DARWIN9-OR-BETTER :FLOAT-EQL-VOPS :GENCGC :IEEE-FLOATING-POINT
 :INLINE-CONSTANTS :INODE64 :LINKAGE-TABLE :LITTLE-ENDIAN
 :MACH-EXCEPTION-HANDLER :MACH-O :MEMORY-BARRIER-VOPS :MULTIPLY-HIGH-VOPS
 :OS-PROVIDES-BLKSIZE-T :OS-PROVIDES-DLADDR :OS-PROVIDES-DLOPEN
 :OS-PROVIDES-PUTWC :OS-PROVIDES-SUSECONDS-T :PACKAGE-LOCAL-NICKNAMES
 :RAW-INSTANCE-INIT-VOPS :SB-DOC :SB-EVAL :SB-LDB :SB-PACKAGE-LOCKS
 :SB-SIMD-PACK :SB-SOURCE-LOCATIONS :SB-TEST :SB-UNICODE :SBCL
 :STACK-ALLOCATABLE-CLOSURES :STACK-ALLOCATABLE-FIXED-OBJECTS
 :STACK-ALLOCATABLE-LISTS :STACK-ALLOCATABLE-VECTORS
 :STACK-GROWS-DOWNWARD-NOT-UPWARD :UD2-BREAKPOINTS :UNIX
 :UNWIND-TO-FRAME-AND-CALL-VOP :X86-64)

Rademaker (arademaker) wrote :
Stas Boukarev (stassats) wrote :

You're trying to create a very long line with READ-LINE, it isn't caused by the 0 byte, but by the lack of a newline.

Changed in sbcl:
status: New → Incomplete
status: Incomplete → Invalid
Rademaker (arademaker) wrote :

Stats, but there is a hard limit of size for a line?

Changed in sbcl:
status: Invalid → New
status: New → Opinion
Rademaker (arademaker) wrote :

In other words. OK, the problem is not the zero byte, but why the crash when the read-line reads a long line?

Paul Khuong (pvk) wrote :

It ran out of space (which is what the first line of error output says "Heap exhausted during garbage collection: 1488 bytes available, 2064 requested").

300 million characters * at 4 bytes per character (we support the full unicode range) = 1.2 GB. Just representing the final string needs more than 200 MB more space than the 1GB heap.

You can increase the heap size with --dynamic-space-size. You might also want to consider working with octets or base-chars and :ascii or :iso-8859-1 external formats.

Changed in sbcl:
status: Opinion → Invalid

Rademaker <email address hidden> writes:

> Stats, but there is a hard limit of size for a line?

Characters in sbcl take four bytes in strings unless you work hard to
avoid it. Your 300MB-sized file with no newlines is therefore going to
try to construct a 1.2GB-sized string, which is probably larger than
your heap size.

> ** Changed in: sbcl
> Status: New => Opinion

It may be your opinion that this is a bug, and in the general scheme of
things it might be -- but you'd be on stronger ground arguing that if
you could reproduce it in an environment where the heap size is actually
large enough for your data.

Christophe

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers