Memory error on Linux when examining certain types of objects

Bug #2009331 reported by JTK
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Invalid
Undecided
Unassigned

Bug Description

This is a bug reported on a debian list, but I can't find it on SBCL list.

In summary, the published version is:

=======
(ql:quickload "cl-monad-macros")

(in-package :cl-monad-macros)

(defparameter *foo* ;; capture the object that causes the error when examined/printed
   (with-input-from-string (in "1")
       (with-state-monad
           (run!
               (lambda (s) (cons (read in) in))
           in))))

(type-of *foo*) ==> cons
(type-of (second *foo*)) ==> Unhandled memory fault at #x25.
(print (second *foo*)) ==> Unhandled memory fault at #x1D.
(format t "~A" (second *foo*)) ==> #<sb-kernel:instance {7FE3E5465CD3}>
but
(format t "[ ~A ]" (second *foo*)) ==> Error (TYPE-ERROR) during printing: #<TYPE-ERROR {100E731DC3}>

==========

I actually ran into this problem in a web server trying to output a YASON JSON parsing error object, but I can't replicate it as neatly as the original CL-MONAD example.

For me, it failed in SB-KERNEL::OUTPUT-UGLY-OBJECT when it encountered YASON error:
  There is no applicable method for the generic function
                                #<standard-generic-function yason:encode (12)>
                              when called with arguments
                                (#<sb-kernel:case-failure expected-type:
                                                          (member #\, #\})
                                                          datum: #\C>

But I'm fairly sure it's the same underlying bug.

Software Versions:

SBCL: 2.3.1
uname: Linux xxx.edu 3.10.0-1160.45.1.el7.x86_64 #1 SMP Wed Oct 13 17:20:51 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Douglas Katzman (dougk) wrote :

This code is surely not legal. WITH-INPUT-FROM-STRING creates a stream that must not be used after the body form is exited. As it stands the LAMBDA within sees random junk on the stack.
You will have to rewrite it using MAKE-STRING-INPUT-STREAM instead.

Revision history for this message
JTK (jetmonk) wrote :

Addendum: in my case, when the memory error occurred in a webserver (hunchentoot) using YASON, the triggering object is a sb-kernel:case-failure from malformed JSON.

But it does not seem to generate a memory error when running inside SLIME, only when running as a freestanding script using SBCL init arg --script. This makes it harder to reproduce than the CL-MONAD-MACROS example.

My only evidence that the two are linked is this thread (mentioning OUTPUT-UGLY-OBJECT):
https://<email address hidden>/msg00540.html

which actually reduced the error to a much simpler case:

(with-input-from-string (in "1")
  (cons (read in) in))

This also gives me "Unhandled memory fault at #x7FE8."

Revision history for this message
Douglas Katzman (dougk) wrote :

working as intended. You can not return IN from the form that creates IN.

"The input string stream to which the variable var is bound has dynamic extent; its extent ends when the form is exited."

http://www.lispworks.com/documentation/HyperSpec/Body/m_w_in_f.htm#with-input-from-string

Revision history for this message
JTK (jetmonk) wrote :
Download full text (5.6 KiB)

Hello Douglas,

Thank you very much for the quick reply.

Based on the CLHS
http://clhs.lisp.se/Body/m_w_in_f.htm (with-input-from-string)
I agree with your assessment. I'm passing the stream 'in' out of its dynamic extent in the toy example I got off the web.

Now I'm trying to figure out why my web server code is failing. I don't think that the error object that is being printed should have dynamic extent. I do have a with-input-from-stream, but the stream should not be escaping from the w-i-f-s form, unless it gets put into the
#<sb-kernel:case-failure expected-type: (member #\, #\}) datum: #\C>
that is generated, which should not be the case.

Here's what I now suspect. In the backtrace below, report-error-to-client in hunchentoot webserver can take a backtrace argument to print, and the backtrace might contain a reference to a dynamic-extent output stream because by default hunchentoot:*log-lisp-backtraces-p*=T. Is that plausible? This seems a bit unlikely because hitting dynamic extent vars would seem to represent a hazard built into backtrace.

But it does look like it died in logging when a backtrace was to be printed.

Thanks for taking the time to comment on this. This has been a big help.

[2023-02-17 21:27:53 [error]] There is no applicable method for the generic function
                                #<standard-generic-function yason:encode (12)>
                              when called with arguments
                                (#<sb-kernel:case-failure expected-type:
                                                          (member #\, #\})
                                                          datum: #\C>

CORRUPTION WARNING in SBCL pid 2663 tid 4263:
Memory fault at 0x7f8b (pc=0x52c89037 [code 0x52c88e80+0x1B7 ID 0x40a5], fp=0x7f765fdad2b8, sp=0x7f765fdad2a8) tid 4263
The integrity of this image is possibly compromised.
Exiting.
   0: fp=0x7f765fdad2b8 pc=0x52c89037 SB-KERNEL::OUTPUT-UGLY-OBJECT
   1: fp=0x7f765fdad310 pc=0x52a24b03 (LABELS SB-IMPL::HANDLE-IT :IN SB-KERNEL::OUTPUT-OBJECT)
   2: fp=0x7f765fdad338 pc=0x52b1d1a4 (FLET "PPRINT-BLOCK" :IN PPRINT-FILL)
   3: fp=0x7f765fdad3f0 pc=0x52a1b2c0 (LABELS #:BODY-NAME-3 :IN SB-PRETTY::CALL-LOGICAL-BLOCK-PRINTER)
   4: fp=0x7f765fdad4b8 pc=0x52a1adae (FLET "WITH-PRETTY-STREAM0" :IN SB-PRETTY::CALL-LOGICAL-BLOCK-PRINTER)
   5: fp=0x7f765fdad570 pc=0x52a1ab8e SB-PRETTY::CALL-LOGICAL-BLOCK-PRINTER
   6: fp=0x7f765fdad598 pc=0x52b1d0f2 PPRINT-FILL
   7: fp=0x7f765fdad600 pc=0x52c8d919 SB-PRETTY::OUTPUT-PRETTY-OBJECT
   8: fp=0x7f765fdad658 pc=0x52a24b03 (LABELS SB-IMPL::HANDLE-IT :IN SB-KERNEL::OUTPUT-OBJECT)
   9: fp=0x7f765fdad678 pc=0x52a20f18 PRIN1
  10: fp=0x7f765fdad6e0 pc=0x5336a6e0 SB-FORMAT::S-INTERPRETER
  11: fp=0x7f765fdad710 pc=0x52af7c77 SB-FORMAT::INTERPRET-DIRECTIVE-LIST
  12: fp=0x7f765fdad7a8 pc=0x52ead27d (FLET "PPRINT-BLOCK" :IN SB-FORMAT::INTERPRET-FORMAT-LOGICAL-BLOCK)
  13: fp=0x7f765fdad860 pc=0x52a1b2c0 (LABELS #:BODY-NAME-3 :IN SB-PRETTY::CALL-LOGICAL-BLOCK-PRINTER)
  14: fp=0x7f765fdad928 pc=0x52a1adae (FLET "WITH-PRETTY-STREAM0" :IN SB-PRETTY::CALL-LOGICAL-BLOCK-PRINTER)
  15: fp=0x7f765fdad9e0 pc=0x52a1ac3c SB-PRETTY::...

Read more...

Revision history for this message
Douglas Katzman (dougk) wrote :

Yeah, the root cause seems to some error which, in trying to report causes another error.
Maybe you can hack up (LABELS HUNCHENTOOT::REPORT-ERROR-TO-CLIENT :IN HUNCHENTOOT::PROCESS-REQUEST) to show the type of each argument so that you can at least see what error it might have been trying to print. That's not foolproof because if the obsolete dynamic-extent object already reached there, there's no guarantee that you're not already looking at scrambled bits. However if the arguments themselves are reasonable but merely contain some other dynamic-extent object, you might be in a better position to think about what it might have wanted to print.

Revision history for this message
JTK (jetmonk) wrote :

I reduced the issue to a simple case that depends on hunchentoot and yason here:

https://github.com/jetmonk/sbcl-web-backtrace-mem-crash

This is likely reproducible on only some SBCL versions (SBCL 2.3.1 on CentOs Linux x64 in my case).

Perhaps the issue is that the backtrace can see method calls with arguments of #<SB-IMPL::STRING-OUTPUT-STREAM {xxxxx}> which have dynamic extent, and are invalid by the time the backtrace sees them. But wouldn't the stack be intact because it is growing after the error, so these streams are still valid, and become invalid only after the stacks moves back past them? As you can tell, my understanding of these issues is not good.

Anyway, if this a real issue, I hope this might provide someone with a way to reproduce it.

Revision history for this message
Douglas Katzman (dougk) wrote :

thanks for the recipe. I see the crash. Unclear whether it is SBCL's problem or hunchentoot.

Just to be clear, the correct error should be something about no method for YASON:ENCODE ?

Revision history for this message
Douglas Katzman (dougk) wrote :

Based on the example I can say that there is nothing wrong in SBCL, hunchentoot, or yason.
It's in your code.

SBCL *is* able to get a backtrace with dynamic-extent args involved.
In fact it takes some care to try to remove dynamic-extent args from the backtrace.
It's a particular problem though when objects are constructed in the manner of
 (make-instance 'thing :args (list (make-instance 'otherthing :args (list stillmorething))))
which is kind of what's going on. There's a condition of type no-applicable-method which wraps a condition of ecase-error and there are some format args that involve streams that have gone out of scope.
All these things have fancy print methods on them, and the print methods get bogus args. The bogus args in this case are a stack-allocated string-input-stream and stack-allocated string-output-stream. It's virtually impossible to work around all manner of broken-ness when, while backtracing, if it's also trying to print unprintable objects.

At line #136 of your example, don't use WITH-OUTPUT-TO-STRING, because YASON:ENCODE encodes that string into the error that it's trying to report.
At line #186, don't use WITH-INPUT-FROM-STRING. Give both string streams indefinite extent.

Changed in sbcl:
status: New → Invalid
Revision history for this message
JTK (jetmonk) wrote : Re: [Bug 2009331] Re: Memory error on Linux when examining certain types of objects
Download full text (6.1 KiB)

Douglas, thank you very much for solving this.

Using your suggestions, I now know exactly where this happens. It is the output string-stream being
packaged into the no-applicable-method; the input stream is OK because the ecase error
doesn’t package the dynamic stream in the exception.

The lesson seems to be

“Never pass any dynamic extent object, including with-input-to-stream and with-output-to-stream streams,
to a CLOS method or to anything else that throws an exception that saves arguments,
if one expects the resulting exception object to be examined, including within a backtrace.”

My remaining questions are:

1. Do you know the safe operations that can be performed on an error object that has a dynamic-extent
    argument packaged in it? (format t “~A” ..) seems dangerous, but ~S is OK. DESCRIBE seems
    safe in at least one case (below) because it apparently catches the error accessing the defunct object.
    Is DESCRIBE guaranteed safe?

2. Does such an exception object represent a hazard in itself, just by existing, possibly when GC’ed?

I have to admit that this was an unexpected gotcha, particularly for the seemingly benign constructs of
with-input-to-stream and with-output-to-stream.

-JTK

======= Further distilled example for anyone who comes across this thread ==========================

cl-user> (defmethod meth1 ((x string)) (print x)) ;; string is only valid arg type

cl-user> (defparameter *err* ;; capture no-applicable-method error
    (nth-value 1 (ignore-errors
             (with-output-to-string (s)
                 (meth1 s))))) ;; call meth1 with a stream
  *err*

cl-user> *err*
  #<sb-pcl::no-applicable-method-error {10027D0C73}>

cl-user> (format t "~S" *err*)
#<sb-pcl::no-applicable-method-error {10027D0C73}>
nil

cl-user> (format t "~A" *err*)
There is no applicable method for the generic function
                                                    #<standard-generic-function common-lisp-user::meth1 (1)>
                                                  when called with arguments
    ***** MEMORY ERROR: Unhandled memory fault at #x7F42. ******

cl-user> (describe *err*)
#<sb-pcl::no-applicable-method-error {10027D0C73}>
  [condition]

Slots with :instance allocation:
  references = ((:ansi-cl :section (7 6 6)))
  generic-function = #<standard-generic-function common-lisp-user::meth1 (1)>
  method = nil
  args = (error printing type)
  problem = "There is no applicable method"
; No value

;; DESCRIBE seems to catch the error printing invalid ARGS.

 =================================================================================

> On Mar 7, 2023, at 09:08, Douglas Katzman <email address hidden> wrote:
>
> Based on the example I can say that there is nothing wrong in SBCL, hunchentoot, or yason.
> It's in your code.
>
> SBCL *is* able to get a backtrace with dynamic-extent args involved.
> In fact it takes some care to try to remove dynamic-extent args from the backtrace.
> It's a particular problem though when objects are constructed in the manner of
> (make-in...

Read more...

Revision history for this message
Douglas Katzman (dougk) wrote :

> The lesson seems to be [...]
right

> 1. Do you know the safe operations that can be performed on an error object that has a dynamic-extent
technically, no operation accessing the dynamic-extent value in a slot out-of-extent is safe. Printing via ~S does not call the print-object method, so it mostly works.
DESCRIBE is not guaranteed safe - everything is "best effort" in terms of detecting trashed objects during printing. People ask this all the time as in "why can't you detect clobbered objects".
Any attempt to insert code to do so would totally outweigh any gain from using dynamic-extent. i.e. automatic lifetime management/detection is tantamount to either using heap allocation in the first place or inserting unwind-protect to "mark" an object as dead. Why even try, if the goal is efficiency?

2. Does such an exception object represent a hazard in itself, just by existing, possibly when GC’ed?
Mere existence of a heap-to-stack pointer does not pose a problem for GC

Revision history for this message
JTK (jetmonk) wrote :

> On Mar 7, 2023, at 13:02, Douglas Katzman <email address hidden> wrote:
>
> People ask this all the time as in "why can't you detect clobbered objects".
> Any attempt to insert code to do so would totally outweigh any gain from using dynamic-extent. i.e. automatic lifetime management/detection is tantamount to either using heap allocation in the first place or inserting unwind-protect to "mark" an object as dead. Why even try, if the goal is efficiency?

Could one imagine bit-marking all boxed dynamic extent objects upon creation, and having an (sb-kernel:dynamic-check obj) function that would replace a dynamic object with an object like #<sb-kernel:unknown-dynamic-extent-arg>?

Then
(ERROR “The argument ~A is not a groz” foo)
  would become, internally,
(%TRUE-ERROR “The argument ~A is not a groz” (sb-kernel:dynamic-check foo))

I suppose this would be a lot of effort for a modest benefit, even if possible, and perhaps all the bit-slots are used already.

Anyway, the harsher lesson I get from this is “Never treat an exception object as examinable in a production system, because
you don’t know where it has been” but this might be mitigated by catching the memory error itself as

 (defun exception-to-string (exception)
    (or (ignore-errors (format nil "~A" exception)) ;; catch access to out of scope object
        (format nil "#<ERROR of type ~A; unexaminable>"
         (type-of exception))))

But will trying to print an out of scope object merely generate a memory access error, or can it actually cause a real memory corruption by making the printer access parts of memory it shouldn’t?
Is there an “Is this stack space?” test one could apply to an object?

Apologies for dragging this on; I’m trying to write a safe science server for a pretty big project and I want to make it crash-proof while still reporting errors clearly.

====

As an aside, it seems that SBCL does not impose the CLHS standard of dynamic extent on WITH-OPEN-FILE streams,
but it does do so for WITH-INPUT-FROM-STRING streams. I think I would have encountered this problem in a lot of places
if SBCL imposed CLHS’s dynamic-extent standard on WITH-OPEN-FILE as well:

cl-user> (macroexpand '(with-open-file (s "foo") (do-stuff)))

(let ((s (open "foo")) (#:g362 t))
  (unwind-protect (multiple-value-prog1 (progn (do-stuff)) (setq #:g362 nil))
    (when s (close s :abort #:g362))))

cl-user> (macroexpand '(with-output-to-string (s) (do-stuff)))

(let ((#:buf (make-array 31 :element-type 'character)))
  (declare (sb-c::dynamic-extent-no-note #:buf))
  (sb-int:dx-let ((#:stream (sb-impl::%allocate-string-ostream))) ;; s has dynamic extent from dx-let
    (let ((s (sb-impl::%init-string-output-stream #:stream #:buf nil)))
      (declare (ignorable s))
      (do-stuff))
    (get-output-stream-string #:stream)))

Revision history for this message
Douglas Katzman (dougk) wrote :

> Could one imagine bit-marking all boxed dynamic extent objects
No. The point is to explicitly release memory based on scope. If marked on the stack, then the marks can be clobbered just the same as the object itself. If marked on the heap, then why not just allocate the object on the heap?

> Is there an “Is this stack space?” test one could apply to an object?
stack-allocated-p. I can't see many reasonable uses in a print-object method. Surely it does not make sense to say that objects satisfying that test or containing slots satisfying the test must never be printed.

> As an aside, it seems that SBCL does not impose the CLHS standard of dynamic extent on WITH-OPEN-FILE streams,
Not a lot of savings to be had with files, as file streams allocate several KB for buffering which goes to C memory, and finalizers and other heap-allocated sub-parts.
The spec does not mandate any object to definitely be stack-allocated anyway.

Revision history for this message
JTK (jetmonk) wrote :

> On Mar 7, 2023, at 17:50, Douglas Katzman <email address hidden> wrote:
>
>> Could one imagine bit-marking all boxed dynamic extent objects
> No. The point is to explicitly release memory based on scope. If marked on the stack, then the marks can be clobbered just the same as the object itself. If marked on the heap, then why not just allocate the object on the heap?

Sorry. I meant that the marking/testing would be used only within the scope (but stack-allocated-p would work instead of marking)

Errors like

(error “the bad object is ~A” s)

   would then really be

(%real-error “the bad object is ~A”
    (if (stack-allocated-p s)
        (format nil “[stack-allocated-transient-object of type ~A]” (type-of s)) ;; or some #<unknown-stack-object>
   s))

Ie, it would be to prevent any stack allocated object from being put inside an exception object, not the impossible test for stack allocated objects outside the scope.

>
>> Is there an “Is this stack space?” test one could apply to an object?
> stack-allocated-p. I can't see many reasonable uses in a print-object method. Surely it does not make sense to say that objects satisfying that test or containing slots satisfying the test must never be printed.
>
>> As an aside, it seems that SBCL does not impose the CLHS standard of dynamic extent on WITH-OPEN-FILE streams,
> Not a lot of savings to be had with files, as file streams allocate several KB for buffering which goes to C memory, and finalizers and other heap-allocated sub-parts.
> The spec does not mandate any object to definitely be stack-allocated anyway.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.