Memory leak in MAKE-INSTANCE 'STANDARD-CLASS
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| SBCL |
Undecided
|
Unassigned |
Bug Description
SBCL 2.0.10 on Linux.
Discovered by Michael Fiano.
Creating anonymous standard-class instances causes a memory leak even if the instances become immediately unreachable. The below example causes a 60MB increase in dynamic space usage not reclaimable even with a full GC.
----------------
* (sb-ext:gc :full t)
NIL
* (room nil)
Dynamic space usage is: 26,948,112 bytes.
Immobile space usage is: 13,945,920 bytes (26,688 bytes overhead).
Read-only space usage is: 0 bytes.
Static space usage is: 736 bytes.
Control stack usage is: 1,952 bytes.
Binding stack usage is: 640 bytes.
Control and binding stack usage is for the current thread only.
Garbage collection is currently enabled.
* (dotimes (i 100000) (make-instance 'standard-class))
NIL
* (sb-ext:gc :full t)
NIL
* (room nil)
Dynamic space usage is: 84,518,336 bytes.
Immobile space usage is: 26,741,824 bytes (22,592 bytes overhead).
Read-only space usage is: 0 bytes.
Static space usage is: 736 bytes.
Control stack usage is: 1,952 bytes.
Binding stack usage is: 640 bytes.
Control and binding stack usage is for the current thread only.
Garbage collection is currently enabled.
Michał "phoe" Herda (phoe-krk) wrote : | #1 |
Michał "phoe" Herda (phoe-krk) wrote : | #2 |
I wondered what would break if we added weakness to the hash tables instantiated in these two places:
And built SBCL with with :WEAKNESS :KEY-AND-VALUE in both hash tables.
The change does not seem to break anything, the resulting hash tables have :WEAKNESS :KEY-AND-VALUE, but it does not fix the issue either: the classoids in question are still not purged from the system.
This is likely because the classoids (or references to them) are allocated in static space (at least this is how I interpret the results given to me by SEARCH-ROOTS) and therefore the classoids are not collectable by the GC, which means that simple hash table weakness is not enough to remove them.
Douglas Katzman (dougk) wrote : | #3 |
Two problems:
1. there a billion other things going on aside from the classoid. The anonymous class is a direct-subclass of standard-object. Are classes allowed to remove unreferenced subclasses? (Is this specified behavior? That direct subclasses are weakly referenced?)
* (let ((so (find-class'
5
6
2. it is virtually impossible to guarantee that something in the MOP has not put your object into a cache of some kind, especially as it is a metaobject. If make-instance on all metaobjects were forced to completely flush all caches in the system, maybe there would be a chance of "fixing" this issue.
Also, generally speaking:
- no new objects ever go into static space, barring some trampolines for alien-lambda
- it's very difficult to get good answers from traceroot where weakness is involved. Pretty much anything it does is going to be wrong. If you assume that weak structures may _not_ participate in any path, then it'll miss some paths. If you assume that they _may_ participate, then it'll act like falsely enlivening the object, not being truly "weak". But if some object that is a-priori alive enlivens a k/v pair in a weak table which makes the path valid, that should be fine. Except that this is beyond the scope of anything I want to solve.
Michał "phoe" Herda (phoe-krk) wrote : | #4 |
Yes, I see now; I remember that this issue was discussed previously, and the MOP does not specify whether unreferenced subclasses are allowed to be removed or whether references are strong.
I know that e.g. CCL does not suffer from this kind of memory leak, while ECL seems to copy SBCL's behavior. But this in turn might be a CCL bug because (make-instance 'standard-class) there does *NOT* seem to push the newly created class into (class-
Regarding the "generally speaking" points - thanks for the explanation, I got slightly confused and a comment from no-defun-allowed on #sbcl helped me:
22:37 < no-defun-allowed> Oh, I remember this.
22:37 < no-defun-allowed> What is happening is that the classes are being linked as subclasses of
22:38 < no-defun-allowed> https:/
22:39 < phoe> hmmmmmmm
22:40 < no-defun-allowed> So they are live unless you make sure those references aren't made.
22:40 < phoe> yes, I see
22:41 < phoe> standard-object, yes
Michał "phoe" Herda (phoe-krk) wrote : | #5 |
I think that at this point, this issue kinda looks like a request for a MOP extension.
I am currently thinking of establishing and documenting some sort of protocol for classes that want to exhibit the behavior that is wanted in this case: to be creatable en masse and fully garbage-
Such a protocol would indirectly require specifying all places that are allowed to hold a strong reference to the class metaobject. Classes that want to be transient can follow this protocol, e.g. by specializing on ADD-DIRECT-SUBCLASS and declining to CALL-NEXT-METHOD, therefore not creating a strong reference from its superclass to itself.
The implementation of this protocol would need to follow into the implementation internals, so e.g. classoids are collectable when the class that they point at becomes unreachable.
Short writeup from my point of view follows. (I'm not a SBCL wizard, so this is going to be painfully incomplete.
Benefits:
* Transient classes don't pollute the image upon being created and collected; therefore, the memory leak is plugged.
Costs:
* All related storages of metametaobjects (e.g. classoids/wrappers case of SBCL) would need to become weak, so they don't get in the way when the strong references to class objects are freed.
* New unit/regression tests.
--------------
> 2. it is virtually impossible to guarantee that something in the MOP has not put your object into a cache of some kind, especially as it is a metaobject. If make-instance on all metaobjects were forced to completely flush all caches in the system, maybe there would be a chance of "fixing" this issue.
I wonder if it should be MAKE-INSTANCE in this case; this seems expensive to flush the caches every time an instance is created.
Maybe there should be a separate function for flushing these caches, e.g. SB-PCL:
This seems feasible because it seems that class creation is usually tied to compilation and/or recompilation in the two concrete use cases I've seen (mfiano's game engine with live code reloading and pve's Smalltalk implementation in Common Lisp). This means that the warm-up costs of regenerating these caches will only need to be paid once after the flushing, and not during runtime of the game engine or generated Smalltalk programs.
Douglas Katzman (dougk) wrote : | #6 |
In addition to this, there's a question of whether DELETE-PACKAGE is allowed to cause all the defstructs and defclasses named by symbols (formerly) in that package to disappear.
Personally I'd like DELETE-PACKAGE to be a way to clean house, but it isn't. There's always the looming possibility of enumerating the descendants of T and finding all the types that have become named by uninterned symbols.
Michał "phoe" Herda (phoe-krk) wrote : | #7 |
I think I could imagine either some sort of dynamic variable that drives the behavior of DELETE-PACKAGE, so we can have:
(let ((sb-ext:
(delete-package "FOO"))
Or, maybe preferably, some separate function, such as SB-EXT:
(sb-ext:
I think this warrants a separate wishlist ticket though.
Douglas Katzman (dougk) wrote : | #8 |
I have to disagree that nuke-package is useful. The issue of object retention is entirely about the type system and nothing but that. A simple test case proves that deleted packages don't accidentally retain objects, but for the type system's hierarchy:
(make-package "TEST" :use '("CL" "SB-EXT"))
(in-package test)
(defstruct test x)
(defvar *testvar* (cons 1 2))
(defun testfun (x) x)
(deftype testtype () 'integer)
(defvar cl-user::*wps*
(list (make-weak-pointer '*testvar*) ; name of var
(in-package cl-user)
(delete-package "TEST")
(gc :full t)
* *wps*
(#<broken weak pointer> #<broken weak pointer> #<broken weak pointer>
#<broken weak pointer> #<weak pointer: #:MAKE-TEST>)
#:MAKE-TEST was retained only because type named #:TEST was retained, which was retained because it is the name of a classoid; a structure-classoid has a defstruct-
* (car(sb-
#<STRUCTURE-CLASS #:TEST>
One reason for this state of matters is because the STANDARD-CLASSOID for SLOT-OBJECT holds possibly permanent strong references to all of its subclasses.
For instance, in the below example after creating some anonymous classes, the value of SUBCLASSES in that classoid is a non-weak hash table that holds 12157 elements, therefore keeping all the classoids and their layouts live and contributing to the aforementioned memory leak.
------- ------- ------- ------
CL-USER> (let* ((slot-object-class (find-class 'sb-pcl: :slot-object) )
(slot- object- wrapper (slot-value slot-object-class 'sb-pcl::wrapper))
(slot- object- classoid (slot-value slot-object-wrapper 'sb-pcl: :classoid) ))
(swank: inspect- in-emacs slot-object- classoid) )
#<SB-KERNEL: STANDARD- CLASSOID {1000052D03}> ------- ------ STANDARD- CLASSOID. SUPERCLASSES: NIL TYPE-HASH- VALUE> :COUNT 12157 {1000250603}> :SLOT-CLASS SB-PCL: :SLOT-OBJECT>
-------
The object is a STRUCTURE-OBJECT of type SB-KERNEL:
%BITS: 1387484958
NAME: SB-PCL::SLOT-OBJECT
LAYOUT: #<SB-KERNEL:LAYOUT for SB-PCL::SLOT-OBJECT {50217103}>
STATE: NIL
DIRECT-
SOURCE-LOCATION: NIL
SUBCLASSES: #<HASH-TABLE :TEST EQ :HASH-FUNCTION #<FUNCTION SB-KERNEL:
PCL-CLASS: #<SB-PCL:
OLD-LAYOUTS: NIL