store._dirty can hide large temporary leaks

Bug #430359 reported by Robert Collins
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Storm
New
Undecided
Unassigned

Bug Description

I'm writing this after debugging why a storm using script was using 3GB of memory, when the entire script logic was built around iterators. This primarily affects large data imports, because find/get/execute can all cause a flush(), but until a flush() is caused, objects which the user has no references to are held in memory.

store._dirty acts like a write-cache: it holds objects in the cache and then pushes them out to the backend.

I think it would make storm's behaviour more clear if this write cache was made explicitly into a cache, much as the read cache is an explicit cache which can have its policy controlled.

Creating a WriteCache object which had:
 - add
 - flush
 - clear
 - block_implicit_flush
 - allow_implicit_flush

methods would move all the related logic for the _dirty dict into a clear self contained object.

This could then be parameterised:
 - set a maximum size (with -1 meaning no cap)
 - set a maximum time

(and so on, if/when a user chooses to).

*even if* noone chooses to use it, it would make the behaviour more clear (that storm accumulates objects until explicitly told to flush, or find/get/execute are run).

I hope this makes sense.

James Henstridge says that a default policy that had a size cap could cause hard to debug issues for users, particularly when making a new object that doesn't meet database constraints, if an implicit flush happened at the wrong time. I think this is a good reason to have the default be the current behaviour. That said, find/get/execute can all cause such bugs already; being able to run with a write cache size of 0 would let developers find out about code that adds invalid objects to the store immediately :)

Not being clear and upfront about the cache also leads to hard to debug issues for users - thats what I experienced.

description: updated
Revision history for this message
Giovanni Pellerano (evilaliv3) wrote :

I think this is a serious flaw that affects every user.

In my understanding it is related only to store.dirty but to the store._cache.

It seems to me that the cache has some circular reference to the store to which is connected.
This causes one keep the other alive and without a store.invalidate() that trigger the internal code store._cache.clear() each allocated store remains allocated undefinitely.

(bug spotted while working on the GlobaLeaks project)

This bug mayve affects security of the softwares using Storm() (launchpad?) were a dos attack easily result in a memory exhaustion.

Revision history for this message
Giovanni Pellerano (evilaliv3) wrote :

The bug could be related to: https://bugs.launchpad.net/storm/+bug/250819

In fact in our use case we have make use also of ReferenceSets.

William Grant (wgrant)
no longer affects: launchpad
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.