saving a project uses too much memory for large graphs

Bug #672071 reported by Guilhem on 2010-11-07
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Gephi
Fix Released
Critical
Mathieu Bastian

Bug Description

Have a 20k nodes, 500k edges graph, spatialized with open ord, 4 workspaces with on the first the complete graph and on the 3 followings 3 different subgraph extracted from the main one. Trying to save the file in a gephi format, processing bar stay stucked, memory load is about 7GO (on 8 on my computer) and processor around 50%.

Windows7, 64bits system.

tags: added: project
Changed in gephi:
milestone: none → 0.7beta
importance: Undecided → High
summary: - problem with gephi graph format save
+ saving a project uses too much memory for large graphs

Yes, the saving system is using DOM and eats too much memory. A specification exists to improve that: http://wiki.gephi.org/index.php/Core_evolution_-_Project_serialization

Changed in gephi:
status: New → Confirmed
Changed in gephi:
importance: High → Critical
Changed in gephi:
status: Confirmed → In Progress
assignee: nobody → Mathieu Bastian (mathieu.bastian)

The new StAX-based siralization is now implemented in rev 2108. All classes implementing WorkspacePersistenceProvider have been rewritten using StAX instead of DOM. StAX is much more memory efficient and scales well.

More tests need to be done, to ensure no bugs and compatibility.

I did tiny benchmark, with a 20K nodes, 50K edges network:

Open
-------
DOM: 10 seconds using 1.37 GB memory
StAX: 6 seconds using 399 MB memory

Save
------
DOM: 12 seconds using 900 MB memory
StAX: 37 seconds using 437 MB memory

Changed in gephi:
status: In Progress → Fix Committed
Changed in gephi:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers