Forest Fire game has crucial error
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
igraph |
Fix Released
|
Medium
|
Gábor Csárdi |
Bug Description
The current implementation of the Forest Fire network generation game has a subtle bug that results in graphs that are *significantly* denser than what the original research paper proposed. I found it impossible to reproduce the results in the research paper [1], so I compared the igraph implementation with the one in the SNAP network library (which contains a version written presumably by the author of [1]).
The bug is as follows: in the main game loop, although the original paper says that the two generated numbers x and y are geometrically distributed, they are actually geometrically distributed MINUS ONE. In other words, the SNAP implementation allows 0 to be chosen for the number of links to burn at *each iteration* with the highest probability, rather than 1. This means that fires burn *much* longer than they are intended to, and the resultant graphs are much denser than intended.
I have verified that the simple "-1" factor generates graphs with the same quantitative trends as in the paper [1].
The patch is as follows to the trunk of igraph-0.6: lines 199 and 200 of src/forestfire.c should be changed from:
long int neis_out=
long int neis_in=
to:
long int neis_out=
long int neis_in=
For verification, the corresponding lines in the SNAP graph library source are lines 110 and 128 in snap/ff.cpp
References:
[1] "Graphs over time: densification laws, shrinking diameters and possible explanations", Leskovec et al, TKDD 2007.
Also:
The igraph implementation appears to be using a Poisson approximation to a binomial distribution instead of a geometric variate. Any reason for doing this instead of generating a random geometric variate directly?
Related branches
description: | updated |
Thanks for pointing this out. At the time the paper came out, SNAP did not exist (or at least I didn't know about it) and I remember that I had could not manage to replicate the exact results, but since the algorithm was correctly implemented, as in the paper, and the results were qualitatively more or less OK (for slightly different parameters), there wasn't much I could do.
Actually the algorithm is not even the same in the different versions of the paper, see these: cs.stanford. edu/people/ jure/pubs/ powergrowth- kdd05.pdf www.cs. cornell. edu/home/ kleinber/ kdd05-time. pdf
http://
http://
and I remember a third version that I cannot find now. The version I implemented gave more or less OK results.
I'll correct this, and add an argument that chooses between the two versions.
As for the Poisson approximation, the code is coming from the R internals and based on this book:
Devroye, L. (1986) Non-Uniform Random Variate Generation. Springer-Verlag, New York. Page 480.