use random sampling to compute centrality estimates

Bug #1090728 reported by Jan Katins
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
igraph
Confirmed
Wishlist
Unassigned

Bug Description

betweeness centrality is not useable for networks with >100k vertices (a |V|=200k, |E|=1.2mio network and `g.betweenness(cutoff=5)` needs already 40min!). It would be nice if igraph could use a (specified numer of) randomly selected vertices instead of all vertices (or only within a distance of <cutoff>) to estimate betweeness centrality.

CENTRALITY ESTIMATION IN LARGE NETWORKS
ULRIK BRANDES and CHRISTIAN PICH
International Journal of Bifurcation and Chaos 2007 17:07, 2303-2318
http://www.worldscientific.com/doi/abs/10.1142/S0218127407018403
http://www.inf.uni-konstanz.de/algo/publications/bp-celn-06.pdf

Revision history for this message
Gábor Csárdi (gabor.csardi) wrote :

Thanks for the suggestion.

I think that you can easily do this "by hand". Just sample a number of vertices, calculate shortest paths individually from the selected vertices, and aggregate the results. I haven't read those papers tough, so I might be missing something.

Changed in igraph:
status: New → Confirmed
importance: Undecided → Wishlist
Revision history for this message
Jan Katins (jasc) wrote :

No, it's exactly that. I will have a try with this in python but I suspect that this is slow as well. I thought it would be easy to implement in C but I saw that the used algorithm is much different than the "usual" formula, which could have been modified to compute this sampled value.

Revision history for this message
Gábor Csárdi (gabor.csardi) wrote : Re: [Bug 1090728] Re: use random sampling to compute centrality estimates

Btw. what is the correlation between the cutoff=4 and cutoff=5 betweenness
values? If it is reasonably high, then you can just go with the cutoff=5
estimates.

I don't think the random samples in Python would be very slow, but of
course it depends on how big your sample is.

On Fri, Apr 5, 2013 at 9:44 AM, Jan Schulz <email address hidden>wrote:

> No, it's exactly that. I will have a try with this in python but I
> suspect that this is slow as well. I thought it would be easy to
> implement in C but I saw that the used algorithm is much different than
> the "usual" formula, which could have been modified to compute this
> sampled value.
>
> --
> You received this bug notification because you are subscribed to igraph.
> https://bugs.launchpad.net/bugs/1090728
>
> Title:
> use random sampling to compute centrality estimates
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/igraph/+bug/1090728/+subscriptions
>

Revision history for this message
Gábor Csárdi (gabor.csardi) wrote : Continue on github

The development of igraph has moved to github, so please do not comment on this bug here. You are of course welcome to comment on github, here:
https://github.com/igraph/igraph/issues/362

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.