igraph

use random sampling to compute centrality estimates

Bug #1090728 reported by Jan Katins on 2012-12-15

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	igraph	Confirmed	Wishlist	Unassigned

Bug Description

betweeness centrality is not useable for networks with >100k vertices (a |V|=200k, |E|=1.2mio network and `g.betweenness(cutoff=5)` needs already 40min!). It would be nice if igraph could use a (specified numer of) randomly selected vertices instead of all vertices (or only within a distance of <cutoff>) to estimate betweeness centrality.

CENTRALITY ESTIMATION IN LARGE NETWORKS
ULRIK BRANDES and CHRISTIAN PICH
International Journal of Bifurcation and Chaos 2007 17:07, 2303-2318
http://www.worldscientific.com/doi/abs/10.1142/S0218127407018403
http://www.inf.uni-konstanz.de/algo/publications/bp-celn-06.pdf

Revision history for this message

Gábor Csárdi (gabor.csardi) wrote on 2012-12-15:

Thanks for the suggestion.

I think that you can easily do this "by hand". Just sample a number of vertices, calculate shortest paths individually from the selected vertices, and aggregate the results. I haven't read those papers tough, so I might be missing something.

Changed in igraph:
status:	New → Confirmed
importance:	Undecided → Wishlist

Revision history for this message

Jan Katins (jasc) wrote on 2013-04-05:

No, it's exactly that. I will have a try with this in python but I suspect that this is slow as well. I thought it would be easy to implement in C but I saw that the used algorithm is much different than the "usual" formula, which could have been modified to compute this sampled value.

Revision history for this message

Gábor Csárdi (gabor.csardi) wrote on 2013-04-06: Re: [Bug 1090728] Re: use random sampling to compute centrality estimates

Btw. what is the correlation between the cutoff=4 and cutoff=5 betweenness
values? If it is reasonably high, then you can just go with the cutoff=5
estimates.

I don't think the random samples in Python would be very slow, but of
course it depends on how big your sample is.

On Fri, Apr 5, 2013 at 9:44 AM, Jan Schulz <email address hidden>wrote:

> No, it's exactly that. I will have a try with this in python but I
> suspect that this is slow as well. I thought it would be easy to
> implement in C but I saw that the used algorithm is much different than
> the "usual" formula, which could have been modified to compute this
> sampled value.
>
> --
> You received this bug notification because you are subscribed to igraph.
> https://bugs.launchpad.net/bugs/1090728
>
> Title:
> use random sampling to compute centrality estimates
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/igraph/+bug/1090728/+subscriptions
>

Revision history for this message

Gábor Csárdi (gabor.csardi) wrote on 2013-05-06: Continue on github

The development of igraph has moved to github, so please do not comment on this bug here. You are of course welcome to comment on github, here:
https://github.com/igraph/igraph/issues/362

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.