personalized pagerank gives weird results

Bug #1143524 reported by Jiang Bian
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
igraph
New
High
Unassigned

Bug Description

The personalized_pagerank algo gives weird results ... the "[772006261183950.8, 39828382229.333336, 23396778274.272728, 21049547303.39394, 14827224986.848484]" is because of overflow?

INFO: [0.5405405405405475, 0.4594594594594663, -4.331923775621389e-16, -4.331923775621389e-16, -4.331923775621389e-16]
INFO: [772006261183950.8, 39828382229.333336, 23396778274.272728, 21049547303.39394, 14827224986.848484]
INFO: [1415344813215411.8, 72489982570.05556, 32271902572.833332, 27963409220.166668, 27618104896.555557]

I am using python, and here is my code...

adj = [[0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],[0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],[0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],[0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],[0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0],[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1],[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0]]
g = igraph.Graph.Adjacency(adj, mode=igraph.ADJ_UNDIRECTED)

for node in g.vs:
     personalized_pagerank_scores = sorted(g.personalized_pagerank(directed=False, damping=0.85, reset_vertices=node), reverse=True)
    logger.info(personalized_pagerank_scores[0:5])

Revision history for this message
Tamás Nepusz (ntamas) wrote : Re: [Bug 1143524] [NEW] personalized pagerank gives weird results

> The personalized_pagerank algo gives weird results ... the
> "[772006261183950.8, 39828382229.333336, 23396778274.272728,
> 21049547303.39394, 14827224986.848484]" is because of overflow?
No, it's probably because of occasional failure of convergence on
disconnected graphs. We are working on resolving this. The easiest
workaround at the moment is to ensure that your graph is connected; for
instance, you can extract the subgraph that contains your seed vertex, and
calculate the personalized PageRank on this subgraph only.

--
T.

Revision history for this message
Jiang Bian (jbian) wrote :

Oh, ok. Thanks!

So, igraph's personalized pagerank implementation doesn't deal with dangling nodes to make sure the transition matrix is irreducible, and guaranteed to converge?

I did my own implementation in C++ following these papers...

http://research.taherh.org/pubs/extrapolationII.pdf

http://kamvar.org/assets/papers/comparison.pdf

http://www.emba.uvm.edu/~tlakoba/AppliedUGMath/DeeperInsidePageRank.pdf

Revision history for this message
Tamás Nepusz (ntamas) wrote :

It does deal with dangling nodes and makes sure that the transition matrix is irreducible, but still the ARPACK eigenvector solver fails to converge every now and then and it's pretty hard to figure out what's going on because ARPACK is a black box for us (igraph just multiplies matrices whenever ARPACK asks igraph to do so). We plan to switch to PRPACK (see http://github.com/DavidKurokawa/prpack) soon and this will hopefully resolve the problems with personalized PageRank.

Changed in igraph:
importance: Undecided → High
milestone: none → 0.7
Revision history for this message
Jiang Bian (jbian) wrote :

Thanks for the information. I thought it was a simple power method implementation.

PRPack does seems to be very nice and comprehensive.

Revision history for this message
Gábor Csárdi (gabor.csardi) wrote : Continue on github

The development of igraph has moved to github, so please do not comment on this bug here. You are of course welcome to comment on github, here:
https://github.com/igraph/igraph/issues/134

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.