Comment 15 for bug 404256

Revision history for this message
Tamás Nepusz (ntamas) wrote :

I have a feeling that you get failures when the graph contains isolated vertices (or maybe small isolated components, though I'm not sure about the latter). An isolated vertex results in a zero eigenvalue, and its corresponding eigenvector will contain zeroes only except for the single component that corresponds to the isolated vertex itself. I believe that eigenvalues close to zero take a long time to converge using ARPACK, so when the splitting process reaches the stage when the zero eigenvalue becomes the largest, ARPACK will sometimes exceed the number of iterations allowed. You can easily set a larger limit, though:

>>> igraph.arpack_options.mxiter = 100000

This usually works for me. The default is only 3000, and it was chosen quite arbitrarily, so maybe we should simply increase it in the next release. (Note that the leading eigenvector method uses multiple ARPACK invocations, so checking igraph.arpack_options.iter after a successful run of Graph.community_leading_eigenvector() will tell you the number of iterations for the last invocation only, which can easily be lower than the peak).

I think an even more sophisticated solution that we should consider is to add a pre-processing step before invoking ARPACK that decomposes the original graph to connected components, calls the community detection method on each of the components and merges the results in the end.

On the other error message related to NEV and NCV: yes, a single Arnoldi iteration takes O(n*NCV*NCV) time, that's why the algorithm is slower when NCV is larger, and that's why we try to keep NCV low. I'm not sure what we should do with this issue; I'm reluctant to raise NCV any further as a lower value is satisfactory for most problem instances. It might also happen that removing isolated vertices as a pre-processing step also resolves the NCV issue.