Random segmentation fault when extending a Graph

Bug #1039493 reported by Pablo Haya
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
igraph
Fix Released
Medium
Tamás Nepusz

Bug Description

We were using igraph 0.5.4 during a year, and after upgrading to 0.6 version our programs randomly crashes. The following source code excerpt basically summarizes the error:

# A new class graph that extends Graph
class MyGraph (ig.Graph):
    def __init__(self, directed=True):
        igraph.Graph.__init__(self, directed=directed)

# Module B
g = MyGraph()
print g

(segmentation fault after print g)

The program fails or executes correctly depending on which module the MyGraph object is created and inspected (the last two lines). Inspecting the core dump using gdb shows the following error cause:

--
Program received signal SIGSEGV, Segmentation fault.
x00007ffff388acb7 in igraphmodule_Graph_vertex_attributes (self=0x1faf240) at src/graphobject.c:8760
8760 return PyDict_Keys(ATTR_STRUCT_DICT(&self->g)[ATTRHASH_IDX_VERTEX]);

Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41)
[GCC 4.4.3] on linux2
--

We believe that there is some kind of internal initialization problem inside ig.Graph constructor, however we have not been able to detect the error pattern.

The execution environment is:
- Linux 2.6.32-42-generic #95-Ubuntu x86_64
- Python 2.6.5
- igraph 0.6
- python-igraph 0.6

and the installation is apparently correct:
$ python
>>> import igraph.test
>>> igraph.test.run_tests()
[...]
Ran 260 tests in 6.020s
OK

Revision history for this message
Tamás Nepusz (ntamas) wrote :

I haven't checked it yet, but I was wondering whether the above code snippet would compile; e.g., is "ig" the same as "igraph"?
Also, would it be possible to install valgrind, run "valgrind --log-file=valgrind.log python test.py" (where test.py is your code snippet that reproduces the problem) and attach the output of valgrind?

Thanks in advance. I'll try to reproduce the issue myself in the meanwhile.

Revision history for this message
Pablo Haya (pablo-haya) wrote :

I am sorry for the typo. ig is actually igraph- The corrected code snippet is:

# A new class graph that extends Graph
class MyGraph (igraph.Graph):
    def __init__(self, directed=True):
        igraph.Graph.__init__(self, directed=directed)

# Module B
g = MyGraph()
print g

I am going to run valgrind as well.

Thank very much

Revision history for this message
Pablo Haya (pablo-haya) wrote :

You can find attached the valgrind output. There is a reference to the segfault at the end of the file.

==7545== Process terminating with default action of signal 11 (SIGSEGV)
==7545== Access not within mapped region at address 0x8
==7545== at 0xAC9BCB7: igraphmodule_Graph_vertex_attributes (graphobject.c:8760)
==7545== by 0x4A7D97: PyEval_EvalFrameEx (in /usr/bin/python2.6)
==7545== by 0x4A9670: PyEval_EvalCodeEx (in /usr/bin/python2.6)
==7545== by 0x4A7808: PyEval_EvalFrameEx (in /usr/bin/python2.6)
==7545== by 0x4A9670: PyEval_EvalCodeEx (in /usr/bin/python2.6)
==7545== by 0x53771C: ??? (in /usr/bin/python2.6)
==7545== by 0x41F0C6: PyObject_Call (in /usr/bin/python2.6)
==7545== by 0x427DFE: ??? (in /usr/bin/python2.6)
==7545== by 0x41F0C6: PyObject_Call (in /usr/bin/python2.6)
==7545== by 0x477BFE: ??? (in /usr/bin/python2.6)
==7545== by 0x46F47E: ??? (in /usr/bin/python2.6)
==7545== by 0x41F0C6: PyObject_Call (in /usr/bin/python2.6)
==7545== If you believe this happened as a result of a stack
==7545== overflow in your program's main thread (unlikely but
==7545== possible), you can try to increase the size of the
==7545== main thread stack using the --main-stacksize= flag.
==7545== The main thread stack size used in this run was 8388608.

Tamás Nepusz (ntamas)
Changed in igraph:
assignee: nobody → Tamás Nepusz (ntamas)
Revision history for this message
Tamás Nepusz (ntamas) wrote :

Okay, this is weird. I can reproduce your issue on almost the same config (Python 2.7.3 instead of 2.6.x) with the source code example you provided _only_ if I do not call igraph.Graph.__init__(self, directed=directed) in the constructor. If this call is not there, I get almost exactly the same valgrind output as you, with exactly the same stacktrace in the end. If this call is there, everything works fine.

Can you please attach a single, self-contained .py file that reproduces the problem on your machine?

Revision history for this message
Pablo Haya (pablo-haya) wrote :

Hi Tamás

After several trials and debugging we have achieved to isolate the bug. As you can see in the attached script the segfault appears when you use a thread for creating an instance of a igraph.Graph object. The inheritance mechanism is not the error source as we assumed at the beginning.

We have tested the same script using igraph 0.5.4 and it works.

Thank very much for your time

Regards

Revision history for this message
Pablo Haya (pablo-haya) wrote :

Here you can find attached the valgrind output

Revision history for this message
Tamás Nepusz (ntamas) wrote :

Thanks, I can reproduce the bug now. I'm marking this bug as confirmed for the time being and will investigate the issue closer. You should know that the C core of igraph is not thread-safe, so it could theoretically be the case that you are running into issues because of this (however I also assumed naively that running igraph functions should be fine in a separate thread as long as it is the only thread that is using igraph).

Anyway, I'm gonna take a look at it and then either fix the issue (if it is not related to igraph's non-threadsafe behaviour) or mark it as invalid.

Finally, note that the C core of igraph can be compiled in thread-safe mode by adding --enable-tls to the switches of the configure script when compiling it. The Debian/Ubuntu packages are not thread-safe by default because they depend on ARPACK, and ARPACK itself is not required to be thread-safe. However, if you feel adventurous, you might try compiling igraph from source using --enable-tls and check if the problem persists or not.

Changed in igraph:
status: New → Confirmed
Tamás Nepusz (ntamas)
Changed in igraph:
importance: Undecided → Medium
milestone: none → 0.6.1
Revision history for this message
Tamás Nepusz (ntamas) wrote :

Seems to be platform-specific; I could not reproduce it with bug-1039493.py on my Mac but it crashes on Ubuntu Linux. Will investigate further.

Revision history for this message
Pablo Haya (pablo-haya) wrote :

Thanks Tamás. We have compiled igraph using --enable-tls and the problem persists.

We know that igraph is not thread-safe, so our software deals with concurrent access when necessary. As you pointed out (and we agree with you), an isolated use of igraph in a separated thread should work, even if the library is not thread-safe. As far as we know non thread-safe libraries does not guarantee that your code works properly when it manipulates shared data by multiple threads at the same time. But, you can still use threads safely if your code avoids dangerous concurrent situations, right? This is the case of the script, isnt it?

As it seems to be platform-specific, we are wondering if the Ubuntu version (and/or installed packages) could affect. We are using Ubuntu 10.04 LTS.

Revision history for this message
Tamás Nepusz (ntamas) wrote :

Okay. I was on the right track with --enable-tls, but it works quite the opposite way; you must compile igraph with --disable-tls instead of with --enable-tls to make it work. Let me elaborate on it a little bit.

In igraph, graph/vertex/edge attribute handling is detached from the C core via an attribute handler interface. The actual attribute handling functions are implemented differently in the Python interface than in the R interface (or in the C interface for that matter), and the igraph library itself contains a pointer to the set of attribute handler routines that are currently active. The problem is that we have accidentally made this pointer thread-local during the transition from igraph 0.5.x to igraph 0.6 - which means that every thread now has its own local attribute handler table. When you start a new thread from Python, the graphs created there will not have access to the Python-specific attribute handler routines, and that's why the program crashes. I was able to fix it by recompiling the C core of igraph with --disable-tls instead of --enable-tls. Of course this will be fixed in the next release of igraph.

Thanks for the bug report once again!

Revision history for this message
Tamás Nepusz (ntamas) wrote :

The detailed bug report of the above issue is now here:

https://bugs.launchpad.net/igraph/+bug/1042404

Feel free to follow that bug if you'd like to be up-to-date.

Revision history for this message
Pablo Haya (pablo-haya) wrote :

Thank very much for your soon response, Tamás. We have compiled using --disable-tls, and it works. Also, thank for the clear explanation of the bug. We will follow the new bug report. We hope that this bug will not cause you so much trouble.

Revision history for this message
Tamás Nepusz (ntamas) wrote :

Attribute handlers are not thread-local any more so this should be resolved in the next release.

Changed in igraph:
status: Confirmed → Fix Committed
Changed in igraph:
status: Fix Committed → Fix Released
Revision history for this message
Gábor Csárdi (gabor.csardi) wrote : Continue on github

The development of igraph has moved to github, so please do not comment on this bug here. You are of course welcome to comment on github, here:
https://github.com/igraph/igraph/issues/232

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.