using the gcc-4.7.0 prerelease as packaged by Fedora Rawhide, there is a segfault in the program that results from compiling sha512-hash.c

Bug #931542 reported by Zooko Wilcox-O'Hearn
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
gcc
Invalid
Medium
pycryptopp
Fix Released
Unknown
pycryptopp (Fedora)
Invalid
High
pycryptopp (Ubuntu)
New
Undecided
Unassigned

Bug Description

This doesn't *actually* affect Ubuntu yet, at least not until Ubuntu upgrades to gcc 4.7.0 and only if this bug is still present by then.

Revision history for this message
In , Zooko Wilcox-O'Hearn (zooko) wrote :

Using this version of gcc-4.7.0 prerelease which is shipped in Fedora Rawhide: 4.7.0 20120126 (Red Hat 4.7.0-0.10), the resulting code segfaults. Valgrind reports this:

==9709== Invalid read of size 1
==9709== at 0xB4FEE70: crypto_hash_sha512 (sha512-hash.c:40)
==9709== by 0xB4F8F23: crypto_sign_publickey (ed25519.c:30)
==9709== by 0xB4F7EEB: ed25519_publickey (ed25519module.c:48)
==9709== by 0x4F0A153: PyEval_EvalFrameEx (in /usr/lib64/libpython2.7.so.1.0)
==9709== by 0x4F0B7C0: PyEval_EvalCodeEx (in /usr/lib64/libpython2.7.so.1.0)
==9709== by 0x4E9C2BA: ??? (in /usr/lib64/libpython2.7.so.1.0)
==9709== by 0x4E78A1D: PyObject_Call (in /usr/lib64/libpython2.7.so.1.0)
==9709== by 0x4E86EEF: ??? (in /usr/lib64/libpython2.7.so.1.0)
==9709== by 0x4E78A1D: PyObject_Call (in /usr/lib64/libpython2.7.so.1.0)
==9709== by 0x4ECBC41: ??? (in /usr/lib64/libpython2.7.so.1.0)
==9709== by 0x4ECB8DB: ??? (in /usr/lib64/libpython2.7.so.1.0)
==9709== by 0x4E78A1D: PyObject_Call (in /usr/lib64/libpython2.7.so.1.0)
==9709== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==9709==
{
   <insert_a_suppression_name_here>
   Memcheck:Addr1
   fun:crypto_hash_sha512
   fun:crypto_sign_publickey
   fun:ed25519_publickey
   fun:PyEval_EvalFrameEx
   fun:PyEval_EvalCodeEx
   obj:/usr/lib64/libpython2.7.so.1.0
   fun:PyObject_Call
   obj:/usr/lib64/libpython2.7.so.1.0
   fun:PyObject_Call
   obj:/usr/lib64/libpython2.7.so.1.0
   obj:/usr/lib64/libpython2.7.so.1.0
   fun:PyObject_Call
}
==9709==
==9709== Process terminating with default action of signal 11 (SIGSEGV)
==9709== Access not within mapped region at address 0x0
==9709== at 0xB4FEE70: crypto_hash_sha512 (sha512-hash.c:40)
==9709== by 0xB4F8F23: crypto_sign_publickey (ed25519.c:30)
==9709== by 0xB4F7EEB: ed25519_publickey (ed25519module.c:48)
==9709== by 0x4F0A153: PyEval_EvalFrameEx (in /usr/lib64/libpython2.7.so.1.0)
==9709== by 0x4F0B7C0: PyEval_EvalCodeEx (in /usr/lib64/libpython2.7.so.1.0)
==9709== by 0x4E9C2BA: ??? (in /usr/lib64/libpython2.7.so.1.0)
==9709== by 0x4E78A1D: PyObject_Call (in /usr/lib64/libpython2.7.so.1.0)
==9709== by 0x4E86EEF: ??? (in /usr/lib64/libpython2.7.so.1.0)
==9709== by 0x4E78A1D: PyObject_Call (in /usr/lib64/libpython2.7.so.1.0)
==9709== by 0x4ECBC41: ??? (in /usr/lib64/libpython2.7.so.1.0)
==9709== by 0x4ECB8DB: ??? (in /usr/lib64/libpython2.7.so.1.0)
==9709== by 0x4E78A1D: PyObject_Call (in /usr/lib64/libpython2.7.so.1.0)

More detail, including access to a buildbot which can reliably reproduce the problem on Fedora Rawhide and demonstrate no such problem on several other platforms, is here: https://tahoe-lafs.org/trac/pycryptopp/ticket/80

Revision history for this message
In , Zooko Wilcox-O'Hearn (zooko) wrote :

The host is linux x86_64. More information about the platform is queried automatically by the buildbot and archived here: https://tahoe-lafs.org/buildbot-pycryptopp/builders/Ruben%20Fedora%20syslib/builds/44/steps/show-tool-versions/logs/stdio

platform: Linux-3.2.1-8.fc17.x86_64-x86_64-with-fedora-17-Rawhide
machine: x86_64
linux_distribution: ('Fedora', '17', 'Rawhide')

Revision history for this message
In , Zooko (zooko-redhat-bugs) wrote :

This program (pycryptopp) segfaults when compiled on Rawhide, but not on several other platforms. I've reported the issue on the gcc tracker, here: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52236 The issue was originally noticed in the pycryptopp project, here: https://tahoe-lafs.org/trac/pycryptopp/ticket/80 The pycryptopp buildbot reliably reproduces the segfault on Rawhide and the lack of segfault on several other systems.

no longer affects: pycryptopp
Changed in gcc:
importance: Unknown → Medium
status: Unknown → New
Revision history for this message
In , Zooko Wilcox-O'Hearn (zooko) wrote :

I opened a ticket on launchpad.net with which to track the progress of this issue across multiple other issue trackers: pycryptopp, GCC, Fedora, and possibly DJB's "nacl" crypto library if there is any way to track such issues other than emailing the author. https://bugs.launchpad.net/pycryptopp/+bug/931542

Revision history for this message
In , Zooko (zooko-redhat-bugs) wrote :

I opened a ticket on launchpad.net with which to track the progress of this issue across multiple other issue trackers: pycryptopp, GCC, Fedora, and possibly DJB's "nacl" crypto library if there is any way to track such issues other than emailing the author. https://bugs.launchpad.net/pycryptopp/+bug/931542

Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

Please read http://gcc.gnu.org/bugs.html, you should provide a self-contained and if possible small testcase, it could very well be a bug in the application you are using. If you suspect a gcc bug, you can use use either a debugger or brute-force - e.g. binary search in between objects compiled with various compilation flags or various versions of the compiler (-O0 vs. standard flags, or standard flags + -fno-strict-aliasing, etc.).

Changed in pycryptopp:
status: Unknown → New
description: updated
Revision history for this message
In , Jakub (jakub-redhat-bugs) wrote :

You've filed lots of bugreports, but haven't provided easy steps how to reproduce, what exactly when built with gcc 4.7 fails, what (if possible minimal) command to reproduce it.

Revision history for this message
In , Zooko (zooko-redhat-bugs) wrote :

By the way, I don't know for sure that this is a bug in gcc-4.7.0-prerelease. It could also be a bug in our code which is uncovered by a recent change in gcc, for example. I do know that the same segfault doesn't happen on the other buildbots, none of which have gcc >= 4.7.0-prerelease.

I can explain how to reproduce it, but I can't conveniently generate a minimal case, since I don't currently have access to system (e.g. Rawhide) with gcc-4.7.0-prerelease on.

To reproduce (not minimally):

git checkout git://github.com/tahoe-lafs/pycryptopp.git
cd pycryptopp
git reset --hard 36a666d4514e21a71c934bcfc62438b8bab97f32

# Note that an equivalent git checkout is done automatically by the buildbot. The exact command-line, environment, and stdout+stderr are logged, e.g. here: https://tahoe-lafs.org/buildbot-pycryptopp/builders/Ruben%20Fedora/builds/49/steps/git/logs/stdio

python setup.py build --test-double-load

# This builds the C and C++ modules. I would assume that the option --test-double-load to the command-line is irrelevant to this bug (it causes another module to be built which isn't used by the program which segfaults). Here's a log of the buildbot executing this step and the resulting output: https://tahoe-lafs.org/buildbot-pycryptopp/builders/Ruben%20Fedora/builds/49/steps/compile/logs/stdio

valgrind --error-exitcode=1 --log-file=valgrind.log.txt --suppressions=misc/coding_helpers/python.supp --gen-suppressions=all python setup.py test

# This runs the pycryptopp unit tests, which trigger the segfault. Here's a log of the command-line, env var, and stdout+stderr of buildbot executing this step: https://tahoe-lafs.org/buildbot-pycryptopp/builders/Ruben%20Fedora/builds/49/steps/test%20valgrind/logs/stdio
# Here's the valgrind log file that results from that command-line as executed by the buildbot: https://tahoe-lafs.org/buildbot-pycryptopp/builders/Ruben%20Fedora/builds/49/steps/test%20valgrind/logs/valgrind

# By the way, here is some information about the system on which that buildbot test runs:
# https://tahoe-lafs.org/buildbot-pycryptopp/buildslaves/buildbot.rubenkerkhof.com
# https://tahoe-lafs.org/buildbot-pycryptopp/builders/Ruben%20Fedora/builds/49/steps/show-tool-versions/logs/stdio

If I were going to minimize this, I would start by looking at the Python unit test which reliably executes, the bug: pycryptopp.test.test_ed25519.Basic.test_OOP . First I would run that test alone and not the other tests, by changing "python setup.py test" to "python setup.py test -s pycryptopp.test.test_ed25519.Basic.test_OOP", just to be sure the other tests running doesn't change the behavior. If it doesn't, then I would add some debug print statements in the source code of test_OOP: https://github.com/tahoe-lafs/pycryptopp/blob/36a666d4514e21a71c934bcfc62438b8bab97f32/pycryptopp/test/test_ed25519.py#L124 to see which of its calls to ed25519, and with what arguments, triggers the segfault. Then I would write equivalent C code and see if that triggers the segfault.

Revision history for this message
Zooko Wilcox-O'Hearn (zooko) wrote :

Thanks to Samuel Neves, the bug was identified in pycryptopp. See the pycryptopp ticket for details.

Revision history for this message
In , Zooko Wilcox-O'Hearn (zooko) wrote :

Thanks to Samuel Neves, the bug was identified in the application (pycryptopp), not in gcc. Thanks!

Revision history for this message
In , Zooko (zooko-redhat-bugs) wrote :

Thanks to Samuel Neves, the bug was identified in pycryptopp. Thanks!

Changed in gcc:
status: New → Invalid
Changed in pycryptopp:
status: New → Fix Released
Changed in pycryptopp (Fedora):
importance: Unknown → High
status: Unknown → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.