openssl faults with "illegal instruction" for e.d. squid, tor, etc.

Bug #1154042 reported by Folkert van Heusden
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Raspbian
Invalid
Undecided
Unassigned

Bug Description

OpenSSL has issues on the RPI.
With all kinds of software it fails in OPENSSL_cpuid_setup with a SIGILL.

E.g. with the squid proxy:
Program received signal SIGILL, Illegal instruction.
0xb6e585e0 in ?? () from /usr/lib/arm-linux-gnueabihf/libcrypto.so.1.0.0
(gdb) bt
#0 0xb6e585e0 in ?? () from /usr/lib/arm-linux-gnueabihf/libcrypto.so.1.0.0
Cannot access memory at address 0x0
#1 0xb6e54fc4 in OPENSSL_cpuid_setup () from /usr/lib/arm-linux-gnueabihf/libcrypto.so.1.0.0
#2 0xb6fe8254 in ?? () from /lib/ld-linux-armhf.so.3
#3 0xbefff7e6 in ?? ()
#4 0xbefff7e6 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

If you do a google you'll find that lots of people have issues like this. With e.g. tor etc.

Tags: openssl
Revision history for this message
peter green (plugwash) wrote :

Afaict openssl probes the capabilities of the user's CPU by trying to do things and trapping the illegal instruction errors. So a couple of sigills during startup is normal. When using a debugger in order to find the real failure in your application you must continue past the startup sigills.

root@raspberrypi:/home/pi# gdb openssl
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/openssl...(no debugging symbols found)...done.
(gdb) run
Starting program: /usr/bin/openssl

Program received signal SIGILL, Illegal instruction.
0xb6e655e0 in _armv7_neon_probe ()
   from /usr/lib/arm-linux-gnueabihf/libcrypto.so.1.0.0
(gdb) continue
Continuing.

Program received signal SIGILL, Illegal instruction.
_armv7_tick () at armv4cpuid.S:17
17 armv4cpuid.S: No such file or directory.
(gdb) continue
Continuing.
OpenSSL>

There has recently been a new openssl version uploaded which according to debian fixes a crash bug. Please upgrade to that version. If you are still having problems with your application crashing then please load it into a debugger continue past the sigill's in the startup code and get a backtrace from the real failure.

I'm setting this bug as invalid, if you can show a backtrace with real failure (not mere CPU capability probing) in the current raspbian version of openssl then please file a new bug.

Also when recording backtraces please install the corresponding -dbg packages so that people can make sense of them.

Changed in raspbian:
status: New → Invalid
Revision history for this message
Folkert van Heusden (folkert) wrote : Re: [Bug 1154042] Re: openssl faults with "illegal instruction" for e.d. squid, tor, etc.

> Afaict openssl probes the capabilities of the user's CPU by trying to do
> things and trapping the illegal instruction errors. So a couple of
> sigills during startup is normal. When using a debugger in order to find
...

Peter, ok thanks for the elaborate reply.

Folkert van Heusden

--
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com

Revision history for this message
Folkert van Heusden (folkert) wrote :
Download full text (3.4 KiB)

> Afaict openssl probes the capabilities of the user's CPU by trying to do
> things and trapping the illegal instruction errors. So a couple of
> sigills during startup is normal. When using a debugger in order to find
> the real failure in your application you must continue past the startup
> sigills.
...
> I'm setting this bug as invalid, if you can show a backtrace with real
> failure (not mere CPU capability probing) in the current raspbian
> version of openssl then please file a new bug.
> Also when recording backtraces please install the corresponding -dbg
> packages so that people can make sense of them.

This is a really difficult problem.
When I run it from a debugger and continue when it sees a sigill, it is
as if the stack gets corrupted or anything.
I'm trying to build squid (proxy server) with ssl support in it. Squid
has a helper program called 'ssl_crtd'. When invoked with:
 -c -s /usr/local/squid/var/lib/ssl_db -d
it should create a database. This is never done.

int main(int argc, char *argv[])
{
... [ variable init ] ...
        printf("here1\n"); fflush(NULL);
        // proccess options.
        while ((c = getopt(argc, argv, "dcghvs:M:b:n:")) != -1) {
            switch (c) {
            case 'd':
                debug_enabled = 1;
                break;
...
            case 's':
                db_path = optarg;
                break;
...
            case 'c':
                create_new_db = true;
                break;
...
            default:
                printf("%d fail\n", c); fflush(NULL);
                exit(0);
            }
        }
        printf("here2\n"); fflush(NULL);

As you can see, getopt can process all options correctly in theory, in
reality the program jumps to the default-case with c == 255:

Reading symbols from /usr/local/src/squid-3.3.2/src/ssl/ssl_crtd...done.
(gdb) set args -c -s /usr/local/squid/var/lib/ssl_db -d
(gdb) r
Starting program: /usr/local/src/squid-3.3.2/src/ssl/ssl_crtd -c -s
/usr/local/squid/var/lib/ssl_db -d

Program received signal SIGILL, Illegal instruction.
0xb6e585e0 in _armv7_neon_probe () from
/usr/lib/arm-linux-gnueabihf/libcrypto.so.1.0.0
(gdb) bt
#0 0xb6e585e0 in _armv7_neon_probe () from
/usr/lib/arm-linux-gnueabihf/libcrypto.so.1.0.0
#1 0xb6e54fc4 in OPENSSL_cpuid_setup () at armcap.c:69
#2 0xb6fe8254 in ?? () from /lib/ld-linux-armhf.so.3
#3 0xbefff7c8 in ?? ()
Cannot access memory at address 0x2d006324
(gdb) c
Continuing.

Program received signal SIGILL, Illegal instruction.
_armv7_tick () at armv4cpuid.S:17
17 armv4cpuid.S: No such file or directory.
(gdb) bt
#0 _armv7_tick () at armv4cpuid.S:17
#1 0xb6e54fe4 in OPENSSL_cpuid_setup () at armcap.c:74
#2 0xb6fe8254 in ?? () from /lib/ld-linux-armhf.so.3
#3 0xbefff7c8 in ?? ()
Cannot access memory at address 0x2d006324
(gdb) c
Continuing.
here1
255 fail
[Inferior 1 (process 9658) exited normally]

As you can see above, it suddenly takes a strange path.
If I add
        for(int i=0; i<argc; i++)
                printf("%d %s\n", i, argv[i]);
just before the printf("here1\n");, then it shows me the correct
commandline.

Folkert van Heusden

--
MultiTail is een flexibele tool voor het volgen van logfile...

Read more...

Revision history for this message
peter green (plugwash) wrote :

What is the type of c ?

Changed in raspbian:
status: Invalid → New
Revision history for this message
Folkert van Heusden (folkert) wrote :

> What is the type of c ?

Ah shit: indeed that was the problem.
Changed it from char c to int c and now it runs fine.
I don't understand it though. Because a -1 char should be the same as an
int -1, isn't it?
Please explain so that I can either submit a bug report to the Squid
people or go sit in a corner with a "too dumb to code" hat on my head.

Folkert van Heusden

--
Always wondered what the latency of your webserver is? Or how much more
latency you get when you through a proxy server/tor? The numbers tell
the tale and with HTTPing you know them!
                                     http://www.vanheusden.com/httping/
-----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com

peter green (plugwash)
Changed in raspbian:
status: New → Invalid
Revision history for this message
peter green (plugwash) wrote :

The problem is that the C language doesn't specify whether plain char is to be signed or unsigned and the convention seems to be "choose the one that can be implemented in the least instructions on the platform in question"

On x86 char is signed by default while on arm char is unsigned by default. Assigning -1 to an unsigned char results in a value of 255.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.