Name collision causes apache gateway modules to fail when mod_shib is installed

Bug #1999823 reported by Jason Boyer
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Evergreen
Fix Released
Medium
Unassigned
OpenSRF
Fix Released
Medium
Unassigned
3.3
Fix Released
Medium
Unassigned

Bug Description

If the libapache2-mod-shib package is installed on Ubuntu and mod_shib enabled, any call to the opensrf json gateway or http translator will result in an apache process crash. Doing a bit of digging, this is because the libcurl3-gnutls package on Ubuntu is built with libssh while on Debian it's built with libssh2.

That sure does sound like an odd statement to make without any supporting evidence so lets explore that a little bit. One of the dependencies of mod_shib, because SAML is thick with XML, is libxerces. And because Xerces is a validating XML parser, it could theoretically need to retrieve an external DTD over the network, so it in turn depends on libcurl. Libcurl can speak a whole pile of protocols, including http/s, ftp, lots of others, and scp/sftp, so it has a dependency on libssh (or libssh2).

I feel like now is a good time to take a look at some of the functions exported from libopensrf, which is used by the OpenSRF json gateway and http translator apache modules. Because Strings Are Hard (at least in C), we have our own string buffer manipulation functions which are really handy:
buffer_init,
buffer_add,
some others,
and buffer_free.

In this setup the apache processes die totally silently, so we get to play around in gdb to see what's going on.
As root:
# systemctl stop apache2
# . /etc/apache2/envvars
# gdb /usr/sbin/apache2
(gdb) r -X
... Send a message to the json gateway from elsewhere
Thread 1 "/usr/sbin/apach" received signal SIGSEGV, Segmentation fault.
__GI___libc_free (mem=0x7) at malloc.c:3102
3102 malloc.c: No such file or directory.
(gdb) bt
#0 __GI___libc_free (mem=0x7) at malloc.c:3102
#1 0x00007ffff4baa15a in ssh_buffer_free () from /lib/x86_64-linux-gnu/libssh.so.4
#2 0x00007ffff7ab5dee in osrf_json_gateway_method_handler (r=<optimized out>) at ./osrf_json_gateway.c:329
#3 0x00005555555b3a88 in ap_run_handler ()
#4 0x00005555555b4036 in ap_invoke_handler ()
#5 0x00005555555cc803 in ap_process_async_request ()
#6 0x00005555555cc9d2 in ap_process_request ()
#7 0x00005555555c8a66 in ?? ()
#8 0x00005555555bd7c8 in ap_run_process_connection ()
#9 0x00007ffff7ad4997 in ?? () from /usr/lib/apache2/modules/mod_mpm_prefork.so
#10 0x00007ffff7ad4c30 in ?? () from /usr/lib/apache2/modules/mod_mpm_prefork.so
#11 0x00007ffff7ad5e29 in ?? () from /usr/lib/apache2/modules/mod_mpm_prefork.so
#12 0x0000555555595488 in ap_run_mpm ()
#13 0x000055555558d499 in main ()

The r -X gdb command means "run /usr/sbin/apache2 and pass it -X," which is apache's "debug" param. Very handy.

Don't like the look of this guy, though:
#1 0x00007ffff4baa15a in ssh_buffer_free () from /lib/x86_64-linux-gnu/libssh.so.4
Pretty sure we're not calling that.

Installing libssh-dev will give us the headers that will show us what functions are exported from the library:
grep buffer_free /usr/include/libssh/*
/usr/include/libssh/legacy.h:SSH_DEPRECATED LIBSSH_API void buffer_free(ssh_buffer buffer);
...
Hey, I know that guy! Oh...

So we get to change the names of some of our functions to un-break our apache modules. The good news is that done correctly there's no need to make any changes to code that uses libopensrf that's not also dynamically loaded into apache's process space. (i.e. no need to change any Evergreen code if we *add* names, rather than change them.)

Branch coming soon.

Tags: pullrequest
Revision history for this message
Jason Boyer (jboyer) wrote :

Did I say soon? I feel like we all know I have a pretty elastic definition of that.

Anyway, a branch lives here: https://git.evergreen-ils.org/?p=working/OpenSRF.git;a=shortlog;h=refs/heads/user/jboyer/lp1999823_funky_names / working/user/jboyer/lp1999823_funky_names
that does 3 things:
1. It renames the canonical growing_buffer functions from whatever() to osrf_whatever() and adds stubs for whatever() to avoid breaking existing libopensrf users.
2. It also increments the library version in a way that signals this. (i.e. things built against previous versions without osrf_whatever() should be able to use this without a rebuild).
3. And finally, it changes all growing_buffer calls in both apache modules to use the osrf_* versions of the calls, no longer exploding when libssh is linked into apache processes.

Testing!
1. Install OpenSRF and Evergreen onto a recent Ubuntu
2. Install and enable mod_shib (no need to actually setup SSO)
3. Send a request that hits an OpenSRF translator (a version check is a good one: http://localhost/gateway?format=json&service=open-ils.actor&method=opensrf.open-ils.system.ils_version )
4. You should get nothing except possibly an error message along the lines of "The server hung up; rude." from whatever client you use.
5. Install OpenSRF with this branch
6. Restart things and send your request again
7. You should get a much more useful response and no crashes.
8. Celebrate

Now, I know what I said at the bottom of that long explanation up there about not having to change the calls in Evergreen, and that's true in the sense that it will work, but we really should do that at some point so these plain calls can be deprecated and someday removed. I'll open an Evergreen bug for that.

We should also look at only exporting select functions from our libraries instead of everything but that's homework for another day.

Revision history for this message
Jeff Davis (jdavis-sitka) wrote :

Jason's branch fixed the problem in my Ubuntu 20.04 test environments. OpenSRF working branch user/jeffdavis/lp1999823_funky_names_signoff has my signoff.

tags: added: pullrequest signedoff
Changed in opensrf:
milestone: none → 3.2.3
status: New → Confirmed
importance: Undecided → Medium
Changed in opensrf:
milestone: 3.2.3 → 3.2.4
Revision history for this message
Jeff Davis (jdavis-sitka) wrote :

Working branch user/jeffdavis/lp1999823_funky_names_rebased has the same fixes, rebased to main:

https://git.evergreen-ils.org/?p=working/OpenSRF.git;a=shortlog;h=refs/heads/user/jeffdavis/lp1999823_funky_names_rebased

I'm hesitant to go ahead and commit this myself -- it touches a lot of code, so additional eyes would be appreciated -- but we've been using my previous signoff branch in production for four months with no known issues. Note that Evergreen's Shibboleth-based SSO will not work on Ubuntu 20.04 without this fix.

Revision history for this message
Jason Boyer (jboyer) wrote :

Unfortunately the growing_buffer calls in Evergreen's mod_idlchunk apache module were missed so this will require changes in both projects and may end up changing the minimum OpenSRF release that a fixed Eg release can run on. :(

Revision history for this message
Jason Boyer (jboyer) wrote (last edit ):

And here's an Evergreen branch that works fine on a non-Ubuntu non-Shibboleth setup, that should also work on an Ubuntu + Shibboleth system. It also includes a commit to make sure that the installed version of libopensrf also contains the necessary functions.

https://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/jboyer/lp1999823_also_funky_names / working/user/jboyer/lp1999823_also_funky_names

Changed in evergreen:
status: New → Confirmed
importance: Undecided → Medium
Changed in evergreen:
assignee: nobody → Jeff Davis (jdavis-sitka)
Revision history for this message
Jeff Davis (jdavis-sitka) wrote :

Jason's fix works for me in multiple Ubuntu 20.04 environments with Shibboleth set up and mod_shib enabled.

Working branch user/jeffdavis/lp1999823-even-funkier-names contains my signoffs for the two commits from the previous branch. It also has an additional commit that changes buffer_* funcs to osrf_buffer_* everywhere else in EG.

https://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/jeffdavis/lp1999823-even-funkier-names

This branch works for me so far on a test server that also has the user/jboyer/lp1999823_funky_names OpenSRF branch installed, but more eyeballs are needed.

Changed in evergreen:
assignee: Jeff Davis (jdavis-sitka) → Jason Stephenson (jstephenson)
Revision history for this message
Jeff Davis (jdavis-sitka) wrote :

We've had the commits from my working branch and the OpenSRF branch in production for 2-3 weeks, and so far nothing has exploded.

Changed in evergreen:
assignee: Jason Stephenson (jstephenson) → nobody
Changed in evergreen:
milestone: none → 3.12-beta
tags: removed: signedoff
Changed in evergreen:
milestone: 3.12-beta → 3.12-rc
Galen Charlton (gmc)
Changed in evergreen:
assignee: nobody → Galen Charlton (gmc)
Changed in opensrf:
assignee: nobody → Galen Charlton (gmc)
Galen Charlton (gmc)
Changed in opensrf:
milestone: 3.2.4 → 3.3-beta
milestone: 3.3-beta → 3.2.4
Revision history for this message
Galen Charlton (gmc) wrote :

I've verified that the OpenSRF side works, including as a drop-in replacement for Evergreen compiled without the EVG side of this bugfix. Consequently, pushed to OpenSRF for including in 3.2.4.

Changed in opensrf:
milestone: 3.2.4 → 3.3-beta
milestone: 3.3-beta → 3.2.4
Galen Charlton (gmc)
Changed in opensrf:
assignee: Galen Charlton (gmc) → nobody
status: Confirmed → Fix Committed
Revision history for this message
Jason Boyer (jboyer) wrote :

And with the OpenSRF side of this landing and being released soonish let's put this bug to bed. Pushed to main and rel_3_12 for release in 3.12.0. Thanks Galen and Jeff!

Galen Charlton (gmc)
Changed in opensrf:
status: Fix Committed → Fix Released
Changed in evergreen:
status: Confirmed → Fix Committed
assignee: Galen Charlton (gmc) → nobody
Changed in evergreen:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.