Comment 32 for bug 1630413

Revision history for this message
Brian Morton (rokclimb15) wrote : Re: [Bug 1630413] Re: segfault in server/mpm/event/event.c:process_socket

Andreas,

I think patching this in Ubuntu only rather than upstream makes sense for
the reasons you've outlined. However, I would prefer that someone with more
Apache experience reviewed the fix.

Thanks,

Brian

On Fri, Dec 7, 2018 at 10:21 AM Christophe Meron <email address hidden>
wrote:

> Unfortunately, not really
>
> I can argue on why we use Trusty: as we deploy storage software which
> runs for years in controlled environment, we never upgrade OSes to new
> releases. Our older platforms are still on Trusty and that makes sense
> to me.
>
> But that doesn't make an argument to why they should fix an old version
> of apache.
>
> We can workaround our issue by using backports or hand-made packages.
> But as it seems to affect anyone using MPM + a not so heavy parallel
> workload, it seems worth fixing this in the distribution by default
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1630413
>
> Title:
> segfault in server/mpm/event/event.c:process_socket
>
> Status in apache2 package in Ubuntu:
> Triaged
>
> Bug description:
> We have seen consistent but infrequent segfaults of apache on a trusty
> production server with 2.4.7-1ubuntu4.13 (for more examples, see [1])
>
> ---
> Oct 2 19:01:03 static kernel: [8029151.932468] apache2[10642]: segfault
> at 7fac797803a8 ip 00007fac90b345e0 sp 00007fac84ff8e20 error 6 in
> mod_mpm_event.so[7fac90b2e000+d000]
> ---
>
> Taking the ip - base seems to put us at a consistent offset
>
> ---
> (gdb) p/x 0x7fac90b345e0 - 0x7fac90b2e000
> $1 = 0x65e0
>
> $ addr2line -e ./mod_mpm_event.so 0x65e0
> /build/apache2-Rau9Dr/apache2-2.4.7/server/mpm/event/event.c:1064
> ---
>
> which is at the bottom of process_socket(), which looks like
>
> ---
> 1058 /*
> 1059 * Prevent this connection from writing to our connection
> state after it
> 1060 * is no longer associated with this thread. This would
> happen if the EOR
> 1061 * bucket is destroyed from the listener thread due to a
> connection abort
> 1062 * or timeout.
> 1063 */
> 1064 c->sbh = NULL;
> 1065 return;
> 1066 }
> ---
>
> 1064 seems at least plausible as a faulting location...
>
> Some digging through httpd history reveals that this assignment was
> removed on the 2.4 branch with commit [2], which seems to be largely
> based on [3]. Things have been shuffled around so much it's hard to
> tell exactly what might have avoided us going down this path. Even so
> I'm honestly not sure how to reproduce it -- on a fairly busy server
> it's seen at most a few times a day.
>
> [1] http://paste.openstack.org/show/584330/
> [2]
> https://github.com/apache/httpd/commit/043eba1a0a190829c073d9ef084358f6693dbbd2
> [3]
> https://github.com/apache/httpd/commit/285e67883e396f97dc3aad50d9dc345f15220827
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1630413/+subscriptions
>