race condition on rmm for module ldap (ldap cache)
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
| Apache2 Web Server |
Fix Released
|
Unknown
|
|||
apache2 (Ubuntu) | Status tracked in Bionic | |||||
| Trusty |
Medium
|
Rafael David Tinoco | |||
| Xenial |
Medium
|
Rafael David Tinoco | |||
| Artful |
Medium
|
Rafael David Tinoco | |||
| Bionic |
Medium
|
Rafael David Tinoco |
Bug Description
[Impact]
* Apache users using ldap module might face this if using multiple threads and shared memory activated for apr memory allocator (default in Ubuntu).
[Test Case]
* Configure apache to use ldap module, for authentication e.g., and wait for the race condition to happen.
* Analysis made out of a dump from a production environment.
* Bug has been reported multiple times upstream in the past 10 years.
[Regression Potential]
* ldap module has broken locking mechanism when using apr mem mgmt.
* ldap would continue to have broken locking mechanism.
* race conditions could still exist.
* could could brake ldap module.
* patch is upstreamed in next version to be released.
[Other Info]
ORIGINAL CASE DESCRIPTION:
Problem summary:
apr_rmm_init acts as a relocatable memory management initialization
it is used in: mod_auth_digest and util_ldap_cache
From the dump was brought to my knowledge, in the following sequence:
- util_ldap_
- util_ald_strdup()
- apr_rmm_calloc()
- find_block_
Had a "cache->rmm_addr" with no lock at "find_block_
cache->
And an invalid "next" offset (out of rmm->base-
This rmm_addr was initialized with NULL as a locking mechanism:
From apr-utils:
apr_rmm_init()
if (!lock) { <-- 2nd argument to apr_rmm_init()
lock = &nulllock;
}
From apache:
# mod_auth_digest
sts = apr_rmm_
# util_ldap_cache
result = apr_rmm_
It appears that the ldap module chose to use "rmm" for memory allocation, using
the shared memory approach, but without explicitly definiting a lock to it.
Without it, its up to the caller to guarantee that there are locks for rmm
synchronization (just like mod_auth_digest does, using global mutexes).
Because of that, there was a race condition in "find_block_
touching "rmm->base-
apache environment, since there were no lock guarantees inside rmm logic (lock
was "apr_anylock_none" and the locking calls don't do anything).
In find_block_of_size:
apr_rmm_off_t next = rmm->base-
We have:
rmm-
Decimal:356400
Hex:0x57030
But "next" turned into:
Name : next
Decimal:
Hex:0x73797363
Causing:
struct rmm_block_t *blk = (rmm_block_
if (blk->size == size)
To segfault.
Upstream bugs:
https:/
https:/
https:/
Changed in apache2 (Ubuntu): | |
status: | New → In Progress |
assignee: | nobody → Rafael David Tinoco (inaddy) |
importance: | Undecided → Medium |
tags: | added: sts |
Rafael David Tinoco (inaddy) wrote : | #1 |
Rafael David Tinoco (inaddy) wrote : | #2 |
PPA containing a testfix for Xenial: https:/
Rafael David Tinoco (inaddy) wrote : | #5 |
Rafael David Tinoco (inaddy) wrote : | #6 |
Rafael David Tinoco (inaddy) wrote : | #7 |
Rafael David Tinoco (inaddy) wrote : | #8 |
Waiting confirmation that the fix is good in order to subscribe sponsors for the patches to be uploaded to T/X/A/B in reverse order.
Changed in apache2: | |
status: | Unknown → New |
Rafael David Tinoco (inaddy) wrote : | #9 |
I have proposed the same fix to Debian BUG:
Rafael David Tinoco (inaddy) wrote : | #10 |
I've got the confirmation that the patch fixes the issue. End user installed patched apache2 into multiple servers and the crashes stopped occurring.
tags: | added: sts-sponsor |
Eric Desrochers (slashd) wrote : | #11 |
I see a current upload in Xenial from cpaelzer for LP: #1466926 with a few automatic testing failures[1].
While the failures are only found in Xenial release, I prefer to sync-up with cpaelzer first tomorrow to make sure we won't interfere with his current apache2 work.
Once I can confirm everything is clear, I'll go ahead and sponsor bionic.
Thanks Rafael !
[1] - https:/
- Eric
Eric Desrochers (slashd) wrote : | #12 |
I couldn't talk to cpaelzer, but after double-checking, the only release on which cpaelzer seems to work for now for apache2 is Xenial.
The bug he is working on has been fix on Bionic a while ago, so I don't think he'll need extra upload in bionic (at least for this particular bug).
# Bionic : debian/changelog
* mpm_event: Fix "scoreboard full" errors. Closes: #834708 LP: #1466926
-- Stefan Fritsch <email address hidden> Wed, 21 Dec 2016 23:46:06 +0100
With that being said I think I can safely say, our way is clear for bionic.
Note that it would be best before proceeding with the SRU to re-evaluate the situation (especially for Xenial).
- Eric
Eric Desrochers (slashd) wrote : | #14 |
Sponsored for bionic.
Changed in apache2 (Ubuntu Trusty): | |
status: | New → In Progress |
Changed in apache2 (Ubuntu Xenial): | |
status: | New → In Progress |
Changed in apache2 (Ubuntu Artful): | |
status: | New → In Progress |
Changed in apache2 (Ubuntu Trusty): | |
assignee: | nobody → Rafael David Tinoco (inaddy) |
Changed in apache2 (Ubuntu Xenial): | |
assignee: | nobody → Rafael David Tinoco (inaddy) |
Changed in apache2 (Ubuntu Artful): | |
assignee: | nobody → Rafael David Tinoco (inaddy) |
Changed in apache2 (Ubuntu Trusty): | |
importance: | Undecided → Medium |
Changed in apache2 (Ubuntu Xenial): | |
importance: | Undecided → Medium |
Changed in apache2 (Ubuntu Artful): | |
importance: | Undecided → Medium |
Changed in apache2 (Ubuntu Bionic): | |
status: | In Progress → Fix Committed |
Launchpad Janitor (janitor) wrote : | #15 |
This bug was fixed in the package apache2 - 2.4.29-1ubuntu4
---------------
apache2 (2.4.29-1ubuntu4) bionic; urgency=medium
* Avoid crashes, hangs and loops by fixing mod_ldap locking: (LP: #1752683)
- added debian/
-- Rafael David Tinoco <email address hidden> Fri, 02 Mar 2018 02:19:31 +0000
Changed in apache2 (Ubuntu Bionic): | |
status: | Fix Committed → Fix Released |
Dan Streetman (ddstreet) wrote : | #17 |
Once the LP: #1466926 apache2 SRU in xenial -proposed is either promoted or removed, I'll upload this to affected releases.
Changed in apache2: | |
status: | New → Fix Released |
Hello Rafael, or anyone else affected,
Accepted apache2 into artful-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-
Further information regarding the verification process can be found at https:/
Changed in apache2 (Ubuntu Artful): | |
status: | In Progress → Fix Committed |
tags: | added: verification-needed verification-needed-artful |
Brian Murray (brian-murray) wrote : | #19 |
Hello Rafael, or anyone else affected,
Accepted apache2 into xenial-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-
Further information regarding the verification process can be found at https:/
Changed in apache2 (Ubuntu Xenial): | |
status: | In Progress → Fix Committed |
tags: | added: verification-needed-xenial |
Changed in apache2 (Ubuntu Trusty): | |
status: | In Progress → Fix Committed |
tags: | added: verification-needed-trusty |
Brian Murray (brian-murray) wrote : | #20 |
Hello Rafael, or anyone else affected,
Accepted apache2 into trusty-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-
Further information regarding the verification process can be found at https:/
Rafael David Tinoco (inaddy) wrote : | #21 |
I have asked end user for verification and was informed that after deployed no more crashes were seen (intermittent and hard to reproduce, based on workload and several variables). I think it is safe to mark this as verification-done.
tags: |
added: verification-done removed: verification-needed verification-needed-artful verification-needed-trusty verification-needed-xenial |
Dan Streetman (ddstreet) wrote : | #22 |
Notes on the apache-deps pending sru autopkgtest failures:
For trusty, the 'puppet' autopkgtest is failing:
http://
the failure for this sru is here:
https:/
it appears to have been failing since 2016, however:
https:/
so it's not related to this sru and can be ignored.
For xenial, the gvfs (s390x) autopkgtest fails, but the log indicates it's a testbed setup script failure - so almost certainly some transient issue with the s390x autopkgtest setup, not anything caused by this sru.
https:/
Additionally, the gvfs autopkgtest has been failing for over a month, so it seems safe to ignore its failures w.r.t this sru:
http://
Also for xenial, libapache2-
http://
For artful, resource-agent (s390x) autopkgtest fails:
http://
however the specific failing tests are first 'IPaddr2' which fails with what looks like a broken test (and certainly seems unrelated to apache2):
autopkgtest [20:43:48]: test IPaddr2: [------
sed: -e expression #1, char 11: unterminated `s' command
autopkgtest [20:43:48]: test IPaddr2: -------
autopkgtest [20:43:49]: test IPaddr2: - - - - - - - - - - results - - - - - - - - - -
IPaddr2 FAIL non-zero exit status 1
autopkgtest [20:43:49]: test IPaddr2: - - - - - - - - - - stderr - - - - - - - - - -
sed: -e expression #1, char 11: unterminated `s' command
and the tests 'command4' and 'command6' which are both mysql tests, not for apache2. So appear unrelated to this.
Also for artful, the cacti autopkgtest fails:
http://
This failure is during a test that has something to do with apache2, so possibly is related:
autopkgtest [20:31:42]: test check-all-pages: [------
--2018-04-05 20:31:42-- http://
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)
HTTP request sent, awaiting response... 200 OK
Length: 134 [text/html]
Saving to: ‘/tmp/tmp.
0K 100% 24.8M=0...
tags: | added: verification-done-artful verification-done-trusty verification-done-xenial |
Dan Streetman (ddstreet) wrote : | #23 |
I reproduced the 'cacti' autopkgtest on my local system, without this sru apache2 package; so the test failure is not related to this sru.
This and the last comment provide justification for ignoring all the pending-sru autopkgtest failures.
Launchpad Janitor (janitor) wrote : | #24 |
This bug was fixed in the package apache2 - 2.4.27-2ubuntu4
---------------
apache2 (2.4.27-2ubuntu4) artful; urgency=medium
* Avoid crashes, hangs and loops by fixing mod_ldap locking: (LP: #1752683)
- added debian/
-- Rafael David Tinoco <email address hidden> Fri, 02 Mar 2018 02:14:42 +0000
Changed in apache2 (Ubuntu Artful): | |
status: | Fix Committed → Fix Released |
The verification of the Stable Release Update for apache2 has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.
Launchpad Janitor (janitor) wrote : | #26 |
This bug was fixed in the package apache2 - 2.4.18-2ubuntu3.7
---------------
apache2 (2.4.18-2ubuntu3.7) xenial; urgency=medium
* Avoid crashes, hangs and loops by fixing mod_ldap locking: (LP: #1752683)
- added debian/
-- Rafael David Tinoco <email address hidden> Thu, 01 Mar 2018 18:29:12 +0000
Changed in apache2 (Ubuntu Xenial): | |
status: | Fix Committed → Fix Released |
Launchpad Janitor (janitor) wrote : | #27 |
This bug was fixed in the package apache2 - 2.4.7-1ubuntu4.19
---------------
apache2 (2.4.7-1ubuntu4.19) trusty; urgency=medium
* Avoid crashes, hangs and loops by fixing mod_ldap locking: (LP: #1752683)
- added debian/
-- Rafael David Tinoco <email address hidden> Fri, 02 Mar 2018 01:48:33 +0000
Changed in apache2 (Ubuntu Trusty): | |
status: | Fix Committed → Fix Released |
I have found the cause of why the race condition happened and was able to verify that apache2 upstream version (ldap module) also suffers from the same issue. Instead of re-describing the problem here I'll give you the URLs of where I'm working upstream:
This bug was already opened in 2015:
https:/ /bz.apache. org/bugzilla/ show_bug. cgi?id= 58483
And again in 2016:
https:/ /bz.apache. org/bugzilla/ show_bug. cgi?id= 60296
My notes are here:
https:/ /bz.apache. org/bugzilla/ show_bug. cgi?id= 60296#c7
And actually the notes were the ones that helped me finding all the upstream bits for this particular bug. As you can see in apache upstream bug, an engineer from EA.COM proposed a small patch solving the external lock need that I also identified. He has also reported this in debian project:
https:/ /bugs.debian. org/cgi- bin/bugreport. cgi?bug= 814980# 15
I sent my findings upstream as a new comment to the existing bug in apache2 project because I wanted to check on what was going to be done by the maintainer, since there are multiple ways of solving this (internal and/or external lock), and I could help in any of them.
Maintainer responded:
https:/ /bz.apache. org/bugzilla/ show_bug. cgi?id= 60296#c8
Saying that he has already fixed this behaviour:
2 Changes with Apache 2.5.1
3
4 *) mod_ldap: Avoid possible crashes, hangs, and busy loops due to
5 improper merging of the cache lock in vhost config.
6 PR 43164 [Eric Covener]
by changing an external (to rmm allocator logic) lock that wasn't working. The lock was being copied in a callback after it was being used, causing the race condition because there was no external (to rmm) lock and rmm lock wasn't being used.