race condition on rmm for module ldap (ldap cache)

Bug #1752683 reported by Rafael David Tinoco on 2018-03-01
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Apache2 Web Server
apache2 (Ubuntu)
Rafael David Tinoco
Nominated for Artful by Rafael David Tinoco
Nominated for Bionic by Rafael David Tinoco
Nominated for Trusty by Rafael David Tinoco
Nominated for Xenial by Rafael David Tinoco

Bug Description


 * Apache users using ldap module might face this if using multiple threads and shared memory activated for apr memory allocator (default in Ubuntu).

[Test Case]

 * Configure apache to use ldap module, for authentication e.g., and wait for the race condition to happen.
 * Analysis made out of a dump from a production environment.
 * Bug has been reported multiple times upstream in the past 10 years.

[Regression Potential]

 * ldap module has broken locking mechanism when using apr mem mgmt.
 * ldap would continue to have broken locking mechanism.
 * race conditions could still exist.
 * could could brake ldap module.
 * patch is upstreamed in next version to be released.

[Other Info]


Problem summary:

apr_rmm_init acts as a relocatable memory management initialization

it is used in: mod_auth_digest and util_ldap_cache

From the dump was brought to my knowledge, in the following sequence:

- util_ldap_compare_node_copy()
- util_ald_strdup()
- apr_rmm_calloc()
- find_block_of_size()

Had a "cache->rmm_addr" with no lock at "find_block_of_size()"

cache->rmm_addr->lock { type = apr_anylock_none }

And an invalid "next" offset (out of rmm->base->firstfree).

This rmm_addr was initialized with NULL as a locking mechanism:

From apr-utils:


    if (!lock) { <-- 2nd argument to apr_rmm_init()
        nulllock.type = apr_anylock_none; <--- found in the dump
        nulllock.lock.pm = NULL;
        lock = &nulllock;

From apache:

# mod_auth_digest

    sts = apr_rmm_init(&client_rmm,
                       NULL, /* no lock, we'll do the locking ourselves */
                       shmem_size, ctx);

# util_ldap_cache

        result = apr_rmm_init(&st->cache_rmm, NULL,
                              apr_shm_baseaddr_get(st->cache_shm), size,

It appears that the ldap module chose to use "rmm" for memory allocation, using
the shared memory approach, but without explicitly definiting a lock to it.
Without it, its up to the caller to guarantee that there are locks for rmm
synchronization (just like mod_auth_digest does, using global mutexes).

Because of that, there was a race condition in "find_block_of_size" and a call
touching "rmm->base->firstfree", possibly "move_block()", in a multi-threaded
apache environment, since there were no lock guarantees inside rmm logic (lock
was "apr_anylock_none" and the locking calls don't do anything).

In find_block_of_size:

    apr_rmm_off_t next = rmm->base->firstfree;

We have:


But "next" turned into:

Name : next


        struct rmm_block_t *blk = (rmm_block_t*)((char*)rmm->base + next);

        if (blk->size == size)

To segfault.

Upstream bugs:


Tags: sts Edit Tag help
Changed in apache2 (Ubuntu):
status: New → In Progress
assignee: nobody → Rafael David Tinoco (inaddy)
importance: Undecided → Medium
tags: added: sts
Rafael David Tinoco (inaddy) wrote :

I have found the cause of why the race condition happened and was able to verify that apache2 upstream version (ldap module) also suffers from the same issue. Instead of re-describing the problem here I'll give you the URLs of where I'm working upstream:

This bug was already opened in 2015:


And again in 2016:


My notes are here:


And actually the notes were the ones that helped me finding all the upstream bits for this particular bug. As you can see in apache upstream bug, an engineer from EA.COM proposed a small patch solving the external lock need that I also identified. He has also reported this in debian project:


I sent my findings upstream as a new comment to the existing bug in apache2 project because I wanted to check on what was going to be done by the maintainer, since there are multiple ways of solving this (internal and/or external lock), and I could help in any of them.

Maintainer responded:


Saying that he has already fixed this behaviour:

2 Changes with Apache 2.5.1
4 *) mod_ldap: Avoid possible crashes, hangs, and busy loops due to
5 improper merging of the cache lock in vhost config.
6 PR 43164 [Eric Covener]

by changing an external (to rmm allocator logic) lock that wasn't working. The lock was being copied in a callback after it was being used, causing the race condition because there was no external (to rmm) lock and rmm lock wasn't being used.

Rafael David Tinoco (inaddy) wrote :

PPA containing a testfix for Xenial: https://launchpad.net/~inaddy/+archive/ubuntu/lp1752683

Rafael David Tinoco (inaddy) wrote :
description: updated
Rafael David Tinoco (inaddy) wrote :

Waiting confirmation that the fix is good in order to subscribe sponsors for the patches to be uploaded to T/X/A/B in reverse order.

Changed in apache2:
status: Unknown → New
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.