Eric, Even after your fix... LDAP_CACHE_LOCK() is either missing a barrier or it is not enough for subsequent calls to APR with NULL locking (passed to APR_RMM_INIT). After patch for this bug has been applied, https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1752683 and the new bug https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1767105, end user still complains about seg faults and core dumps show same issue: race condition for rmm->base->firstfree in function "find_block_of_size". In the dump, in find_block_of_size(): apr_rmm_off_t next = rmm->base->firstfree; ... while(next) { struct rmm_block_t *blk = (rmm_block_t*)((char*)rmm->base + next); blk gets value 0x5772e56b36226557 because "next" was corrupted (value: 0x57726573553d554f). This happens because the lock APR_ANYLOCK_LOCK(&rmm->lock); in apr_rmm_calloc() is apr_anylock_none, like previously reported by me. For the sake of exercising possibilities, if mod_ldap is calling APR RMM with external locking, it would be using LDAP_CACHE_LOCK. My current stack trace is this: Thread #19 7092 (Suspended : Container) kill() at syscall-template.S:84 0x7ff7e9911767 () at 0x7ff7e9cb7390 find_block_of_size() at apr_rmm.c:106 0x7ff7ea10e25a apr_rmm_calloc() at apr_rmm.c:342 0x7ff7ea10ea68 util_ald_alloc() at util_ldap_cache_mgr.c:105 0x7ff7e3369b3d util_ldap_compare_node_copy() at util_ldap_cache.c:257 0x7ff7e3369784 util_ald_cache_insert() at util_ldap_cache_mgr.c:501 0x7ff7e336a310 uldap_cache_compare() at util_ldap.c:1,183 0x7ff7e33662d3 ldapgroup_check_authorization() at mod_authnz_ldap.c:925 0x7ff7e8459937 apply_authz_sections() at mod_authz_core.c:737 0x7ff7e4bb99fa apply_authz_sections() at mod_authz_core.c:751 0x7ff7e4bb9c01 authorize_user_core() at mod_authz_core.c:840 0x7ff7e4bb9dca ap_run_auth_checker() at request.c:91 0x56127e692f00 ap_process_request_internal() at request.c:335 0x56127e695d57 ap_process_async_request() at http_request.c:408 0x56127e6b4690 ap_process_request() at http_request.c:445 0x56127e6b4850 ap_process_http_sync_connection() at http_core.c:210 0x56127e6b091e ap_process_http_connection() at http_core.c:251 0x56127e6b091e ap_run_process_connection() at connection.c:41 0x56127e6a6bf0 ap_process_connection() at connection.c:213 0x56127e6a7000 process_socket() at worker.c:631 0x7ff7e2f51f8b worker_thread() at worker.c:990 0x7ff7e2f51f8b start_thread() at pthread_create.c:333 0x7ff7e9cad6ba clone() at clone.S:109 0x7ff7e99e341d Which means uldap_cache_compare() would have synchronized access to APR RMM calls through LDAP_CACHE_LOCK() macro. This doesn't seem to be the case as the lock doesn't seem to be acquired. LDAP_CACHE_LOCK() translates into: do { if (st->util_ldap_cache_lock) apr_global_mutex_lock(st->util_ldap_cache_lock); } while (0); After the change proposed for this bug (where "util_ldap_cache_lock" would come from the ldap_merge_config), it seems that st has util_ldap_cache_lock and util_ldap_cache all set: Name : util_ldap_cache_lock Hex:0x7ff7ea75aee0 Name : util_ldap_cache Hex:0x7ff7e0e51038 Meaning that it got the ldap_cache and ldap_cache_lock from the merge config function. From the mutex acquire logic, for the apr_global_mutex_lock() -> apr_proc_mutex_lock(): apr_status_t apr_proc_mutex_lock(apr_proc_mutex_t *mutex) { return mutex->meth->acquire(mutex); } And, from my dump, it would translate into: st->util_ldap_cache_lock->proc_mutex->meth->acquire == proc_mutex_fcntl_acquire() And from that logic: static apr_status_t proc_mutex_fcntl_acquire(apr_proc_mutex_t *mutex) { int rc; do { rc = fcntl(mutex->interproc->filedes, F_SETLKW, &proc_mutex_lock_it); } while (rc < 0 && errno == EINTR); if (rc < 0) { return errno; } mutex->curr_locked=1; return APR_SUCCESS; } Again, from the dump, We would guarantee mutex lock through a file descriptor to the file: "/var/lock/apache2/ldap-cache.1368" (filedes == 15) And the "mutex->curr_locked" would be 1. Unfortunately, considering my stack trace, during the cache insertion: find_block_of_size() at apr_rmm.c:106 0x7ff7ea10e25a apr_rmm_calloc() at apr_rmm.c:342 0x7ff7ea10ea68 util_ald_alloc() at util_ldap_cache_mgr.c:105 0x7ff7e3369b3d util_ldap_compare_node_copy() at util_ldap_cache.c:257 0x7ff7e3369784 util_ald_cache_insert() at util_ldap_cache_mgr.c:501 0x7ff7e336a310 uldap_cache_compare() at util_ldap.c:1,183 0x7ff7e33662d3 Name : st->util_ldap_cache_lock Details:0x7ff7ea75aee0 Default:0x7ff7ea75aee0 Decimal:140702767230688 Hex:0x7ff7ea75aee0 Binary:11111111111011111101010011101011010111011100000 Octal:03777375235327340 Name : proc_mutex Details:0x7ff7ea75aef8 Default:0x7ff7ea75aef8 Decimal:140702767230712 Hex:0x7ff7ea75aef8 Binary:11111111111011111101010011101011010111011111000 Octal:03777375235327370 Name : curr_locked Details:0 Default:0 Decimal:0 Hex:0x0 Binary:0 Octal:0 I have curr_locked = 0 So far.. I can see: The functions (from fcntl type of locking) touching curr_locked are: proc_mutex_fcntl_acquire() -> fcntl(proc_mutex_lock_it), curr_locked = 1 proc_mutex_fcntl_tryacquire() -> unblocking fcntl(proc_mutex_lock_it), curr_locked = 1 proc_mutex_fcntl_create() -> apr_file_open(), curr_locked = 0 proc_mutex_fcntl_release() -> curr_locked = 0, fcntl(proc_mutex_unlock_it) proc_mutex_fcntl_cleanup() -> if (cur_locked = 1) { mutex_fcntl_release() } -> apr_file_close() Meaning that likely _release() might have raced with _acquire() somehow. One thing that comes to my thoughts, right away, is the fact that function proc_mutex_fcntl_release(), from apr, sets "curr_locked" to 0 before actually the fcntl() F_SETLKW succeeds BUT the locking is guaranteed using the file lock with fcntl() call and not due to the variable value change, so it would be okay-ish. I can't see any other obvious reason why the lock is not acquired in this current execution. Still investigating... any ideas ?