0. the apparmor_parser consuming a core or more during compile is expected. See step 1 above. Questions: Are you seeing cache messages being logged? Would you be willing to add a flag to parser.conf that would log more info about caching of the complile? 1. This message is logged during the duplicate elimination check. So the profile being loaded is being dropped. It is possible this will cause the profile just loaded to use rcu, to free the profile or some of the components allocated during unpack. Question: Does the exact same $profile_name get logged multiple times? If so this would indicate userspace is swamping the kernel with multiple replacement requests when not necessary. 2. Agreed this does sound like resource contention. The goal is to track down what exactly is the bottle neck, so it can be fixed. The apparmor_parser when invoked will immediately consume resources. 2.1 Compiles can be cpu and memory intensive so it would be good to get access to the policy on the machine to test what kind of resources are being. Compiles however do not cause locking issues, nor rcu problems. 2.2. The majority of the load phases 1-4 are attributed to the apparmor_parser. phase 1 in userspace and 2-4 in the kernel. 2.3. phase 4 does take spinlocks, so this could certainly be part of the problem. The load is structured to do as much of the work as possible before the locks are taken, and they are released, and then re-aquired where possible to allow for pre-empt points. 2.4 a) replacement will immediately degrade performance, whether due to the compile consuming resources or due the load and locking. I would assume the first part of the recovery was due to some of the replacement being done. Eg. maybe the phase 1 compile. This would free up cpu resources and reduce some contention. 2.5 b) the periodic 10s slowdown does sound like rcu. The message being logged is odd in that this is not usually considered the most resource intensive part of a replacement. Indeed the duplicate elimination is done to avoid some of the locking and resource, cache churn that happen due to phase 4-6. Duplicate eliminate will do hash comparisons first. And only if the hash is an exact match a full memory compare. The full memory compare does take some cpu time but is generally very fast. With that said this phase is done under a mutex lock, which will block other replacements behind it. If there is a lot of resource contention this could account for the behavior seen in that each outstanding replacement will be queued up waiting on the mutex lock. One will be processed at a time, and kernel scheduling will determine when they wake up. If enough of these apparmor_parser tasks are queued up waiting to perform a replacement they could consume a significant amount of memory resources just sitting there waiting because in general userspace programs only return memory to the system when they exist. So for example if there are 50 processes queued up, and each used 50 MB for a compile we are looking at 2.5 GB of memory not available for caches etc. This does not explain all the behavior you have seen but is a potential source of problems. This phase could potentially be broken up more to reduce the time the mutex lock is held.