aa-logprof: doesn't handle large logs

Bug #387657 reported by Tom Metro on 2009-06-16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
apparmor (Ubuntu)

Bug Description

Binary package hint: apparmor

Ubuntu 8.04.2
Package: apparmor-utils
Version: 2.1+1075-0ubuntu9.2

My first experience with AppArmor was finding a kernel log file that was full of 800 MB of AppArmor warnings from Samba and CUPS. I'm not sure what suddenly enabled AppArmor on this LTS system, but while fixing the problem I noticed....

aa-logprof, while processing an 800 MB log file, drove the load average up into the high 40's. I suspect it was trying to load the whole thing into memory (on a system with 1 GB of RAM). I'd recommend revising the architecture so that it processes the lines as it sees them.

I worked around the problem by splitting the file into a dozen 75MB chunks.

Changed in apparmor (Ubuntu):
status: New → Confirmed
importance: Undecided → Wishlist
Jamie Strandboge (jdstrand) wrote :

Is this still an issue on Ubuntu 14.04 or 14.10?

Changed in apparmor (Ubuntu):
status: Confirmed → Incomplete
tags: added: aa-tools
Changed in apparmor:
importance: Undecided → Wishlist
status: New → Incomplete
Christian Boltz (cboltz) wrote :

I just checked the code - see aa.py do_logprof_pass(). Shortened quote (comments removed):

    log_reader = apparmor.logparser.ReadLog(pid, filename, existing_profiles, profile_dir, log)
    log = log_reader.read_log(logmark)

    for root in log:
        handle_children('', '', root)

    for pid in sorted(profile_changes.keys()):
        set_process(pid, profile_changes[pid])


So it seems first the full log is read, then handle_children processes the log entries, set_process() changes the profiles of running processes (if they have null-XY subprofiles) and finally collapse_log() is called.

handle_children() loops over all log events, so it should be easy to change it to get one call per log entry.

handle_children() changes profile_changes at various places, so integrating set_process() causes some work. The solution is probably to change all "profile_changes[pid] = ..." to call a helper function that
- checks if profile_changes[pid] is already set and, if it is, is identical to the new value
- if there is a real change, call set_process() for that pid
- and of course include profile_changes[pid] = ...

After that, integrating collapse_log() shouldn't be too hard.

With this change, only events that cause questions for profile changes will be kept in memory.

Changed in apparmor:
status: Incomplete → Triaged
Christian Boltz (cboltz) wrote :

A good (and even easier) start would be to check for duplicates in add_to_tree() before blindly adding events to self.pid[loc_pid].

Changed in apparmor (Ubuntu):
status: Incomplete → Confirmed
Christian Boltz (cboltz) wrote :

Just for the records - bzr trunk contains some patches that make aa-logprof faster *a lot* (~70% time saved) (thanks Peter!). Those patches didn't even touch the functions I mentioned above, so there's still room for improvement ;-)

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers