OOM scoring kills the browser's render process while the browser is running

Bug #1478853 reported by Michael Zanetti
162
This bug affects 40 people
Affects Status Importance Assigned to Milestone
Canonical System Image
Won't Fix
High
David Barth
Oxide
New
Undecided
Unassigned
ubuntu-app-launch (Ubuntu)
Invalid
Critical
Unassigned
webbrowser-app (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

According to people's reports, the OOM killer kills the browser's renderer process while the browser is being focused.

https://plus.google.com/u/1/106915287544632473346/posts/CoA8mLLvsJk?cfem=1

Also the OOM scoring mechanism really should take more things into accounts. To me it happens frequently that an app is hanging unused in the right edge spread for days without getting killed, but the 2 most recent apps I switch between get killed all the time.

Changed in ubuntu-app-launch (Ubuntu):
importance: Undecided → Critical
description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ubuntu-app-launch (Ubuntu):
status: New → Confirmed
Changed in canonical-devices-system-image:
status: New → Confirmed
importance: Undecided → Critical
milestone: none → ww34-2015
Changed in canonical-devices-system-image:
assignee: nobody → kevin gunn (kgunn72)
Changed in qtmir (Ubuntu):
status: New → Confirmed
Revision history for this message
Pat McGowan (pat-mcgowan) wrote :

@olivier anything we can do to reduce memory consumption and/or ensure the renderer is not doing something in the background

Changed in webbrowser-app (Ubuntu):
assignee: nobody → Olivier Tilloy (osomon)
Revision history for this message
Olivier Tilloy (osomon) wrote :

When in the background, the renderer process should be stopped together with its parent. However it will continue using memory.
The OOM killer should be instructed to kill stopped processes in priority, not sure how feasible that is though.

Revision history for this message
Gerry Boland (gerboland) wrote :

There's not enough information here to make a proper attack on this issue, and this is a complex topic.

We need someone to sit down and determine things like:
1. why did the OOM killer strike a foreground app? Did it use an insane amount of memory? If a webapp, is it a QtWebkit-based app, or Oxide? Does Oxide cache much in memory?
2. would more graded OOM scoring really make an impact. How much does the OOM killer take this score into account. Maybe the scores being currently set are not strong enough.
3. Is the OOM killer cgroup aware? Can we be sure it won't kill any process in a cgroup that is containing the app currently focused by the user
4. is the default kernel OOM killer really the right thing to rely on for a phone? I believe it tries to prioritize killing memory-greedy apps over longer-running ones, i.e. if your app starts eating lots of memory, it'll kill it first. I don't think Android uses it
5. has anyone profiled these apps to see if easy memory savings could be made? Are they loading large high-res graphics?
6. has anyone profiled unity8/maliit/dash and all the system services to try reduce their memory usage?

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in webbrowser-app (Ubuntu):
status: New → Confirmed
Revision history for this message
Josh Arenson (josharenson) wrote :

@greyback, it seems that upstart-app-launch just sets the score for paused apps at 900, and changes them back to 100 after being resumed. I wonder where these numbers came from, and how OOM Killer further prioritizes processes with the same score. 900 seems pretty high, but I really don't have much experience with this so I'm not sure. As a hack, would reducing the score of Oxide (or whatever is actually affected by this) by 30, or so, points have an impact?

Revision history for this message
Gerry Boland (gerboland) wrote :

You're correct the OOM score set to 2 arbitrary values. They're hardcoded and set by upstart-app-launch. I believe your question is my question 2 above, to which I don't have an answer.

Revision history for this message
Gerry Boland (gerboland) wrote :
Revision history for this message
Ted Gould (ted) wrote : Re: [Bug 1478853] Re: OOM scoring kills the browser's render process while the browser is running

On Tue, 2015-08-04 at 20:43 +0000, Gerry Boland wrote:

> You're correct the OOM score set to 2 arbitrary values. They're
> hardcoded and set by upstart-app-launch. I believe your question is my
> question 2 above, to which I don't have an answer.

They're not entirely arbitrary, more or less built to try and have a
continuum of values. Basically trusted services (unity, etc.) are zero.
Then we put the focused application below that at 100. The dash is 200
so if there is no focused application it is the lowest. Then unfocused
applications are 900.

Hand-wavingly that means that a focused application would be allowed to
use 8 times as much memory as an unfocused application if memory size
was the only thing taken into account.

Revision history for this message
Gerry Boland (gerboland) wrote :

20921 ? Ssl 0:02 webbrowser-app
20966 ? Sl 0:00 /usr/bin/python3 /usr/lib/dbus-property-service/propertyservice
20977 ? S 0:00 /usr/lib/arm-linux-gnueabihf/oxide-qt/chrome-sandbox /usr/lib/arm-linux-gnueabihf/oxide-qt/oxide-renderer --t
20980 ? S 0:00 /usr/lib/arm-linux-gnueabihf/oxide-qt/oxide-renderer --type=zygote --form-factor=phone --limit-max-decoded-im
20985 ? S 0:00 /usr/lib/arm-linux-gnueabihf/oxide-qt/oxide-renderer --type=zygote --form-factor=phone --limit-max-decoded-im
21005 ? Sl 0:00 /usr/lib/arm-linux-gnueabihf/oxide-qt/oxide-renderer --type=renderer --disable-touch-editing --enable-overlay

phablet@arale:~$ l /proc/20980/oom_*
/proc/20980/oom_adj /proc/20980/oom_score /proc/20980/oom_score_adj

phablet@arale:~$ cat /proc/20985/oom_*
2
111
110
phablet@arale:~$ cat /proc/21005/oom_*
5
313
300
phablet@arale:~$ cat /proc/20921/oom_*
2
142
110
phablet@arale:~$ cat /proc/20977/oom_*
2
110
110
phablet@arale:~$ cat /proc/20980/oom_*
2
114
110

Why has the renderer process such different oom score settings? Intended?

Revision history for this message
Olivier Tilloy (osomon) wrote :

I tentatively added an oxide task to check whether oxide itself could be tinkering with the OOM scores of its render processes.

kevin gunn (kgunn72)
Changed in canonical-devices-system-image:
milestone: ww34-2015 → ww40-2015
Revision history for this message
Ilonka (ilonka-o) wrote :

I don't know if this is relevant, but today I've found a URL which occurs multiple crashes of the browser on my BQ4.5:
http://fossbytes.com/salaries-of-software-engineers-top-paying-skills-roles-and-best-engineering-schools/

after link in browser chrashes during next 5 minutest. Other URLs in the same session were stable

Revision history for this message
Gerry Boland (gerboland) wrote :

QtMir relying on ubuntu-app-launch to set OOM scoring

Changed in qtmir (Ubuntu):
status: Confirmed → Invalid
kevin gunn (kgunn72)
no longer affects: qtmir (Ubuntu)
Revision history for this message
kevin gunn (kgunn72) wrote :

@dbarth - i assigned to you, since i think it makes more sense, let's sync on irc if you disagree

Changed in canonical-devices-system-image:
assignee: kevin gunn (kgunn72) → David Barth (dbarth)
Revision history for this message
David Barth (dbarth) wrote :

At this stage we have a series of optimizations to deal with that situation :

- a crash handler for webapps, which prevents the appearance of blank webviews - https://bugs.launchpad.net/ubuntu/+source/oxide-qt/+bug/1375215
- a memory limit adjustment in Oxide - https://bugs.launchpad.net/oxide/+bug/1498953
- another opportunity to limit memory usage with the Skia font cache - https://bugs.launchpad.net/oxide/+bug/1499479

But at the root of that we would need to receive a signal in low memory situations, ie :
- adding the API to the app. lifecycle
- implementing the mechanisms to detect and notify apps of the low memory condition

Revision history for this message
David Barth (dbarth) wrote :

That last item is already recorded and tracked as bug: https://bugs.launchpad.net/oxide/+bug/1267132

Changed in canonical-devices-system-image:
milestone: ww40-2015 → ww46-2015
Revision history for this message
Pat McGowan (pat-mcgowan) wrote :

Lower priority based on comment #16

Changed in canonical-devices-system-image:
importance: Critical → High
milestone: ww46-2015 → backlog
Revision history for this message
Ted Gould (ted) wrote :

I think that, considering the age of this, we're happy with the values today?

Changed in ubuntu-app-launch (Ubuntu):
status: Confirmed → Invalid
Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :

Actually the tuning of the memory consumption of oxide/chromium fixed the problem.

Changed in canonical-devices-system-image:
status: Confirmed → Won't Fix
Olivier Tilloy (osomon)
Changed in webbrowser-app (Ubuntu):
assignee: Olivier Tilloy (osomon) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.