oom_adj and -DLINUX_OOM_ADJ=0 should be used

Bug #854590 reported by Franck
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
postgresql-9.1 (Ubuntu)
Fix Released
Low
Unassigned
postgresql-common (Ubuntu)
Fix Released
Low
Martin Pitt

Bug Description

I just had a bad exerience with oom_killer killing the postmaster, and I am wondering if debian/ubuntu package should use the oom_adj trick to avoid this.

As I understand it, this would imply:
1) set oom_adj to -17 in the init script (to make the postmaster unkillable)
2) compile postgresql with -DLINUX_OOM_ADJ=0 cflag (to allow the children processes to be killed)

Does it make sense and could it be considered ?

Revision history for this message
Franck (alci) wrote :

Here is a link to a discussion on pgsql-hackers(at)postgresql(dot)org about the subject, in case that could help:

http://archives.postgresql.org/pgsql-hackers/2010-01/msg00170.php

description: updated
Revision history for this message
Franck (alci) wrote :

Also see 17.4.3. Linux Memory Overcommit at http://www.postgresql.org/docs/9.1/static/kernel-resources.html

Revision history for this message
Franck (alci) wrote :

Just a few more remarks:
- setting oom_adj should probably be an option (maybe in /etc/postgreqsql/xx/start.conf), but defaulting to -17, at least for the 9.x series would probably be what the average user wants
- probably for versions that cannot set their child processes oom_adj value, a default to 0 would be fine (but the user could easily set it to another value).

Revision history for this message
Martin Pitt (pitti) wrote :

This is certainly an interesting hinting towards the OOM killer. Adding a -common task as the init script lives there.

Changed in postgresql-9.1 (Ubuntu):
status: New → Triaged
importance: Undecided → Low
Changed in postgresql-common (Ubuntu):
importance: Undecided → Low
status: New → Triaged
Revision history for this message
Martin Pitt (pitti) wrote :

I committed the -DLINUX_OOM_ADJ=0 part to postgresql-9.1 packaging bzr. Will look at the init script counterpart soon.

Changed in postgresql-9.1 (Ubuntu):
status: Triaged → Fix Committed
Changed in postgresql-common (Ubuntu):
assignee: nobody → Martin Pitt (pitti)
Revision history for this message
Martin Pitt (pitti) wrote :

Committed the pg_ctlcluster side and integration tests to p-common.

Changed in postgresql-common (Ubuntu):
status: Triaged → Fix Committed
Revision history for this message
Martin Pitt (pitti) wrote :

BTW, I set it to -16, to still be the "last against the wall". Making it unkillable entirely doesn't seem desirable to me, if there is a bug somewhere which keeps allocating memory in a loop, this could very easily lock you out of your server.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package postgresql-9.1 - 9.1.1-3

---------------
postgresql-9.1 (9.1.1-3) unstable; urgency=low

  * debian/rules: Build with LINUX_OOM_ADJ=0 on Linux, to allow the OOM killer
    to slay the backends when the postmaster gets marked as unkillable.
    (LP: #854590)

 -- Martin Pitt <email address hidden> Wed, 19 Oct 2011 09:43:13 +0200

Changed in postgresql-9.1 (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Franck (alci) wrote :

Thanks Martin for your awesome work and for your reactivity on this request. I love postgresql packaging on Debian/Ubuntu.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package postgresql-common - 125

---------------
postgresql-common (125) unstable; urgency=low

  * Add debian/backport-ppa: Script to generate and upload backport packages
    to my Ubuntu PPA. Only for personal use.
  * Add t/160_alternate_confroot.t: Test creation, operation, upgrading, and
    removal of clusters as user nobody using $PG_CLUSTER_CONF_ROOT. This
    reproduces LP#835630 and other bugs.
  * PgCommon.pm: If $PG_CLUSTER_CONF_ROOT is set, untaint it.
  * pg_upgradecluster: Don't hardcode /etc/postgresql/, use
    $PgCommon::confroot to respect $PG_CLUSTER_CONF_ROOT. (LP: #835630)
  * pg_upgradecluster: Add --logfile option to specify a custom log file for
    the upgraded cluster. Necessary if you want to run this on
    per-user clusters and can't write into /var/log/postgresql/.
  * pg_ctlcluster: When starting as root for >= 9.1, adjust the OOM killer
    protection to -16, so that the postmaster does not get OOM-killed so
    easily (as it appears to claim all the shared memory). 9.1.1-3 and later
    resets oomadj of child processes to 0, so that the client backends can
    still get OOM-killed. Add tests to t/020_create_sql_remove.t.
    (LP: #854590)
  * debian/control: Add Breaks: to postgresql-9.1 versios before 9.1.1-3, as
    they do not reset oomadj for child processes. This is a precaution to
    avoid running all the client backends with -16 as well.
  * Add t/170_extensions.t: Check that all shipped extensions install and
    remove.
  * Add t/180_ecpg.t: Check that ecpg works. In t/001_packages.t, check that
    libecpg-dev is installed.

 -- Martin Pitt <email address hidden> Thu, 20 Oct 2011 12:17:30 +0200

Changed in postgresql-common (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.