init: failure to set oom_adj fails process (implement soft value?)
The current version of upstart implements a great feature to avoid having critical services to be killed by the Out Of Memory killer.
The issue with the oom_adj option is that if setting the priority fails, the whole job fails.
This doesn't happen in most cases as upstart runs as root (obviously) and so should have access to values from -17 (== never) to 15.
On containers (at least with OpenVZ), oom_adj is restricted so a container can't start processes that won't be killed by the OOM killer, that's in order to avoid a container to bring down the host. In this case, the value "-17" is invalid and return "Operation not permitted" when the user tries to set it.
In the case of the current ssh (and others) job in Ubuntu, this basically means that the container will start just fine but sshd will never start as setting oom_adj fails and therefore the whole job does.
The attached branch changes oom_adj handling a bit so that if it gets a "Operation not permitted", it'll increase the score and try again, until it reaches 15 in which case it'll just fail as it currently does. Every-time the score is increased, it logs the old and new score as warning.
I've been testing this change in a Ubuntu 10.04 container and it works as expected.
|Scott James Remnant (scott) wrote : Re: [Bug 693264] Re: Restricted oom_adj causes job to fail starting completely||#3|
|Changed in upstart:|
|status:||New → Triaged|
|importance:||Undecided → Low|
- Restricted oom_adj causes job to fail starting completely; add support
- for hard/soft value
+ init: failure to set oom_adj fails process
- init: failure to set oom_adj fails process
+ init: failure to set oom_adj fails process (implement soft value?)