[ubuntu-touch] system not recovering automatically when a critical service reaches the upstart respawn limit
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| | Canonical System Image |
Undecided
|
Unassigned | ||
| | Ubuntu RTM |
Critical
|
Mathieu Trudel-Lapierre | ||
| | upstart-watchdog (Ubuntu) |
Critical
|
Mathieu Trudel-Lapierre | ||
Bug Description
current build number: 26
device name: mako
channel: ubuntu-
last update: 2014-11-19 16:43:47
version version: 26
version ubuntu: 20141119
version device: 20141119
version custom: 20141119
Useful for ubuntu-touch mainly.
Currently there is no way for the system to recover itself after a critical service reaches the upstart respawn limit (e.g. ofono crashing in a reboot loop). This is bad for the user as he will never really know that core services are not running anymore, and might not be able to receive calls or messages.
Related branches
| Ricardo Salveti (rsalveti) wrote : | #2 |
We discussed possible ways to fix this issue during our last device sprint, and the suggestion for the short term was to have a watchdog job that could watch critical system and session services, and automatically reboot the phone when a process reaches the upstart respawn limit.
| Changed in ubuntu: | |
| assignee: | nobody → Mathieu Trudel-Lapierre (mathieu-tl) |
| Changed in ubuntu-rtm: | |
| assignee: | nobody → Mathieu Trudel-Lapierre (mathieu-tl) |
| Changed in ubuntu: | |
| importance: | Undecided → Critical |
| Changed in ubuntu-rtm: | |
| importance: | Undecided → Critical |
| Changed in ubuntu: | |
| status: | New → Confirmed |
| Changed in ubuntu-rtm: | |
| status: | New → Confirmed |
| Ricardo Salveti (rsalveti) wrote : | #3 |
| affects: | ubuntu → upstart-watchdog (Ubuntu) |
| Changed in upstart-watchdog (Ubuntu): | |
| status: | Confirmed → Fix Released |
| Changed in canonical-devices-system-image: | |
| status: | New → In Progress |
| Changed in ubuntu-rtm: | |
| status: | Confirmed → Won't Fix |
| Changed in canonical-devices-system-image: | |
| status: | In Progress → Fix Released |


You can easily reproduce that by hand by killing ofono in loop (e.g. 'pkill -9 ofono' a few times in a row). Upstart will give up and it'll never be started again.