Activity log for bug #1447756

Date Who What changed Old value New value Message
2015-04-23 18:32:00 Oliver Grawert bug added bug
2015-04-23 18:46:25 Pat McGowan upstart (Ubuntu): importance Undecided Critical
2015-04-23 18:46:25 Pat McGowan upstart (Ubuntu): status New Confirmed
2015-04-23 18:46:48 Pat McGowan bug task added canonical-devices-system-image
2015-04-23 18:47:14 Pat McGowan canonical-devices-system-image: importance Undecided Critical
2015-04-23 18:47:14 Pat McGowan canonical-devices-system-image: status New Confirmed
2015-04-23 18:47:14 Pat McGowan canonical-devices-system-image: milestone ww17-2015
2015-04-23 18:47:25 Pat McGowan tags hotfix
2015-04-24 09:42:00 James Hunt upstart (Ubuntu): assignee James Hunt (jamesodhunt)
2015-04-24 12:53:49 James Hunt bug task added upstart
2015-04-24 12:53:58 James Hunt upstart: assignee James Hunt (jamesodhunt)
2015-04-24 12:58:00 Launchpad Janitor branch linked lp:~jamesodhunt/upstart/bug-1447756
2015-04-24 13:13:31 Launchpad Janitor branch linked lp:~jamesodhunt/ubuntu/vivid/upstart/bug-1447756
2015-04-24 17:35:04 Nicolas Delvaux bug added subscriber Nicolas Delvaux
2015-04-30 16:48:21 Pat McGowan canonical-devices-system-image: milestone ww17-2015 ww19-ota
2015-04-30 20:56:35 Pat McGowan canonical-devices-system-image: assignee Ondrej Kubik (w-ondra)
2015-05-01 08:08:59 James Hunt bug added subscriber James Hunt
2015-05-01 08:09:22 James Hunt bug added subscriber Ondrej Kubik
2015-05-06 11:59:10 Pat McGowan canonical-devices-system-image: status Confirmed Fix Committed
2015-05-12 15:04:25 James Hunt upstart (Ubuntu): status Confirmed In Progress
2015-05-12 15:04:31 James Hunt upstart: status New In Progress
2015-05-12 15:49:25 Launchpad Janitor branch linked lp:~jamesodhunt/upstart/bug-1447756-the-actual-fix
2015-05-12 18:12:53 Pat McGowan canonical-devices-system-image: status Fix Committed In Progress
2015-05-12 18:12:53 Pat McGowan canonical-devices-system-image: milestone ww19-ota ww22-2015
2015-05-13 15:51:46 James Hunt attachment added bug-1447756-both-fixes.diff https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/1447756/+attachment/4396882/+files/bug-1447756-both-fixes.diff
2015-05-13 16:16:57 Ubuntu Foundations Team Bug Bot tags hotfix hotfix patch
2015-05-15 09:42:19 James Hunt upstart: status In Progress Fix Committed
2015-05-15 09:57:48 James Hunt nominated for series Ubuntu Utopic
2015-05-15 09:57:48 James Hunt bug task added upstart (Ubuntu Utopic)
2015-05-15 09:57:48 James Hunt nominated for series Ubuntu Wily
2015-05-15 09:57:48 James Hunt bug task added upstart (Ubuntu Wily)
2015-05-15 09:57:48 James Hunt nominated for series Ubuntu Vivid
2015-05-15 09:57:48 James Hunt bug task added upstart (Ubuntu Vivid)
2015-05-18 10:14:08 Launchpad Janitor branch linked lp:ubuntu/upstart
2015-05-19 06:34:33 Launchpad Janitor upstart (Ubuntu Wily): status In Progress Fix Released
2015-05-21 17:44:20 Steve Langasek upstart (Ubuntu Utopic): status New Won't Fix
2015-05-21 17:49:25 Steve Langasek bug task added upstart (Ubuntu RTM)
2015-05-21 17:49:37 Steve Langasek upstart (Ubuntu RTM): status New Fix Released
2015-05-21 18:20:20 Pat McGowan canonical-devices-system-image: status In Progress Fix Released
2015-05-22 12:25:12 Launchpad Janitor branch linked lp:~jamesodhunt/ubuntu/vivid/upstart/sru-bug-1447756
2015-05-22 12:26:03 James Hunt upstart (Ubuntu Vivid): status New In Progress
2015-05-22 12:26:10 James Hunt upstart (Ubuntu Vivid): assignee James Hunt (jamesodhunt)
2015-05-22 13:23:54 James Hunt description We recently started getting reprots from phone users that their devices go into a reboot loop after changing the language or getting an OTA upgrade (either of both end with a reboot of the phone) after a bit of research we collected the log at http://pastebin.ubuntu.com/10872934/ this shows a segfault of upstarts init binary in the log.c code: [ 6.999083]init: log.c:819: Assertion failed in log_clear_unflushed: log->unflushed->len [ 7.000279]init: Caught abort, core dumped [ 7.467176]Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000600 = Summary = The version of Upstart in vivid is affected by a coule of bugs relating to the flushing data from early-boot jobs to disk which can both result in a crash: == Problem 1 == An internal list is mishandled meaning a crash could occur randomly. == Problem 2 == Jobs which spawn processes in the background then themselves exit can cause a crash due. = Explanation of how Upstart flushes early job output = If an Upstart job starts *and ends* early in the boot sequence (before the log partition is mounted and writable) and produces output to its stdout/stderr, Upstart will cache the output for later flushing by adding the 'Log' object associated with the 'Job' to a list. When the log partition is mounted writable, the /etc/init/flush-early-job-log.conf job is run which calls "initctl notify-disk-writeable". This is a signal to Upstart to flush its cache of early-boot job output which takes the form of iterating the 'log_unflushed_files' list and flushing all the 'Log' entries to disk. = Code Specifics = There are 2 issues (note that the numbers used below match those used in the Summary). == Problem 1 detail == Due to a bug in the way the 'log_unflushed_files' list is handled (the 'Log' cannot be added to the list directly, so is added via an intermediary ('NihListElem') node), a crash can result when iterating the list since the 'Log' is freed, but NOT the intermediary node. The implication is that it is possible for the intermediary node to be attempt to dereference already-freed data, resulting in a crash. == Problem 2 detail == If a job spawns a process in the background, then itself exits, that jobs 'Log' entry will be added to the 'log_unflushed_files' list. But, if the background process produces output and then exits before Upstart attempts to flush the original jobs data to disk, the 'NihIo' corresponding to the log will be serviced automatically and the data flushed to disk. The problem comes when Upstart receives the notification to flush the 'log_unflushed_files' list, since that list now contains an entry which has already been freed (since all its data has already been flushed). The result is an assertion failure. = Fix = == Problem 1 fix == Correct the 'log_unflushed_files' list handling by freeing the 'NihListElem' (which will automatically free the 'Log' object), not by simply freeing the 'Log' object itself. * Branch: lp:~jamesodhunt/ubuntu/vivid/upstart/bug-1447756/ * New Upstart test added to avoid regression?: Yes. == Problem 2 fix == Correct the assumption that the only entries in the 'log_unflushed_files' list will always have data to flush by checking if there is in fact any data to flush; if not, remove the entry from the 'log_unflushed_files' list since it has already been handled automatically by the 'NihIo'. * Branch: lp:~jamesodhunt/upstart/bug-1447756-the-actual-fix * New Upstart test added to avoid regression?: Yes. = Workarounds = If a system is affected by this bug, it will be manifested by a crash early in the boot sequence. To overcome the issue, either: a) Boot by adding "--no-log" to the kernel command-line. b) Disable the flush-early-job-log job (assuming the machine is bootable) by running the following: $ echo manual | sudo tee -a /etc/init/flush-early-job-log.override = Impact = The issue has been present in Upstart since logging was introduced but no known instances of crashes relating to these problems have been reported prior to this bug being reported (which relates the the issue being seen on a very small subset of specific Ubuntu Touch phone hardware where Upstart is used as the system init daemon). Note that vivid still uses Upstart for managing the graphical session, but now uses systemd by default for the system init daemon. Since the session (Upstart) init does not even require a flush-early-job-log, the exposure to both the bug and the updated fix codepath is extremely limited. = Test Case = This bug is extremely hard to surface so the approach is simply to check that the internal list can be iterated correctly by: 1) Booting the system with upstart (select the Upstart option from the grub menu or add "init=/sbin/upstart" to the kernel command-line). 2) Running the following on a system booted with Upstart: $ for i in $(seq 17); do sudo start flush-early-job-log; done = Regression Potential = None expected: - As noted in Impact, the problems fixed by this version of Upstart have not been observed on server/desktop systems before. - The fix is already in wily and no problems have been reported. - See Impact. = Original Description = We recently started getting reprots from phone users that their devices go into a reboot loop after changing the language or getting an OTA upgrade (either of both end with a reboot of the phone) after a bit of research we collected the log at http://pastebin.ubuntu.com/10872934/ this shows a segfault of upstarts init binary in the log.c code: [ 6.999083]init: log.c:819: Assertion failed in log_clear_unflushed: log->unflushed->len [ 7.000279]init: Caught abort, core dumped [ 7.467176]Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000600
2015-05-22 13:24:11 James Hunt summary segfault in log.c code causes phone reboot loops [SRU] segfault in log.c code causes phone reboot loops
2015-06-26 19:54:35 Steve Langasek upstart (Ubuntu Vivid): status In Progress Fix Committed
2015-06-26 19:54:39 Steve Langasek bug added subscriber Ubuntu Stable Release Updates Team
2015-06-26 19:54:44 Steve Langasek bug added subscriber SRU Verification
2015-06-26 19:54:49 Steve Langasek tags hotfix patch hotfix patch verification-needed
2015-07-02 19:36:56 Ubuntu Foundations Team Bug Bot bug added subscriber Brian Murray
2015-07-02 19:37:01 Ubuntu Foundations Team Bug Bot tags hotfix patch verification-needed hotfix patch verification-failed verification-needed
2015-07-21 22:32:46 Adam Conrad tags hotfix patch verification-failed verification-needed hotfix patch verification-needed
2015-09-18 05:02:54 Mathew Hodson upstart (Ubuntu Vivid): importance Undecided Critical
2015-09-18 05:02:57 Mathew Hodson upstart (Ubuntu RTM): importance Undecided Critical
2015-09-18 05:02:59 Mathew Hodson upstart (Ubuntu Utopic): importance Undecided Critical
2016-04-24 05:29:37 Mathew Hodson upstart (Ubuntu Vivid): status Fix Committed Won't Fix
2016-05-12 13:48:03 Martin Pitt removed subscriber Ubuntu Stable Release Updates Team
2016-05-12 13:48:04 Martin Pitt removed subscriber SRU Verification
2016-05-12 13:48:04 Martin Pitt tags hotfix patch verification-needed hotfix patch
2016-09-01 14:12:22 Michael Schaller bug added subscriber Michael Schaller
2016-11-17 19:17:54 Justin King-Lacroix bug added subscriber Goobuntu Team
2016-11-17 19:18:02 Justin King-Lacroix bug added subscriber Justin King-Lacroix