After touch /forcefsck and reboot: Assertion failed in log_clear_unflushed

Bug #1096531 reported by Blair Zajac
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
upstart
Fix Released
High
James Hunt
upstart (Ubuntu)
Fix Released
High
James Hunt
Nominated for Trusty by Philipp Kern

Bug Description

I made a copy of raring-server-cloudimg-amd64-disk1.img, launched in KVM on my 12.10 system, updated all the packages on it and then rebooted it after touching /forcefsck. Even though its a clean reboot, it still core dumpes.

ProblemType: Bug
DistroRelease: Ubuntu 13.04
Package: upstart 1.6.1-1ubuntu1
ProcVersionSignature: Ubuntu 3.7.0-7.15-generic 3.7.0
Uname: Linux 3.7.0-7-generic x86_64
ApportVersion: 2.7-0ubuntu2
Architecture: amd64
Date: Sun Jan 6 06:39:07 2013
MarkForUpload: True
ProcEnviron:
 TERM=screen
 SHELL=/bin/bash
 PATH=(custom, no user)
 LANG=en_US.UTF-8
SourcePackage: upstart
UpgradeStatus: No upgrade log present (probably fresh install)

Related branches

Revision history for this message
Blair Zajac (blair) wrote :
summary: After touch /forcefsck and reboot: Assertion failed in
- log_clear_unflushed: after
+ log_clear_unflushed
Revision history for this message
Blair Zajac (blair) wrote :

The one issue with the KVM image is that it'll hang for 2 minutes upon boot and not tell you its 'ubuntu' user password.

I can reproduce this using the daily server ISO:

- Download rsync://cdimages.ubuntu.com/cdimage/ubuntu-server/daily/current/raring-server-amd64.iso
- Start VirtualBox
- Make a new VM with the raring-server-amd64.iso
- Do a minimal install
- Reboot into the new OS
- sudo touch /forcefsck
- sudo /sbin/shutdown -r now

Revision history for this message
James Hunt (jamesodhunt) wrote :

Thanks for reporting this issue Blair. I can confirm the problem is recreatable using the i386 raring-server image.

Changed in upstart (Ubuntu):
status: New → Confirmed
importance: Undecided → High
status: Confirmed → In Progress
Revision history for this message
James Hunt (jamesodhunt) wrote :

Investigation shows that this problem is only triggered by '/forcefsck': the actual problem results from jobs that satisfy both conditions below:

- *end* before /var/log/upstart/ becomes writeable.
- has spawned one or more processes that continue to run *after* the job has finished and after /var/log/upstart/
  has become writeable.

Problem is more reliably recreatable if a job such as the following is created:
____________________

start on startup
    # spawn a process quickly that allows the job itself to end fast,
    # but which also produces output after the job has finished.
    /bin/sh -c 'while :; do cat /etc/fstab; done' &
script

end script
____________________

On my test server system, the standard job that triggered the problem was network-interface: it calls ifup which spawns off dhclient which is still running after the network-interface instance job has finished.

Revision history for this message
James Hunt (jamesodhunt) wrote :

Work-arounds for this problem:

= Work-Around 1 =

Boot with '--no-log' on kernel command-line.

EFFECT: This will disable all job log output. This is potentially rather extreme, so users may wish to use work-around 2 below.

= Work-Around 2 =

1) Boot with '--no-log' on kernel command-line.

2) Disable the flush-early-job-log job:

    echo manual | sudo tee /etc/init/flush-early-job-log job.override

3) Reboot.

EFFECT: This will throw away data that jobs produce before the disk becomes writeable, which tends to be minimal since only a very few jobs start this early in the boot process.

File /etc/init/flush-early-job-log job.override should be deleted once a fix for this bug is available in Ubuntu.

Changed in upstart:
status: New → Fix Committed
status: Fix Committed → In Progress
importance: Undecided → High
Changed in upstart (Ubuntu):
assignee: nobody → James Hunt (jamesodhunt)
Changed in upstart:
assignee: nobody → James Hunt (jamesodhunt)
James Hunt (jamesodhunt)
Changed in upstart:
status: In Progress → Fix Committed
James Hunt (jamesodhunt)
Changed in upstart (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Please don't mark ubuntu bugs as "fix released", instead wait for the package to build and migrate into -release pocket, then the bug should get autoclosed from the changelog entry.

Changed in upstart (Ubuntu):
status: Fix Released → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package upstart - 1.6.1-1ubuntu2

---------------
upstart (1.6.1-1ubuntu2) raring; urgency=low

  [ Stéphane Graber ]
  * Call autoreconf so we can use the same packaging for snapshots as for
    release tarballs. Add depends on dh-autoreconf and autopoint.

  [ James Hunt ]
  * init/log.c: Cherry-pick of upstream fix for jobs which end before
    the log disk becomes writeable and which spawn off additional
    processes which persist after the main job process has ended
    (LP: #1096531).
 -- James Hunt <email address hidden> Tue, 22 Jan 2013 15:18:30 +0000

Changed in upstart (Ubuntu):
status: Fix Committed → Fix Released
James Hunt (jamesodhunt)
Changed in upstart:
status: Fix Committed → Fix Released
Revision history for this message
Philipp Kern (pkern) wrote :

My attempt to target this bug was obviously in error. I meant to update https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/1447756

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.