Some services not started on boot, runlevel returns "unknown"

Bug #543506 reported by Brian J. Murrell
484
This bug affects 93 people
Affects Status Importance Assigned to Milestone
upstart (Ubuntu)
Confirmed
High
Unassigned
Nominated for Karmic by Mike Bianchi
Nominated for Lucid by Mike Bianchi

Bug Description

Binary package hint: upstart

On Lucid with:

# apt-cache policy upstart
upstart:
  Installed: 0.6.5-4
  Candidate: 0.6.5-4
  Version table:
 *** 0.6.5-4 0
        500 http://apt.interlinx.bc.ca lucid/main Packages
        100 /var/lib/dpkg/status

during boot, upstart just seems to stop starting jobs. A number of jobs are started but all of the jobs that should be started never gets complete. I end up (though having a job that starts a shell on a tty) having to manually:

# service ssh start
# service rc-sysinit start
(at least)

to get the system up into a semblance of a multi-user run-level.

I'd be happy to provide more information but I'm at a bit of a loss as to how to debug this situation.

FWIW, my "debug shell" job which gives me a shell that I can debug upstart from is:

description "Start a debug shell"

start on startup

expect fork

script
    exec /root/open -c 12 /bin/bash
end script

I have no doubt it's probably not 100% correct as I'm really only just cribbing it up from what I can see in other jobs. Hopefully I have not exacerbated the problem with that job.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote : Re: Upstart not starting jobs on boot

Thanks for the report.

At your debug shell, could you run the "initctl list" command for me, and provide the output.

Also when booting, add --verbose to the kernel command line, and capture the output from that.

Thanks very much

Changed in upstart (Ubuntu):
status: New → Incomplete
importance: Undecided → Medium
summary: - not all jobs are started
+ Upstart not starting jobs on boot
Revision history for this message
Brian J. Murrell (brian-interlinx) wrote :

(Un)fortunately (for the sake of hunting this bug down) this does not seem to be occurring any more.

I guess we can just close this bug and I will reopen if I see it again.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Thanks, please do!

Changed in upstart (Ubuntu):
status: Incomplete → Invalid
Revision history for this message
Wout (wout-bliepbliep) wrote :

Boot scripts under /etc/rc2.d seem to be run sometimes and other times not.

I found this problem as /etc/rc2.d/S99rc.local was not executed. To verify I added a logger statement (logger "Running rc.local") as the first command in rc.local some weeks ago (I should have filed this bug earlier but applications like apport and launchpad make this a very frustrating experience). Checking /var/log/syslog I find the log entry from rc.local for some boots but not for others.

It is not only rc.local that is not executed. The /etc/rc2.d/S20virtualbox-ose script is also not always run. This becomes blatantly obvious as VirtualBox can't run VMs in those cases and I have to run the virtualbox-ose script manually.

These cases seem to coincide (so both virtualbox-ose and rc.local are either both run or both not run).

This morning rc.local and virtualbox-ose were not run. I ran virtualbox-ose manually later. This is the output of 'initctl list' after about 5 hours of uptime:

alsa-mixer-save stop/waiting
avahi-daemon start/running, process 3403
mountall-net stop/waiting
rc stop/waiting
rpc_pipefs stop/waiting
rsyslog start/running, process 3286
tty4 start/running, process 3506
udev start/running, process 2903
upstart-udev-bridge start/running, process 2883
ureadahead-other stop/waiting
apport stop/waiting
console-setup stop/waiting
hwclock-save stop/waiting
irqbalance stop/waiting
plymouth-log stop/waiting
tty5 start/running, process 3509
statd start/running, process 3454
atd start/running, process 3526
dbus start/running, process 3375
failsafe-x stop/waiting
plymouth stop/waiting
portmap start/running, process 3415
ssh start/running, process 3313
control-alt-delete stop/waiting
hwclock stop/waiting
network-manager start/running, process 3453
module-init-tools stop/waiting
bridge-network-interface (vboxnet0) start/running
bridge-network-interface (lo) start/running
bridge-network-interface (eth0) start/running
bridge-network-interface (wlan0) start/running
cron start/running, process 3527
gdm start/running, process 3579
mountall stop/waiting
acpid start/running, process 3522
plymouth-stop start/running
rcS stop/waiting
ufw start/running
mounted-varrun stop/waiting
rc-sysinit stop/waiting
anacron stop/waiting
tty2 start/running, process 3515
udevtrigger stop/waiting
mounted-dev stop/waiting
tty3 start/running, process 3516
udev-finish stop/waiting
hostname stop/waiting
mountall-reboot stop/waiting
gssd stop/waiting
mountall-shell stop/waiting
mounted-tmp stop/waiting
network-interface (vboxnet0) start/running
network-interface (lo) start/running
network-interface (eth0) start/running
network-interface (wlan0) start/running
plymouth-splash stop/waiting
tty1 start/running, process 3920
udevmonitor stop/waiting
dmesg stop/waiting
network-interface-security start/running
networking stop/waiting
procps stop/waiting
tty6 start/running, process 3518
idmapd stop/waiting
ureadahead stop/waiting

Let me know if you need more info.

Revision history for this message
Wout (wout-bliepbliep) wrote :

This is on 10.04 with all available updates installed.
Running non standard kernel (2.6.33-020633-generic) for hardware support.

Changed in upstart (Ubuntu):
status: Invalid → New
Revision history for this message
David (dogge2k-deactivatedaccount) wrote :

I've encountered the same problem with Mythbuntu 10.04 with all updates installed. I recognized it because lirc sometimes started and sometimes not.
After some investigation I saw the following line in my syslog:
init: Failed to spawn rc main process: unable to open console: Input/output error

apt-cache policy upstart
upstart:
  Installiert: 0.6.5-6
  Kandidat: 0.6.5-6
  Versions-Tabelle:
 *** 0.6.5-6 0
        500 http://de.archive.ubuntu.com/ubuntu/ lucid/main Packages
        100 /var/lib/dpkg/status

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote : Re: [Bug 543506] Re: Upstart not starting jobs on boot

On Sat, 2010-04-24 at 09:15 +0000, Wout wrote:

> This is on 10.04 with all available updates installed.
> Running non standard kernel (2.6.33-020633-generic) for hardware support.
>
> ** Changed in: upstart (Ubuntu)
> Status: Invalid => New
>
Please do not repoen bugs if you are not the original reporter; it's
quite likely that you are experiencing a different problem. I'm closing
this bug again - if you are having problems, please file a *NEW* bug.

 status invalid

Scott
--
Scott James Remnant
<email address hidden>

Changed in upstart (Ubuntu):
status: New → Invalid
Revision history for this message
Wout (wout-bliepbliep) wrote :

Thanks for the message!

I have the same spawn rc message in my logs.
I see that in /etc/init/rc.conf it mentions 'console output' which makes
upstart try to open the console.
That fails. Probably a race condition. Possibly /dev/console is not yet
available when upstart wants to start rc?

I see you have a ticket open for this so no need for me to add anything (a
relief seeing how hostile Ubuntu appears to be to bug reporting).

Kind regards,
Wout

On Sun, Apr 25, 2010 at 19:29, David <email address hidden> wrote:

> I've encountered the same problem with Mythbuntu 10.04 with all updates
> installed. I recognized it because lirc sometimes started and sometimes not.
> After some investigation I saw the following line in my syslog:
> init: Failed to spawn rc main process: unable to open console: Input/output
> error
>
> apt-cache policy upstart
> upstart:
> Installiert: 0.6.5-6
> Kandidat: 0.6.5-6
> Versions-Tabelle:
> *** 0.6.5-6 0
> 500 http://de.archive.ubuntu.com/ubuntu/ lucid/main Packages
> 100 /var/lib/dpkg/status
>
> --
> Upstart not starting jobs on boot
> https://bugs.launchpad.net/bugs/543506
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “upstart” package in Ubuntu: New
>
> Bug description:
> Binary package hint: upstart
>
> On Lucid with:
>
> # apt-cache policy upstart
> upstart:
> Installed: 0.6.5-4
> Candidate: 0.6.5-4
> Version table:
> *** 0.6.5-4 0
> 500 http://apt.interlinx.bc.ca lucid/main Packages
> 100 /var/lib/dpkg/status
>
> during boot, upstart just seems to stop starting jobs. A number of jobs
> are started but all of the jobs that should be started never gets complete.
> I end up (though having a job that starts a shell on a tty) having to
> manually:
>
> # service ssh start
> # service rc-sysinit start
> (at least)
>
> to get the system up into a semblance of a multi-user run-level.
>
> I'd be happy to provide more information but I'm at a bit of a loss as to
> how to debug this situation.
>
> FWIW, my "debug shell" job which gives me a shell that I can debug upstart
> from is:
>
> description "Start a debug shell"
>
> start on startup
>
> expect fork
>
> script
> exec /root/open -c 12 /bin/bash
> end script
>
> I have no doubt it's probably not 100% correct as I'm really only just
> cribbing it up from what I can see in other jobs. Hopefully I have not
> exacerbated the problem with that job.
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/+subscribe
>

Revision history for this message
Javier López (javier-lopez) wrote : Re: Upstart not starting jobs on boot

I've just installed Lucid and I have the feeling that my problem is related to this bug. Sometimes, when Ubuntu starts I get this message:

dev/sda2: clean, 2312/231242 files, 213123/2311434 blocks (check in 4 mounts)
init: Failed to spawn rc-sysinit main process: unable to open console: Input/output error

The only way to fix it is by disabling Plymouth removing the options "quiet splash" from the starting params.

For the record: I've installed Lucid from the Minimal CD Image and I'm using Slim as the login manager.

Should I open a new bug report?

Changed in upstart (Ubuntu):
status: Invalid → New
Revision history for this message
Brian J. Murrell (brian-interlinx) wrote :
Download full text (40.2 KiB)

Reopening this. I am seeing this problem, very consistently on a Karmic->Lucid upgrade I did today. I will include the previously requested information. Before I do however I want to note that I can consistently have this machine hang on startup after filesystems are mounted by simply booting the default kernel arguments.

However! I can consistently get this machine to boot to full multi-users/desktop by simply adding "init=/bin/bash" to the kernel command line and then when the kernel boots and starts bash, simply exec(1)ing /sbin/init as such:

# exec /sbin/init
[ upstart starts and boots up normally ]

So to further clarify, something about starting the kernel and having it start bash as it's init and then starting upstart from that bash, makes it all boot just fine.

Here's "initctl list" when the boot has mounted filesystems but hung there:

alsa-mixer-save stop/waiting
avahi-daemon stop/waiting
mountall-net stop/waiting
qemu-kvm stop/waiting
rc stop/waiting
rpc_pipefs start/running
tty4 stop/waiting
udev start/running, process 639
upstart-udev-bridge start/running, process 621
ureadahead-other stop/waiting
apport stop/waiting
console-setup stop/waiting
hwclock-save stop/waiting
irqbalance stop/waiting
plymouth-log stop/waiting
tty5 stop/waiting
debug_shell start/running, process 589
statd start/running, process 1583
atd stop/waiting
dbus stop/waiting
failsafe-x stop/waiting
plymouth start/running, process 590
portmap start/running, process 1429
ssh stop/waiting
autofs stop/waiting
control-alt-delete stop/waiting
hwclock stop/waiting
network-manager stop/waiting
usplash stop/waiting
module-init-tools stop/waiting
bridge-network-interface (lo) start/running
bridge-network-interface (eth0) start/running
cron stop/waiting
gdm stop/waiting
mountall start/running, process 594
acpid stop/waiting
plymouth-stop stop/waiting
rcS stop/waiting
ufw start/running
mounted-varrun stop/waiting
rc-sysinit stop/waiting
anacron stop/waiting
tty2 stop/waiting
udevtrigger stop/waiting
mounted-dev stop/waiting
tty3 stop/waiting
udev-finish stop/waiting
hostname stop/waiting
mountall-reboot stop/waiting
mythtv-backend stop/waiting
gssd start/running, process 1579
mountall-shell stop/waiting
mounted-tmp stop/waiting
network-interface (lo) start/running
network-interface (eth0) start/running
plymouth-splash stop/waiting
tty1 stop/waiting
udevmonitor stop/waiting
network-interface-security start/running
networking stop/waiting
procps stop/waiting
tty6 stop/waiting
bootchart start/running, process 87
idmapd stop/waiting
ureadahead stop/waiting

Here's the boot output with "--verbose" on the kernel command line:

Begin: Running /scripts/local-bottom ...
Done.
Done.
Begin: Running /scripts/init-bottom ...
Done.
udevd-work[98]: inotify_add_watch(6, /dev/md0, 10) failed: No such file or directory

init: Handling startup event
init: debug_shell goal changed from stop to start
init: debug_shell state changed from waiting to starting
init: mountall goal changed from stop to start
init: mountall state changed from waiting to starting
init: hostname goal changed from stop to start
init: hostname state changed from waiting to starting
init: Handling starting event
in...

Revision history for this message
TuxInvader (tuxinvader) wrote :

Me too. Upstart is not starting all processes. My box has been running since last night without a firewall!

runlevel returns "unknown", and it seems to have forgotten a lot of scripts even exist....

root@swift:/etc/rc2.d# initctl list | grep postfix
root@swift:/etc/rc2.d# initctl list | grep cyrus

How do I set my default runlevel? Where is the upstart configuration? If I create an inittab will upstart read it?

Revision history for this message
TuxInvader (tuxinvader) wrote :

running: telinit 2
and then rebooting seems to have resolved the problem.

Revision history for this message
TuxInvader (tuxinvader) wrote :

It look as though upstart couldn't figure out what the previous runlevel was (or maybe it was trying to use 6?). My last log says the system crashed, but I don't think it did, maybe upstart/reboot failed to write to utmp properly?

My last log.....

runlevel (to lvl 2) 2.6.32-22-generi Tue May 11 14:33 - 14:56 (00:22)
reboot system boot 2.6.32-22-generi Tue May 11 14:33 - 14:56 (00:22)
shutdown system down 2.6.32-22-generi Tue May 11 14:32 - 14:33 (00:00)
shutdown system down 2.6.32-22-generi Tue May 11 14:32 - 14:32 (00:00)
runlevel (to lvl 6) 2.6.32-22-generi Tue May 11 14:32 - 14:32 (00:00)
runlevel (to lvl 2) 2.6.32-22-generi Tue May 11 14:31 - 14:32 (00:01)
reboot system boot 2.6.32-22-generi Tue May 11 14:31 - 14:32 (00:01)
myth tty7 :0 Tue May 11 13:59 - crash (00:32)
shutdown system down 2.6.32-22-generi Tue May 11 13:58 - 14:31 (00:33)
shutdown system down 2.6.32-22-generi Tue May 11 13:58 - 13:58 (00:00)
runlevel (to lvl 6) 2.6.32-22-generi Tue May 11 13:58 - 13:58 (00:00)
reboot system boot 2.6.32-22-generi Tue May 11 13:58 - 13:58 (00:00)
myth pts/0 :0.0 Mon May 10 20:48 - 23:26 (02:37)
myth tty7 :0 Mon May 10 20:48 - crash (17:10)
shutdown system down 2.6.32-22-generi Mon May 10 20:47 - 13:58 (17:11)
shutdown system down 2.6.32-22-generi Mon May 10 20:47 - 20:47 (00:00)
runlevel (to lvl 6) 2.6.32-22-generi Mon May 10 20:47 - 20:47 (00:00)
myth tty7 :0 Sun May 9 19:58 - down (1+00:48)
runlevel (to lvl 2) 2.6.32-22-generi Sun May 9 19:58 - 20:47 (1+00:49)
reboot system boot 2.6.32-22-generi Sun May 9 19:58 - 20:47 (1+00:49)
shutdown system down 2.6.32-21-generi Sun May 9 19:57 - 19:58 (00:00)

Revision history for this message
Brian J. Murrell (brian-interlinx) wrote :
Download full text (7.4 KiB)

This continues to happen on more than one of my karmic->lucid upgrades, very consistently, so I can gather data if needed.

Teinit 2 doesn't help. That just runs the scripts in /etc/rc2.d but the ones in /etc/init are left un-run.

When the system has stopped starting jobs, here's what initctl list reports:

alsa-mixer-save stop/waiting
mountall-net stop/waiting
rc stop/waiting
rpc_pipefs stop/waiting
rsyslog stop/waiting
tty4 stop/waiting
udev start/running, process 455
upstart-udev-bridge start/running, process 429
ureadahead-other stop/waiting
apport stop/waiting
console-setup stop/waiting
hwclock-save stop/waiting
plymouth-log stop/waiting
tty5 stop/waiting
statd stop/waiting
dbus stop/waiting
failsafe-x stop/waiting
plymouth stop/waiting
portmap start/running, process 736
ssh stop/waiting
control-alt-delete stop/waiting
hwclock stop/waiting
module-init-tools stop/waiting
cron stop/waiting
mountall start/running, process 408
acpid stop/waiting
plymouth-stop stop/waiting
rcS stop/waiting
mounted-varrun stop/waiting
rc-sysinit stop/waiting
tty2 stop/waiting
udevtrigger stop/waiting
mounted-dev stop/waiting
tty3 stop/waiting
udev-finish stop/waiting
hostname stop/waiting
mountall-reboot stop/waiting
mysql stop/waiting
mythtv-backend stop/waiting
gssd stop/waiting
mountall-shell stop/waiting
mounted-tmp stop/waiting
network-interface (lo) start/running
network-interface (eth0) start/running
plymouth-splash stop/waiting
tty1 stop/waiting
udevmonitor stop/waiting
dmesg stop/waiting
network-interface-security start/running
networking stop/waiting
procps stop/waiting
tty6 stop/waiting
idmapd stop/waiting
ureadahead stop/waiting

And ps axf reports:

  PID TTY STAT TIME COMMAND
    2 ? S 0:00 [kthreadd]
    3 ? S 0:00 \_ [migration/0]
    4 ? S 0:00 \_ [ksoftirqd/0]
    5 ? S 0:00 \_ [watchdog/0]
    6 ? S 0:00 \_ [events/0]
    7 ? S 0:00 \_ [cpuset]
    8 ? S 0:00 \_ [khelper]
    9 ? S 0:00 \_ [netns]
   10 ? S 0:00 \_ [async/mgr]
   11 ? S 0:00 \_ [pm]
   12 ? S 0:00 \_ [sync_supers]
   13 ? S 0:00 \_ [bdi-default]
   14 ? S 0:00 \_ [kintegrityd/0]
   15 ? S 0:00 \_ [kblockd/0]
   16 ? S 0:00 \_ [kacpid]
   17 ? S 0:00 \_ [kacpi_notify]
   18 ? S 0:00 \_ [kacpi_hotplug]
   19 ? S 0:00 \_ [ata/0]
   20 ? S 0:00 \_ [ata_aux]
   21 ? S 0:00 \_ [ksuspend_usbd]
   22 ? S 0:00 \_ [khubd]
   23 ? S 0:00 \_ [kseriod]
   24 ? S 0:00 \_ [kmmcd]
   29 ? S 0:00 \_ [khungtaskd]
   30 ? S 0:00 \_ [kswapd0]
   31 ? SN 0:00 \_ [ksmd]
   32 ? S 0:00 \_ [aio/0]
   33 ? S 0:00 \_ [ecryptfs-kthrea]
   34 ? S 0:00 \_ [crypto/0]
   38 ? S 0:00 \_ [kstriped]
   39 ? S 0:00 \_ [kmpathd/0]
   40 ? S 0:00 \_ [kmpath_handlerd]
   41 ? S 0:00 \_ [ksnapd]
   42 ? S 0:00 \_ [kondemand/0]
   43 ? S 0:00 \_ [kconservative/...

Read more...

Revision history for this message
staedtler-przyborski (staedtler-przyborski-deactivatedaccount) wrote :

Seems I'm affected by this bug too...

Sometimes (not always) sound and printing (Cups) is missing, also I can shutdown only by pressing the Power button.

Output:

initctl list
alsa-mixer-save stop/waiting
avahi-daemon start/running, process 751
mountall-net stop/waiting
rc stop/waiting
rsyslog start/running, process 725
tty4 start/running, process 811
udev start/running, process 372
upstart-udev-bridge start/running, process 369
ureadahead-other stop/waiting
apport stop/waiting
console-setup stop/waiting
hwclock-save stop/waiting
irqbalance stop/waiting
plymouth-log stop/waiting
tty5 start/running, process 814
atd start/running, process 839
dbus start/running, process 736
failsafe-x stop/waiting
plymouth stop/waiting
control-alt-delete stop/waiting
hwclock stop/waiting
network-manager start/running, process 744
module-init-tools stop/waiting
cron start/running, process 838
gdm start/running, process 873
mountall stop/waiting
acpid start/running, process 826
plymouth-stop stop/waiting
rcS stop/waiting
ufw start/running
mounted-varrun stop/waiting
rc-sysinit stop/waiting
anacron stop/waiting
tty2 start/running, process 819
udevtrigger stop/waiting
mounted-dev stop/waiting
tty3 start/running, process 820
udev-finish stop/waiting
hostname stop/waiting
mountall-reboot stop/waiting
mountall-shell stop/waiting
mounted-tmp stop/waiting
network-interface (lo) start/running
network-interface (eth0) start/running
network-interface (wlan0) start/running
plymouth-splash stop/waiting
tty1 start/running, process 1281
udevmonitor stop/waiting
dmesg stop/waiting
network-interface-security start/running
networking stop/waiting
procps stop/waiting
tty6 start/running, process 822
ureadahead stop/waiting

Revision history for this message
Ash (abostock) wrote :

I'm seeing the same problem after doing a mythbuntu 9.10 to 10.04 upgrade.
apache2 sometimes starts at boot other times it doesn't.

Revision history for this message
Paul Done (pkdone-hotmail) wrote :

Same problem for me on fresh 10.04 install. For me the boot job that is problematic is CUPSD. Sometime cups starts at bootime, sometimes it doesn't. There's a cups specific boot job bug report here: https://bugs.launchpad.net/ubuntu/+source/cups/+bug/554172?comments=all

Revision history for this message
Ash (abostock) wrote :

I've seen cups fail to start on my father's machine. I'm also noticing issues with LCDd and yesterday lirc didn't start, so it's definitely affecting several different boot scripts.

Revision history for this message
Laurent Van Basselaere (lvanbass) wrote :

Same problem as comment #16, with apache2 not starting, but on a fresh 10.04 install.

> initctl list
alsa-mixer-save stop/waiting
avahi-daemon start/running, process 744
mountall-net stop/waiting
rc stop/waiting
rsyslog start/running, process 725
tty4 stop/waiting
udev start/running, process 307
upstart-udev-bridge start/running, process 305
ureadahead-other stop/waiting
apport stop/waiting
console-setup stop/waiting
hwclock-save stop/waiting
irqbalance stop/waiting
plymouth-log stop/waiting
tty5 stop/waiting
atd stop/waiting
dbus start/running, process 731
failsafe-x stop/waiting
plymouth stop/waiting
control-alt-delete stop/waiting
hwclock stop/waiting
network-manager start/running, process 741
module-init-tools stop/waiting
cron stop/waiting
gdm start/running, process 740
mountall stop/waiting
acpid stop/waiting
plymouth-stop stop/waiting
rcS stop/waiting
ufw start/running
mounted-varrun stop/waiting
rc-sysinit stop/waiting
anacron start/running, process 1288
tty2 stop/waiting
udevtrigger stop/waiting
mounted-dev stop/waiting
tty3 stop/waiting
udev-finish stop/waiting
hostname stop/waiting
mountall-reboot stop/waiting
mountall-shell stop/waiting
mounted-tmp stop/waiting
network-interface (lo) start/running
network-interface (eth0) start/running
network-interface (wlan0) start/running
plymouth-splash stop/waiting
tty1 stop/waiting
udevmonitor stop/waiting
dmesg stop/waiting
network-interface-security start/running
networking stop/waiting
procps stop/waiting
tty6 stop/waiting
ureadahead stop/waiting

Revision history for this message
Ben Lings (ben-lings) wrote :

I'm having intermittent problems with some services not starting on boot. From what I can tell, when it doesn't work, anything run as a result of the upstart rc-sysinit job doesn't happen.

When the problem happens, I get the following symptoms:

runlevel 'unknown' (from running `runlevel`)
cron not started
apache2 not started

When the problem hasn't happened, these services do start and runlevel is 'N 2'.

The last time this happened was after a kernel update (maybe this makes a race condition happen?). Restarting fixed the problem. The system is running MythTV so only gets restarted when required for updates.

Revision history for this message
Ben Lings (ben-lings) wrote :

Another data point. As mentioned above, I'm running mythtv. The backend service for mythtv has the following in /etc/init/myth-backend.conf:

start on (local-filesystems and net-device-up IFACE=lo)

This service gets started even when none of the runlevel 2 services get started.

Revision history for this message
Oscar Forsström (biggles) wrote :

I am experiencing the exact same problem (well different services, but the same symptoms). some of the services in rc2 is starting, sometimes all of them sometimes just a few.

Runlevel outputs:
runlevel 'unknown'

If I change to runlevel 2 by issuing init 2, the services starts.

Both tried an updated karmic to lucid and then re-installed with a fresh install from minimal 10.04.

Revision history for this message
Pascal W. (info-p-development) wrote :

I'm also experiencing this problem on a fresh 10.04 desktop installation.

Quite often non of my installed services (dropbear ssh, tightvncserver and sabnzbd) are started. This is very annoying to me because my Ubuntu box is running headless.

Revision history for this message
Wout (wout-bliepbliep) wrote : Re: [Bug 543506] Re: Upstart not starting jobs on boot

A long way back in this thread it was reported that the below message would
appear in /var/log/messages when upstart had problems running
/etc/rc.sysinit. Is anyone still seeing this?

init: Failed to spawn rc-sysinit main process: unable to open console:
Input/output error

Wout

On Mon, May 17, 2010 at 09:57, Pascal W. <email address hidden> wrote:

> I'm also experiencing this problem on a fresh 10.04 desktop
> installation.
>
> Quite often non of my installed services (dropbear ssh, tightvncserver
> and sabnzbd) are started. This is very annoying to me because my Ubuntu
> box is running headless.
>
> --
> Upstart not starting jobs on boot
> https://bugs.launchpad.net/bugs/543506
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “upstart” package in Ubuntu: New
>
> Bug description:
> Binary package hint: upstart
>
> On Lucid with:
>
> # apt-cache policy upstart
> upstart:
> Installed: 0.6.5-4
> Candidate: 0.6.5-4
> Version table:
> *** 0.6.5-4 0
> 500 http://apt.interlinx.bc.ca lucid/main Packages
> 100 /var/lib/dpkg/status
>
> during boot, upstart just seems to stop starting jobs. A number of jobs
> are started but all of the jobs that should be started never gets complete.
> I end up (though having a job that starts a shell on a tty) having to
> manually:
>
> # service ssh start
> # service rc-sysinit start
> (at least)
>
> to get the system up into a semblance of a multi-user run-level.
>
> I'd be happy to provide more information but I'm at a bit of a loss as to
> how to debug this situation.
>
> FWIW, my "debug shell" job which gives me a shell that I can debug upstart
> from is:
>
> description "Start a debug shell"
>
> start on startup
>
> expect fork
>
> script
> exec /root/open -c 12 /bin/bash
> end script
>
> I have no doubt it's probably not 100% correct as I'm really only just
> cribbing it up from what I can see in other jobs. Hopefully I have not
> exacerbated the problem with that job.
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/+subscribe
>

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote : Re: Upstart not starting jobs on boot

Thanks for the reports.

Could everybody who is affected by this bug confirm that when they have missing services, runlevel *always* returns "unknown" and never "N 2" -- if you see "N 2" you may have a different problem, so please shout out

Please also attach /var/log/boot.log

Changed in upstart (Ubuntu):
status: New → Incomplete
importance: Medium → High
Revision history for this message
Ash (abostock) wrote :

Booting up apache2 failed to start.

$ runlevel
unknown

/var/log/boot.log is attached

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote : Re: [Bug 543506] Re: Upstart not starting jobs on boot

On Mon, 2010-05-17 at 12:00 +0000, Ash wrote:

> Booting up apache2 failed to start.
>
> $ runlevel
> unknown
>
> /var/log/boot.log is attached
>
Thanks, could you (and evyerone else) also attach the output of
"initctl list" after the failed boot

Scott
--
Scott James Remnant
<email address hidden>

Revision history for this message
Ash (abostock) wrote : Re: Upstart not starting jobs on boot

This is from a different boot but same symptoms...

$ cat /var/log/boot.log
fsck from util-linux-ng 2.17.2
/dev/sda1: clean, 217843/45604864 files, 93960226/182391961 blocks
init: ureadahead-other main process (769) terminated with status 4

$ initctl list
alsa-mixer-save stop/waiting
mountall-net stop/waiting
nmbd start/running, process 1094
rc stop/waiting
rpc_pipefs start/running
rsyslog start/running, process 804
tty4 stop/waiting
udev start/running, process 396
upstart-udev-bridge start/running, process 393
ureadahead-other stop/waiting
apport stop/waiting
console-setup stop/waiting
hwclock-save stop/waiting
irqbalance stop/waiting
plymouth-log stop/waiting
smbd start/running, process 789
tty5 stop/waiting
statd start/running, process 753
atd stop/waiting
dbus start/running, process 811
failsafe-x stop/waiting
plymouth stop/waiting
portmap start/running, process 701
ssh start/running, process 806
control-alt-delete stop/waiting
hwclock stop/waiting
network-manager start/running, process 814
usplash stop/waiting
module-init-tools stop/waiting
cron stop/waiting
gdm start/running, process 882
mountall stop/waiting
acpid stop/waiting
plymouth-stop stop/waiting
rcS stop/waiting
ufw start/running
mounted-varrun stop/waiting
rc-sysinit stop/waiting
tty2 stop/waiting
udevtrigger stop/waiting
mounted-dev stop/waiting
tty3 stop/waiting
udev-finish stop/waiting
hostname stop/waiting
mountall-reboot stop/waiting
mysql start/running, process 794
mythtv-backend start/running, process 1372
gssd stop/waiting
mountall-shell stop/waiting
mounted-tmp stop/waiting
network-interface (lo) start/running
network-interface (eth0) start/running
plymouth-splash stop/waiting
tty1 stop/waiting
udevmonitor stop/waiting
dmesg stop/waiting
network-interface-security start/running
networking stop/waiting
procps stop/waiting
tty6 stop/waiting
idmapd start/running, process 902
ureadahead stop/waiting

summary: - Upstart not starting jobs on boot
+ Some services not started on boot, runlevel returns "unknown"
Revision history for this message
Mike Bianchi (mbianchi-foveal) wrote :

initctl list attachment
from an AMD 64-bit quad core, running 3 cores

Revision history for this message
Mike Bianchi (mbianchi-foveal) wrote :

I'm moving my participation in the runlevel unknown problem to here from Bug #581291.

See that bug for my partial success with using init='/bin/sbin --verbose' .

Revision history for this message
Oscar Forsström (biggles) wrote :
Download full text (3.1 KiB)

cat /var/log/boot.log
fsck from util-linux-ng 2.17.2
fsck from util-linux-ng 2.17.2
udevd[345]: BUS= will be removed in a future udev version, please use SUBSYSTEM= to match the event device, or SUBSYSTEMS= to match a parent device, in /lib/udev/rules.d/40-alsa-firmware-loaders.rules:9

udevd[345]: SYSFS{}= will be removed in a future udev version, please use ATTR{}= to match the event device, or ATTRS{}= to match a parent device, in /lib/udev/rules.d/40-alsa-firmware-loaders.rules:9

udevd[345]: BUS= will be removed in a future udev version, please use SUBSYSTEM= to match the event device, or SUBSYSTEMS= to match a parent device, in /lib/udev/rules.d/40-alsa-firmware-loaders.rules:10

udevd[345]: SYSFS{}= will be removed in a future udev version, please use ATTR{}= to match the event device, or ATTRS{}= to match a parent device, in /lib/udev/rules.d/40-alsa-firmware-loaders.rules:10

udevd[345]: BUS= will be removed in a future udev version, please use SUBSYSTEM= to match the event device, or SUBSYSTEMS= to match a parent device, in /lib/udev/rules.d/40-alsa-firmware-loaders.rules:12

udevd[345]: SYSFS{}= will be removed in a future udev version, please use ATTR{}= to match the event device, or ATTRS{}= to match a parent device, in /lib/udev/rules.d/40-alsa-firmware-loaders.rules:12

udevd[345]: BUS= will be removed in a future udev version, please use SUBSYSTEM= to match the event device, or SUBSYSTEMS= to match a parent device, in /lib/udev/rules.d/40-alsa-firmware-loaders.rules:13

udevd[345]: SYSFS{}= will be removed in a future udev version, please use ATTR{}= to match the event device, or ATTRS{}= to match a parent device, in /lib/udev/rules.d/40-alsa-firmware-loaders.rules:13

udevd[345]: BUS= will be removed in a future udev version, please use SUBSYSTEM= to match the event device, or SUBSYSTEMS= to match a parent device, in /lib/udev/rules.d/40-alsa-firmware-loaders.rules:15

udevd[345]: SYSFS{}= will be removed in a future udev version, please use ATTR{}= to match the event device, or ATTRS{}= to match a parent device, in /lib/udev/rules.d/40-alsa-firmware-loaders.rules:15

udevd[345]: BUS= will be removed in a future udev version, please use SUBSYSTEM= to match the event device, or SUBSYSTEMS= to match a parent device, in /lib/udev/rules.d/40-alsa-firmware-loaders.rules:16

udevd[345]: SYSFS{}= will be removed in a future udev version, please use ATTR{}= to match the event device, or ATTRS{}= to match a parent device, in /lib/udev/rules.d/40-alsa-firmware-loaders.rules:16

/dev/sda1: clean, 80063/1221600 files, 553749/4883752 blocks (check in 3 mounts)
/dev/sda5: clean, 49534/1160992 files, 786663/4638760 blocks (check in 5 mounts)
udevd[345]: SYSFS{}= will be removed in a future udev version, please use ATTR{}= to match the event device, or ATTRS{}= to match a parent device, in /etc/udev/rules.d/70-persistent-lirc.rules:1

udevd[345]: SYSFS{}= will be removed in a future udev version, please use ATTR{}= to match the event device, or ATTRS{}= to match a parent device, in /etc/udev/rules.d/70-persistent-lirc.rules:2

init: ureadahead-other main process (616) terminated with status 4
lircd-0.8.6[649]: lircd(default) ready, using /va...

Read more...

Revision history for this message
Oscar Forsström (biggles) wrote :

runlevel
N 2

 initctl list
alsa-mixer-save stop/waiting
avahi-daemon start/running, process 753
mountall-net stop/waiting
nmbd start/running, process 995
rc stop/waiting
rsyslog start/running, process 697
tty4 start/running, process 794
udev start/running, process 353
upstart-udev-bridge start/running, process 343
ureadahead-other stop/waiting
apport stop/waiting
console-setup stop/waiting
hwclock-save stop/waiting
irqbalance stop/waiting
plymouth-log stop/waiting
smbd start/running, process 683
tty5 start/running, process 797
atd start/running, process 811
dbus start/running, process 737
failsafe-x stop/waiting
plymouth stop/waiting
ssh start/running, process 979
control-alt-delete stop/waiting
hwclock stop/waiting
module-init-tools stop/waiting
cron start/running, process 810
mountall stop/waiting
acpid start/running, process 809
plymouth-stop stop/waiting
rcS stop/waiting
ufw start/running
mounted-varrun stop/waiting
rc-sysinit stop/waiting
tty2 start/running, process 801
udevtrigger stop/waiting
mounted-dev stop/waiting
tty3 start/running, process 802
udev-finish stop/waiting
hostname stop/waiting
mountall-reboot stop/waiting
mountall-shell stop/waiting
mounted-tmp stop/waiting
network-interface (lo) start/running
network-interface (eth0) start/running
plymouth-splash stop/waiting
tty1 start/running, process 997
udevmonitor stop/waiting
dmesg stop/waiting
network-interface-security start/running
networking stop/waiting
procps stop/waiting
tty6 start/running, process 804
ureadahead stop/waiting

Revision history for this message
Oscar Forsström (biggles) wrote :

And now, after another restart, I get: runlevel: unknown

Revision history for this message
Pascal W. (info-p-development) wrote :

Runlevel unknown, initctl see attachement.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote : Re: [Bug 543506] Re: Some services not started on boot, runlevel returns "unknown"

On Mon, 2010-05-17 at 17:41 +0000, Oscar Forsström wrote:

> runlevel
> N 2
>
Was this a working boot?

Scott
--
Scott James Remnant
<email address hidden>

Revision history for this message
Oscar Forsström (biggles) wrote :

No, it wasn't. For me it's often like this.

1. Non-working boot with runlevel N 2.
2. Reboot. Non-working boot. runlevel unknown.
3. Reboot. Non-working boot. runlevel unknown.
4. Reboot. Working boot. Runlevel N 2

Revision history for this message
Christoph A. (ceage) wrote :

Problem occured after 10.04 upgrade...
The details:

$ cat /var/log/boot.log
fsck from util-linux-ng 2.17.2
fsck from util-linux-ng 2.17.2
/dev/mapper/Odin-root: clean, 113846/29908992 files, 3292621/119608320 blocks
/dev/sda5: clean, 33/124496 files, 32040/248976 blocks
init: ureadahead-other main process (875) terminated with status 4

Attached is a file containing two initctl list outputs, the first one after booting into runlevel unknown and the second after a working boot.

However, on my system, each time after reinstalling the resovconf package via sudo apt-get install --reinstall resolvconf, I ended up with a working boot (it was not installed any more after the upgade). There is also a bug/question with this package under Bug #488945.

Don't know if this helps at all...

Revision history for this message
Laurent Van Basselaere (lvanbass) wrote :

First boot of the morning: apache2 not started

lvanbass@navets:~$ runlevel
unknown

lvanbass@navets:~$ cat /var/log/boot.log
fsck from util-linux-ng 2.17.2
/dev/sda1: clean, 182097/915712 files, 980283/3661824 blocks
fsck from util-linux-ng 2.17.2
/dev/sda6: clean, 85816/1917600 files, 3940320/7670272 blocks

lvanbass@navets:~$ initctl list
alsa-mixer-save stop/waiting
avahi-daemon start/running, process 787
mountall-net stop/waiting
rc stop/waiting
rsyslog start/running, process 758
tty4 stop/waiting
udev start/running, process 325
upstart-udev-bridge start/running, process 323
ureadahead-other stop/waiting
apport stop/waiting
console-setup stop/waiting
hwclock-save stop/waiting
irqbalance stop/waiting
plymouth-log stop/waiting
tty5 stop/waiting
atd stop/waiting
dbus start/running, process 773
failsafe-x stop/waiting
plymouth stop/waiting
control-alt-delete stop/waiting
hwclock stop/waiting
network-manager start/running, process 782
module-init-tools stop/waiting
cron stop/waiting
gdm start/running, process 778
mountall stop/waiting
acpid stop/waiting
plymouth-stop stop/waiting
rcS stop/waiting
ufw start/running
mounted-varrun stop/waiting
rc-sysinit stop/waiting
anacron stop/waiting
tty2 stop/waiting
udevtrigger stop/waiting
mounted-dev stop/waiting
tty3 stop/waiting
udev-finish stop/waiting
hostname stop/waiting
mountall-reboot stop/waiting
mountall-shell stop/waiting
mounted-tmp stop/waiting
network-interface (lo) start/running
network-interface (eth0) start/running
network-interface (wlan0) start/running
plymouth-splash stop/waiting
tty1 stop/waiting
udevmonitor stop/waiting
dmesg stop/waiting
network-interface-security start/running
networking stop/waiting
procps stop/waiting
tty6 stop/waiting
ureadahead stop/waiting

Revision history for this message
Sean Mc Allister (sean-gmx) wrote :

I reconfigured my network interface to use static settings and seem to get reliable boots now.

Revision history for this message
Christoph A. (ceage) wrote :

All my network interfaces were configured static by default, still I get the 'runlevel unknown'. There might be different problems around, each resulting in a 'runlevel unknown'.
As I posted already, for it looks like reinstalling of the reloveconf-package each time really helps. Haven't had a single runlevel failure after doing so. Figuring out what happens during the (re)install of the resolvconf-package might help fixing the runlevel-problem...

26 comments hidden view all 106 comments
Revision history for this message
Michael Goetze (mgoetze) wrote :

Looking at the ttyX jobs as well as xinetd (a service started by a legacy initscript), I've so far seen all of the following combinations:
- None started
- xinetd started, but none of the ttys
- both xinetd and all the ttys started
If this is caused by a race condition, then network-manager is not involved, as that is not installed on my servers.

Revision history for this message
Mike Bianchi (mbianchi-foveal) wrote : Re: <OK> [Bug 543506] Re: Some services not started on boot, runlevel returns "unknown"

On Tue, Jun 01, 2010 at 03:49:30PM -0000, Michael Goetze wrote:
> Looking at the ttyX jobs as well as xinetd (a service started by a legacy initscript), I've so far seen all of the following combinations:
> - None started
> - xinetd started, but none of the ttys
> - both xinetd and all the ttys started
> If this is caused by a race condition, then network-manager is not involved, as that is not installed on my servers.

Look at earlier comments in this Bug about commenting out "console output" in /etc/init/*.conf files.

--
 Mike Bianchi
 Foveal Systems

 973 822-2085

 <email address hidden>
 http://www.AutoAuditorium.com
 http://www.FovealMounts.com

Revision history for this message
Christoph A. (ceage) wrote :

Thanks Mike, got so far reliable boots with your workaround! By forcing ureadahead to reprofile on system boot I was somehow avoiding that race condition.

Revision history for this message
Francisco.X (fxbeta) wrote :

Same problem here. Runlevel "unknown", failed to start most services. /var/log/boot.log:

fsck from util-linux-ng 2.17.2
fsck from util-linux-ng 2.17.2
/dev/sdb1: clean, 151983/499968 files, 714882/1999872 blocks
/dev/sdb3: clean, 1183/29900800 files, 101518470/119596544 blocks
init: ureadahead-other main process (625) terminated with status 4

This is on a single core Celeron, fresh 10.04 install + apt-get upgrade. Posting in response to Mike's question #47 wondering if it only happens on multi-core machines.

I've commented out the "console output" in /etc/init/*.conf files and will reboot when I am able to. Hopefully that will help until a proper fix is rolled out. Any updates from the devs on a real fix?

Revision history for this message
Brian J. Murrell (brian-interlinx) wrote : Re: [Bug 543506] Re: Some services not started on boot, runlevel returns "unknown"

Why have the developer(s) gone totally silent on this issue? Surely
there is enough data in this report now to understand the problem(s) and
issue (a) fix(es), yes?

Revision history for this message
Francisco.X (fxbeta) wrote :

Sigh. I commented out the "console output" line in the four files in /etc/init that had it:

/etc/init: grep "console output" *
mountall.conf:#console output
rc.conf:#console output
rc-sysinit.conf:#console output
ufw.conf:#console output

I have rebooted three times in a row and every time it's the same runlevel unknown. So that wasn't a fix.

Other bits of info (in case they help):

- this machine has several firewire and usb hard drives attached. the OS is on an internal SATA drive.
- this machine has two network interfaces, one is static IP, the other dynamic (DSL - this is my firewall)

 I just installed the OS a few days ago and I don't think I've had a successful boot even once. I'll give this a few more days before I just blow away 10.04 and reinstall 08.04 which took longer to boot but was rock solid. In the meantime, I could do some tests on Saturday and try to dig up more clues. It does appear my case is 100% repeatable at the moment.

Revision history for this message
Mike Bianchi (mbianchi-foveal) wrote : Re: <OK> [Bug 543506] Re: Some services not started on boot, runlevel returns "unknown"

On Thu, Jun 03, 2010 at 02:36:14PM -0000, Francisco.X wrote:
> Sigh. I commented out the "console output" line in the four files in
> /etc/init that had it:
>
> /etc/init: grep "console output" *
> mountall.conf:#console output
> rc.conf:#console output
> rc-sysinit.conf:#console output
> ufw.conf:#console output
>
> I have rebooted three times in a row and every time it's the same
> runlevel unknown. So that wasn't a fix.
> :

Look at /var/log/boot.log and /var/log/syslog after a
 runlevel unknown boot and see if there are any error messages labeled
"init:"

Also, my fix included making /etc/init/rc-sysinit.conf sensitive to
 rsyslog .

 start on ( filesystem
  and started rsyslog
  and net-device-up IFACE=lo
 )

Please try that.

If it still fails, attach /var/log/boot.log and /var/log/syslog to this
Bug.

Revision history for this message
Francisco.X (fxbeta) wrote :

/var/log/boot.log:
init: ureadahead-other main process (635) terminated with status 4

There are a number of init messages in /var/log/syslog from my last reboot, related to ssh respawning too fast. I'll double check the differences between the default ssh config files and my edited version to see if there's anything there.

I have made the change to rc-sysinit.conf and will try rebooting tonight and see how it goes. Thank you.

Revision history for this message
Mike Bianchi (mbianchi-foveal) wrote :

On Thu, Jun 03, 2010 at 03:32:25PM -0000, Francisco.X wrote:
> /var/log/boot.log:
> init: ureadahead-other main process (635) terminated with status 4
>
> There are a number of init messages in /var/log/syslog from my last
> reboot, related to ssh respawning too fast. I'll double check the
> differences between the default ssh config files and my edited version
> to see if there's anything there.
>
> I have made the change to rc-sysinit.conf and will try rebooting tonight
> and see how it goes. Thank you.

There seems to be some sort of issue with ureadahead :
 https://bugs.launchpad.net/ubuntu/+source/ureadahead/+bug/501715

You might want to read that bug and see if there is a related issue there.
(I know nothing about it, except that we have seen it mentioned in comments
#28 and #69).

Revision history for this message
Francisco.X (fxbeta) wrote :

I have been reading about ureadahead today as well. I was already thinking of trying to disable it to see if it helps, but I'll try the reboot with the tweaked rc-sysinit.conf first, so if things work I at least will know which of the two variables made the difference. Thanks.

Revision history for this message
Francisco.X (fxbeta) wrote :

No luck. I've tried the suggested rc-sysinit.conf modifications. Tried other modifications suggested elsewhere. Tried disabling ureadahead. Tried several other things and I still get the same thing on every reboot. I haven't tried unplugging all the USB and firewire drives and/or the second network card, but since those are the main functions of this particular computer it wouldn't be a satisfactory solution for me anyway. Wiping the OS partition and reinstalling 08.04 seems like the most pain free way to go.

In the old init.d system, it was easy for me to add something like an echo or syslog message to any script to know when it was being executed. How can I do the same so I can know if /etc/init/rc-sysinit.conf is even run during boot?

Revision history for this message
Francisco.X (fxbeta) wrote :

I'm attaching a syslog from one of the earlier bad bootups. The /var/log/boot.log file has been empty since I disabled ureadahead.

Revision history for this message
Mike Bianchi (mbianchi-foveal) wrote :

On Fri, Jun 04, 2010 at 02:29:49AM -0000, Francisco.X wrote:
> :
> In the old init.d system, it was easy for me to add something like an
> echo or syslog message to any script to know when it was being executed.
> How can I do the same so I can know if /etc/init/rc-sysinit.conf is even
> run during boot?

See Bug #581291.

To get entries in /var/log/boot.lob I've been running with

    init='/sbin/init --verbose'

added to the linux /boot/vmlinuz/... line in /boot/grub/grub.cfg .
Modifying /etc/default/grub is one way to do that.

Revision history for this message
Olivier Debon (olivier-debon) wrote :

Happens to me as well from time to time, really unpredictable. But failure is related to console existence. This also sometimes make X fail (NVidia card here).
What I could diff from syslog between a successful startup and a no runlevel situation is that only the 2 following messages are missing in the failure case:
- lp: driver loaded but no devices found
- Console: switching to colour frame buffer device 80x30
Sure that lp driver has nothing to do here.

I'll add suggested verbose init option above and come back with log if that happens again.

Revision history for this message
Mike Bianchi (mbianchi-foveal) wrote : Re: [Bug 543506] Re: Some services not started on boot, runlevel returns "unknown"

On Wed, Jun 09, 2010 at 01:06:20PM -0000, Olivier Debon wrote:
> What I could diff from syslog between a successful startup and a no runlevel situation is that only the 2 following messages are missing in the failure case:

For many of us, the solution is already in this Bug report. Namely:

 From comment #54
Comment out "console output" in all /etc/init/*.conf files that
have it.

 From comment #73
Add "started rsyslog" to the "start on" stanza in /etc/init/rc-sysinit.conf:
  start on ( filesystem
   and started rsyslog
   and net-device-up IFACE=lo
  )

 From comment #79
Add "init='/sbin/init --verbose'" to the linux boot line in grub.

Folks:
Have any of you found this insufficient? Was there anything else you had
to do to achieve reliable boots?

I can imagine that items in /etc/init.d/* and/or /etc/rc2.d/* that attempt
to write to /dev/console could get caught in the race condition between
/dev/console becoming writable and them running.

To my mind there should be an event when /dev/console _is_ writable,
but I don't know if one exists or how to make it happen.

--
 Mike Bianchi
 Foveal Systems

 973 822-2085

 <email address hidden>
 http://www.AutoAuditorium.com
 http://www.FovealMounts.com

Revision history for this message
Blas Lopez (blas) wrote :

I have this problem too and I notice that every time the init scripts fails I get this lines in the dmesg:

[ 0.000000] HPET: 4 timers in total, 0 timers will be used for per-cpu timer
[ 0.000000] Fast TSC calibration failed
[ 0.000000] TSC: Unable to calibrate against PIT
[ 0.000000] TSC: using HPET reference calibration
[ 0.000000] Detected 1994.959 MHz processor.
....
[ 0.514332] tty tty28: hash matches
....
[ 0.849208] ATL1E 0000:09:00.0: enabling device (0000 -> 0003)
[ 0.849490] ATL1E 0000:09:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[ 0.849689] ATL1E 0000:09:00.0: setting latency timer to 64
....
[ 46.072567] ov519: Can't determine sensor slave IDs
[ 46.072570] ov519: OV519 Config failed
[ 46.072584] ov519: probe of 2-4.4:1.0 failed with error -16

Revision history for this message
Mike Bianchi (mbianchi-foveal) wrote :

On Wed, Jun 09, 2010 at 01:06:20PM -0000, Olivier Debon wrote:
> What I could diff from syslog between a successful startup and a no runlevel situation is that only the 2 following messages are missing in the failure case:

For many of us, the solution is already in this Bug report. Namely:

 From comment #54
Comment out "console output" in all /etc/init/*.conf files that
have it.

 From comment #73
Add "started rsyslog" to the "start on" stanza in /etc/init/rc-sysinit.conf:
         start on ( filesystem
          and started rsyslog
          and net-device-up IFACE=lo
         )

 From comment #79
Add "init='/sbin/init --verbose'" to the linux boot line in grub.

Folks:
Have any of you found this insufficient? Was there anything else you had
to do to achieve reliable boots?

I can imagine that items in /etc/init.d/* and/or /etc/rc2.d/* that attempt
to write to /dev/console could get caught in the race condition between
/dev/console becoming writable and them running.

To my mind there should be an event when /dev/console _is_ writable,
but I don't know if one exists or how to make it happen.

Revision history for this message
Martin West (martin-objectgizmos) wrote :

Well I added changes for #54, #73 and #79 but no joy - still get runlevel unknown

I have intel i3 machine with 4GB.

I normally dont complain about free software but the ubuntu quality control sucks. The upgrade from 8 to 9 was a disaster, the upgrade from 9 to 10 was a disaster so I did a fresh install and now just when I thought I had my system working again I have this problem.

Revision history for this message
Martin West (martin-objectgizmos) wrote :

boot.log and syslog

Revision history for this message
Martin West (martin-objectgizmos) wrote :

If I follow the instructions in #50

initctl emit filesystem [have to C-C to end]

initctl emit net-device-up IFACE=lo & [corrected typo]

I then get runlevel N 2

What a mess - sigghhhh

Revision history for this message
Jimmy Xu (jimmyxu) wrote :

So many reproductions, marking this as confirmed.

Changed in upstart (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
jdobry (jdobry) wrote :

I have one more reproduction.

Revision history for this message
Sepero (cowpie2000) wrote :

I also have this problem. Thanks Mike for post #81 and #83. Will try.

Revision history for this message
lavda (lavda) wrote :

I am also still having the problem after commenting all instances of "console output" in /etc/init/*. Replacing "IFACE=lo" by "IFACE=eth0" in /etc/init/rc-sysinit.conf (as suggested in a few posts) did not fix it either. The only way I can get the system to display the correct runlevel and have my services started correctly is by removing completely the " and net-device-up IFACE=lo" condition from "start" in /etc/init/rc-sysinit.conf (ie: start on filesystem and started rsyslog) and commenting all instances of "console output". Probably not a good idea but it works for now. Hope an adequate fix is coming soon.

Revision history for this message
Olivier Debon (olivier-debon) wrote :

Finally happened after 10 days (1 boot per day btw).
corresponding syslog attached.

Revision history for this message
Joel Kelley (cmdrfierce-deactivatedaccount) wrote :

https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/554172?comments=all <--- This bug was initially classified as a CUPS bug, but it has been moved to an Upstart bug.

It is very similar to this bug. I am posting a link here since there may be more useful information there.

To summarize what I observe in both bugs:
-services sometimes do not start on boot. irqbalance, winbindd, cupsd are the major ones. runlevel command will report 'unknown'
-on some boots, all services will start ok. runlevel command will report 'N 2'
-if you think you have it fixed, boot 10 times at least to see if you can get it to fail.
-if you have a bad boot, 'sudo start rc-sysinit' seems to start the services that did not start.

Revision history for this message
Olaf Leidinger (leidola) wrote :

My bug is also related/the same:
https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/584855

There is also some information here.

Revision history for this message
Graham T (grahamt-manichostingservices) wrote :

Adding additional confirmation.

Clean Lucid 10.04 Desktop 32-bit install
2.6.32-22-generic
All updates applied

I noticed proftpd and cupsd not starting up on some system restarts. I did over 50 restarts and there are two absolutely consistent results:

a. On the restarts when proftpd and cupsd start up successfully the output of the
runlevel command looks normal i.e.

graham@gt-desktop:~$ runlevel
N 2

b. On the restarts when proftpd and cupsd do not start up the output of the
runlevel command is not normal, it gives this:

graham@gt-desktop:~$ runlevel
unknown

The services can be started manually fine after startup in situation b.

Revision history for this message
Daniel Bernal Yhama (dbernal-yhama) wrote :

I found that the problem of runlevel Unknown refers to a problem with /etc/network/interfaces look and if you can't find this file you maybe need to install it using "aptitude --install ifupdown_0.6.8ubuntu29_amd64.deb" to force to rewrite and create mising files download from http://packages.ubuntu.com/es/lucid/amd64/ifupdown/download
this file define eth0 and lo interfaces.
excuse for my english.

Revision history for this message
Lynnos (lynnos) wrote :

I try to PXE boot ubuntu 10.04,it shows runlevel is unknown

initctl list:

mountall-net stop/waiting
rc stop/waiting
rsyslog start/running
test stop/waiting
tty4 stop/waiting
udev start/running, process 4296
upstart-udev-bridge start/running, process 4291
console-setup stop/waiting
hwclock-save stop/waiting
plymouth-log stop/waiting
tty5 stop/waiting
plymouth stop/waiting
ssh start/running, process 4298
control-alt-delete stop/waiting
hwclock stop/waiting
network-manager stop/waiting
module-init-tools stop/waiting
cron stop/waiting
mountall stop/waiting
plymouth-stop stop/waiting
rcS stop/waiting
mounted-varrun stop/waiting
rc-sysinit stop/waiting
tty2 stop/waiting
udevtrigger stop/waiting
mounted-dev stop/waiting
tty3 stop/waiting
udev-finish stop/waiting
hostname stop/waiting
mountall-reboot stop/waiting
mountall-shell stop/waiting
mounted-tmp stop/waiting
network-interface (lo) start/running
network-interface (eth0) start/running
network-interface (eth1) start/running
plymouth-splash stop/waiting
tty1 stop/waiting
udevmonitor stop/waiting
dmesg stop/waiting
network-interface-security start/running
procps stop/waiting
tty6 stop/waiting
ttyS0 stop/waiting

if I change I try to PXE boot ubuntu 10.04,it shows runlevel is unknown

initctl list:

mountall-net stop/waiting
rc stop/waiting
rsyslog start/running
test stop/waiting
tty4 stop/waiting
udev start/running, process 4296
upstart-udev-bridge start/running, process 4291
console-setup stop/waiting
hwclock-save stop/waiting
plymouth-log stop/waiting
tty5 stop/waiting
plymouth stop/waiting
ssh start/running, process 4298
control-alt-delete stop/waiting
hwclock stop/waiting
network-manager stop/waiting
module-init-tools stop/waiting
cron stop/waiting
mountall stop/waiting
plymouth-stop stop/waiting
rcS stop/waiting
mounted-varrun stop/waiting
rc-sysinit stop/waiting
tty2 stop/waiting
udevtrigger stop/waiting
mounted-dev stop/waiting
tty3 stop/waiting
udev-finish stop/waiting
hostname stop/waiting
mountall-reboot stop/waiting
mountall-shell stop/waiting
mounted-tmp stop/waiting
network-interface (lo) start/running
network-interface (eth0) start/running
network-interface (eth1) start/running
plymouth-splash stop/waiting
tty1 stop/waiting
udevmonitor stop/waiting
dmesg stop/waiting
network-interface-security start/running
procps stop/waiting
tty6 stop/waiting
ttyS0 stop/waiting

I do a little change in /etc/init/rc-sysinit.conf

from
start on filesystem and net-device-up IFACE=lo

to
start on filesystem

runlevel is N 2

is there any final solution for this issue?

Revision history for this message
Nick Davis (argoneus) wrote :

I'm seeing this issue as well. I don't see the "unknown" when running runlevel, but several services aren't starting up automatically.

Any word on a fix for this?

Revision history for this message
Nick Davis (argoneus) wrote :

I should add to my previous comment that I've tried some of the workarounds mentioned in this thread, including commenting out "console output" in all /etc/init/*.conf files. This had no effect, and services such as lighttpd aren't automatically starting that should be.

Revision history for this message
Mike Bianchi (mbianchi-foveal) wrote : Re: <OK> [Bug 543506] Re: Some services not started on boot, runlevel returns "unknown"

On Wed, Jun 30, 2010 at 07:16:47PM -0000, phireph0x wrote:
> I should add to my previous comment that I've tried some of the
> workarounds mentioned in this thread, including commenting out "console
> output" in all /etc/init/*.conf files. This had no effect, and services
> such as lighttpd aren't automatically starting that should be.

You said "some of the workarounds".

I found that -ALL THREE- specific workarounds spelled out in
https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/81
were necessary for success, at least in my circumstance.

Did you try that specific combination?

Revision history for this message
John Edwards (john-cornerstonelinux) wrote :

I've upgraded about 20 machines from Ubuntu 8.04 to 10.04 and have only seen this problem on one machine.

That was on a old Pentium 3 machine running a "linux-generic" kernel on a 32-bit install. The other machines which do not show this problem were either running a 64-bit install or a "linux-generic-pae" kernel.

So has anyone seen it on a machine which is not a 32-bit install running a "generic" (not "generic-pae") kernel?

Revision history for this message
didit21 (didit21) wrote :

> So has anyone seen it on a machine which is not a 32-bit install running a
> "generic" (not "generic-pae") kernel?

I have an amd64 install here with the same problem. So it's not specific to 32-bit systems.

Revision history for this message
Tony McConnell (amnw14545) wrote :

We've just been bitten by this, on a delivery of 20 machines to site; it does seem to be fixed by #99, fortunately - I think it must be pretty widespread, also happens on my mini-311c at home, which is up2date with lucid-updates.

Must give one hell-of-a-bad user experience to the user-base though!

Revision history for this message
scottmuz (scottmuz) wrote :

This bug must be the worst I've come across in my 6-7 years with ubuntu.

Some services not starting is an absolute show stopper security issue that
must surely be treated with the utmost urgency.

Revision history for this message
Rômulo de Barros Correia Jales (romulo-romulojales) wrote :

I also had this problem. But the bug was in the script I created to automate the process of customization[1]. I don't remove the link /sbin/initctl when exit from chroot as described [2]. I resolved my problem by removing this link.

[1] - https://help.ubuntu.com/community/LiveCDCustomization
[2] - https://help.ubuntu.com/community/LiveCDCustomization#Cleanup

Revision history for this message
gdowle (garb-dowle) wrote :

@Mike Bianchi

I just wanted to thank you very much. In the last time I was struggling hard with boot hangs on all our Ubuntu 10.04 Servers, and after I followed your advice and commented out the "console output" in /etc/init/*.conf, our servers are booting normally. Finally! I hope I won't get into any problems with that thing called "upstart" in future any more...

Revision history for this message
Michael Gorven (mgorven) wrote :

I ran into this on a machine upgraded from Karmic to Lucid. The problem was that the ifupdown package hadn't been upgraded (it was on 0.6.8nmu+1 IIRC) and the upstart init files for networking weren't in /etc/init (which meant that lo was never started and so rc-sysvinit was never started). Installing Lucid's ifupdown package fixed the problem.

Displaying first 40 and last 40 comments. View all 106 comments or add a comment.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.