Ubuntu

system services using "console output" not starting at boot

Reported by Mike Basinger on 2010-04-02
This bug affects 244 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Unassigned
Lucid
High
Andy Whitcroft
Maverick
High
Andy Whitcroft
upstart (Ubuntu)
Medium
Scott James Remnant (Canonical)
Lucid
Medium
Scott James Remnant (Canonical)
Maverick
Medium
Scott James Remnant (Canonical)

Bug Description

Binary package hint: cups

Cups is not loading on my machine at boot, must run sudo /etc/init.d/cups start to after booting to print.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: cups 1.4.2-10
ProcVersionSignature: Ubuntu 2.6.32-19.28-generic 2.6.32.10+drm33.1
Uname: Linux 2.6.32-19-generic i686
NonfreeKernelModules: nvidia
Architecture: i386
Date: Fri Apr 2 13:07:35 2010
InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Alpha i386 (20100401)
Lpstat: Error: command ['lpstat', '-v'] failed with exit code 1: lpstat: Connection refused
MachineType: Dell Inc. Studio XPS 1340
Papersize: letter
PpdFiles: Brother-HL-2170W-series: Brother HL-2170W Foomatic/pxlmono (recommended)
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-19-generic root=UUID=615bbe85-506a-4152-af5a-a5c2da303d83 ro quiet splash
ProcEnviron:
 LANG=en_US.utf8
 SHELL=/bin/bash
SourcePackage: cups
dmi.bios.date: 09/08/2009
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A11
dmi.board.name: 0Y279R
dmi.board.vendor: Dell Inc.
dmi.board.version: A11
dmi.chassis.asset.tag: 1234567890
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.chassis.version: A11
dmi.modalias: dmi:bvnDellInc.:bvrA11:bd09/08/2009:svnDellInc.:pnStudioXPS1340:pvrA11:rvnDellInc.:rn0Y279R:rvrA11:cvnDellInc.:ct8:cvrA11:
dmi.product.name: Studio XPS 1340
dmi.product.version: A11
dmi.sys.vendor: Dell Inc.

Mike Basinger (mike.basinger) wrote :
Brock Riedell (kbrcbc) wrote :

I confirm the same behaviour ever since installing Lucid 10.04 beta 1 on a Lenovo ThinkPad T400. CUPS does not start up on booting; requiring a manual start with "sudo service cups start".

Same problem exists on Lucid 10.04 beta 1 (updated as of 4/11) running on a Zotac IONITX / Intel Atom 230. Printer is connected by USB and is turned off at boot time -- turning the printer on and rebooting does not help. After every reboot cups is not running.

Till Kamppeter (till-kamppeter) wrote :

Can you attach the file

/var/log/boot.log

Does the file /etc/rc.d/S50cups exist for you

ls -l /etc/rc2.d/S50cups

Changed in cups (Ubuntu):
status: New → Incomplete
Till Kamppeter (till-kamppeter) wrote :

Can you also have a look at bug 497299, perhaps your problem is the same.

Mike Basinger (mike.basinger) wrote :

Re-installing all cups-* packages fixed the problem for me. I will do a fresh install again for the Lucid RC release, to see if this is still an issue.

Mike Basinger (mike.basinger) wrote :
Mike Basinger (mike.basinger) wrote :

lrwxrwxrwx 1 root root 14 2010-04-01 10:23 /etc/rc2.d/S50cups -> ../init.d/cups

Changed in cups (Ubuntu):
status: Incomplete → Confirmed
status: Confirmed → New
Mike Basinger (mike.basinger) wrote :

The problem has returned.

None of the comments points me toward a solution.

runlevel
N 2

garry@Hiro:~$ ls -l /etc/rc2.d/S50cups
lrwxrwxrwx 1 root root 14 2010-04-01 13:49 /etc/rc2.d/S50cups -> ../init.d/cups

I have not tried downgrading Upstart yet --- if a race condition exists I'm guessing that it may be a race on disk i/o; I have a humdrum 1.6 GHz cpu / 3.2 GB memory much like many others (Zotac IONITX / Intel Atom 230 with Nvidea GPU) but maybe faster than average disk i/o (Intel X25 M 80GB SSD on SATA2). That could "reveal" a race condition that others do not see. The problem did not occur on this system until I installed this SSD and simultaneously installed the Lucid Beta-1.

I reinstalled all of cups with no joy. Cups does not run on boot and I can not get it to run following any of the advice that has been posted so far.

Here's a further clue:

exactly the same problem occurs on my personal computer --- IONITX with Intel Atom 330, 3.2 GB memory, _and_ an even faster SSD, the Intel X25E. CUPS worked fine on that computer under Karmic; it was upgraded to Lucid when Beta 1 was released and the cups daemon has been failing to start since then.

Good luck bug hunters! This might be a tough one. It is real. Find it!

p.s. of course if it IS caused by a dreaded race condition then of course it sometimes goes away when software is re-installed or any odd change is made that happens to change the timing of events at boot time.

pinch150g (rob-mcdole) wrote :

I am having the same problem. I have tried all of the above with no help.

Changed in cups (Ubuntu):
status: New → Confirmed
importance: Undecided → Medium

Thanks for your bug report.
All people with this problem please attach the file /var/log/apt/term.log

Changed in cups (Ubuntu):
status: Confirmed → Incomplete

Pascal,
Here is the requested log file. Since I originally reported that this
bug was affecting me, routine Lucid Beta updates have partially resolved
the problem for me -- i.e., on some reboots cups starts up fine; on
others I have to start it manually.

Regards
Brock Riedell
<email address hidden>

On 10-04-17 03:48 AM, Pascal De Vuyst wrote:
> Thanks for your bug report.
> All people with this problem please attach the file /var/log/apt/term.log
>
> ** Changed in: cups (Ubuntu)
> Status: Confirmed => Incomplete
>
>

Here's the requested term.log --- after several updates the problem went away on both of my machines and cups printing just worked, and then the updates of the last few days brought the problem back and I'm sad again :-<

After a reboot neither computer is running cups even though examining the init script setup shows that it should be running cups. This means for the moment that my wife generally can not print because she does not have sudo privilege and wouldn't know how to start a service anyway.

Nothing special in the term logs.
Do you get more information in /var/log/boot.log if you boot with "quiet splash" options removed as described here: https://help.ubuntu.com/community/Grub2#Editing%20Menus%20During%20Boot

Removed the "quiet splash" options from the grub config and rebooted. Now THAT looks like a linux system. :-D

boot.log is attached. There's nothing interesting there. cups is not mentioned in dmesg.

But I did grep for "cups" in all of my system logs for the past couple of days and was surprised to find that cupsd did segfault last night, apparently while or after I printed some xmonad/dzen documentation to the local networked printer (to read on my night shift :-)

In syslog.1 I find this entry:

Apr 24 21:44:10 Medium kernel: [18688.981257] cupsd[4938]: segfault at 0 ip 0027c7a0 sp bfebbb88 error 4 in libc-2.11.1.so[209000+153000]

Of course cups had been started manually on the command line on both the client and server computers :-(

21:44 would be about the moment that I shutdown my computer so I think that the segfault may have occurred during the shutdown process rather than during printing. I doubt that this segfault relates to this bug 554172 but it is the first error message I have seen regarding cups. The daemon has been silently failing to run at boot time.

The last time cups loaded at boot time I saw these messages in kern.log

kern.log.1:Apr 21 06:52:06 Medium kernel: [ 7.215772] type=1505 audit(1271850726.855:13): operation="profile_load" pid=934 name="/usr/lib/cups/backend/cups-pdf"
kern.log.1:Apr 21 06:52:06 Medium kernel: [ 7.216856] type=1505 audit(1271850726.859:14): operation="profile_load" pid=934 name="/usr/sbin/cupsd"

Since Apr 21 there are no such messages in kern.log about loading cupsd or cups-pdf. All that changed in my configuration after Apr 21 was routine updates. According to what I see in kern.log cupsd was loading properly on Apr 19, 20, and 21, but not before or after. cupsd managed to run during the window of 3 days before another update spoiled it again. (I update my system daily.)

Joshua Martin (jsmartin) wrote :

This problem occurs on a fresh install of Ubuntu 10.04 x86 final. Had to perform sudo /etc/init.d/cups start in order to install printer.

Changed in cups (Ubuntu):
status: Incomplete → Confirmed

I re-installed all the CUPS-* packages and this solved my problem. I don know why or for how long but it worked.

Mario Vukelic (mario-vukelic) wrote :

I upgraded a friend's system from Ubuntu 8.04 to 10.04 and the same problem occurs. I don't have access to that system right now, but I'll try the reinstall too. If any one is interested in any details I should capture before doing so, please let me know.

cdyring (clausdyring) wrote :

term.log

Tom Inglis (tominglis) wrote :

I can confirm that this issue affects me on Mythbuntu 10.04. I reinstalled the cups package using Synaptic, and it seems to have corrected the problem.

Nils Naumann (nau) wrote :

This is probably a dupe of bug #497299. In order to test it the runlevel has to be cheked, and if the runlevel is undefined, the file /etc/network/interfaces should be checked, if it contais the lines:

auto lo
iface lo inet loopback

The ceation of this file has solved the issue at my system.

Should not be caused by CUPS that CUPS does not get started.

affects: cups (Ubuntu) → upstart (Ubuntu)
Luca Aluffi (aluffilu) wrote :

For what is worth I've seen that cups service is not started mostly when there are too many spaces in the kernel's grub command line, e.g:
THIS WORKS: linux /vmlinuz-2.6.32-22-generic root=UUID=a96105c8-5ffd-4962-9f4e-93d87dc58aab ro quiet splash acpi_osi=Linux

THIS MOSTLY NOT (note multiple spaces between "ro" and "quiet"): linux /vmlinuz-2.6.32-22-generic root=UUID=a96105c8-5ffd-4962-9f4e-93d87dc58aab ro quiet splash acpi_osi=Linux

Luca Aluffi (aluffilu) wrote :

Uhm...

the other duplicate fellows ( https://bugs.launchpad.net/bugs/497299) told us to get back here, and they are right since their bug is for ifupdown.

Could someone remove the duplicate status from this bug?

Steve Langasek (vorlon) wrote :

Right, this is not a duplicate of bug #497299. Un-duping and reassigning back to cups - there's nothing at all here pointing at an upstart problem.

affects: upstart (Ubuntu) → cups (Ubuntu)
Zimmer (zimm-z) wrote :

FWIW . I WAS using the NOUVEAU video driver for my NVIDIA card.

I have installed the NVIDIA proprietary driver and have spent last 2 hours rebooting and booting from cold. CUPS now starting at boot ! I will continue to monitor the situation.

I also have this problem. I would like to add that it's intermittent -- sometimes I will boot up and CUPS is started. This post in the forums describes it well: http://swiss.ubuntuforums.org/showthread.php?p=9281178

E. Lewis (ed-lewis) wrote :

For what it is worth, I have same problem on my 32-bit Lucid machine. I do not have he problem on my 64-bit Lucid machine. CUPS starts properly on the 64-bit. It starts intermittently on boot on the 32-bit. As observed by others "sudo service cups start" starts CUPS on the 32-bit after booting.

Diez B. Roggisch (deets-web) wrote :

I can add that on my dad's machine (upgraded from 8.04 to 10.04 yesterday, 32bit) the problem happens with an hplib-based printer when the printer is *not* turned on on boot.

If it is turned on, cups is started + a HP-logo appears in the gnome sys tray.

"sudo service cups start" works as well of course. Then no HP-logo appears, but printing is ok.

I did a brand new install of 10.04 on this laptop and am affected by the same bug. Manual start works, but everything else I've tried doesn't (including attempting to restart cups in local.start -- it was worth a shot).

For a seasoned Linux user like me this is no big deal. For my wife who is new to Linux (and whom does not have wheel access on this device to prevent her from doing anything potentially destructive), it's a deal-breaker to have to call me over every time she wants to print something.

Paul Done (pkdone-hotmail) wrote :

I'm on brand new 10.04 install. Sometimes cupsd starts for me, sometimes it doesn't. However, since then I've realised that there are other jobs also intermittently/randomly starting or not starting. As a result, perhaps this is not cups specific and is a general problem as highlighted at https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506 ?

Sorry about the dupicate state, I put it on all the cups not starting bugs on lucid after reading someone having it running at boot by only creating the loopback interface.

Now with loopback interface at default in ifupdown seems that we are still lacking cups at boot.

Zimmer (zimm-z) wrote :

Reporting Back!!
CUPS still intermittent... will NOT usually start from cold boot but is more likely to start from RESTART ...

fig_wright (fig-wright) wrote :

I have this problem also. Can start cups manually. Also cups starts automatically manually by running telinit!

 ~ > sudo runlevel
Unknown

 ~ > sudo telinit 2
 ~ > sudo runlevel
N 2

~ > sudo initctl list

Warning: Fake initctl called, doing nothing.

~ > service --status-all
 [ ? ] acpi-support
 [ ? ] acpid
 [ ? ] alsa-mixer-save
 [ ? ] anacron
 [ - ] apparmor
 [ ? ] apport
 [ ? ] atd
 [ ? ] avahi-daemon
 [ ? ] binfmt-support
 [ - ] bluetooth
 [ - ] bootlogd
 [ - ] brltty
 [ ? ] console-setup
 [ ? ] cron
 [ ? ] cryptdisks
 [ ? ] cryptdisks-early
 [ ? ] cryptdisks-enable
 [ ? ] cryptdisks-udev
 [ + ] cups
 [ ? ] dbus
 [ ? ] dmesg
 [ ? ] dns-clean
 [ ? ] ecryptfs-utils-restore
 [ ? ] ecryptfs-utils-save
 [ ? ] failsafe-x
 [ - ] fancontrol
 [ ? ] gdm
 [ ? ] gkrellmd
 [ - ] grub-common
 [ ? ] hostname
 [ ? ] hwclock
 [ ? ] hwclock-save
 [ ? ] irqbalance
 [ - ] kerneloops
 [ ? ] killprocs
 [ - ] lm-sensors
 [ ? ] module-init-tools
 [ ? ] network-interface
 [ ? ] network-interface-security
 [ ? ] network-manager
 [ ? ] networking
 [ ? ] nmbd
 [ ? ] ondemand
 [ ? ] pcmciautils
 [ ? ] plymouth
 [ ? ] plymouth-log
 [ ? ] plymouth-splash
 [ ? ] plymouth-stop
 [ ? ] pppd-dns
 [ ? ] procps
 [ + ] pulseaudio
 [ ? ] rc.local
 [ - ] rsync
 [ ? ] rsyslog
 [ - ] saned
 [ ? ] screen-cleanup
 [ ? ] sendsigs
 [ ? ] smbd
 [ ? ] speech-dispatcher
 [ ? ] stop-bootlogd
 [ ? ] stop-bootlogd-single
 [ ? ] ubiquity
 [ ? ] udev
 [ ? ] udev-finish
 [ ? ] udevmonitor
 [ ? ] udevtrigger
 [ ? ] ufw
 [ ? ] umountfs
 [ ? ] umountnfs.sh
 [ ? ] umountroot
 [ ? ] unattended-upgrades
 [ - ] urandom
 [ ? ] wpa-ifupdown
 [ - ] x11-common

polarbeardog2 (polarbeardog-2) wrote :

I can confirm that I have had this issue from just before upgrading to 10.04 from 9.10. I am able to get printing functionality back when I run 'sudo /etc/init.d/cups start' but this needs to be done every time the computer is booted. I have also run 'sudo update-rc.d cups enable' and the service was confirmed added but absent from the subsequent boot.

Corrado (corrado-zanella) wrote :

Same intermittent problem.
'sudo /etc/init.d/cups start' works

cdyring (clausdyring) wrote :

I also have the problem using an atom cpu and ssd harddisk.

I am using two usb printers, attached via a usb hub.

Also, the printer trouble shooting wizard in the printer administration app detects that cups is not running, but it then the directs me to a menu item in the system menu which does not seem to be there... system->administration->services .. this app should supposedly allow the user to start the service. Maybe it is the locale translaation that is bad, but I can not see any such menu item.

Sven Loncaric (sven-loncaric) wrote :

I also have the same problem running 10.04 on Thinkpad T400 with Intel X25M SSD.
(I did upgrade to 10.04 from 9.10, not a fresh install.)

cupsd does not start automatically. I can start it manually with /etc/init.d/cups start.

vboxdrv also does not start automatically - I can manually start it with /etc/init.d/vboxdrv start.

There is nothing special in /var/log/boot.log. cups is not mentioned in any of the log files in /var/log.

piviul (piviul) wrote :

I have also the same problem on a atom ion330 running 10.04 (Is not an upgrade, is a fresh installation).

Sometimes cups start automatically but normally doesn't. Furthermore even freepops doesn't start automatically even if is installed correctly...

In another atom ion330 after the upgrade from 9.10 to 10.04 I have a similar trouble. Normally all seems to works well but sometimes amule-daemon doesn't start automatically and sometimes even the audio doesn't start (I have find no service to restart to have it working again, I have to restart the current session to have the audio works).

In both cases seems logs doesn't help or better didn't help me.

In my opinion the importance of this bug have to be changed in high not medium!

Have a great day

Piviul

Kai Krakow (hurikhan77) wrote :

I can confirm this. Upgraded a production machine from 9.10 to 10.04 since Lucid is marked as LTS and I thought it would be nice step to a long-lasting stable system. Now different instabilities show up, one of it being that cups sometimes starts on boot, sometimes not. I do not always have access to this machine and it's a show stopper. The printer needs to be used to print delivery forms and invoices for a food delivery service and they call me every second or third day that the system does not print again. The system is shut down every evening and booted every late morning. This needs to be fixed asap.

Kai Wollweber (wollw) wrote :

I set this bug as duplicate. The workaround from yannubuntu solved the problem for my
ubuntu 10.04 LTS (Linux version 2.6.32-21-generic-pae)
Missing /etc/init/cups.conf was the cause for not starting cups at startup
https://bugs.launchpad.net/ubuntu/+source/cups/+bug/524186/comments/2

spider (spidersoft) on 2010-05-22
description: updated
Zimmer (zimm-z) wrote :

Tried the workaround mentioned in bug 524186.

Worked for next boot.... failed on next two subsequent cold boots...
I must try to find time to investigate the workings of upstart, from what I have read so far I am guessing either the CUPS start process has not been fully ported and fails if certain 'events' occur/do not occur (maybe along with other services that I have not noticed because I do not use them..) or.. at some point upstart is provided with a NULL runlevel argument and skips processing subsequent events.

A thought.. what if there are many more instances of CUPS not starting but not reported as the users have no printer and have not noticed..

Kai Wollweber (wollw) wrote :

After Zimmer and other did not succeed with the /etc/init/cups.conf file I observed:
(1) With /etc/init.conf cupsd starts at reboot with ppid 1 and pid 882
(2) Without /etc/init.conf cupsd starts at reboot with ppid 1 and pid 1020
At least condition (2) does not guarantee that cups will start.

I am not at the location of the server. Therefore I cannot test a cold boot.

Zimmer (zimm-z) wrote :

Compared initctl list output of a NON CUPS boot with one that worked and found several other items that appear to fail to start with it.
tty4 tty5 atd cron acpid cups tty2 tty3 tty1 tty6

Will attach file zimmcups.ods which shows the comparison. May give someone a clue...

Don Myers (donmyers) wrote :

I've been fighting the same issue. I've upgraded 11 machines from 9.10 to 10.04. They were all upgrades and not fresh installs. In most cases after the upgrade, the printers did not show up. I would reinstall cups and everything seemed fine. But what I'm finding is that randomly on at least 3 machines one time when you boot, the printers will be there, then the next time they are gone. My one machine at home has an AMD 4800 dual core processor and 4 gigs of ram. On Sunday they were gone again, so I did a complete uninstall of cups in synaptic package manager, and then a complete reinstall of cups assuming this would take care of the problem. The computer was fine yesterday, but tonight my printers were gone again. Two of the problem computers are on machines with almost new Linux users, and they are not very impressed. I hope you get a workaround real fast!!!!!!!

Kai Krakow (hurikhan77) wrote :

Maybe for a similar reason the xorg-server sometimes does not start, logging a message in Xorg.0.log about a wait timeout on VT7 or something, as I reported in Bug #584187. (I will upload logs there later)

Olaf Leidinger (leidola) wrote :

Assuming the problem is hpijs, wouldn't it be possible to use alternative drivers for those HP printers, e.g. gutenprint drivers as a work-around? They seem to work for me.

Olaf, this is for sure not caused by HPLIP or any selection of printer drivers. For me it looks like that the problem is caused by the switchover to Upstart. It seems that the backward compatibility facility of Upstart for services still using the old System V init scripts is not working very well. Especially as users have the same problem with other services than CUPS for me it rather looks like an Upstart problem and not a CUPS problem.

Zimmer, unfortunately, we cannot accept files of OpenOffice.org or any other desktop application to supply text-only information. Please post your list of initctl output comparison directly into a comment in this bug report, not using attachments, so that your results get directly displayed by the reader's web browser.

Ideally, you generate the initctl output for both the CUPS and non-CUPS case into two text files. Then run the command

diff -uraNb cups.txt non-cups.txt > comparison.txt

and post the content of comparison.txt directly into a comment.

Zimmer, please do not what I told in the previous comment. Do the following:

Boot your system repeatedly, something like 20 times. Each boot has a number, your first boot is number 1, your second number 2, and so on.

Whenever you get a boot where CUPS is NOT running, direct the output into a file named "no-cups-##.txt" with ## replaced by the 2-digit number of the boot, like "no-cups-02.txt"

Whenever you get a boot where CUPS is running, direct the output into a file named "cups-##.txt" with ## replaced by the 2-digit number of the boot, like "cups-05.txt"

Attach all these files, uncompressed. You will need one comment per file.

Everyone who has problems with CUPS not starting on boot, please do the following:

Add "--verbose" to the kernel command line, proceeding as described here

https://help.ubuntu.com/community/BootOptions#Change%20Boot%20Options%20Permanently%20On%20An%20Existing%20Installation

Reboot your system and check whether CUPS is running, if it is running, reboot again, until you have CUPS not running. Remove workarounds if you have applied some. Do a cold boot (system shut down and after that turning on the machine again).

If your system is up with CUPS not running, attach your

/var/log/boot.log

file to this bug report.

General info about Upstart you find on

https://help.ubuntu.com/community/UpstartHowto

Changed in cups (Ubuntu):
status: Confirmed → Incomplete
Changed in cups (Ubuntu):
importance: Medium → High
piviul (piviul) wrote :

This is my boot.log with kernel --verbose option.

Thank you very much.

Piviul

Steve Langasek (vorlon) wrote :

piviul,

Everything looks correct in this attached log. Can you please post the output of the commands 'sudo initctl list' and 'runlevel' from this same session?

Don Myers (donmyers) wrote :

When I booted up tonight, my printers were gone again. I don't really understand everything above about creating and sending the reports.

I added --verbose to kernel line. First boot I had cups not start. Second boot I had cups start ok. Attaching /var/log/boot.log, output of 'sudo initctl list' and 'runlevel' below. Note that runlevel output on cups failure is "unknown", whereas it is "N 2" when cups successfully starts.

Steve Langasek (vorlon) wrote :

Joel,

Please attach the output of 'grep rc-sysinit /var/log/syslog' following a failure of cups to start, booting again with --verbose; and attach both that output and the matching /var/log/boot.log. Please also check whether running 'start rc-sysinit' causes cups to start.

Your results show that the rc-sysinit job is being started (ruling out a recurrence of previous /etc/network/interfaces bugs), but that it doesn't succeed in getting runlevel 2 to start. We need to see why.

Steve,

After a boot where cups failed to start (it took 3 tries), I executed above commands and am attaching output. Also, 'start rc-sysinit' does cause cups to start.

Olaf Leidinger (leidola) wrote :

I think bug #584855 is related. I put the logs there.

Steve Langasek (vorlon) wrote :

Ok, it seems we still haven't gone deep enough; everything still looks reasonable in the logs. It's clear by this point that the bug you're seeing isn't a cups bug, Joel, though we don't know that others on this bug report have the same bug.

Next step: can you boot with --debug instead of --verbose, and attach the same logs? If this is an upstart bug, it's a pretty deep one.

Steve Langasek (vorlon) wrote :

Sorry, not "the same logs": please attach /var/log/boot.log and the full, *unfiltered* /var/log/syslog.

Steve, booting with --debug now. On the first boot, cups failed to start (in syslog below this boot started at May 30 16:00:06). Thanks for the help!

Also, like a someone else above, my virtual consoles, ctrl-alt-F[1-6], are not working so I can see how this is going beyond just cups like you said.

Kai Krakow (hurikhan77) wrote :

> Also, like a someone else above, my virtual consoles,
> ctrl-alt-F[1-6], are not working so I can see how this
> is going beyond just cups like you said.

I think this (VTxx not available) is why Xorg sometimes doesn't start on boot for me either. (see bug #584187, attached logs there)

piviul (piviul) wrote :

boot.log

piviul (piviul) wrote :

output of sudo initctl list

piviul (piviul) wrote :

syslog

piviul (piviul) wrote :

Furthermore this is the output of runlevel

italo@bianchino:~$ runlevel
N 2

All file file I have attached are after a boot with --verbose kernel option and when cups is not started automatically.

Thank you very much

Piviul

PDP (pdp) wrote :

I used following workaround for the problem:

Open a terminal and run: sudo visudo
Then add a line like this at the end of the file:
username ALL=NOPASSWD: /usr/bin/service

Of course changing username to the actual username.
After that, that user should be able to run 'sudo service cups start' without supplying a password.

Add a new startup application (system --> preferences --> startup applications) that executes the following command during login: sudo service cups start

I find it a very important bug because it's basic printing functionality that doesn't work. I hope one can find as soon as possible a good solution so that above workaround isn't necessary anymore. With ubuntu lucid I've had also problems with other services that didn't startup automatically so it could be a more general problem.

piviul (piviul) wrote :

And this is the boot log with --debug kernel option

piviul (piviul) wrote :

And this is the syslog log with --debug kernel option

..that's all. I hope you can find the bug because my father can't use the pc: can't print and can't riceive mail because even freepops doesn't start when cups doesn't start.

Have a great day

Piviul

fig_wright (fig-wright) wrote :

Can confirm that my virtual consoles ctrl-alt-F[1-6] are also not working, even after I put the machine into runlevel 2 manually with "~ > sudo telinit 2". Also, I'm getting an unhelpful response from initctl:
~ > sudo initctl list
Warning: Fake initctl called, doing nothing.

On Tue, Jun 01, 2010 at 09:00:43PM -0000, fig_wright wrote:
> ~ > sudo initctl list
> Warning: Fake initctl called, doing nothing.

Please post the output of 'which initctl' and 'dpkg -S `which initctl`'.
You appear to have broken your upstart setup somehow.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

piviul,

The error in your logs is this:

Jun 1 19:18:43 bianchino init: rc state changed from starting to pre-start
Jun 1 19:18:43 bianchino init: rc state changed from pre-start to spawned
Jun 1 19:18:43 bianchino init: Failed to spawn rc main process: unable to open console: Input/output error

You should be able to work around this by commenting out the line 'console output' in /etc/init/rc.conf. However, this doesn't look like the same bug other people are reporting. Can you please file a new bug report against the upstart package including the same logs and citing this explanation?

piviul, if you report th console output bug, could you please CC me or put
the link here? I think this is related to different bugs I see in upstart.

2010/6/2 Steve Langasek <email address hidden>

> piviul,
>
> The error in your logs is this:
>
> Jun 1 19:18:43 bianchino init: rc state changed from starting to pre-start
> Jun 1 19:18:43 bianchino init: rc state changed from pre-start to spawned
> Jun 1 19:18:43 bianchino init: Failed to spawn rc main process: unable to
> open console: Input/output error
>
> You should be able to work around this by commenting out the line
> 'console output' in /etc/init/rc.conf. However, this doesn't look like
> the same bug other people are reporting. Can you please file a new bug
> report against the upstart package including the same logs and citing
> this explanation?
>
> --
> cups not starting at boot
> https://bugs.launchpad.net/bugs/554172
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “cups” package in Ubuntu: Incomplete
>
> Bug description:
> Binary package hint: cups
>
> Cups is not loading on my machine at boot, must run sudo /etc/init.d/cups
> start to after booting to print.
>
> ProblemType: Bug
> DistroRelease: Ubuntu 10.04
> Package: cups 1.4.2-10
> ProcVersionSignature: Ubuntu 2.6.32-19.28-generic 2.6.32.10+drm33.1
> Uname: Linux 2.6.32-19-generic i686
> NonfreeKernelModules: nvidia
> Architecture: i386
> Date: Fri Apr 2 13:07:35 2010
> InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Alpha i386 (20100401)
> Lpstat: Error: command ['lpstat', '-v'] failed with exit code 1: lpstat:
> Connection refused
> MachineType: Dell Inc. Studio XPS 1340
> Papersize: letter
> PpdFiles: Brother-HL-2170W-series: Brother HL-2170W Foomatic/pxlmono
> (recommended)
> ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-19-generic
> root=UUID=615bbe85-506a-4152-af5a-a5c2da303d83 ro quiet splash
> ProcEnviron:
> LANG=en_US.utf8
> SHELL=/bin/bash
> SourcePackage: cups
> dmi.bios.date: 09/08/2009
> dmi.bios.vendor: Dell Inc.
> dmi.bios.version: A11
> dmi.board.name: 0Y279R
> dmi.board.vendor: Dell Inc.
> dmi.board.version: A11
> dmi.chassis.asset.tag: 1234567890
> dmi.chassis.type: 8
> dmi.chassis.vendor: Dell Inc.
> dmi.chassis.version: A11
> dmi.modalias:
> dmi:bvnDellInc.:bvrA11:bd09/08/2009:svnDellInc.:pnStudioXPS1340:pvrA11:rvnDellInc.:rn0Y279R:rvrA11:cvnDellInc.:ct8:cvrA11:
> dmi.product.name: Studio XPS 1340
> dmi.product.version: A11
> dmi.sys.vendor: Dell Inc.
>
>
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/ubuntu/+source/cups/+bug/554172/+subscribe
>

I testet all workarounds, but after a few boots I mention, that cups is not started again.
This bug ist related to upstart, because mostly all oldstyle startscripts doesn't work anymore.
Sometimes the PC can't be turned off, ending in GDM (some old initscripts have not been started). So all old scripts need to be converted to upstart and configured by upgrade to start with upstart.

pothos (pothos) wrote :

You can test it by typing "runlevel" in the terminal. For me it ist "unknown". I upgraded from 9.10.

pothos (pothos) wrote :

Hehe, I had no runlevel because of a space at the ending of the kernel line in /boot/grub/menu.lst:

now good:
title Ubuntu 10.04 LTS, kernel 2.6.32-22-generic
uuid 0ee5d7e9-5e85-45c1-bd37-5dfb65814fca
kernel /boot/vmlinuz-2.6.32-22-generic root=UUID=0ee5d7e9-5e85-45c1-bd37-5dfb65814fca ro quiet splash
initrd /boot/initrd.img-2.6.32-22-generic
quiet

before bad, a space after splash:
title Ubuntu 10.04 LTS, kernel 2.6.32-22-generic
uuid 0ee5d7e9-5e85-45c1-bd37-5dfb65814fca
kernel /boot/vmlinuz-2.6.32-22-generic root=UUID=0ee5d7e9-5e85-45c1-bd37-5dfb65814fca ro quiet splash
initrd /boot/initrd.img-2.6.32-22-generic
quiet

Olaf Leidinger (leidola) wrote :

Do you have a runlevel when inserting this very space character again and add --debug to the kernel commandline?

Bru Baldoví (bru) wrote :

My two cents: clean Ubuntu 10.04 installation, updated, CUPS fails to start sometimes.

When CUPS doesn't start, neither does "vboxdrv": "The VirtualBox Linux kernel driver (vboxdrv) is either not loaded or there is a permission problem with /dev/vboxdrv. Please reinstall the kernel module by executing".

Rebooting solves. In my opinion, it's a severe bug, and a show stopper for the non-techies. Thanks a lot to the people working on solving this.

piviul (piviul) wrote :

HI Steve, hi Kai I opened a new bug as you suggest me. This is the link: https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/589272

Commenting the line console output in /etc/init/rc.conf as you suggest me the upstart problem seems to disappear.

Thank you very much

Piviul

Ola (erik-o-engstrom) wrote :

Hi . Commenting the line console output in /etc/init/rc.conf solved the problem - with a twist - for the workstations on my homenet to see the printers I have to log on to my workstation that doubles as a printerserver. In 8.04 the server just had to start up in order for the attached printers to be seen on the workstations.

/ola

Even though it may be unrelated, I tried commenting out the line 'console output' in /etc/init/rc.conf. I was able to boot three times without issue, but then on the fourth boot cups didn't start again.

Ran 'start rc-sysinit' at command line to get cups to start. I noticed several other processes start when I do this. Here's a snippet of ps aux of the processes that started with 'start rc-sysinit' when I ran at 09:48:

root 1311 0.0 0.0 1788 564 tty4 Ss+ 09:48 0:00 /sbin/getty -8 38400 tty4
root 1316 0.0 0.0 1788 564 tty5 Ss+ 09:48 0:00 /sbin/getty -8 38400 tty5
root 1323 0.0 0.0 1788 560 tty2 Ss+ 09:48 0:00 /sbin/getty -8 38400 tty2
root 1327 0.0 0.0 1788 564 tty3 Ss+ 09:48 0:00 /sbin/getty -8 38400 tty3
root 1329 0.0 0.0 1788 564 tty6 Ss+ 09:48 0:00 /sbin/getty -8 38400 tty6
root 1330 0.0 0.0 2820 528 ? Ss 09:48 0:00 /usr/sbin/irqbalance
root 1331 0.0 0.0 2044 864 ? Ss 09:48 0:00 acpid -c /etc/acpi/events -s /var/run/acpid.socket
root 1337 0.0 0.0 2372 876 ? Ss 09:48 0:00 cron
daemon 1338 0.0 0.0 2244 428 ? Ss 09:48 0:00 atd
root 1371 0.0 0.0 11732 1360 ? Ss 09:48 0:00 /usr/sbin/winbindd
root 1374 0.0 0.0 11732 1168 ? S 09:48 0:00 /usr/sbin/winbindd
root 1385 2.0 0.1 6828 2456 ? Ss 09:48 0:00 /usr/sbin/cupsd -C /etc/cups/cupsd.conf
root 1456 0.0 0.0 1828 560 ? S 09:48 0:00 /bin/sh /etc/init.d/ondemand background
root 1463 0.0 0.0 1768 436 ? S 09:48 0:00 sleep 60
root 1465 0.0 0.0 1788 568 tty1 Ss+ 09:48 0:00 /sbin/getty -8 38400 tty1

So it appears that tty's, irqbalance, acpid, cron, atd, winbindd (wine dependency I believe), cupsd are all affected.

Clownfishy (clownfishy) wrote :

I have run 10.04 through Alpha and did not see the problem even after 10.04 was released. I only had the problem after the first official update to the live 10.04. I continually have to run

sudo service cups start

to start the printer.

Are there any log files /detail I can upload to help with the problem?

Novastorm (novastorm87) wrote :

I'm also experiencing this problem on a fresh install Lucid x86_64. I can provide any logs if needed. Just ask.

Don Myers (donmyers) wrote :

I have installed or upgraded 15 systems since 10.04 was released. All 32 bit versions. Most were upgrades from 9.10. In almost all instances when an upgrade was done, after the installation was finished and the machine rebooted, cups was not running. Some of these machines have not been a problem after that. Several continue to be a major issue. One of those is a co-worker's computer. Several times has said to me about wiping of his Ubuntu and his just going back to Windows. I'm dragging my feet as much as possible in the hope that this gets fixed.

I converted two systems that had this problem to an Ubuntu 10.04 derivative,
Linux Mint 9, and both systems work perfectly. This could provide a clue to
the source of the problem, and could provide an emergency solution to people
like Don who are in a tough situation and must have a working Ubuntu-like
system immediately. BTW, the main menu in Linux Mint would probably feel
very familiar to Windows users. It's a very pleasant system (and of course
steals shamelessly from all of the work done in creating Ubuntu). It does
not come with Ubuntu One or the Ubuntu Music Store plugin pre-installed.

On Thu, Jun 10, 2010 at 9:05 AM, Don Myers <email address hidden> wrote:

> I have installed or upgraded 15 systems since 10.04 was released. All 32
> bit versions. Most were upgrades from 9.10. In almost all instances when
> an upgrade was done, after the installation was finished and the machine
> rebooted, cups was not running. Some of these machines have not been a
> problem after that. Several continue to be a major issue. One of those
> is a co-worker's computer. Several times has said to me about wiping of
> his Ubuntu and his just going back to Windows. I'm dragging my feet as
> much as possible in the hope that this gets fixed.
>
> --
> cups not starting at boot
> https://bugs.launchpad.net/bugs/554172
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “cups” package in Ubuntu: Incomplete
>
> Bug description:
> Binary package hint: cups
>
> Cups is not loading on my machine at boot, must run sudo /etc/init.d/cups
> start to after booting to print.
>
> ProblemType: Bug
> DistroRelease: Ubuntu 10.04
> Package: cups 1.4.2-10
> ProcVersionSignature: Ubuntu 2.6.32-19.28-generic 2.6.32.10+drm33.1
> Uname: Linux 2.6.32-19-generic i686
> NonfreeKernelModules: nvidia
> Architecture: i386
> Date: Fri Apr 2 13:07:35 2010
> InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Alpha i386 (20100401)
> Lpstat: Error: command ['lpstat', '-v'] failed with exit code 1: lpstat:
> Connection refused
> MachineType: Dell Inc. Studio XPS 1340
> Papersize: letter
> PpdFiles: Brother-HL-2170W-series: Brother HL-2170W Foomatic/pxlmono
> (recommended)
> ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-19-generic
> root=UUID=615bbe85-506a-4152-af5a-a5c2da303d83 ro quiet splash
> ProcEnviron:
> LANG=en_US.utf8
> SHELL=/bin/bash
> SourcePackage: cups
> dmi.bios.date: 09/08/2009
> dmi.bios.vendor: Dell Inc.
> dmi.bios.version: A11
> dmi.board.name: 0Y279R
> dmi.board.vendor: Dell Inc.
> dmi.board.version: A11
> dmi.chassis.asset.tag: 1234567890
> dmi.chassis.type: 8
> dmi.chassis.vendor: Dell Inc.
> dmi.chassis.version: A11
> dmi.modalias:
> dmi:bvnDellInc.:bvrA11:bd09/08/2009:svnDellInc.:pnStudioXPS1340:pvrA11:rvnDellInc.:rn0Y279R:rvrA11:cvnDellInc.:ct8:cvrA11:
> dmi.product.name: Studio XPS 1340
> dmi.product.version: A11
> dmi.sys.vendor: Dell Inc.
>
>
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/ubuntu/+source/cups/+bug/554172/+subscribe
>

Don Myers (donmyers) wrote :
Download full text (3.8 KiB)

Thank you Garry. I think Linux Mint 9 is based on either Ubuntu 9.04 or
9.10. I never saw this issue come up until 10.04.

On 06/10/2010 06:01 PM, Garry Roseman wrote:
> I converted two systems that had this problem to an Ubuntu 10.04 derivative,
> Linux Mint 9, and both systems work perfectly. This could provide a clue to
> the source of the problem, and could provide an emergency solution to people
> like Don who are in a tough situation and must have a working Ubuntu-like
> system immediately. BTW, the main menu in Linux Mint would probably feel
> very familiar to Windows users. It's a very pleasant system (and of course
> steals shamelessly from all of the work done in creating Ubuntu). It does
> not come with Ubuntu One or the Ubuntu Music Store plugin pre-installed.
>
> On Thu, Jun 10, 2010 at 9:05 AM, Don Myers<email address hidden>
> wrote:
>
>
>> I have installed or upgraded 15 systems since 10.04 was released. All 32
>> bit versions. Most were upgrades from 9.10. In almost all instances when
>> an upgrade was done, after the installation was finished and the machine
>> rebooted, cups was not running. Some of these machines have not been a
>> problem after that. Several continue to be a major issue. One of those
>> is a co-worker's computer. Several times has said to me about wiping of
>> his Ubuntu and his just going back to Windows. I'm dragging my feet as
>> much as possible in the hope that this gets fixed.
>>
>> --
>> cups not starting at boot
>> https://bugs.launchpad.net/bugs/554172
>> You received this bug notification because you are a direct subscriber
>> of the bug.
>>
>> Status in “cups” package in Ubuntu: Incomplete
>>
>> Bug description:
>> Binary package hint: cups
>>
>> Cups is not loading on my machine at boot, must run sudo /etc/init.d/cups
>> start to after booting to print.
>>
>> ProblemType: Bug
>> DistroRelease: Ubuntu 10.04
>> Package: cups 1.4.2-10
>> ProcVersionSignature: Ubuntu 2.6.32-19.28-generic 2.6.32.10+drm33.1
>> Uname: Linux 2.6.32-19-generic i686
>> NonfreeKernelModules: nvidia
>> Architecture: i386
>> Date: Fri Apr 2 13:07:35 2010
>> InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Alpha i386 (20100401)
>> Lpstat: Error: command ['lpstat', '-v'] failed with exit code 1: lpstat:
>> Connection refused
>> MachineType: Dell Inc. Studio XPS 1340
>> Papersize: letter
>> PpdFiles: Brother-HL-2170W-series: Brother HL-2170W Foomatic/pxlmono
>> (recommended)
>> ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-19-generic
>> root=UUID=615bbe85-506a-4152-af5a-a5c2da303d83 ro quiet splash
>> ProcEnviron:
>> LANG=en_US.utf8
>> SHELL=/bin/bash
>> SourcePackage: cups
>> dmi.bios.date: 09/08/2009
>> dmi.bios.vendor: Dell Inc.
>> dmi.bios.version: A11
>> dmi.board.name: 0Y279R
>> dmi.board.vendor: Dell Inc.
>> dmi.board.version: A11
>> dmi.chassis.asset.tag: 1234567890
>> dmi.chassis.type: 8
>> dmi.chassis.vendor: Dell Inc.
>> dmi.chassis.version: A11
>> dmi.modalias:
>> dmi:bvnDellInc.:bvrA11:bd09/08/2009:svnDellInc.:pnStudioXPS1340:pvrA11:rvnDellInc.:rn0Y279R:rvrA11:cvnDellInc.:ct8:cvrA11:
>> dmi.product.name: Studio XPS 1340
>> dmi.product.version: A11
>> dmi.sys.vendor: Dell Inc.
>>
>>
>>
>> To unsubscribe...

Read more...

@ Till Kamppeter

Sorry Till, been away having started to test by re booting many times. Tried booting the installation on return and it will not boot: get quick purple splash screen, then the screen goes blank and flickers. (A Mepis install on the same drive boots ok).

Put a fresh install into another (Brand New) external disk and CUPS has been starting ok for the last 3 days. Was going to write a post saying the fresh install has sorted the problem...... but it has not.... CUPS failed to load on that very boot!
I will try and spend some time today (between World Cup Matches :) ) re booting and collecting text files as you directed.
Regards
Zimmer

Jarige (jarikvh) wrote :

Another affected person here. I can confirm that vboxdrv doesn't start at boot either, tty1 to 6 don't work and that 'sudo start rc-sysinit' fixes it. Manually executing that is, of course, not an ideal situation.

Don Myers (donmyers) wrote :

My appologizies to Garry! I see that Linux Mint 9 is based on Ubuntu 10.04. I am truly sorry. After using Ubuntu 8.10, 9.04, 9.10, and now 10.04, I am somewhat disgruntled with 10.04. In theory, each version should be an improvement over the last. I've had a few relatively minor issues with past versions, and while I like the additional boot speed and appearence of 10.04, I've had 4 significant issues with it. There is the recurring problem where some hardware will not awake from sleep. Some systems work perfectly, and some not at all. Then I had the issue of web pages, scans, and pdf creator pages printing with a cream colored background. This has now been fixed. Then we have the issue of the printers disappearing, which still happens for me on two computers. To top things off though, one would assume that openoffice would work to perfection. I found last week that if you do an install of the openoffice individual applications or the entire suite from the repository, the spell checker doesn't function. It pretends it does. Openoffice in a linux distrubution should be rock solid. Any simple testing would reveal this problem. So I downloaded and installed openoffice directly from openoffice.org and have a fully functrioning spell checker and office suite. Quality control is certainly lacking! Then I tried to report this bug. The two preferred ways in their instructions didn't work, so I made up a launchpad bug report which crashed when it was trying to upload. At that point I gave up!!!!!!!! Frustration. I admire what Ubuntu has done for Linux, and their goals. But quality control and something that works out of the box is of great importance to convert new people to Linux. Printers that disappear and spell checks that pretend to function and do nothing don't cut it to convert new users to either Linux or Ubuntu. If I were a brand new user, I would look at all of these issues and say it is back to Windows. Linux isn't ready for the destop yet.

Don Myers: Keep in mind this is an open source project and we need to all work together to fix this. I understand your frustration though, I have gone back to 9.10 because of this issue and it's working great for me. I think it's a good idea to let new releases mature somewhat before adopting them any ways. But it is a little disconcerting that this is happening on a LTS.

Developers: What can we do to get this bug addressed? I notice it is still in "Incomplete" status. What information do you need to move this back to "Confirmed" and get this assigned? I will gladly help in any way whatsoever.

Digression: Perhaps you should have the less technical users volunteer to help with QA. I realize it's a tedious process that wastes valuable developer's time. Basic things like testing printing is something intermediate users can do. We could define a list of basic functions to check to be sure they are all working ok with new releases. This can be for all functions, not just printing, getting multiple results from many users on the same function to corroborate.

I've tried removing proprietary NVIDIA drivers to see if would make a difference: No, it does not.

I think this may be a general upstart problem. I have noticed that boot sequence is highly inconsistent and seemingly unpredictable on consecutive boots, even when nothing major has changed on system. For example, the text that scrolls by on the screen is frequently different during subsequent boots. I get the Ubuntu splash screen with the dots showing progress, sometimes not. Sometimes I get the Nvidia proprietary Logo screen, sometimes not. I am booting with --debug kernel option and the last line of text in boot sequence is frequently different. I agree with Steve Langasek above that this is more than likely a general upstart problem, and not just a cups issue. See comment #69.

My suggestion: If everyone affected by this could on terminal do:

1. If you have a boot where printers do not appear in System->Administration->Printers, do a 'sudo start rc-sysinit' in your terminal. then do a 'ps aux' and see if more services start. Virtual terminals (ttys), irqbalance, acpid, cron, atd, winbindd all do not start for me when cupsd doesn't start (see my comment #92 on this bug).

2. please post what you find here.

If the majority of users here have several services not starting, in addition to cupsd, perhaps we could re-file this as an upstart bug?

Novastorm (novastorm87) wrote :

In response to comment #101:

I found that on my machine both 'cupsd' and 'winbindd' started after running 'sudo start rc-sysinit'

Don Myers (donmyers) wrote :

Hi Joel,

Thank you for your very well written comments. I've been using Ubuntu for about 15 months. I've migrated everything to Ubuntu except my video editing and a Windows forms program. I do a lot of picture editing and some websites, as well as sell real estate and farm. I've installed Ubuntu on 20 different computers. For the power save problem I turn the power save off even though I wish it would work. The cream colored background issue has been fixed. The cups issue has been major, but for me personally it is a nuisance. I can live with it until it is fixed. For a co-worker in the office who I'm trying to convince to like Ubuntu and Linux, having his machine as one where the printers don't work is a major issue with him. But quite honestly when I found that the openoffice spell checker was non-functional on my own machine, and then checked a second machine and it had the same issue, and then checked on line and found others having the exact same issue, I was just dumbfounded that something like this could ever slip by and be released. Most of the people who I've installed Ubuntu for would never understand how to install the openoffice.org version. This kind of problem with the spell checker is something that reflects badly on openoffice, Ubuntu, and Linux. I tried installing several additional different packages from synaptic. That didn't help. At that point I gave up and went to openoffice.org. I didn't have time to play with something that should have never been broken in the first place. But for me to try to explain to the first time user why a spell checker goes through the motions but doesn't do anything is beyond anything I can do.

My printers did show up on this computer tonight. But the next time they don't I will do what you suggested. I'm comfortable running the command line as long as someone tells me what to enter!

Hi Don Myers and everyone experiencing this problem,

Perhaps you can reboot several times, get to a boot where printers do not show up in System->Administration->Printers.

Then:

1. at terminal do: 'sudo start rc-sysinit' (without quotes). this command should start any processes/services that did not start at boot time. note the time that you executed this command.
2. at terminal do: 'ps aux'
3. look for processes/services that started at the time you executed 'sudo start rc-sysinit'. the 9th column in the output is the time when the processes/service started. if it equals the time that you executed 'sudo start rc-sysinit', this means this process/service(s) did not start like they should at boot.
4. Please post those processes/services here.

This is definitely not a CUPS bug, the problem is that rc-sysinit does not get executed on startup. Moving to upstart ...

affects: cups (Ubuntu) → upstart (Ubuntu)
summary: - cups not starting at boot
+ CUPS and other system services not starting at boot

I don't know what exactly solved my problem:

1. re-installation of the package cups
or
2. sudo service cups start

I did both of them at the same time and the problem never occurred again (so far).

Don Myers (donmyers) wrote :

Hi Jens,

Some machines have not had issues. When I first noticed it, after doing an upgrade, I had no printers. The help in the printer dialog boxes indicated cups was not running and to start it. Almost all of the upgrade installs had this issue. Many of them after an initial restrt have not had the issue. It made me wonder if it made a difference if the printers were turned on or off during the upgrade. Two systems continue to give me problems. On both of them I completely uninstalled cups, and completely reinstalled cups, and thens tarted cups, but the issue remains. I hope your system turns out better than mine!

Don Myers (donmyers) wrote :

Hi Joel,

I hit the sack last night before you made your last post. Here is the strange thing. My machine at home was 50% of the time having the printers show up. The alst few days they were there. My co-workers machine at the office was 98% of the time the printers were missing. I made a document that I put on his desktop with the instructions for him to start cups in the terminal in case I wasn't here to help him. He has been off the end of last week and this week. I booted his computer 3 times on Saturday, primarily to take off the Ubuntu version of Open Office with the unworking spell checker, and to put on the openoffice.org version with the working spell checker. I DID NOT want him to see another issue with Ubuntu or Linux. But while I was doing his reboots Saturday, I would check the printers, and they were always there. The same is true this morning. I did a cold boot and then 2 reboots and they were theer all there times. I wonder if any of the Ubuntu updates fixed the issue! Especially since over the weekend and early this week I've had no problems with either machine.

Jens Janssen (jayjay) wrote :

So I conclude to myself that this problem is a bit tricky. So this bug needs a professional having an eye on it.

Clownfishy (clownfishy) wrote :

So here is the strange thing, periodically my printer is available so CUPS must start under certain circumstances. It is not very often but occasionally my printer is there. I will try and work out under what conditions CUPS is starting.

There is not many times I moan about open source as I am a huge fan of everything open source. However, if an organisation rolled out 10:04 and their users could not print, I think they would be amazed at how long this is taking to fix. Maybe it is just affecting a few people who run a certain type of system? I am getting a Dell Mini 10 in a week or so, so I will install 10:04 on that and see if I get the same issue.

I am happy to upload any log files which may help.

Jarige (jarikvh) wrote :

This bug should be of higher importance as this could be affecting other programs. Virtual Box is affected either.

I wanted to point out that after executing 'sudo start rc-sysinit' tty1 still won't work, but tty2 and up will.

Don Myers (donmyers) wrote :

Hi Joel and Others,

I just turned on my co-workers computer to test things again. From the cold boot everything was fine. On the reboot, no printers. When running 'sudo start rc-sysinit' at 3:15PM, this is what I got:
dale@dale-desktop:~$ sudo start rc-sysinit
[sudo] password for dale:
rc-sysinit stop/waiting
dale@dale-desktop:~$

The printers are now available.

Here are the results from 'ps aux' from 3:14 (15:14) on:
dale 1428 0.3 0.4 49184 13364 ? Rl 15:14 0:01 gnome-terminal
dale 1429 0.0 0.0 1984 696 ? S 15:14 0:00 gnome-pty-helper
dale 1430 0.0 0.1 5808 3156 pts/0 Ss 15:14 0:00 bash
root 1525 0.0 0.0 2824 576 ? Ss 15:15 0:00 /usr/sbin/irqbalance
root 1558 0.0 0.0 11732 1360 ? Ss 15:15 0:00 /usr/sbin/winbindd
root 1573 0.0 0.0 11732 1164 ? S 15:15 0:00 /usr/sbin/winbindd
root 1581 0.0 0.0 6856 2676 ? Ss 15:15 0:00 /usr/sbin/cupsd -C /etc/cups/cupsd.conf
dale 1663 0.0 0.0 2712 1044 pts/0 R+ 15:20 0:00 ps aux
dale@dale-desktop:~$

I hope this helps!

Don Myers (donmyers) wrote :

I just did 3 more reboots, then 1 cold boot, and then 1 more reboot, and the printers were there each time. This is the exact opposite of the way this machine was performing last week and the week before.

For me it looks like a timing issue. During boot several processes run in parallel and depending on whether a certain process A finishes before a certain process B rc-sysinit gets started or not. The timing can be influenced by whether a cold boot or a reboot is done and also by reinstalling packages as then files get placed on different locations on the disk and so the access can get faster or slower, But strange is that reinstalling CUPS causes any changes, because the problem happens before any file of CUPS gets touched.

Kai Krakow (hurikhan77) wrote :

I can't remember where but I read clues about NetworkManager could be involved:

Sometimes NM starts fast enough to initialize the "lo" net interface but this does not create an event in upstart. Sometimes the script that usually inits "lo" runs first and generates the event.

Something else I see is different scripts just don't start because they are complaining about an IO error accessing the console. Another clue which could also explain why the loopback may not start? At least it relates to why Xorg does not start for me sometimes with an IO error accessing the console in Xorg log.

Bru Baldoví (bru) wrote :

Missing printers once again, executing 'sudo start rc-sysinit' at about 16:39 and 'ps aux' at about 16:43 the last lines read:

bru 1730 0.1 0.1 5976 3192 pts/1 Ss 16:39 0:00 bash
root 1880 0.0 0.0 11728 1344 ? Ss 16:39 0:00 /usr/sbin/winbindd
root 1901 0.0 0.1 6828 2856 ? Ss 16:39 0:00 /usr/sbin/cupsd -C /etc/cups/cupsd.conf
root 1937 0.0 0.0 11728 1152 ? S 16:39 0:00 /usr/sbin/winbindd
bru 1995 0.0 0.0 2680 1028 pts/1 R+ 16:43 0:00 ps aux

The printers are available again. Thanks.

Garry Leach (garry-leach) wrote :

I am also the initiator of bug #580755, Log-in screen locks up.

The lock-up is actually that the keyboard & mouse are not enabled, so I can't proceed with booting.

Is it possible that these are not "started" during the upstart process?

Regards, Garry.

Garry Leach (garry-leach) wrote :

One of my sons is a long-time user of Unix & Ubuntu. He upgraded to 10.04, a few weeks after it became available.

He has not had any problem with his printer (or other services).

However, his printer is set-up on his network (i.e. connected to his router).

Perhaps this may provide some insight into the area of the problem...

Regards, Garry.

Don Myers (donmyers) wrote :

Garry,

It makes no difference whether the printer is conected through a network or directly to the computer. I've had experience with both failing.

Don Myers (donmyers) wrote :

To Joel and others,

My home computer decided not to have any printers tonight. The time was 23:39. Here is the information Joel requested:

don 2133 0.5 0.3 48156 13340 ? Rl 23:38 0:00 gnome-terminal
don 2134 0.0 0.0 1984 696 ? S 23:38 0:00 gnome-pty-helpe
don 2135 0.3 0.0 6012 3364 pts/0 Ss 23:38 0:00 bash
root 2229 0.0 0.0 2824 580 ? Ss 23:39 0:00 /usr/sbin/irqba
root 2264 0.0 0.0 11732 1360 ? Ss 23:39 0:00 /usr/sbin/winbi
root 2267 0.0 0.0 11732 1168 ? S 23:39 0:00 /usr/sbin/winbi
root 2286 0.0 0.0 6796 2456 ? Ss 23:39 0:00 /usr/sbin/cupsd
root 2330 0.0 0.0 1828 564 ? S 23:39 0:00 /bin/sh /etc/in
root 2337 0.0 0.0 1768 448 ? S 23:39 0:00 sleep 60
don 2338 0.0 0.0 2712 1040 pts/0 R+ 23:39 0:00 ps aux

Ola (erik-o-engstrom) wrote :

For whatever it's worth. I've installed 8.10 server as a printer server on an old laptop and cups works every upstart.

Thanks for all the posts of ps aux output after doing a 'sudo start rc-sysinit' after a bad boot.

From the ps aux reports that I requested in #104 for when cupsd does not start, it appears that irqbalance, winbindd also are not starting for most people.

Since we moved this bug from a cups bug to an upstart bug I see that:
http://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506
is very similar to this one.

I am also posting a summary in the other bug that there are a lot of reports to analyze here.

Changed in upstart (Ubuntu):
status: Incomplete → Confirmed

Ola, the bug is caused by the new upstart infrastructure, this was not yet introduced in 8.10.

Joel, I have marked the other bug as a duplicate of this one now, so that all discussion will continue in this bug now.

Changed in upstart (Ubuntu):
importance: High → Critical

I had the same problem, CUPS didn't start with boot machine and apache as well.
This was hellpful:
https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/497299
(comment no.12)

Error was in my /etc/network/interfaces
I had this inside:

auto eth0
iface eth0 inet static
address 192.168.5.10
netmask 255.255.255.0
network 192.168.5.0
broadcast 192.168.5.255
gateway 192.168.5.1

_________________________________
Instead, it shall be like this:

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
address 192.168.5.10
netmask 255.255.255.0
network 192.168.5.0
broadcast 192.168.5.255
gateway 192.168.5.1

(well, I hope that after 3-4 restarts that this problem will not appear again.)

BTW, Lucid looks really great, hope it will be stable in work .
All the best,

Joel, Till

Re comment #122. I believe you have it backwards.

I believe if you apply the get-around spelled out in comment #83 of
http://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506
you will find that your CUPS starts will be reliable.

I've seen native 10.4 upstart boot go to "unknown" after many boots to "N 2" and vice versa.

By applying the get-around from comment #83, my boots became 100% reliable; 100% "N 2".

Till,
Re comment #123. I believe that is exactly right.
The Upstart infrastructure seems to be inherently unstable because there is no fixed order of startup scripts.

Just because it boots once correctly to runlevel "N 2" you cannot assume that it won't boot to runlevel "unknown"
sometime in the future. This is because the order of /etc/init/*.conf and /etc/rc2.d/S* scripts are run is
not fixed.

mån 2010-06-21 klockan 11:21 +0000 skrev Till Kamppeter:
> Ola, the bug is caused by the new upstart infrastructure, this was not
> yet introduced in 8.10.
Sorry finger slip version for the Ubuntu-server is 10.04 not 8.10.
Regards
/ola

>

For what its worth: i seem to have this bug also.
maybe it is also affecting my vmware upstart (vmware workstation 6.5.4) - the vmware init script seems sometimes not to run on startup.

pothos (pothos) wrote :

Removing grub-spaces did not help. I had to comment out upstart-consule-outputs because of a wrong starting order. But even now sometimes the pc can't be shutdown, but the runlevel seems to be ok.

vindex (niklas-correnz) wrote :

I have this problem, too, but not only with cups, rc.local and some custom startup-scripts are not starting, either.

On Fri, Jun 25, 2010 at 10:33:02PM -0000, vindex wrote:
> I have this problem, too, but not only with cups, rc.local and some
> custom startup-scripts are not starting, either.

Look at
https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/83

That has worked for many of us.

Mike,
Re #131: I tried the solution stated at the other bug-report but it doesn't work for me. First reboot after change already started in unknown runlevel without starting services like CUPS.

sudo start rc-sysinit
at startup, worked for me but this cannot be the solution!
Additionally, this command crushed my internet connection and I had to perform a restart of /etc/init.d/networking !?

On Sat, Jun 26, 2010 at 04:27:35PM -0000, Tim Scheurenbrand wrote:
> Re #131: I tried the solution stated at the other bug-report but it doesn't work for me. First reboot after change already started in unknown runlevel without starting services like CUPS.

runlevel returning "unknown" appears to be a clear indication that the boot was
incomplete. A clean boot gives "N 2".

If you added "init='/sbin/init --verbose'" to the linux boot line
you are more likely to see error messages from "init" in /var/log/syslog
that might indicate a specific failure.

Do you people with the CUPS problem also have no sound/audio and no shutdown option after login? I could be totally wrong, but to me it seemed inappropriate to merge this bug with this one https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506?

The problem I have with sound/shutdown is now FIXED after doing what is here:
http://ubuntuforums.org/showpost.php?p=9480172&postcount=20

Also, I did this:
apt-get install ifupdown --reinstall --purge

I ran a scripted reboot on my system testing for Run Level, and after about an hour of boot/reboots, it correctly came up "runlevel 2" everytime.

Michael Doube (michael-doube) wrote :

The ondemand governor, vaiopower and vaiofand also fail to start for me. Adding the ondemand bug as a dupe of this one.

My understanding of this bug is that System V services fail to start because the rc-sysinit or rc jobs fail because /dev/console cannot be opened. A workaround appears to be to comment out "console output" from those jobs.

I also understand that in all cases, /dev/console actually exists -- it's a fundamental node that always exists in the devtmpfs filesystem, and is created by the initramfs for a tmpfs-on-/dev filesystem. In the case where neither of these is used, the node must exist on the root filesystem and is created as part of the installer process.

And since it exists, but returns an error when it's being opened, this strongly implies some kind of kernel bug.

Andy (andy-xillean) wrote :

This is definitely a rc-sysinit --> upstart problem. list of services not starting up for me: cups, bind9, ddclient,

Nick Davis (argoneus) wrote :

As I mentioned in #543506, I'm also seeing this issue on Ubuntu Server 10.04. I've tried some of the workarounds mentioned in this thread (https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/83), including commenting out "console output" in all /etc/init/*.conf files. This had no effect, and services such as lighttpd aren't automatically starting that should be.

When I run 'runlevel', the output is N 2, but again, my services still aren't being started properly.

That doesn't sound like the exact same issue. Well over 500 restarts
here and absolute consistency found, in that either I get runlevel N 2
and all services started fine or I get runlevel unknown and services not
started (in my case proftpd and cupsd).

No doubt they are similar and most likely caused by race conditions
being created in the way upstart handles starting up services.

Anyone else happily go back to a slower bootup time if it gave them some
control over what order services started up? :D

On 30/06/10 21:08, phireph0x wrote:
> As I mentioned in #543506, I'm also seeing this issue on Ubuntu Server
> 10.04. I've tried some of the workarounds mentioned in this thread
> (https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/83),
> including commenting out "console output" in all /etc/init/*.conf files.
> This had no effect, and services such as lighttpd aren't automatically
> starting that should be.
>
> When I run 'runlevel', the output is N 2, but again, my services still
> aren't being started properly.
>
>

This is a serious problem. At this point 10.04 is not production worthy. How can one deploy machines and not know if a service will start or not? 10.04 is turning out to be a disaster on the order of Vista. Services not starting up. Black screen on boot. HP printer regressions, scanner regressions, Just very basic stuff really that should just work. especially if its something that used to work in a previous version. This is embarrasing and when you convince a lot of people to switch from Windows they upgrade and regressions start happening all over the place. Very disappointing indeed. It feels as if I'm running out of reasons to convince people to leave Windows. Only security is left. Stability is out the window.

Nick Davis (argoneus) wrote :

I agree it's probably not the exact same issue, but definitely in the "upstart has a race condition" bucket of issues.

I'm kinda impressed that you've gone through the heroics of rebooting your machine 500 times in the name of bug-squashing... that's persistence :-P

>That doesn't sound like the exact same issue. Well over 500 restarts
>here and absolute consistency found, in that either I get runlevel N 2
>and all services started fine or I get runlevel unknown and services not
>started (in my case proftpd and cupsd).
>
>No doubt they are similar and most likely caused by race conditions
>being created in the way upstart handles starting up services.
>
>Anyone else happily go back to a slower bootup time if it gave them some
>control over what order services started up? :D

On 30/06/10 21:08, phireph0x wrote:
>> As I mentioned in #543506, I'm also seeing this issue on Ubuntu Server
>> 10.04. I've tried some of the workarounds mentioned in this thread
>> (https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/83),
>> including commenting out "console output" in all /etc/init/*.conf files.
>> This had no effect, and services such as lighttpd aren't automatically
>> starting that should be.
>>
>> When I run 'runlevel', the output is N 2, but again, my services still
>> aren't being started properly.
>>
>>

Hi Graham,

A fast boot time is not critical for me (& I think that we are only
talking of seconds, anyway). In any case, my current boot process with
10.04 involves using several steps in the fail-safe mode, so it is not
only slow, but requires my interaction several times.

However, I am not an expert, so any changes need to be provided to me in
easy-to-follow steps. And, I am somewhat wary, after I tried to
implement a change to my Mythbuntu PC that someone suggested, & got into
an awful mess (the boot failed miserably, but then I could not get the
file up to edit it back the way it was, as the fsck system wanted to
check the HDD, but failed; it was all just stuck in a horrible loop).

Garry.

On Wed, 2010-06-30 at 23:01 +0000, Graham T wrote:
> That doesn't sound like the exact same issue. Well over 500 restarts
> here and absolute consistency found, in that either I get runlevel N 2
> and all services started fine or I get runlevel unknown and services not
> started (in my case proftpd and cupsd).
>
> No doubt they are similar and most likely caused by race conditions
> being created in the way upstart handles starting up services.
>
> Anyone else happily go back to a slower bootup time if it gave them some
> control over what order services started up? :D
>
> On 30/06/10 21:08, phireph0x wrote:
> > As I mentioned in #543506, I'm also seeing this issue on Ubuntu Server
> > 10.04. I've tried some of the workarounds mentioned in this thread
> > (https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/83),
> > including commenting out "console output" in all /etc/init/*.conf files.
> > This had no effect, and services such as lighttpd aren't automatically
> > starting that should be.
> >
> > When I run 'runlevel', the output is N 2, but again, my services still
> > aren't being started properly.
> >
> >
>

So /dev/console exists but isn't immediately "ready for work". It appears it
becomes fully functional later (otherwise it would not be possible to start
the failing services manually).
Can we figure out at what point /dev/console becomes functional?

In the meantime two workarounds seem doable, both assuming /dev/console
doesn't depend on anything started during boot:
1. A boot script that doesn't complete until /dev/console works, make
rc/rc-sysinit boot sequence depend on it.
2. A patch in upstart that waits until /dev/console works.

Any thoughts?

Wout

On Mon, Jun 28, 2010 at 18:43, Scott James Remnant <email address hidden>wrote:

> My understanding of this bug is that System V services fail to start
> because the rc-sysinit or rc jobs fail because /dev/console cannot be
> opened. A workaround appears to be to comment out "console output" from
> those jobs.
>
> I also understand that in all cases, /dev/console actually exists --
> it's a fundamental node that always exists in the devtmpfs filesystem,
> and is created by the initramfs for a tmpfs-on-/dev filesystem. In the
> case where neither of these is used, the node must exist on the root
> filesystem and is created as part of the installer process.
>
> And since it exists, but returns an error when it's being opened, this
> strongly implies some kind of kernel bug.
>
> --
> CUPS and other system services not starting at boot
> https://bugs.launchpad.net/bugs/554172
> You received this bug notification because you are a direct subscriber
> of a duplicate bug.
>
> Status in “upstart” package in Ubuntu: Confirmed
>
> Bug description:
> Binary package hint: cups
>
> Cups is not loading on my machine at boot, must run sudo /etc/init.d/cups
> start to after booting to print.
>
> ProblemType: Bug
> DistroRelease: Ubuntu 10.04
> Package: cups 1.4.2-10
> ProcVersionSignature: Ubuntu 2.6.32-19.28-generic 2.6.32.10+drm33.1
> Uname: Linux 2.6.32-19-generic i686
> NonfreeKernelModules: nvidia
> Architecture: i386
> Date: Fri Apr 2 13:07:35 2010
> InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Alpha i386 (20100401)
> Lpstat: Error: command ['lpstat', '-v'] failed with exit code 1: lpstat:
> Connection refused
> MachineType: Dell Inc. Studio XPS 1340
> Papersize: letter
> PpdFiles: Brother-HL-2170W-series: Brother HL-2170W Foomatic/pxlmono
> (recommended)
> ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-19-generic
> root=UUID=615bbe85-506a-4152-af5a-a5c2da303d83 ro quiet splash
> ProcEnviron:
> LANG=en_US.utf8
> SHELL=/bin/bash
> SourcePackage: cups
> dmi.bios.date: 09/08/2009
> dmi.bios.vendor: Dell Inc.
> dmi.bios.version: A11
> dmi.board.name: 0Y279R
> dmi.board.vendor: Dell Inc.
> dmi.board.version: A11
> dmi.chassis.asset.tag: 1234567890
> dmi.chassis.type: 8
> dmi.chassis.vendor: Dell Inc.
> dmi.chassis.version: A11
> dmi.modalias:
> dmi:bvnDellInc.:bvrA11:bd09/08/2009:svnDellInc.:pnStudioXPS1340:pvrA11:rvnDellInc.:rn0Y279R:rvrA11:cvnDellInc.:ct8:cvrA11:
> dmi.product.name: Studio XPS 1340
> dmi.product.version: A11
> dmi.sys.vendor: Dell Inc.
>
>
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/554172/+subscribe
>

On 07/01/2010 10:18 AM, Wout wrote:
> So /dev/console exists but isn't immediately "ready for work". It appears it
> becomes fully functional later (otherwise it would not be possible to start
> the failing services manually).
> Can we figure out at what point /dev/console becomes functional?
>
> In the meantime two workarounds seem doable, both assuming /dev/console
> doesn't depend on anything started during boot:
> 1. A boot script that doesn't complete until /dev/console works, make
> rc/rc-sysinit boot sequence depend on it.
> 2. A patch in upstart that waits until /dev/console works.
>
> Any thoughts?

I think if Scott's analysis is correct then the correct solution would
be to have the kernel buffer anything received on /dev/console until
it's ready to display it.

Anyone for reassigning this bug to the kernel?

Regards,
Michael

On Thu, Jul 1, 2010 at 11:09, Michael Goetze <email address hidden> wrote:

> On 07/01/2010 10:18 AM, Wout wrote:
> > So /dev/console exists but isn't immediately "ready for work". It appears
> it
> > becomes fully functional later (otherwise it would not be possible to
> start
> > the failing services manually).
> > Can we figure out at what point /dev/console becomes functional?
> >
> > In the meantime two workarounds seem doable, both assuming /dev/console
> > doesn't depend on anything started during boot:
> > 1. A boot script that doesn't complete until /dev/console works, make
> > rc/rc-sysinit boot sequence depend on it.
> > 2. A patch in upstart that waits until /dev/console works.
> >
> > Any thoughts?
>
> I think if Scott's analysis is correct then the correct solution would
> be to have the kernel buffer anything received on /dev/console until
> it's ready to display it.
>
> That's why I wrote workaround not solution.

> Anyone for reassigning this bug to the kernel?
>
> Regards,
> Michael
>
> --
> CUPS and other system services not starting at boot
> https://bugs.launchpad.net/bugs/554172
> You received this bug notification because you are a direct subscriber
> of a duplicate bug.
>
> Status in “upstart” package in Ubuntu: Confirmed
>
> Bug description:
> Binary package hint: cups
>
> Cups is not loading on my machine at boot, must run sudo /etc/init.d/cups
> start to after booting to print.
>
> ProblemType: Bug
> DistroRelease: Ubuntu 10.04
> Package: cups 1.4.2-10
> ProcVersionSignature: Ubuntu 2.6.32-19.28-generic 2.6.32.10+drm33.1
> Uname: Linux 2.6.32-19-generic i686
> NonfreeKernelModules: nvidia
> Architecture: i386
> Date: Fri Apr 2 13:07:35 2010
> InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Alpha i386 (20100401)
> Lpstat: Error: command ['lpstat', '-v'] failed with exit code 1: lpstat:
> Connection refused
> MachineType: Dell Inc. Studio XPS 1340
> Papersize: letter
> PpdFiles: Brother-HL-2170W-series: Brother HL-2170W Foomatic/pxlmono
> (recommended)
> ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-19-generic
> root=UUID=615bbe85-506a-4152-af5a-a5c2da303d83 ro quiet splash
> ProcEnviron:
> LANG=en_US.utf8
> SHELL=/bin/bash
> SourcePackage: cups
> dmi.bios.date: 09/08/2009
> dmi.bios.vendor: Dell Inc.
> dmi.bios.version: A11
> dmi.board.name: 0Y279R
> dmi.board.vendor: Dell Inc.
> dmi.board.version: A11
> dmi.chassis.asset.tag: 1234567890
> dmi.chassis.type: 8
> dmi.chassis.vendor: Dell Inc.
> dmi.chassis.version: A11
> dmi.modalias:
> dmi:bvnDellInc.:bvrA11:bd09/08/2009:svnDellInc.:pnStudioXPS1340:pvrA11:rvnDellInc.:rn0Y279R:rvrA11:cvnDellInc.:ct8:cvrA11:
> dmi.product.name: Studio XPS 1340
> dmi.product.version: A11
> dmi.sys.vendor: Dell Inc.
>
>
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/554172/+subscribe
>

On Thu, Jul 01, 2010 at 04:42:53AM -0000, Andy wrote:
> This is a serious problem. At this point 10.04 is not production worthy. ...

I agree.

But the evidence is that Canonical is just not paying any attention to this
line of bugs OR somehow does not think they _are_ serious.

I have a service contract with Canonical and have had a Support Case open with
them about this issue since 24 May. On 21 June they said:

"... we will follow up with Scott regarding
the bug #543506 for you. We will also present him with your comment #83
from the bug and have his opinion on that. If he agreed with the workaround
and can find out the root cause for these issue, that will be great, it
means we might be able to resolved this issue. If he disagreed with the
workaround or needs more information, we will let you know as well."

What do we do now to get their attention?

Start a picket line on the Isle of Man?

Write to the magazines?

Does anyone have Mark Shuttleworth's phone number?

I agree also. Most of you are much better than I when you are getting deep into the working of an operating system. I have not tried the "comment 83" fix for fear of breaking something and creating a royal mess.

I was new to Ubuntu and Linux in February 2009. Since then I've installed Ubuntu on 20 machines. I started with 8.10, and then progressed through 9.04, 9.10, and now 10.04. I've converted about 10 people to Ubuntu from Windows. Everything was fine until 10.04. I originally thought that each version would be building on and improving the previous release. That held true until 10.04. I filed the initial bug about color printers printing with a cream colored background instead of white. To see the garbled word Ubuntu on systems which were upgraded and long periods of a black screen make it seem like it was released before it was ready, just like Vista. My experience with Vista is one of the reasons I'm running Ubuntu. To have seen the repository builds of OpenOffice not having a functioning spell checker was a disgrace. To have some systems through all releases not being able to come out of hibernation without powering down and rebooting and that issue never being fixed. But how do I explain to those whom I converted to Ubuntu that a major issue like their printers not working isn't fixed for over 2 months after the version was released. And this is an LTS?????? The spell checker was fixed. Within a week there was a work around for the cream colored background, and in about 3 weeks a fix was put out in the updates. The spell checker now works in OpenOffice. A clean install is the only thing which gives a somewhat decent view of the boot process during boot. It seems quite ridiculous though that this printing issue isn't fixed. I have two machines affected by this. The strange thing is some of my machines have never had the issue.

I think the problem is that nobody really wants to take responsibility for
that. I have another bug open which describes Xorg not starting because it
cannot open the console (which is probably the same why cups does not start
sometimes). I posted this bug against upstart, but it was rejected as
probably being a bug with nvidia, and that's where's the bug stuck now.

As far as I followed all these bug reports about services not starting etc,
everything seems to come together around /dev/console and tty devices which
cannot be opened, or the loopback devices initialized by the wrong service.
So to me it is either a kernel bug (unlikely) or an upstart bug (wrong deps
or events fired to early or missing / timing issues).

The biggest problem however is I have to find arguments for my customers why
we didn't choose Windows which "works much better" (which I don't believe in
in the long run, as it usually does not "just work" for a longer time with
unexperienced users, having to invest in all those costs like regular
maintenance, windows license, antivir license, bigger hardware).

2010/7/1 Don Myers <email address hidden>

> But how do I explain to those whom I converted
> to Ubuntu that a major issue like their printers not working isn't fixed
> for over 2 months after the version was released. And this is an
> LTS??????

I had the problem after installing NVIDIA proprietary driver. Deinstalling this driver solved the problem. May be it is not a special cups problem: As already mentioned above virtual consoles did not work while NVIDIA proprietary driver was installed.
This happend on a 64bit machine with installed 32bit linux.

I had the problem with installed 64bit linux and NVIDIA proprietary driver too. I don't know if deinstalling NVIDIA proprietary driver could be a solution for this - I tried reinstalling linux with 32 bit.

Jarige (jarikvh) wrote :

I don't think this is an NVIDIA bug though, I don't have NVIDIA drivers installed and I'm still affected.

I've upgraded about 20 machines from Ubuntu 8.04 to 10.04 and have only seen this problem on one machine.

That was on a old Pentium 3 machine running a "linux-generic" kernel on a 32-bit install. The other machines which do not show this problem were either running a 64-bit install or a "linux-generic-pae" kernel. The Nvidia drivers were not installed an any of them.

So has anyone seen it on a machine which is not a 32-bit install running a "generic" (not "generic-pae") kernel?

On Thu, Jul 01, 2010 at 07:26:47PM -0000, John Edwards wrote:
> So has anyone seen it on a machine which is not a 32-bit install running
> a "generic" (not "generic-pae") kernel?

In my case, I was running a quad-core AMD with a 64-bit Linux. For a while I
thought it was a problem that showed up on "fast" machines. Then someone at
our Linux group had it on an old 32-bit Pentium-4 laptop.

I would stress that whether the machine boots to runlevel "N 2" or "unknown"
appears to be largely random. Several times I thought I had found a fix
(several good boots in a row) and then I'd get burned again.

For Mike Bianchi - what kernel was your 64-bit machine running? To find out run 'uname -a' and post the output.

joschi (meyer-email) wrote :

I found that I have the same problem yesterday. I do not need my printer as much but I think when I install Lucid (fresh install) printing worked. I was very surprised when there was no printer in my kmail printing dialog yesterday.
I tried the workaround in https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/83 and until now that seems to work. But I have to say that I have done only about 5 boots since yesterday.

Menno (m-tjoelker) wrote :

Occasionally Cups is not started after boot.
As far as I have noticed, this always coincides with runlevel 'unknown'.
The problem occurs randomly, as if the system is throwing a dice at each system start.

System: AMD Athlon 64 X2 dual core 3800+, 2GB DDR2 memory

A remark to my fellow reporters:

It is not helpful at all to start blaming Canonical and its developers, to make comparisons with Vista, refer to other, unrelated problems, etc. Too bad a problem like this happens, but such is life. I've seen even worse problems with commercial software in a professional setting - a problem that did not occur in most systems, but persistently hit a large customer and proved very difficult to solve, in spite of serious efforts from the provider. Management perception: bad software, bad software company. Nonsense!

By the way: a better bypass than making changes to system files is to start the missing service(s) manually - in particular for those who do not have a real understanding of what they're doing (and afterwards, don't remember what they did).

On Thu, Jul 01, 2010 at 08:06:52PM -0000, John Edwards wrote:
> For Mike Bianchi - what kernel was your 64-bit machine running? To find
> out run 'uname -a' and post the output.

I don't have access to that machine now, but this from a syslog kept while
debugging. You can see the "Linux version" and later the "init=/sbin/init --verbose"

 May 25 nn:nn:nn autoaud-broad1 kernel: imklog 4.2.0, log source = /proc/kmsg started.
 May 25 nn:nn:nn autoaud-broad1 rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="989" x-info="http://www.rsyslog.com"] (re)start
 May 25 nn:nn:nn autoaud-broad1 rsyslogd: rsyslogd's groupid changed to 103
 May 25 nn:nn:nn autoaud-broad1 rsyslogd: rsyslogd's userid changed to 101
 May 25 nn:nn:nn autoaud-broad1 rsyslogd-2039: Could no open output file '/dev/xconsole' [try http://www.rsyslog.com/e/2039 ]
 May 25 nn:nn:nn autoaud-broad1 kernel: [ nn.nnnnnn] Initializing cgroup subsys cpuset
 May 25 nn:nn:nn autoaud-broad1 kernel: [ nn.nnnnnn] Initializing cgroup subsys cpu
 May 25 nn:nn:nn autoaud-broad1 kernel: [ nn.nnnnnn] Linux version 2.6.32-22-generic (buildd@yellow) (gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) ) #33-Ubuntu SMP Wed Apr 28 nn:nn:nn UTC 2010 (Ubuntu 2.6.32-22.33-generic 2.6.32.11+drm33.2)
 May 25 nn:nn:nn autoaud-broad1 kernel: [ nn.nnnnnn] Command line: BOOT_IMAGE=/boot/vmlinuz-2.6.32-22-generic root=UUID=c2e45336-da96-4d8e-bd38-3caa7bded803 ro init=/sbin/init --verbose
 May 25 nn:nn:nn autoaud-broad1 kernel: [ nn.nnnnnn] KERNEL supported cpus:
 May 25 nn:nn:nn autoaud-broad1 kernel: [ nn.nnnnnn] Intel GenuineIntel
 May 25 nn:nn:nn autoaud-broad1 kernel: [ nn.nnnnnn] AMD AuthenticAMD
 May 25 nn:nn:nn autoaud-broad1 kernel: [ nn.nnnnnn] Centaur

Since we are looking at old syslogs, notice the "Failed to spawn rc-sysinit" line of this snippet ...
 May 26 09:37:42 autoaud-broad1 kernel: [ 0.901072] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input1
 May 26 09:37:42 autoaud-broad1 kernel: [ 0.901105] ACPI: Power Button [PWRF]
 May 26 09:37:42 autoaud-broad1 init: Failed to spawn rc-sysinit main process: unable to open console: Input/output error
 May 26 09:37:42 autoaud-broad1 init: rc-sysinit goal changed from start to stop
 May 26 09:37:42 autoaud-broad1 init: rc-sysinit state changed from spawned to stopping
 May 26 09:37:42 autoaud-broad1 init: Handling stopping event
 May 26 09:37:42 autoaud-broad1 init: rc-sysinit state changed from stopping to killed
 May 26 09:37:42 autoaud-broad1 init: rc-sysinit state changed from killed to post-stop
 May 26 09:37:42 autoaud-broad1 init: rc-sysinit state changed from post-stop to waiting
 May 26 09:37:42 autoaud-broad1 init: Handling stopped event
 May 26 09:37:42 autoaud-broad1 init: Handling net-device-up/failed event
 May 26 09:37:42 autoaud-broad1 init: Handling started/failed event

This is one of the things that leads me to believe that we are dealing with an upstart bug.
Failure to open the console does not sound to me like a good reason to kill a /etc/init/*.conf script .

Menno, it's not just CUPS. The server I met this problem with does not even have CUPS installed.

If runlevel 2 is not entered then it is *all* services that are started by rc scripts that fail. That is almost everything.

The only things that are running are udev, rsyslog and sshd. You don't even get a login getty on the console. If sshd did not start for any reason, then the machine would appear completely dead and you would have to boot from a rescue CD (or similar). That's a bit difficult when the server is in a data centre in a different city.

Andy (andy-xillean) wrote :

On a new AMD quadcore 4GB ram box with nVidiaGeForce 6150 integrated graphics running 10.04 AMD64 I had a problem with X not starting up on boot (black screen). it was just booting to a console logon. If I logged on to the console and ran /etc/init.d/gdm start then X would startup without issue. I disabled the on-board 6150 and installed a 9800GTX and the problem on that box disappeared. Now that I see this thread i think that was an upstart problem too.

On another older intel celeron 800Mhz box with intel integrated graphics I had a black screen and cups not starting up. That machine is still crippled at the moment only running in VESA mode and still sometimes get black screen. Worked perfect in 9.10

On another highend AMD64 box with 8GB/RAM and 9800GTX video card this issue is showing up too. on this box bind9, ddclient, cups fail to start on boot about 50% of the time.

On a new HP server with 10.04 AMD64 8CPUs bind starts 50% of the time. Thankfully its a server and they wont have to be rebooting it much.

on a Dell inspiron 1520n that came with Ubuntu installed from Dell. This was upgraded to 10.04 and now cups starts about 70% of the time. This is down from 100% of the time in on 9.10

on a Dell inspiron mini (netbook) intel gma500 cups starts about 70% of the time that system is booted. The GMA500 is using Lucazade's hacked GMA500 driver else ubuntu wouldn't even run on this netbook. as far as GMA500 and 2D acceleration Lucazade and Jbernardo http://ubuntuforums.org/showthread.php?p=9324361 fixed what Canonical and Intel could not. The upstart is still not starting up cups properly on this machine tho. Maybe Lucazade and Jbernrado could look at it.

On Thu, 01 Jul 2010 22:14:43 -0000
John Edwards <email address hidden> wrote:

> Menno, it's not just CUPS. The server I met this problem with does not
> even have CUPS installed.
>
> If runlevel 2 is not entered then it is *all* services that are started
> by rc scripts that fail. That is almost everything.
>
> The only things that are running are udev, rsyslog and sshd. You don't
> even get a login getty on the console. If sshd did not start for any
> reason, then the machine would appear completely dead and you would have
> to boot from a rescue CD (or similar). That's a bit difficult when the
> server is in a data centre in a different city.
>

Hello,

If it concerns a server located in a data centre and it is not remotely accessible,
that is pretty serious indeed.

As far as I have noticed, if the runlevel is not set and Cups is not started, nothing
important beyond that is missing. Even services that are started in the same way
(from /etc/rc2.d/) are running normally (e.g., ntpd, sshd, pulseaudio).

Menno

In reply to comment #104 - I just couldn't start a VirtualBox virtual machine so I did start rc-sysinit. ps aux showed me the following processes: VirtualBox, CouchDB, CUPS, Apache2 and ondemand which hadn't been initialized before.

Michael Goetze (mgoetze) on 2010-07-06
affects: upstart (Ubuntu) → linux (Ubuntu)
Chris (cdf) wrote :

I'm running 10.04.
This problem only started in the last two weeks with me. I noticed one 'update' recently included 'cups' & since then its been a problem. Apparently the 'CUPS service is stopped' and has to started via terminal. Initial a 'restart' would get it working but now it never seems to start on boot.

I have tried booting my Pentium 3 server (see comment 151 and 157 above) with both a generic and generic-pae, and both have problems entering run level 2.

Closer inspection of the kernel messages shows that ureadahead, plymouthd and init were being killed during boot by the kernel's out-of-memory killer. I disabled ureadahead by deleting /etc/init/ureadahead.conf and /etc/init/ureadahead-other.conf, and the server has rebooted several times without problems.

The problem of ureadahead being out-of-memory could be related to bug #590963:
  https://bugs.launchpad.net/ubuntu/+source/ureadahead/+bug/590963

I suspect that the init process being killed by the oom-killer is why this server is not entering run level 2.

Don Myers (donmyers) wrote :

The cups update didn't change anything relative to this problem for me with the printers not being available after a startup. The machines that have had the problem continue to have the problem. The ones that don't have the issue haven't had a problem yet. But one thing I have noticed as compared to Ubuntu 8.10, 9.04, and 9.10, is that 10.04 occasionally, when you are printing a document, the printers show that they are available, but when you click to print a document, sometimes it will take 15 to 20 seconds before it will print. It says it is searching for the printer even though cups appears to be started and the printers are displayed. This happens on both the machines where cups fails to start at times, and on machines where cups always starts.

Don Myers (donmyers) wrote :
Download full text (16.5 KiB)

Hi,

I did a clean install on a new computer about 3 weeks ago. It had never had an issue of the printers not being loaded. I booted this morning, and about 15 minutes ago went to print for the first time today, and no printers. Doing the 'sudo start rc-sysinit' and the 'ps aux' commands per response 104 here yielded the following information:

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 2796 1632 ? Ss 09:31 0:00 /sbin/init
root 2 0.0 0.0 0 0 ? S 09:31 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S 09:31 0:00 [migration/0]
root 4 0.0 0.0 0 0 ? S 09:31 0:00 [ksoftirqd/0]
root 5 0.0 0.0 0 0 ? S 09:31 0:00 [watchdog/0]
root 6 0.0 0.0 0 0 ? S 09:31 0:00 [migration/1]
root 7 0.0 0.0 0 0 ? S 09:31 0:00 [ksoftirqd/1]
root 8 0.0 0.0 0 0 ? S 09:31 0:00 [watchdog/1]
root 9 0.0 0.0 0 0 ? S 09:31 0:00 [migration/2]
root 10 0.0 0.0 0 0 ? S 09:31 0:00 [ksoftirqd/2]
root 11 0.0 0.0 0 0 ? S 09:31 0:00 [watchdog/2]
root 12 0.0 0.0 0 0 ? S 09:31 0:00 [migration/3]
root 13 0.0 0.0 0 0 ? S 09:31 0:00 [ksoftirqd/3]
root 14 0.0 0.0 0 0 ? S 09:31 0:00 [watchdog/3]
root 15 0.0 0.0 0 0 ? S 09:31 0:00 [events/0]
root 16 0.0 0.0 0 0 ? S 09:31 0:00 [events/1]
root 17 0.0 0.0 0 0 ? S 09:31 0:00 [events/2]
root 18 0.0 0.0 0 0 ? S 09:31 0:00 [events/3]
root 19 0.0 0.0 0 0 ? S 09:31 0:00 [cpuset]
root 20 0.0 0.0 0 0 ? S 09:31 0:00 [khelper]
root 21 0.0 0.0 0 0 ? S 09:31 0:00 [netns]
root 22 0.0 0.0 0 0 ? S 09:31 0:00 [async/mgr]
root 23 0.0 0.0 0 0 ? S 09:31 0:00 [pm]
root 25 0.0 0.0 0 0 ? S 09:31 0:00 [sync_supers]
root 26 0.0 0.0 0 0 ? S 09:31 0:00 [bdi-default]
root 27 0.0 0.0 0 0 ? S 09:31 0:00 [kintegrityd/0]
root 28 0.0 0.0 0 0 ? S 09:31 0:00 [kintegrityd/1]
root 29 0.0 0.0 0 0 ? S 09:31 0:00 [kintegrityd/2]
root 30 0.0 0.0 0 0 ? S 09:31 0:00 [kintegrityd/3]
root 31 0.0 0.0 0 0 ? S 09:31 0:00 [kblockd/0]
root 32 0.0 0.0 0 0 ? S 09:31 0:00 [kblockd/1]
root 33 0.0 0.0 0 0 ? S 09:31 0:00 [kblockd/2]
root 34 0.0 0.0 0 0 ? S 09:31 0:00 [kblockd/3]
root 35 0.0 0.0 0 0 ? S 09:31 0:00 [kacpid]
root 36 0.0 0.0 0 0 ? S 09:31 0:00 [kacpi_notify]
root 37 0.0 0.0 0 0 ? S 09:31 0:00 [k...

Sepero (cowpie2000) wrote :

A Possible Solution

This bug seems very related to this one:
https://bugs.launchpad.net/ubuntu/+source/consolekit/+bug/544139

This solution here is what worked for me and a few others. If it works for you, please help spread it to others.
http://ubuntuforums.org/showpost.php?p=9480172&postcount=20

Gavin Munday (gmunday-shaw) wrote :

Upgrading to Ubuntu 10.04 and trying to get printing to work must be really frustrating for new Ubuntu users.

I had to deal with 2 bugs:

First was bug #595650 (error message: "Printer 'hp-LaserJet3030' may not be connected") related to package libusb. The fix listed worked for me.

Second was bug #554172 discussed here. Sometimes (but not always) after booting up the printer does not show up in System > Administration > Printing and I am unable to print. Starting CUPS manually by typing "sudo service cups start" in terminal solved the problem.

Hendrik Knackstedt (hennekn) wrote :

If this bug is related to https://bugs.launchpad.net/ubuntu/+source/consolekit/+bug/544139 you should have a message like:
"WARNING: Error waiting for native console 5 activation: Invalid argument" in your daemon.log. Could somebody please check?

Xiang (hsiang-liu) wrote :

everytime when I found that runlevel returned unknown, some services must have not started at booting.

But if I delete the file of /var/run/utmp and reboot, runlevel will return 'N 2'. It means all services have started as usually.

As far as I know, the runlevel command is an indicator of this sort of situation

Don Myers (donmyers) wrote :

In response to Sepero, message 165 above, could you please clarify the instructions in the second link in your post.

When I run sudo su -, it puts me in root.

In the next step, I'm assuming the words done & exit are not to be included in what I enter into the terminal.
When I run:
sed -i 's/^console output/\#console output/' $file;

I get:
sed: no input files

Obviously I'm doing something wrong. If you would be kind enough to put exacly what needs to be entered into the terminal I will see if this works.

Thank you.

Don Myers, enter each of the shown lines completely, one after the other, into a terminal window and press Enter after each line.

First enter:

sudo su -

You will get asked for your password, after entering it correctly you will get a root prompt, which means that all subsequent commands in this terminal will get executed as root.

Then enter the next line completely (copy and paste it into the terminal):

for file in /etc/init/*.conf; do sed -i 's/^console output/\#console output/' $file; done

This line executes the sed command in a loop, applying it to each .conf file in the /etc/init/ directory.

After that, enter

exit

to get out of the root console back into normal mode.

DAP (douglas-peale) wrote :

Some behaviors I'm seeing on my system that might be related to this problem:

Occasionally I get a message box saying "Camera Monitor is already running". Camera Monitor is only started via "Startup Applications Preferences" and should only start once. When this happens, cupsd is running, and runlevel is N 2 (though I have only tested todays boot since I did not know about the runlevel command before today).

I have also had mythbackend fail to start.

Now that I know about the runlevel command, I will watch the runlevel and see if there are correlations to "Camera Monitor is already running".

Menno (m-tjoelker) wrote :

It looks like the problem in my case has disappeared after replacing kernel version 2.6.32-23 by version 2.6.32-24.
I am not absolutely sure, it could be a coincidence and it could also be caused by some other system modification or software upgrade - my usual procedure is to install updates immediately - but since that moment, I haven't experienced the problem any more. (I have made it a habit to execute runlevel and ps ax|grep cups after every boot.)
If it is sort of race condition that is causing the problem, any change could slightly modify the timings and so make the problem disappear.

Don Myers (donmyers) wrote :

First, for Till and Post 170, thank you so much for your clarification. With that help I have done all of the items suggested by Sepero in post 165 above at this link:
http://ubuntuforums.org/showpost.php?p=9480172&postcount=20
on the machine that has given me the most problems. I've done 3 reboots and 1 cold boot so far and everything has worked fine. I'll do the second machine tonight and keep you updated on how both are doing.

On 07/16/2010 04:34 PM, Don Myers wrote:
>I'll do the second machine tonight and keep you updated on how both are
doing.

Please don't. This is a bug report and not a personal support forum. I
have no idea how you Ubuntu users expect developers to sift through bug
reports when you constantly spam them with repetition and confirmation
of workarounds...

I had no idea I was spamming this board. I'm sorry. The computer I put this work around on is a co-worker's computer who has been frustrated enough to tell me to put Windows back on his machine at least 3 times and forget Ubuntu and Linux. For 2 1/2 months now I've convinced him to hang in there. He is not overly computer savvy, and gets frustrated easily when things don't work. I was trying to keep a Ubuntu user and not loose him, and I was following the instructions in post 165 to see if this would work. I am sorry this was considered spamming.

DAP (douglas-peale) wrote :

I have had another instance of the titleless message box with the text:
Another instance of Camera Monitor is already running!

runlevel reports: N 2

Is there any way to tell if this bug, when it doesn't cause daemons to fail to start, attempts to start them more than once?

dsainty (dave-dtsp) wrote :

I applied the first two items of https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/83 and that seems to have helped.

This seems like a well understood problem, why is it not well fixed?

Don Myers (donmyers) wrote :

I did things listed here in the link in Post 165:
http://ubuntuforums.org/showpost.php?p=9480172&postcount=20

My co-workers computer which would only load the printers at best 50% of the time has loaded has loaded them every time in 15 boots.

Nick Davis (argoneus) wrote :

The workaround described in http://ubuntuforums.org/showpost.php?p=9480172&postcount=20 doesn't work for me. I still have services that don't start on system reboot.

On Mon, Jul 19, 2010 at 08:04:03PM -0000, phireph0x wrote:
> The workaround described in
> http://ubuntuforums.org/showpost.php?p=9480172&postcount=20 doesn't work
> for me. I still have services that don't start on system reboot.

See
   https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/99
and
   https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/81

Did you follow those instructions?

I don't think that any workarounds could really fix this - the
workarounds just modify timing or reduce impact. I think the problem
is really much more lowlevel - device nodes that are not ready, or
events emitted by upstart in wrong order or wrong timing.

I don't want to be mean, but I'd suggest to move such repeating
discussions to the support forum (or stay there), this is a bug
tracker. Add yourself to CC to express your urge for a fix but move
"did you try this and that" to a forum and report back if a
work-around really fixes a problem reproducibly for multiple people.
Especially since the backref is actually a forum link!

On Tue, Jul 20, 2010 at 09:49:13AM -0000, Kai Krakow wrote:
> I don't think that any workarounds could really fix this - the
> workarounds just modify timing or reduce impact.

You are absolutely right.
These are workarounds.

But they are the best we can do because I, for one, do not have the time
to chase this problem as a bug in /bin/init
nor the knowledge to chase it as a bug in the kernel.

The workarounds that work for me (follow the links in comment 180) allowed
me to deliver a working machine to a customer. But I have given up on 10.4
until this bug is properly addressed.

>I think the problem is really much more lowlevel - device nodes that are not ready, or
> events emitted by upstart in wrong order or wrong timing.

I think that point has already been made and has been apparent for months.
See
  https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/54
and
  https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/58
and
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/554172/comments/136

> I don't want to be mean, but I'd suggest to move such repeating
> discussions to the support forum (or stay there), this is a bug
> tracker. Add yourself to CC to express your urge for a fix but move
> "did you try this and that" to a forum and report back if a
> work-around really fixes a problem reproducibly for multiple people.
> Especially since the backref is actually a forum link!

This bug is labeled Confirmed, of High Importance, and _Unassigned_ !

We are trying, desperately to get someone actively working on the problem.
I've been told by Canonical Support that they are not working on it because
they have not been able to reproduce it!

The activity here is an attempt to get their attention,
demonstrating that it is a real bug affecting more than a few people.
My experience has been that discussions in the forums do not get any
attention at all from Canonical.

If you read all the activity in this and
duplicate bugs (especially #543506) some of us have put a lot of effort
into narrowing down the specific symptoms that cause this class of
"(something) does not start on boot" bug.

My hope is that by accumulating enough evidence here we may get someone
to modify /bin/init so it will determine that /dev/console is writable
before progressing with the /etc/init/*.conf scripts
_or_ fix the underlying kernel bug.

Download full text (3.3 KiB)

My understanding of the problem is that /dev/console exists but isn't
writable.
Also, it appears it becomes writable after a little while.

Would waiting for /dev/console to become writable be a possible workaround?
For example by adding something like this to the relevant /etc/init/*
scripts:

pre-start script
  while ! /bin/dd if=/dev/null of=/dev/console 2>/dev/null
  do
    sleep 1
  done
end script

If that works it may even be possible to restore the 'console output' lines
(though this depends on whether they take effect after pre-start is
executed).
Note that this will hang if /dev/console does not become writable!!

Another way to accomplish the above would be to create a
/etc/init/console-ready job with the above pre-start section.
Jobs that require /dev/console to be writable could then be made dependent
on console-ready.

I'd try the above myself but I've moved to Debian after too many issues with
Ubuntu 10.04.
Sad as generally I really like Ubuntu.

Wout

On Mon, Jul 19, 2010 at 22:26, Mike Bianchi <email address hidden>wrote:

> On Mon, Jul 19, 2010 at 08:04:03PM -0000, phireph0x wrote:
> > The workaround described in
> > http://ubuntuforums.org/showpost.php?p=9480172&postcount=20 doesn't work
> > for me. I still have services that don't start on system reboot.
>
> See
>
> https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/99
> and
>
> https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/81
>
> Did you follow those instructions?
>
> --
> CUPS and other system services not starting at boot
> https://bugs.launchpad.net/bugs/554172
> You received this bug notification because you are a direct subscriber
> of a duplicate bug.
>
> Status in “linux” package in Ubuntu: Confirmed
>
> Bug description:
> Binary package hint: cups
>
> Cups is not loading on my machine at boot, must run sudo /etc/init.d/cups
> start to after booting to print.
>
> ProblemType: Bug
> DistroRelease: Ubuntu 10.04
> Package: cups 1.4.2-10
> ProcVersionSignature: Ubuntu 2.6.32-19.28-generic 2.6.32.10+drm33.1
> Uname: Linux 2.6.32-19-generic i686
> NonfreeKernelModules: nvidia
> Architecture: i386
> Date: Fri Apr 2 13:07:35 2010
> InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Alpha i386 (20100401)
> Lpstat: Error: command ['lpstat', '-v'] failed with exit code 1: lpstat:
> Connection refused
> MachineType: Dell Inc. Studio XPS 1340
> Papersize: letter
> PpdFiles: Brother-HL-2170W-series: Brother HL-2170W Foomatic/pxlmono
> (recommended)
> ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-19-generic
> root=UUID=615bbe85-506a-4152-af5a-a5c2da303d83 ro quiet splash
> ProcEnviron:
> LANG=en_US.utf8
> SHELL=/bin/bash
> SourcePackage: cups
> dmi.bios.date: 09/08/2009
> dmi.bios.vendor: Dell Inc.
> dmi.bios.version: A11
> dmi.board.name: 0Y279R
> dmi.board.vendor: Dell Inc.
> dmi.board.version: A11
> dmi.chassis.asset.tag: 1234567890
> dmi.chassis.type: 8
> dmi.chassis.vendor: Dell Inc.
> dmi.chassis.version: A11
> dmi.modalias:
> dmi:bvnDellInc.:bvrA11:bd09/08/2009:svnDellInc.:pnStudioXPS1340:pvrA11:rvnDellInc.:rn0Y279R:rvrA11:cvnDellInc.:ct8:cvrA11:
> dmi.product.name: Studio XPS 1340
> dmi.product.version: A11
> dmi.sys.vendo...

Read more...

Ok installed 10.04 Server AMD64 fresh on a brand new Dell 2950 dual 6-core AMD Opteron box with 16GB of ram.

Report:.
Sometimes on boot i get a hang right after this message.
"init: ureadahead-other main process terminated with status 4"

Sometimes the server will boot up fully to the console login but some services have not started
such as libvirtd.

Sometimes when this happens the runlevel command shows (unknown)
At other times it shows N 2

Not sure what is going on but this problem has put a halt on all 10.04 server deployments on this side. We cannot deploy servers in this state. An OS with critical services that refuse to start at least 99% of the time is fundamentally not ready for deployment in production. Especially virtual machines and such. I have seen this problem now on many many machines workstations and servers. Some of which I have reported on here. usually the faster ones if that helps. I hope this gets done as we are a die hard Ubuntu shop and now are feeling pressured to start evaluating Debian for our customers needs. I hope Mark Shuttleworth is aware of this issue because it is becoming very critical out here in the trenches.

Richard Driscoll (rjdriscoll) wrote :

Ubuntu 10.04 32 bit. I have two USB printers. If both printers are off at boot time CUPS does not start and turning on a printer fails to start it. If a printer is on at boot time then CUPS starts and both printers are available (even though one of them may be turned off). This was not a problem on 9.04 and I suspect that it wasn't initially a problem on 10.04 so one of the updates may have broken something.

Don Myers (donmyers) wrote :

I never had an issue with there being any difference whether the printers were on or not on boot. On 1 computer cups would not be started 80% of the time. On another about 30% of the time. The workaround in:
http://ubuntuforums.org/showpost.php?p=9480172&postcount=20
has worked flawlessly for me with no issues of cups not starting since I did them. I did all of the itmes listed there.

Don Myers (donmyers) wrote :

I found today that a third computer that I upgraded from 9.10 to 10.04 is also having this same issue.

Nick Davis (argoneus) wrote :

RE: Mike Bianchi Yes, I tried ALL of the instructions suggested: commenting out 'console output' in /etc/init/*.conf files, adding the rsyslog config line to /etc/init/rc-sysinit.conf, and adding the "init='/sbin/init --verbose'" line to the grub config. This didn't solve my problem: I still have services that don't start on reboot, including nginx and a couple fastcgi init scripts.

An interesting, possibly related, thing that I've noticed is when I issue 'shutdown -r now', sometimes the machine doesn't reboot. I've had to issue the command again to get the machine to actually reboot.

I'll add my 2 cents that this issue is quite dire (what organization wants a server that doesn't reliably start services on boot?), and hope that Canonical is working on a fix.

On Thu, Jul 22, 2010 at 03:18:10PM -0000, phireph0x wrote:
> RE: Mike Bianchi Yes, I tried ALL of the instructions suggested:
> ... .... I still have services that don't start on reboot,

Thanks for all your efforts.

Do you get runlevel "unknown" or "N 2" when your services do not start?

If they do not start _but_ you have runlevel "N 2" then I'd go looking
in /var/log/syslog and /var/log/messages for hints as to why?

I had this problem on one of my systems after upgrading to lucid - about 50% of boots left cups not running and nothing meaningful in the logs. It would always start manually. I have several other lucid systems that never encounter this problem.

I applied the circumvention from my own bug report https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/604283 (adding lo:0 to /etc/network/interfaces as described) and since then, the cups has started on the system 5 out 5 boots.

If anyone has this problem, please try the same circumvention and update both bug reports with your observations.

Don Myers (donmyers) wrote :

I've done 20 installs if 10.04, mostly upgrades from 9.10, all 32 bit. Initially only 2 machines were problem machines. My computer at home and my co-workers computer at the office. None of the others had this issue for months. The upgrades were done shortly after 10.04 was released. On the two problem machines, about a week and a half ago, I did things listed here in the link in Post 165:
http://ubuntuforums.org/showpost.php?p=9480172&postcount=20
So far those machines have performed perfectly.

But........, I am now finding this happening to machines which had not had the issue. My main machine at the office, and my backup machine at the office last week, and last night my brother had it happen on his machine. He had not had any issues since I upgraded him in early May. Each time I run into this instance, I am now applying the work around above. It looks like many computers are going to be facing this issue even if they seemed fine for a period of time. It doesn't matter with respect to whether or not the computer is a newer, high spec machine, or an older low spec machine. It really looks like I had best install the workaround in post 165 as soon as I do any new installs (I have one I will be doing soon) just so people don't have this issue, and I don't have problems to fix for people.

Menno in post 172 above said "replacing kernel version 2.6.32-23 by version 2.6.32-24" seemed to solve the issue for him. That kernel came in my updates last night, so this issue happened to the three other computer before the updated kernel. I don't know if that makes a difference or not.

Robbie Williamson (robbiew) wrote :

Before diving into this comment quagmire, I'd like to know if anyone experiencing this bug, also has the latest SRU kernel (2.6.32-24.38) installed.

I have the latest kernel:
Linux version 2.6.32-24-generic (buildd@yellow) (gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) ) #38-Ubuntu SMP Mon Jul 5 09:20:59 UTC 2010

This version worsened the bug. Before cups randomly refused to start, but now it never starts at boot time.
Further symptoms that apply to the same behaviour (these symptoms only now and then occurred in the previous kernel, and now always):
- sound system is not working either
- Trying to reboot from Gnome does not work: the login screen is displayed. Rebooting with "sudo reboot" works fine.

I have a SSD installed, and I suspect that that might be part of the problem (the fast boot time).

Andy (andy-xillean) wrote :

Same problem here with this Kernel. This problem shows up more frequently on the faster systems here too. All the fast systems we setup exhibit this problem intermittently see my post #158 above for typical specs

Robbie Williamson - I can confirm occasional problems entering runlevel 2 with the 2.6.32-24 kernel.

--------
$ uname -a
Linux aa000039.uk.fabit.net 2.6.32-24-generic-pae #38-Ubuntu SMP Mon Jul 5 10:54:21 UTC 2010 i686 GNU/Linux
$ runlevel
unknown
--------

This is on a Pentium 3 server with 1.2GB of RAM. The files in /etc/init/ do not have "console output" commented out, and ureadahead has not been removed. Those are the two usual things I use to provide a fix to this problem.

In this configuration the server fails to boot about 50% of the time.

I have attached the dmesg, output of 'ps auxwwf', meminfo, cpuinfo and contents of /etc/init/. I hope this helps you find the problem. Let me know if you need to test anything else.

Steve Oliver (pdt163r) wrote :

Ubuntu 10.04 32 bit with the latest kernel.

Brian Burch's circumvention in post 190 works for me.

Changed in linux (Ubuntu Lucid):
importance: Undecided → High
status: New → Confirmed
assignee: nobody → Canonical Foundations Team (canonical-foundations)
Changed in linux (Ubuntu Lucid):
importance: High → Medium
importance: Medium → High
Robbie Williamson (robbiew) wrote :

To summarize the 190+ comments on this bug:
1) There is definitely a problem that needs fixing...no arguing this.
2) It has *something* to do with when lo0 is activated, i.e. a race condition
3) There are apparently a few hacky workarounds, but no fix
4) There is no concrete evidence that this is a kernel issue
5) There is no concrete evidence that this is an upstart issue

This bug is officially "on the radar", so we will figure out what's causing the issue and address it in an SRU, but it's too late for 10.04.1.

affects: linux (Ubuntu) → ubuntu
Changed in ubuntu:
importance: Critical → High
Download full text (3.1 KiB)

Hello Robbie,

If you plan to dive into this quagmire now you should have a look at bug
543506
as well (if you haven't already). It was marked a duplicate of this
one.
In 543506, Mike Bianchi posted some lines from his logs (May 26 email):

May 26 09:37:42 autoaud-broad1 init: Failed to spawn rc-sysinit main
process: unable to open console: Input/output error
May 26 09:37:42 autoaud-broad1 init: rc-sysinit goal changed from start to
stop
May 26 09:37:42 autoaud-broad1 init: rc-sysinit state changed from spawned
to stopping

I think the 'unable to open console' message points to the core of all this.
Why would /dev/console not be available?

Wout

On Mon, Aug 2, 2010 at 05:24, Robbie Williamson <
<email address hidden>> wrote:

> To summarize the 190+ comments on this bug:
> 1) There is definitely a problem that needs fixing...no arguing this.
> 2) It has *something* to do with when lo0 is activated, i.e. a race
> condition
> 3) There are apparently a few hacky workarounds, but no fix
> 4) There is no concrete evidence that this is a kernel issue
> 5) There is no concrete evidence that this is an upstart issue
>
> This bug is officially "on the radar", so we will figure out what's
> causing the issue and address it in an SRU, but it's too late for
> 10.04.1.
>
>
> ** Package changed: linux (Ubuntu) => ubuntu
>
> ** Changed in: ubuntu
> Importance: Critical => High
>
> --
> CUPS and other system services not starting at boot
> https://bugs.launchpad.net/bugs/554172
> You received this bug notification because you are a direct subscriber
> of a duplicate bug.
>
> Status in Ubuntu: Confirmed
> Status in The Lucid Lynx: Confirmed
>
> Bug description:
> Binary package hint: cups
>
> Cups is not loading on my machine at boot, must run sudo /etc/init.d/cups
> start to after booting to print.
>
> ProblemType: Bug
> DistroRelease: Ubuntu 10.04
> Package: cups 1.4.2-10
> ProcVersionSignature: Ubuntu 2.6.32-19.28-generic 2.6.32.10+drm33.1
> Uname: Linux 2.6.32-19-generic i686
> NonfreeKernelModules: nvidia
> Architecture: i386
> Date: Fri Apr 2 13:07:35 2010
> InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Alpha i386 (20100401)
> Lpstat: Error: command ['lpstat', '-v'] failed with exit code 1: lpstat:
> Connection refused
> MachineType: Dell Inc. Studio XPS 1340
> Papersize: letter
> PpdFiles: Brother-HL-2170W-series: Brother HL-2170W Foomatic/pxlmono
> (recommended)
> ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-19-generic
> root=UUID=615bbe85-506a-4152-af5a-a5c2da303d83 ro quiet splash
> ProcEnviron:
> LANG=en_US.utf8
> SHELL=/bin/bash
> SourcePackage: cups
> dmi.bios.date: 09/08/2009
> dmi.bios.vendor: Dell Inc.
> dmi.bios.version: A11
> dmi.board.name: 0Y279R
> dmi.board.vendor: Dell Inc.
> dmi.board.version: A11
> dmi.chassis.asset.tag: 1234567890
> dmi.chassis.type: 8
> dmi.chassis.vendor: Dell Inc.
> dmi.chassis.version: A11
> dmi.modalias:
> dmi:bvnDellInc.:bvrA11:bd09/08/2009:svnDellInc.:pnStudioXPS1340:pvrA11:rvnDellInc.:rn0Y279R:rvrA11:cvnDellInc.:ct8:cvrA11:
> dmi.product.name: Studio XPS 1340
> dmi.product.version: A11
> dmi.sys.vendor: Dell Inc.
>
>
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/ubuntu/...

Read more...

On Mon, Aug 02, 2010 at 03:24:28AM -0000, Robbie Williamson wrote:
> :
> 2) It has *something* to do with when lo0 is activated, i.e. a race condition

I disagree, it has something to do with not being able to write /dev/console.
See my analysis in bugs 581291 and 543506.

> 3) There are apparently a few hacky workarounds, but no fix

They _are_ hacky, _BUT_ they give some of us reliable boots.
They should be distributed until a fix is found.

> 4) There is no concrete evidence that this is a kernel issue
> 5) There is no concrete evidence that this is an upstart issue

I disagree. There is strong, but inconclusive, evidence that it is EITHER
a kernel or upstart issue (maybe both). Again see 543506.

> This bug is officially "on the radar", so we will figure out what's
> causing the issue and address it in an SRU, but it's too late for
> 10.04.1.

I do not see why the hacky workarounds can not be part of the official
distribution until the real fixes are implemented.

I strongly recommend distributing the workarounds, including the
 init='/bin/sbin --verbose'
I had some evidence that it _helped_ improve boot reliability.
It certainly helped with boot sequence visibility.

I would think you would want to increase the population of reliable boots.
The evidence is that over the past months newer 10.4 updates have been
reducing that population.

Download full text (3.4 KiB)

Workarounds are hocus pocus and should not be distributed. They do not work
in all cases, so are at best an indicator of where to look for a developer.
Shouting even when frustrated does not solve problems faster. I will wait
patiently for a fix that works reliably, or use something else for the
interim.
> On Mon, Aug 02, 2010 at 03:24:28AM -0000, Robbie Williamson wrote:
>> :
>> 2) It has *something* to do with when lo0 is activated, i.e. a race
condition
>
> I disagree, it has something to do with not being able to write
/dev/console.
> See my analysis in bugs 581291 and 543506.
>
>> 3) There are apparently a few hacky workarounds, but no fix
>
> They _are_ hacky, _BUT_ they give some of us reliable boots.
> They should be distributed until a fix is found.
>
>> 4) There is no concrete evidence that this is a kernel issue
>> 5) There is no concrete evidence that this is an upstart issue
>
> I disagree. There is strong, but inconclusive, evidence that it is EITHER
> a kernel or upstart issue (maybe both). Again see 543506.
>
>> This bug is officially "on the radar", so we will figure out what's
>> causing the issue and address it in an SRU, but it's too late for
>> 10.04.1.
>
> I do not see why the hacky workarounds can not be part of the official
> distribution until the real fixes are implemented.
>
> I strongly recommend distributing the workarounds, including the
> init='/bin/sbin --verbose'
> I had some evidence that it _helped_ improve boot reliability.
> It certainly helped with boot sequence visibility.
>
> I would think you would want to increase the population of reliable boots.
> The evidence is that over the past months newer 10.4 updates have been
> reducing that population.
>
> --
> CUPS and other system services not starting at boot
> https://bugs.launchpad.net/bugs/554172
> You received this bug notification because you are a direct subscriber
> of a duplicate bug.
>
> Status in Ubuntu: Confirmed
> Status in The Lucid Lynx: Confirmed
>
> Bug description:
> Binary package hint: cups
>
> Cups is not loading on my machine at boot, must run sudo /etc/init.d/cups
start to after booting to print.
>
> ProblemType: Bug
> DistroRelease: Ubuntu 10.04
> Package: cups 1.4.2-10
> ProcVersionSignature: Ubuntu 2.6.32-19.28-generic 2.6.32.10+drm33.1
> Uname: Linux 2.6.32-19-generic i686
> NonfreeKernelModules: nvidia
> Architecture: i386
> Date: Fri Apr 2 13:07:35 2010
> InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Alpha i386 (20100401)
> Lpstat: Error: command ['lpstat', '-v'] failed with exit code 1: lpstat:
Connection refused
> MachineType: Dell Inc. Studio XPS 1340
> Papersize: letter
> PpdFiles: Brother-HL-2170W-series: Brother HL-2170W Foomatic/pxlmono
(recommended)
> ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-19-generic
root=UUID=615bbe85-506a-4152-af5a-a5c2da303d83 ro quiet splash
> ProcEnviron:
> LANG=en_US.utf8
> SHELL=/bin/bash
> SourcePackage: cups
> dmi.bios.date: 09/08/2009
> dmi.bios.vendor: Dell Inc.
> dmi.bios.version: A11
> dmi.board.name: 0Y279R
> dmi.board.vendor: Dell Inc.
> dmi.board.version: A11
> dmi.chassis.asset.tag: 1234567890
> dmi.chassis.type: 8
> dmi.chassis.vendor: Dell Inc.
> dmi.chassis.version: A11
> ...

Read more...

On Mon, Aug 02, 2010 at 12:16:08PM -0000, Phil Went wrote:
> Workarounds are hocus pocus and should not be distributed. They do not work
> in all cases, so are at best an indicator of where to look for a developer.
> Shouting even when frustrated does not solve problems faster. I will wait
> patiently for a fix that works reliably, or use something else for the
> interim.

Allow me to disagree.

For those of us able to implement the workarounds, once we discover them for ourselves or in the bug reports, waiting is an acceptable choice.

But, for the many others out there who don't even know there is a problem and just discover that CUPS, /etc/getty, or whatever are suddenly not working, for no apparent reason, it becomes frustration without explanation.

So I stick by my recommendations -- distribute these "workarounds" from https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/81

 add "init='/sbin/init --verbose'" to the linux boot line in grub

 comment out the "console output" lines in /etc/init/*.conf

 add "and started rsyslog" to /etc/init/rc-sysinit

To my mind, that last one is not a workaround. It is a crucial dependency.

I forgot about the /dev/console bit, thanks for the reminder. I did recall an issue where /dev/console was being created, but not writable at creation, which I believe would be a kernel bug. I will talk to our plumbing and kernel developers this week to try and find the problem and a proper fix to it. As for why we don't practice releasing workarounds as "fixes", I'm sorry if you disagree with the policy, but I feel it's a good one to have and don't want to waste bug comment space defending it.

On Mon, Aug 02, 2010 at 03:58:54PM -0000, Robbie Williamson wrote:
> I'm sorry if you disagree with the policy, but I feel it's a
> good one to have and don't want to waste bug comment space defending it.

I made my argument and have been heard. My argument did not convince.
That's the way it is sometimes.
Onward.

I'll place this comment here with great trepidation. I've had this problem for some time and I've been starting cups manually. I do not have the skill to do the work arounds that have been suggested. My initial approach was to try the old uninstall reinstall using synaptic on my kubuntu 64bit installation. But to no avail. Cups would just not load at start up. Eventually I reinstalled from the console using

$> sudo apt-get install --reinstall cups

 (a solution I found on http://hardc0l2e.wordpress.com/2010/05/21/cups-not-running-on-boot-ubuntu-10-04/)

and for some reason it worked. It has been two days and I have done 10 cold restarts and cups has loaded each time. I can't explain it, it just worked. I do not place this here as a solution but just to say that for some basic users such as myself this has worked and it may be worth looking at why.

On Tue, Aug 03, 2010 at 02:38:04AM -0000, Lindsay Barclay wrote:
> ... and for some reason it worked. It has been two days and I have done 10
> cold restarts and cups has loaded each time. I can't explain it, ...

This bug, which at its base is unreliable boots where scripts in
/etc/init/*.conf and /etc/rc2.d do not get executed, has _always_ proven to
be random. We have seen dozens of successful boots in a row, followed by a
failure, and vice versa.

The workarounds _do_not_ promise you will never see a bad boot. I can only
say that, for me, it _appears_ to have fixed the problem.

You can determine if the boot was probably good by running the runlevel
command. If it returns "N 2", the evidence is that you _very_likely_ to have
had a good boot and all the services are started. If it returns "unknown"
the evidence is that many services did not start, CUPS among them.

I added the runlevel command as the last line of my .bashrc profile,
so I saw the state as soon as I opened a terminal.

I managed to eliminate this issue merely by commenting out all the "console output" lines in the five init files mentioned above. What is also interesting is this....

I did not have any problems with this machine while it was on my desk. I booted it countless times without problems while configuring it. After packing it up and shipping it to a data center where it was merely hooked up (ethernet/power, no monitor) it fails to boot pretty much all the time (but stops at different points). Occasionally it will stop when SSH is up but the consoles are still down, allowing me to get into the machine.

The machine has a DRAC. As I said, the _only_ thing I changed were the "console output" lines. Could the lack of a monitor actually plugged into the DRAC have anything to do with it (Dell Remote Administration Card)? Maybe there are three workarounds because there are different causes for the same "race" conditions?

Grasping but the difference was immediate and abrupt. And the same server was working _unchanged_ but just in a different physical location with absolutely _zero_ changes. It wasn't even configured to use DHCP, it was static IP in both locations so there shouldn't really be any material "network" effects...

Hardware: Dell R200 DRAC4p

> . . . Could the lack of a monitor actually plugged into
> the DRAC have anything to do with it (Dell Remote Administration Card)?

I would say "yes". While working out that /dev/console was part of the problem I did many boots with "verbose" turned on and could see that order of the start-up processes in boots varied all over the place, even when the list of start-up processes was identical. This was a 4-core machine, so I wasn't all that surprised. So having something major like a video card not going through configuration because there is no monitor to query sounds like it would affect the work load which in turn could affect the start-up sequence.

And my evidence is that a start-up process that attempts to write /dev/console before it is writable blows away most of the following boot services.

Anthony Glenn (aglenn-pcug) wrote :

Alas, I too am being regularly bitten by this bug. I have a fully up to date version of Lucid, V10.04. Computer: mobo Asus M2N-SLI DELUXE, nVidia chipset, CPU dual core Athlon 64, main memory 1GiB, video chipset nVidia 7600GS, monitor 19-inch Hitach CM766ET CRT. Computer was built by me in 2006, it would be considered very humble and ordinary these days.

INCIDENCE: I cold boot every day, and the bug usually happens. If I restart, the bug usually does not happen, however, restarts are usually useless to clear the bug, once it does happen.

It has only happened since around late July 2010. Previously, Ubuntu would boot satisfactorily.

DETECTION: After every cold boot or restart, I do System > Administration > Printing. If there is no printer shown in the Printing - localhost window, then CUPS did not start and I cannot print. Then I do the workaround (see below), that fixes it. If the printer is there in the window, the bug is not present and I continue as normal.

WORKAROUND: Do Applications > Accessories > Terminal, Gnome Terminal starts. Enter command:

sudo telinit 2

Put in the user password, command returns, then quit Terminal. Recheck System > Administration > Printing to verify that CUPS is now running. Continue as normal.

RUNLEVEL: I have spent many hours doing CLI commands to try to investigate this bug. At all times when the bug is present, doing the runlevel command gives the result "unknown". When the bug is not present, runlevel reports "N 2".

This suggests that the Upstart job /etc/init/rc.rc-sysinit.conf is not running. In an attempt to make it run more reliably, I changed the line in rc-sysinit.conf reading:

start on filesystem and net-device-up IFACE=lo

to:

start on filesystem

I had a theory that the parsing for events in Upstart was not working properly, so simplifying the event might help. Nope. That did not fix the bug.

SCRIPT "rc.local" NOT RUNNING: When the bug happens, /etc/rc.local does not run. I verified that by putting into rc.local, just after the initial comments, the following two lines:

# Put a status line in /var/log/syslog
logger "Running /etc/rc.local"

Then, whenever /etc/rc.local is run, there gets to be a line in /var/log/syslog which includes the text:

localhost logger: Running /etc/rc.local

Such lines can easily be found by doing gedit /var/log/syslog (or gedit /var/log/syslog.1, if syslog is too recent), then search for "logger".

When the bug happens, there is no line such line in /var/log/syslog, for the buggy start.

IMPORTANCE: I am not happy that the importance of this bug has been downgraded to High from Critical. This is a very serious bug for users who do not know a workaround (as given above). It is the sort of thing that gets people giving up on Linux and going back to Windows. Please put it back up to Critical.

SUSPICIONS: I have a low level of suspicion about the kernel and Plymouth. I have a high level of suspicion about Upstart. I think Scott James Remnant should take more interest this bug, even though he is pointing the finger at the kernel in comment #136.

COMMENT FOR SCOTT: How about making all Upstart job files have the extension .upjob ?

Anthony Glenn (aglenn-pcug) wrote :

Oops, hasty proofreading. In comment #208 of mine, the sentence:

This suggests that the Upstart job /etc/init/rc.rc-sysinit.conf is not running.

should read:

This suggests that the Upstart job /etc/init/rc-sysinit.conf is not running.

Wrong filename. Sorry about that folks.

Jan Müller (info-teilnehmer) wrote :

I agree about Critical being the right category. Even people knowing the workaround will have problems when they supervise a system where not every user can sudo. Which would be EVERY system where some new user is being helped by someone else (partner, friend, co-worker). This new users will indeed switch if they can't use their computer as a typewriter.

I will third the Critical. Even though I was able to get it booting quickly again reliably with one of the workarounds (thanks Mike and others) it is disconcerting to me to know that it could just start doing this again pretty much whenever. This is a server that we shipped out to a client site and it was quite embarrassing to give them a server that wouldn't boot (it didn't exhibit this behavior until the new location).

Robbie Williamson (robbiew) wrote :

This bug is marked "High" in accordance to our guidelines: https://wiki.ubuntu.com/Bugs/Importance.

Robbie Williamson (robbiew) wrote :

I'm going to try and recreate this using a 32bit Ubuntu Server installation. Can those experiencing the problem confirm that this occurs on fresh installs, as well as upgrades? I need to determine if I need to install 9.10 and upgrade, or just 10.04 directly.

Sacha Kagan (sacha-kagan) wrote :

As a very "average" nonexpert user, I want to point out that I have been switching to Windows for 2 months now, each time I wanted to print, because I didn't have time to hunt for the forums. It is only a couple of days ago that I got so frustrated with Lucid Lynx not printing, that I finally found this "bug report" page (& learned the different "workaround" tips)... Many users probably, like I did, just reboot on windows to print, and some, maybe many, probably abandon the Lynx altogether (this bug is really annoying). You may be losing a number of non-expert users if you continue to perceive this bug as only "high" priority, based on the little hundred of users following this bug report... It may be affecting many more people (i.e. "a severe impact on a large portion of Ubuntu users"), but not necessarily the people who populated the user forums & "launchpad" webpages... Of course, it's hard to tell... Good luck with solving this bug asap, which is a nasty thorn in your lynx's paw! (Of course, I am one of the many people incapable of applying the more cumbersome "workarounds" described by some users on this webpage... I just reinstalled CUPS & hope that will do for now...)

Fresh and upgrades of 32bit lucid confirmed
> I'm going to try and recreate this using a 32bit Ubuntu Server
> installation. Can those experiencing the problem confirm that this
> occurs on fresh installs, as well as upgrades? I need to determine if I
> need to install 9.10 and upgrade, or just 10.04 directly.
>
> --
> CUPS and other system services not starting at boot
> https://bugs.launchpad.net/bugs/554172
> You received this bug notification because you are a direct subscriber
> of a duplicate bug.
>
> Status in Ubuntu: Confirmed
> Status in The Lucid Lynx: Confirmed
>
> Bug description:
> Binary package hint: cups
>
> Cups is not loading on my machine at boot, must run sudo /etc/init.d/cups
start to after booting to print.
>
> ProblemType: Bug
> DistroRelease: Ubuntu 10.04
> Package: cups 1.4.2-10
> ProcVersionSignature: Ubuntu 2.6.32-19.28-generic 2.6.32.10+drm33.1
> Uname: Linux 2.6.32-19-generic i686
> NonfreeKernelModules: nvidia
> Architecture: i386
> Date: Fri Apr 2 13:07:35 2010
> InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Alpha i386 (20100401)
> Lpstat: Error: command ['lpstat', '-v'] failed with exit code 1: lpstat:
Connection refused
> MachineType: Dell Inc. Studio XPS 1340
> Papersize: letter
> PpdFiles: Brother-HL-2170W-series: Brother HL-2170W Foomatic/pxlmono
(recommended)
> ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-19-generic
root=UUID=615bbe85-506a-4152-af5a-a5c2da303d83 ro quiet splash
> ProcEnviron:
> LANG=en_US.utf8
> SHELL=/bin/bash
> SourcePackage: cups
> dmi.bios.date: 09/08/2009
> dmi.bios.vendor: Dell Inc.
> dmi.bios.version: A11
> dmi.board.name: 0Y279R
> dmi.board.vendor: Dell Inc.
> dmi.board.version: A11
> dmi.chassis.asset.tag: 1234567890
> dmi.chassis.type: 8
> dmi.chassis.vendor: Dell Inc.
> dmi.chassis.version: A11
> dmi.modalias:
dmi:bvnDellInc.:bvrA11:bd09/08/2009:svnDellInc.:pnStudioXPS1340:pvrA11:rvnDellInc.:rn0Y279R:rvrA11:cvnDellInc.:ct8:cvrA11:
> dmi.product.name: Studio XPS 1340
> dmi.product.version: A11
> dmi.sys.vendor: Dell Inc.
>
>
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/ubuntu/+bug/554172/+subscribe

  I've had the issue on both fresh installs and upgrades from 9.10.

On 08/04/2010 04:38 PM, Phil Went wrote:
> Fresh and upgrades of 32bit lucid confirmed
>> I'm going to try and recreate this using a 32bit Ubuntu Server
>> installation. Can those experiencing the problem confirm that this
>> occurs on fresh installs, as well as upgrades? I need to determine if I
>> need to install 9.10 and upgrade, or just 10.04 directly.
>>
>> --
>> CUPS and other system services not starting at boot
>> https://bugs.launchpad.net/bugs/554172
>> You received this bug notification because you are a direct subscriber
>> of a duplicate bug.
>>
>> Status in Ubuntu: Confirmed
>> Status in The Lucid Lynx: Confirmed
>>
>> Bug description:
>> Binary package hint: cups
>>
>> Cups is not loading on my machine at boot, must run sudo /etc/init.d/cups
> start to after booting to print.
>> ProblemType: Bug
>> DistroRelease: Ubuntu 10.04
>> Package: cups 1.4.2-10
>> ProcVersionSignature: Ubuntu 2.6.32-19.28-generic 2.6.32.10+drm33.1
>> Uname: Linux 2.6.32-19-generic i686
>> NonfreeKernelModules: nvidia
>> Architecture: i386
>> Date: Fri Apr 2 13:07:35 2010
>> InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Alpha i386 (20100401)
>> Lpstat: Error: command ['lpstat', '-v'] failed with exit code 1: lpstat:
> Connection refused
>> MachineType: Dell Inc. Studio XPS 1340
>> Papersize: letter
>> PpdFiles: Brother-HL-2170W-series: Brother HL-2170W Foomatic/pxlmono
> (recommended)
>> ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-19-generic
> root=UUID=615bbe85-506a-4152-af5a-a5c2da303d83 ro quiet splash
>> ProcEnviron:
>> LANG=en_US.utf8
>> SHELL=/bin/bash
>> SourcePackage: cups
>> dmi.bios.date: 09/08/2009
>> dmi.bios.vendor: Dell Inc.
>> dmi.bios.version: A11
>> dmi.board.name: 0Y279R
>> dmi.board.vendor: Dell Inc.
>> dmi.board.version: A11
>> dmi.chassis.asset.tag: 1234567890
>> dmi.chassis.type: 8
>> dmi.chassis.vendor: Dell Inc.
>> dmi.chassis.version: A11
>> dmi.modalias:
> dmi:bvnDellInc.:bvrA11:bd09/08/2009:svnDellInc.:pnStudioXPS1340:pvrA11:rvnDellInc.:rn0Y279R:rvrA11:cvnDellInc.:ct8:cvrA11:
>> dmi.product.name: Studio XPS 1340
>> dmi.product.version: A11
>> dmi.sys.vendor: Dell Inc.
>>
>>
>>
>> To unsubscribe from this bug, go to:
>> https://bugs.launchpad.net/ubuntu/+bug/554172/+subscribe

--

***
*

Critical vs High

"For example, if the system fails to boot, or X fails to start, on a certain make and model of computer"

This is not a certain make and model. It appears to be any make and model. It also appears that it can occur almost spontaneously as a response to even normally very benign environmental conditions (did your DHCP server take longer to respond than normal?). Also given as many issues that this is considered "duplicate" of and that it appears it could unpredictably strike anyone without notice makes it pretty serious indeed. That's my case and I'll leave it at that.

For me it appeared on a fresh 10.04 Server install, after upgrading on 7/28, and only after a change of physical environment (disconnected monitor, different network). Yesterday, I attempted to duplicate it on similar hardware and could not.

Akaname (akaname) wrote :

* Confirmed for fresh Lucid 64bit with amd dual and quad core machines.
* Not confirmed for fresh Lucid 32bit on old intel single core.

As suggested in https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/83 the problem was solved by removing "console output" in all /etc/init/*.conf files and modifying /etc/init/rc-sysinit.conf .

Robbie Williamson (robbiew) wrote :

Thanks for the confirmation. Since it will make people happy, I'll change to "Critical", but please understand that doing this doesn't trigger some sort of all points bulletin or ninja bug fixer squad to jump on it....as acting release manager, and lead of the Canonical team responsible for the OS plumbing layer...you have the right person's attention on it ;).

Robbie Williamson (robbiew) wrote :

One more question, does anyone affected by this bug NOT have an nvidia graphics card, i.e. ATI/Radeon or Intel?

The Dual CPU Pentium 3 server (see comment 151, 162 and 195) that has this problem has a very old ATI Rage chipset on the motherboard:
--------
$ runlevel
unknown
$ lspci | grep -i vga
01:00.0 VGA compatible controller: ATI Technologies Inc Rage XL AGP 2X (rev 27)
--------

It does not enter runlevel 2 about 25% to 50% of the time. It does not run X in any form, and would normally run postfix, apache, and slapd. SSH is started by upstart but console logins are not.

Luckily it is semi-retired and currently only used as a test server.

What? No Ninja's?

The Dell R200 dual core server I was using has a ATI ES1000 ("Rage" not "Radeon" I guess). However, with the Dell Remote Administration Card, output on the ATI is disabled and is instead sent through the DRAC. Not sure what that means for you. Further, I did not install X at all. 64bit Lucid, fresh install. Everyone that I have seen that specified, specified at least 2 core machines.

The machine I had with the problem had a Radeon 4xxx card.

Robbie Williamson (robbiew) wrote :

I'll try to recreate on my laptop this week, which has an Intel card. I'm currently away from my machine with nvidia, but will be home this weekend...where I can do a recreate there as well. If possible, does anyone have the time/resources to try and recreate with 10.10 Alpha 3, which is scheduled to release tomorrow? The data would be useful for debugging purposes...and don't worry, if the bug isn't present in 10.10, we won't say "then just upgrade to 10.10" as the solution. ;)

On Wed, Aug 04, 2010 at 11:15:32PM -0000, Robbie Williamson wrote:
> I'll try to recreate on my laptop this week, which has an Intel card.

I don't understand all this concern about which video card is in the
machine.

The evidence is that if /sbin/init tries to write to /dev/console too early, it fails.

I did a step-by-step analysis in Bug 543506. Start at comment #54.
https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/54

--
 Mike Bianchi
 Foveal Systems

 973 822-2085

 <email address hidden>
 http://www.AutoAuditorium.com
 http://www.FovealMounts.com

Hi Robbie, yes, I have an ATI/Radeon RV350 AS [Radeon 9550] graphics
card.

Possibly due to this, I can't boot normally; I lose the use of the
keyboard & mouse at the sign-on stage & have to use the restart button &
do the following. (I submitted bug report #580755 on this, but I am the
only subscriber, so it is still in the new/undecided/unassigned states).

I have to use a very strange boot process.
(select the failsafe kernel, select boot in failsafe mode, start X, then
basic sign-on with only my user (acceptable)). For the first failsafe
boot after "shut-down, shut-down", I then need to start CUPS; I also
start hplip. When I usually shut-down, I use "hibernate"; then the PC
starts with all of what I need working.

My PC is 5 years old, with a single core 2.8 Ghz Intel processor & 2 GB
of RAM. I have been using Ubuntu for 4 of those years (upgrading
regularly to the latest version), previously with far fewer problems
than with 10.04. I am not much more than a basic end-user, who has had
to learn much more than I did with MS Windows.

I submitted my CUPS problem on #590551, but this was recognised as a
duplicate of #554172.

I have not tried the 3-step workaround, as I am not confident of
recovering if I make an error (this happened recently on my Myth PC,
when I tried to implement a suggested fix/workaround on 9.10. It failed
miserably, & just left me some strange state, with lots of lines of
output. I couldn't fix things because the HDD had reached its limit for
checking (fsck), & the system wouldn't mount (?) it, so I couldn't
restore the back-up of the file I changed.

Garry Leach.

On Wed, 2010-08-04 at 22:01 +0000, Robbie Williamson wrote:
> One more question, does anyone affected by this bug NOT have an nvidia
> graphics card, i.e. ATI/Radeon or Intel?
>

Intel IGP here, Q33/Q45/Q43.

As I've stated in #543506, this affected 20 machines on the day that we were due to send them to site; only the /dev/console work-around allowed these machines to boot reliably, and start our application.

In those circumstances, try explaining to your boss why you recommended moving from Red Hat EL 5 to Ubuntu 10.04 LTS...

It occurs to me that if you want to force the /dev/console bug, it might be
as simple as adding "console output" to all the /etc/init/*.conf files.
If "successful" it could leave the installation unbootable, so I would
advise doing this in a disposable installation on a separate partition.

Brian Burch's circumvention in post 190 just failed me after about 20 good boots.

Robbie Williamson (robbiew) wrote :

Still trying to recreate...and failing. Is anyone seeing this with the LATEST kernel, which is 2.6.32-24-generic (-pae if needed) #39...not #38. Running a 'uname -a' will tell you if you have #39.

Robbie Williamson, I can confirm that this effects 2.6.32-24-generic #39.

I upgrade the Dual CPU Pentium 3 server (see comment 151, 162 and 195) yesterday to test this, and it failed to enter runlevel 2:
--------
$ runlevel
unknown
$ uname -a
Linux aa000039.uk.fabit.net 2.6.32-24-generic-pae #39-Ubuntu SMP Wed Jul 28 07:39:26 UTC 2010 i686 GNU/Linux
--------

About 20 other machines (virtual and real hardware) have been upgraded to 10.04 without meeting this problem, so it may take some time and different hardware combinations to recreate. If you need any more details on this server then let me know or look at the attachment to comment #195.

Robbie Williamson (robbiew) wrote :

John Edwards, thanks!

summary: - CUPS and other system services not starting at boot
+ system services not starting at boot

@John Edwards
A Question to your machines (real and virtual) that are not affected by this bug: How many processor cores or virtual processors cores do they have?

@Robbie Williamson:

Just upgraded to #39, and I was lucky this time:
mart@Bluebird:~$ runlevel
N 2
mart@Bluebird:~$ uname -a
Linux Bluebird 2.6.32-24-generic #39-Ubuntu SMP Wed Jul 28 05:14:15 UTC 2010 x86_64 GNU/Linux

First boot was ok, while #38 always failed.

Michael Doube (michael-doube) wrote :

#39 broken here after 3 boots:
mdoube@doris:~$ runlevel
unknown
mdoube@doris:~$ uname -a
Linux doris 2.6.32-24-generic #39-Ubuntu SMP Wed Jul 28 05:14:15 UTC 2010 x86_64 GNU/Linux

Also have a suspend-to-ram failure since 2.6.32-23, bug #602049
Vaio SZ6 notebook, Core 2 Duo + Intel graphics + Intel X25-M SSD

Hey folks,

I've done some investigation based on the hypothesis that this bug is caused by open("/dev/console") returning the EIO error, this is supported by comments including this log message from Upstart and by evidence that removing "console output" from Upstart jobs appears to correct the problem.

First an important clarification. There is some suggestion that Upstart should "wait" for this device to be ready, that is nonsense. /dev/console is a virtual device supplied by the kernel that represents an active system console, whatever that may be - or the bit bucket if there isn't one. It's always available, and trivial operations on it such as open() and close() should always succeed. Waiting for /dev/console makes as little sense as waiting for /dev/null. open("/dev/console") is _not_supposed_to_fail_.

So I've read through the kernel source code and I have found a pattern which *would* cause opening /dev/console to fail with EIO, and there is also a good explanation of why this only started appearing in Ubuntu 10.04.

Opening /dev/console for the first time allocates memory within the kernel, and future opens take a reference count to this allocated memory. Closing /dev/console reduces this reference count, and should it hit zero, it frees the memory.

The trouble seems to be that the kernel doesn't free the memory within the tty mutex; instead it marks the allocated tty information as TTY_CLOSING, releases the mutex, then frees the memory later. The open() code checks for this flag, and bails out with EIO when present.

This is a clear SMP bug within the kernel, a race condition exists where if you open /dev/console "just after" the last file descriptor is closed from a process running on another core, that process gets a reference to the *being freed* console information rather than referencing it and re-using it.

As to why this has only relatively recently appeared - previously the kernel seems to have done all of this under the BKL ("Big Kernel Lock"), the last commits to this code were attempts to remove the BKL. This may be a resulting bug of reducing locking and increasing pre-emptiveness.

Also an Upstart bug may have been hiding the problem; Upstart gets passed /dev/console or opens it on initialisation so it can set up the console with sane parameters, however it failed to close it again and kept the console device open at all times. This was a bug, and meant that the SysRq-K SAK key killed init and caused a kernel panic. With this bug fix though, it became once again possible for the console device to be released from memory (init always had a reference before) so exposed this bug underneath.

Reassigning to the kernel team to make the tty code SMP safe.

affects: Ubuntu Lucid → linux (Ubuntu Lucid)
tags: added: kernel-core kernel-needs-review

The kernel patch that introduces this regression is f278a2f7bbc2239f479eaf63d0b3ae573b1d746c, which even notes in the commit log:

    Due to tty release routines run in a workqueue now, error like the
    following will be reported while booting:

    INIT open /dev/console Input/output error

Andy Whitcroft (apw) wrote :

It should be noted that the commentary is #238 is from the fix which eliminated that error, not introducing that error. This fix is also already applied to both Lucid and Maverick kernels.

commit f278a2f7bbc2239f479eaf63d0b3ae573b1d746c
Author: Dave Young <email address hidden>
Date: Sun Sep 27 16:00:42 2009 +0000

    tty: Fix regressions caused by commit b50989dc

    The following commit made console open fails while booting:

        commit b50989dc444599c8b21edc23536fc305f4e9b7d5
        Author: Alan Cox <email address hidden>
        Date: Sat Sep 19 13:13:22 2009 -0700

        tty: make the kref destructor occur asynchronously

    Due to tty release routines run in a workqueue now, error like the
    following will be reported while booting:
[...]
    Fix it as per the following Alan's suggestion:
[...]

akaname, I've only seen it on one machine and that had two Pentium 3 CPUs (separate chips).

Of the machines which have not seen this problem, the virtual machines are all single processor and the servers all had multiple core CPUs (Pentium D, AMD X2, Core2 Duo and Quad).

There were a few desktop machines upgraded to Ubuntu 10.04 that had single core CPUs - Pentium 4, AMD Athlon, and Intel Atom. The desktops were only upgraded a few weeks ago and are not used by me day-to-day, but have not reported problems so far. I'll try to grab one and power cycle it a bit to recreate the problem.

I suspect Scott James Remnant is correct and this is an SMP problem. Most machines built over the past few years have had multi-core CPUs.

  I've installed Ubuntu on 20 machines for individual desktop use. Most
have been on AMD processors, and most of those have been on dual core
and 1 quad core. The single core machines have rarely been used to print
from. I've seen the issue on both dual core and quad core.

I hope this helps.

On 08/05/2010 11:53 AM, akaname wrote:
> @John Edwards
> A Question to your machines (real and virtual) that are not affected by this bug: How many processor cores or virtual processors cores do they have?
>

--

***
*

Phil Went (phl-wnt) wrote :

Scott,

I've done some investigation based on the hypothesis that this bug is
> caused by open("/dev/console") returning the EIO error, this is
> supported by comments including this log message from Upstart and by
> evidence that removing "console output" from Upstart jobs appears to
> correct the problem.
>

Not sure if this is relevant, but the workaround of commenting out the
/dev/console statements may _seem_ to be a fix, but it is not as I have seen
the bug reproduced with this workaround applied, albeit less frequently.

It does remove the console write error from the log, but some services may
still not run. No obvious signs of failure does not mean there has been no
failure, as I have found machines that seem OK but it is not obvious as they
have a small number of services. From memory, there is also an option to
turn off the console.

I tried each workaround individually with multiple reboots until failure,
including the disabling of ureadahead. The only workaround that I have not
seen the bug is after disabling ureadahead, but this may mean nothing given
the intermittent nature of the bug.

Thanks Robbie.
This problem shows up more often on faster systems.

every quad phenom/nvidia box I have put together does it.
all quad opteron servers (NO GUI) does it regardless of video card.
Sometimes on the servers bind9 will fail. ddclient also fails
vboxdrv also fails to start on boot and sometimes the login
screen fails to show up

Some slower intel boxes with intel video cards do it.

on some systems with GeForce fx5200 you get a black screen and login fails to load sometimes while other times
you get a console login but X fails to load.

on some systems with integrated intel 845G same problem with X not starting and login on console not starting sometimes
while other times login starts at the console.

list of services known not to start because of this problem on various machines:
cups
X
bind9
vboxdrv
console login
ddclient.
nmbd
just about any other 3rd party service.

This problem did NOT show up on any boxes I setup during the Karmic release.

It does not matter whether the runlevel command shows N 2 or (unknown) I have had services fail to start
on regardless of the output of the runlevel command. Usually shows N 2

Ola (erik-o-engstrom) wrote :

My twopennies.
I'm running 10.04 server w. the following kernel:
2.6.32-21-generic-pae #32-Ubuntu SMP
The machine is an old Thosiba laptop w: 250 Mb and a Pentium III processor (~600 Mhz).
It has worked flawlessly w-out the recommended fixes as a printerserver for some months.

Andy Whitcroft (apw) wrote :
Download full text (5.2 KiB)

First some background. Analysis above indicates that upstart jobs are
failing because opens of /dev/console are failing, and that this should
not be possible:

 "... open("/dev/console") is _not_supposed_to_fail_."

As previously indicated /dev/console is a virtual device which represents
the currently active console. However, there is no guarentee that this
device will successfully open, no guarentee that there is a system
console device. This device will only open successfully if there is
a real active system console defined, if there is no system console at
open time then the open will fail with errno set to ENODEV. There is no
guarentee that it will ever become openable.

Also while /dev/console is a virtual device representing the current active
console, it is implemented as a direct open of the real console device.
Opens of /dev/console are redirected to opens of the actual active console
device at the time the open occurs. The open is therefore to the real
underlying device, the returned file descriptor has all the semantics of
the real tty device.

I have managed to trivially reproduce open failures to /dev/console,
returning errno of EIO, by running two open/close loops on a tty device
in parallel, in my case /dev/tty10:

    #include <stdio.h>
    #include <fcntl.h>
    #include <errno.h>

    main(int argc, char *argv[])
    {
     int fd;

     while (1) {
      fd = open(argv[1], O_RDWR);
      if (fd < 0) {
       printf("fd<%d> errno<%d>\n", fd, errno);
       fflush(stdout);
      }
      close(fd);
     }
    }

Looking at the open code, the suspected source of the error (EIO during open)
is the code fragment below. That the console is in the process of being
closed when the open is occuring:

    static int tty_reopen(struct tty_struct *tty)
    {
     struct tty_driver *driver = tty->driver;

     if (test_bit(TTY_CLOSING, &tty->flags))
      return -EIO;
    [...]

Nominally open/close handling is protected by tty_mutex, this prevents
parallel opens and closes from racing with each other. However once we
close a device for the last time (ie all sharers have closed the device)
a real shutdown of the device occurs. For tty devices this may involve
an extended handshake at the hardware level. During this close process
is it not safe to initiate a reopen, but we also do not wish to block all
tty opens. Thus the kernel only holds the tty_mutex long enough to mark
the device as in the process of closing (sets TTY_CLOSING in the device
flags) and releases tty_mutex before executing the potentially extended
close handling. In the single thread close/reopen race is avoided as tty
shutdown processing is executed in the context of the closer, thus the
device close has progressed sufficiently far to prevent a subsequent open
from seeing this partially closed state, and triggering the EIO return.
It should be noted that open/close processing is also covered by the BKL,
however this is not proof against parallel execution, should the close
handing sleep (which can occur should there be any mutexs in the path)
the BKL is dropped and reaquired (as it is a preemptable lock). For the
common case the console is a VT device, which...

Read more...

Andy Whitcroft (apw) on 2010-08-09
Changed in linux (Ubuntu Lucid):
assignee: Canonical Kernel Team (canonical-kernel-team) → Andy Whitcroft (apw)
Changed in linux (Ubuntu Maverick):
assignee: Canonical Kernel Team (canonical-kernel-team) → Andy Whitcroft (apw)
Anthony Glenn (aglenn-pcug) wrote :
Download full text (3.5 KiB)

Many thanks to Andy Whitcroft for his truly excellent analysis above in comment #245.

REPRODUCING THE BUG: It looks to me as though what is required to reproduce this bug is a fast multi-core machine with a slow video card. It just so happens that is exactly what my machine is like. When I build a machine, I always buy a high-end gamer-style motherboard and put a close-to-top-of-the-line CPU in it. My experience has been that motherboards are a reliability problem area, so I spend up on the motherboard. I tend to prefer motherboards and CPUs that have been out for a little while, so the bugs have been worked out of them. However, I do not like the noise and power consumption of high-end video cards, so I buy quiet (preferably silent) video cards. Such cards are inevitably slow.

Many server machines also have fast CPUs and slow video. Most servers spend almost all of their lives with their video never being looked at by a human being, so fast video performance is unimportant in that market. So server computer manufacturers put in cheap slow video systems. Meanwhile, the CPU performance is critical, so server manufacturers put in big fast multi-core CPUs.

This explains the prominent presence of the server guys in the comments on this bug. They have the kind of machines that show the bug.

FLOW CONTROL PROBLEM: One problem not mentioned by Andy is the possibility of flow control happening. For example, suppose the console was something really slow such as an actual teletype, joined to the computer via modems. The word "teletype" is where "tty" came from. Of course, few of you young people know what a real teletype looks, sounds or smells like, but there are some of us who remember that they only went at 10 characters per second. Now suppose that the telephone line connecting the modems has got signal quality problems, maybe it has noise or crosstalk. So there are hangups and redialling going on as the modems struggle with the telephone line. Then do a few restarts and generate a lot of console traffic. The poor old console could end up hours behind. Linux should cope with all that and keep right on working properly.

If there is spooling for the console, there will be short hesitations in the flow of data as the data is written to disk. Then there has to be a flow-control-asserted signal back to whatever is writing the data, to say, "Hey, wait up, my buffer is nearly full." Then the writer has to wait until flow control is deasserted and writing may resume. Flow control should always be present between any two asynchronous processes transferring data.

If there is no spooling for the console, the short hesitations can become quite long waits, as the console labours to catch up.

I do not know whether there is a spooling option for console traffic. Perhaps someone more knowledgeable might comment.

However, the problem will always be there, of the writer to the console possibly getting to be faster than the console device can take the data. CPUs keep on getting faster. Spooling disks or video cards cannot necessarily keep up. So there will always be the necessity for flow control. Linux in general, and upstart in particular, must cope satisfa...

Read more...

If I can reproduce the bug when the quiet kernel option is set (this
disables console output to speed up boot), does this analysis still hold
water? Same for commenting out output to /dev/console in init files. There
may be an issue according to the analysis, but am not sure whether it is the
root cause of the symptoms in this bug.

eg, in all combinations, quiet/not quiet commented/not commented the bug
occurs(some services fail to start sometimes) on two giada n10U devices
(dual atom 330 nvidia ion), as well as asus eee 1201n (also dual atom nvidia
ion).

@Phil -- setting quiet would likely make things more likely as it simply prevents visible printing of the output, it does not reduce the contention on the console_sem its likely to increase it. Commenting out the open of /dev/console however from the upstarts configuration files should have an effect, and indeed is reported to in this bug for a number of people. It is quite possible there are two issues here with similar symptoms.

Hobson Lane (hobs) wrote :

Similar symptoms for me on fully upgraded Ubuntu 10.04 on Dell Inspiron laptop dual-booting with Win7 using grub2. A triple-workaround that worked for others here (but not me) is listed below.

sudo -s
for file in /etc/init/*.conf; do sed -i 's/^console output/\#console output/' $file; done
sed -i 's/start on filesystem and net-device-up IFACE=lo/start on filesystem and started rsyslog and net-device-up IFACE=lo/' /etc/init/rc-sysinit.conf
sed -i 's/GRUB_CMDLINE_LINUX=""/GRUB_CMDLINE_LINUX="init='\''\/sbin\/init --verbose'\''"/' /etc/default/grub
update-grub

My problem may be related to a failed hibernate resume due to laptop battery draining below hibernate threshold just as I was correcting a permissions misconfiguration on my /var or /usr directory that was preventing proper OS operation. I think others have mentioned the possibility of a stale file lock problem, but can't remember where or how to correct it.

Andy Whitcroft (apw) wrote :

Having talked to upstream and clarified the plans with the BTM, it
seems that they are intending on closing some of the races as they are
deemed unhelpful. Overall however some slow TTY devices will indeed
still legitimatly return EIO when a slow close is in operation. That is
expected behaviour.

Looking at upstart it does appear that a change there has exposed us to
this issue. Until recently upstart used to hold /dev/console open in its
own name. That was stopped (quite reasonably) to avoid another issue,
from the changelog:

    0.6.5 2010-02-04 "Our last, best hope for victory"
    [...]
     * No longer holds /dev/console open, so the SAK SysRq key will not
       kill Upstart. (Bug: #486005)
    [...]

If we look at upstart itself, it seems to be using a plain open which
likely should be more robust in any case:

    system_setup_console (ConsoleType type,
    [...]
     switch (type) {
     case CONSOLE_OUTPUT:
     case CONSOLE_OWNER:
      /* Ordinary console input and output */
      fd = open (CONSOLE, O_RDWR | O_NOCTTY);
      if (fd < 0)
       nih_return_system_error (-1);
    [...]

More robust, how?

If the job is configured to require its standard three file descriptors to the system console, and Upstart is unable to open the system console, it is unable to satisfy the requirements of the job configuration so will terminate the attempt to start the job.

Read through the entire job_process_spawn() function and you'll see that the code is already safe from EINTR due to the signal disposition, and all other permissible error returns from open() are non-transient.

open(2) does not document EIO as a valid return from this function, and I'm not even sure this error is appropriate - where it's used elsewhere it nearly always refers to a filesystem error - there are few exceptions. If the intent is that the calling process should just try again, shouldn't it instead return EAGAIN?

Also please bear in mind that "should" implies that it should somehow have been anticipated that the kernel was going to change an interface and introduce an undocumented non-transient error code where none existed before? :-)

Also, let's consider the other effects of this kernel change. For example, the following code from the initramfs that actually exec's init in the first place:

  exec run-init ${rootmnt} ${init} "$@" <${rootmnt}/dev/console >${rootmnt}/dev/console 2>&1

This opens /dev/console to be bound to init's file descriptors, if the console has recently been closed, these shell redirects can now fail with EIO. That means it's not just init that has to be fixed, it's every single possible shell out there, including the shells inside things like busybox?

This is why the kernel can't just push its own lazyness down to userspace like this.

Another point to consider (I discussed this with a few people here at LinuxCon):

open() is supposed to be an inherently blocking system call, just like connect(), creat(), etc. If the kernel hasn't finished hanging up the tty from last time, it's *okay* for the subsequent open() to block for a while while it hangs up the tty and reinitializes it. The app will be expecting that.

If the app calls open() with the O_NONBLOCK flag, which it accepts today already, then it's a non-blocking open - and in that case it would be acceptable for the kernel to fail the open with the EAGAIN or EWOULDBLOCK error - *NOT* EIO.

(not EIO because it turns out that that error code is already returned in some cases to indicate filesystem corruption or disk error, neither of which are transient and acceptable to loop on)

Robbie Williamson (robbiew) wrote :

Scott,

Could we consider carrying a short-term hack in Lucid/Maverick for upstart (but not put in the main upstart tree), and then remove it for 11.04...when the issue is supposedly fixed in the kernel?

Andy Whitcroft (apw) wrote :
Download full text (5.2 KiB)

> open(2) does not document EIO as a valid return from this function, and
> I'm not even sure this error is appropriate - where it's used elsewhere it
> nearly always refers to a filesystem error - there are few exceptions. If
> the intent is that the calling process should just try again, shouldn't
> it instead return EAGAIN?

Though open(2) does indeed not document this error, it is a documented
POSIX return and it has been possible that this could get returned on
open for a TTY for a very long time. Yes EIO is not a very intuitive
return but actually they chose a different return code as it does indeed
indicate something different than an EAGAIN might. EGAIN generally meaning
"just do it again" and EIO meaning "this is stuck closing at the moment".

> Also please bear in mind that "should" implies that it should somehow
> have been anticipated that the kernel was going to change an interface
> and introduce an undocumented non-transient error code where none existed
> before? :-)

This interface has _not_ changed, an open on a TTY which has recently been
closed has always had the possibility of returning EIO, the /dev/console
device is a TTY and therefore could trigger this behaviour; you have
been lucky up to now. Two things have changed. Firstly, the window in
which it can triggered has widened slightly in the kernel. Secondly,
upstart recently stopped holding /dev/console open in the main thread (to
avoid the REISUB death), holding it open mitigates this issue completely.
(And we might consider this as a mitigation option.)

> Also, let's consider the other effects of this kernel change. For example,
> the following code from the initramfs that actually exec's init in the
> first place:
>
> exec run-init ${rootmnt} ${init} "$@" <${rootmnt}/dev/console >${rootmnt}/dev/console 2>&1
>
> This opens /dev/console to be bound to init's file descriptors, if the
> console has recently been closed, these shell redirects can now fail with
> EIO. That means it's not just init that has to be fixed, it's every single
> possible shell out there, including the shells inside things like busybox?

Actually the race can only be triggered by parallel execution, so for the
init process up to this point we are likely protected by being singly threaded.
If the thread has recently closed the console it will have paid the cost of
closing it before continuing and we are not affected.

> This is why the kernel can't just push its own lazyness down to
> userspace like this.

We commonly hand off unfortunate semantics to userspace and let that
handle things. EINTR is a classic example.

> Another point to consider (I discussed this with a few people here
> at LinuxCon):
>
> open() is supposed to be an inherently blocking system call, just like
> connect(), creat(), etc. If the kernel hasn't finished hanging up the
> tty from last time, it's *okay* for the subsequent open() to block for
> a while while it hangs up the tty and reinitializes it. The app will be
> expecting that.
>
> If the app calls open() with the O_NONBLOCK flag, which it accepts today
> already, then it's a non-blocking open - and in that case it would be
> acceptable for the kernel to fa...

Read more...

David Haskins (surfari) wrote :

I am running a rack-mounted remote server that I was going to upgrade while the students are away. I cannot rely on ssh to start, or apache2 or mysql so I am somewhat stuffed chaps. Any news on this getting sorted or should I leave it with 7.04 server?

Robbie: Way ahead of you on that one, I put a workaround in Upstart together yesterday, and just uploaded it - now we understand the bug, and the implications, it's safe to do so.

Upstart will now fallback to using /dev/null for jobs with "console output" -- "console owner" will still fail, because those are expressing a stronger desire for the console

Changed in upstart (Ubuntu Lucid):
assignee: nobody → Scott James Remnant (scott)
Changed in upstart (Ubuntu Maverick):
importance: Undecided → Medium
Changed in upstart (Ubuntu Lucid):
importance: Undecided → Medium
Changed in upstart (Ubuntu Maverick):
assignee: nobody → Scott James Remnant (scott)
Changed in upstart (Ubuntu Lucid):
status: New → Triaged
Changed in upstart (Ubuntu Maverick):
status: New → Triaged
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package upstart - 0.6.6-2

---------------
upstart (0.6.6-2) maverick; urgency=low

  * Apply patch from trunk to use /dev/null when /dev/console is unavailable
    due to kernel bugs. This isn't a fix for those bugs, but it does work
    around it for now. LP: #554172.
 -- Scott James Remnant <email address hidden> Thu, 12 Aug 2010 09:52:07 -0400

Changed in upstart (Ubuntu Maverick):
status: Triaged → Fix Released

> Though open(2) does indeed not document this error, it is a documented
> POSIX return and it has been possible that this could get returned on
> open for a TTY for a very long time.
>
No it isn't, the current and previous editions of POSIX don't document EIO as a return for open() - are you sure you're not reading the XSI STREAMS specification? :-) That being said, as I discussed above, open() has always apparently returned EIO in Linux for other reasons. We should probably make sure this is documented.

> Yes EIO is not a very intuitive
> return but actually they chose a different return code as it does indeed
> indicate something different than an EAGAIN might. EGAIN generally meaning
> "just do it again" and EIO meaning "this is stuck closing at the moment".
>
The problem is that EIO is *already* returned from open() to mean "omg, the filesystem/block device is on fire! stop! stop! stop!"

> It seems that this is predicated only on your dislike of EIO as a return.
>
No! You dangerously misunderstand.

This is predicated on my discovery that open() can already return EIO for different errors that are catastrophic, and thus code that loops on EIO isn't possible. A different error would mean we could retry in userspace - right now the only option is to fail.

Andy: if we can change the returned error code so it's not EIO, and something more like EAGAIN that unequivocoably tells userspace that it can loop, we can deal with that in userspace. That's just a one-line change.

For example we could patch libc to always loop on those open()s, which would fix everything at once

Robbie Williamson (robbiew) wrote :

FYI, the fix for Lucid has been uploaded into -proposed:
  http://launchpadlibrarian.net/53562040/upstart_0.6.5-7_source.changes
and is waiting for approval.

Accepted into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in linux (Ubuntu Lucid):
status: Confirmed → Fix Committed
Changed in upstart (Ubuntu Lucid):
status: Triaged → Fix Committed
tags: added: verification-needed
Changed in linux (Ubuntu Lucid):
status: Fix Committed → Confirmed
Martin Pitt (pitti) on 2010-08-13
Changed in upstart (Ubuntu Lucid):
milestone: none → ubuntu-10.04.1

Seems to work fine on my PC, all services start now.

Ditto
> Seems to work fine on my PC, all services start now.
>
> --
> system services not starting at boot
> https://bugs.launchpad.net/bugs/554172
> You received this bug notification because you are a direct subscriber
> of a duplicate bug (543506).
>
> Status in “linux” package in Ubuntu: Confirmed
> Status in “upstart” package in Ubuntu: Fix Released
> Status in “linux” source package in Lucid: Confirmed
> Status in “upstart” source package in Lucid: Fix Committed
> Status in “linux” source package in Maverick: Confirmed
> Status in “upstart” source package in Maverick: Fix Released
>
> Bug description:
> Binary package hint: cups
>
> Cups is not loading on my machine at boot, must run sudo /etc/init.d/cups
start to after booting to print.
>
> ProblemType: Bug
> DistroRelease: Ubuntu 10.04
> Package: cups 1.4.2-10
> ProcVersionSignature: Ubuntu 2.6.32-19.28-generic 2.6.32.10+drm33.1
> Uname: Linux 2.6.32-19-generic i686
> NonfreeKernelModules: nvidia
> Architecture: i386
> Date: Fri Apr 2 13:07:35 2010
> InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Alpha i386 (20100401)
> Lpstat: Error: command ['lpstat', '-v'] failed with exit code 1: lpstat:
Connection refused
> MachineType: Dell Inc. Studio XPS 1340
> Papersize: letter
> PpdFiles: Brother-HL-2170W-series: Brother HL-2170W Foomatic/pxlmono
(recommended)
> ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-19-generic
root=UUID=615bbe85-506a-4152-af5a-a5c2da303d83 ro quiet splash
> ProcEnviron:
> LANG=en_US.utf8
> SHELL=/bin/bash
> SourcePackage: cups
> dmi.bios.date: 09/08/2009
> dmi.bios.vendor: Dell Inc.
> dmi.bios.version: A11
> dmi.board.name: 0Y279R
> dmi.board.vendor: Dell Inc.
> dmi.board.version: A11
> dmi.chassis.asset.tag: 1234567890
> dmi.chassis.type: 8
> dmi.chassis.vendor: Dell Inc.
> dmi.chassis.version: A11
> dmi.modalias:
dmi:bvnDellInc.:bvrA11:bd09/08/2009:svnDellInc.:pnStudioXPS1340:pvrA11:rvnDellInc.:rn0Y279R:rvrA11:cvnDellInc.:ct8:cvrA11:
> dmi.product.name: Studio XPS 1340
> dmi.product.version: A11
> dmi.sys.vendor: Dell Inc.
>
>
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/554172/+subscribe

I installed the proposed upstart binary, and since then booted my system several times: the bug didn't reappear. All services started well.

Serge Rivoallan (srivoallan) wrote :

CUPS still not starting after update of the proposed upstart package

Serge Rivoallan (srivoallan) wrote :

Oops! My mistake! All services start but since smb start before cups, printers are still not available in samba! I thought at first that the new upstart package would solve that!

Steve Langasek (vorlon) wrote :

Serge, please file a separate bug against the samba package for your issue. This is not related to this bug preventing the startup of cups.

The patched version of upstart looks so far. 25 reboots on the dual CPU Pentium 3 machine and it has entered runlevel 2 each time.

Kai Krakow (hurikhan77) wrote :

Can also confirm this working. The system is taking longer to boot now - who wonders: all services start now. However, I had no chance to do excessive reboots since the machine is in production.

Steve Langasek (vorlon) on 2010-08-15
tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package upstart - 0.6.5-7

---------------
upstart (0.6.5-7) lucid-proposed; urgency=low

  * Apply patch from trunk to use /dev/null when /dev/console is unavailable
    due to kernel bugs. This isn't a fix for those bugs, but it does work
    around it for now. LP: #554172.
 -- Scott James Remnant <email address hidden> Thu, 12 Aug 2010 10:45:46 -0400

Changed in upstart (Ubuntu Lucid):
status: Fix Committed → Fix Released
Phil Went (phl-wnt) wrote :

On one machine I have applied the later upstart from lucid-proposed 0.6.5.-7 and still get runlevel unknown on reboot.

Steve Langasek (vorlon) wrote :

"runlevel unknown" is a separate error. Please check that your /etc/network/interfaces is correct and has successfully initialized your loopback interface (as shown by 'status network-interface INTERFACE=lo').

Status returns start/running , services start which is nice tho not
conclusive, just the runlevel is unknown.
On 17 Aug 2010 03:12, "Steve Langasek" <email address hidden> wrote:

> "runlevel unknown" is a separate error. Please check that your
> /etc/network/interfaces is correct and has successfully initialized your
> loopback interface (as shown by 'status network-interface
> INTERFACE=lo').
>
>
> --
> system services not starting at boot
> https://bugs.launchpad.net/bugs/554172
> You received this b...
>

I updated to upstart 0.6.5-7, rebooted, and have the same issues with services not starting. Specifically, nginx doesn't start, and I have two CGI services that interact with nginx, that don't start on boot. I've checked with rcconf, and all 3 are labelled to start on boot. Checking some of the logs (/var/log/messages, /var/log/syslog, and /var/log/boot{.log}) doesn't reveal anything that would indicate a problem with these services, as far as I can tell.

A related issue, that I mentioned earlier, is that 'shutdown -r now' doesn't work. I get the "System is shutting down" message on my console, but nothing happens. I have to use 'reboot', and this seems to work consistently. From some of the Ubuntu forum conmments, it seems like this is possibly related to the upstart problems.

Robbie Williamson (robbiew) wrote :

For anyone who is still experiencing problems with services not starting at boot, I suggest you open a new bug (feel free to subscribe me - robbie.w is my ID). There could be any number of reasons why a particular service won't start, and we'd rather you open a new bug than add to this one. If we determine the issue is the same, we will just dup your bug to this one.

Thanks,
Robbie

summary: - system services not starting at boot
+ system services using "console output" not starting at boot
Changed in upstart (Ubuntu Maverick):
milestone: none → ubuntu-10.10-beta
Changed in linux (Ubuntu Lucid):
importance: Critical → High
status: Confirmed → Won't Fix
Changed in linux (Ubuntu Maverick):
status: Confirmed → Won't Fix
ingo (ingo-steiner) wrote :

On my box it helped to just un-install plymouth and all services like cups, apcupsd ... are starting fine since then.
How to perform this (artificial blocked) un-install, see https://bugs.launchpad.net/bugs/556372

Rogi (rogi) wrote :

Just deinstalling plymouth solves this problem on my system too.

ingo (ingo-steiner) wrote :

Great, thanks for your feedback, Rogi.

Plynouth in my opinion is yust a fancy boot screen which is still in an alpha stadium. Though Ubuntu team always points out it also does all logging during boot process - that is the root cause of all that and other troubles and you again did confirm that. Did you ever see any fsck properly logged in /var/log/fsck/* with Lucid?

The ugly side is that Canonical has intenionally built in a fake-dependency on plymouth in packages 'mountall' and 'cryptsetup' to force users to act as alpha/beta-testers for their buggy plymouth. I'd be willing to do such tests (to make Ubuntu better) if it is clearly explained and documented and the choice to participate is left to the user. But I do not accept the current attitude in a LTS-release!

This are hard words of a disappointed Ubuntu user and I do revoke my statements if somebody proves the opposite.

Anthony Glenn (aglenn-pcug) wrote :

The revised version of Upstart fixed it for me. Thanks to Scott James Remnant and Andy Whitcroft. I have done many cold boots and restarts since the fix, and the bug is gone.

To avoid problems with Plymouth, get rid of "quiet splash" in the Grub kernel command line. Alas, Grub and Grub 2 are different. The following instructions apply to Grub.

Start a root terminal. Enter:

grub-install -v

You should get the response "0.97". That is the Grub version number. If it reads "1.96" or higher, you have Grub 2, check the documentation at:

https://help.ubuntu.com/community/Grub2

Assuming you have Grub, edit /boot/grub/menu.lst , find the line like:

# kopt=root=UUID=87c74523-fd13-4bca-97e4-5aba28218222 ro quiet splash

Notice that the 32 hex digits on your UUID will be different to mine. Keep your own. Delete "quiet splash".

Look down further in the file and you will find a line like:

kernel /boot/vmlinuz-2.6.32-24-generic root=UUID=87c74523-fd13-4bca-97e4-5aba28218222 ro quiet splash

Delete the "quiet splash". Save the file. Restart the computer. Now you should have a Linux boot that looks like a Linux boot, with real console messages and none of that wimpy graphical splash screen rubbish. However, then your graphical interface (X Windows) will start normally.

Feel proud.

ingo (ingo-steiner) wrote :

> To avoid problems with Plymouth, get rid of "quiet splash"

Better: get rid of Plymouth

Nick Davis (argoneus) wrote :

>For anyone who is still experiencing problems with services not starting at boot, I suggest you open a new bug (feel free to >subscribe me - robbie.w is my ID). There could be any number of reasons why a particular service won't start, and we'd rather >you open a new bug than add to this one. If we determine the issue is the same, we will just dup your bug to this one.

>Thanks,
>Robbie

I've created a new bug, as requested, detailing the issues I'm still experiencing regarding services not starting on boot. I've subscribed you to the bug (https://bugs.launchpad.net/ubuntu/+bug/642555).

I'm running the updated upstart and plymouth-free grub2 config on two newly installed amd64 boxes (bog-standard Dell 2950s) with today's 2.6.32-24-server kernel, and I'm still finding a whole bunch of services fail to start (cron, apache2, mysql, fail2ban, sysstat, postfix, openntpd, mysql-mmm), along with the unknown runlevel issue. All of them (except sysstat for some reason) start fine if I run telinit 2. One of them reported the localhost interface breakage, but the other did not.
It doesn't seem likely that all of these major, common packages have bugs with identical symptoms - it's got to be a problem with upstart - so reporting bugs on the packages is probably a waste of time.
This is a really major problem as it can (and does for us) render servers non-functional and insecure by default - why is it marked as 'wontfix' and downgraded from critical with no obvious pointer to a better bug?

On Tue, Sep 21, 2010 at 04:09:39PM -0000, Marcus Bointon wrote:
> I'm running the updated upstart and plymouth-free grub2 config on two
> newly installed amd64 boxes (bog-standard Dell 2950s) with today's
> 2.6.32-24-server kernel, and I'm still finding a whole bunch of services
> fail to start (cron, apache2, mysql, fail2ban, sysstat, postfix, openntpd,
> mysql-mmm), along with the unknown runlevel issue. All of them (except
> sysstat for some reason) start fine if I run telinit 2. One of them
> reported the localhost interface breakage, but the other did not.

If the runlevel is not set at boot, the issue you're seeing is unrelated to
this bug. The rc-sysinit job is failing to trigger, either because you have
filesystems configured in /etc/fstab that are not being found and mounted at
boot time, or because you don't have a properly configured loopback
interface in /etc/network/interfaces.

> It doesn't seem likely that all of these major, common packages have bugs
> with identical symptoms - it's got to be a problem with upstart - so
> reporting bugs on the packages is probably a waste of time.

Yes, in that case the appropriate thing to do would be to file a *single*
bug against upstart describing the problem. In any case, following up to
this bug, which has been fixed, is not the correct approach.

> This is a really major problem as it can (and does for us) render servers
> non-functional and insecure by default - why is it marked as 'wontfix' and
> downgraded from critical with no obvious pointer to a better bug?

What is marked as 'wontfix' for the lucid and maverick releases is the
kernel issue that led to this problem. The user-facing issue has been
addressed.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

tags: added: kernel-key
David Haskins (surfari) wrote :

I cannot see that this has been fixed. I am using 10.04 LTS on my Toshiba Tecra and do Apache development. Most times but not alawys, on boot apache2 does not start so I need to do it manually "sudo /etc/init.d/apache2 start"

David Haskins, could you run the 'runlevel' command to check if this is related to this particular bug, or if it is another problem?

On Wed, Jan 12, 2011 at 03:14:41PM -0000, John Edwards wrote:
> David Haskins, could you run the 'runlevel' command to check if this is
> related to this particular bug, or if it is another problem?

A better approach might be to put logging into /etc/init.d/apache2
and figure out:
 If it is actually called.
 And, if it is, why does it fail.

I freely admit my prejudices on the this type of problem.

 /bin/init under upstart does not guarantee the order of execution of the
startup scripts. In fact it almost forces them to be random if you have a
multi-core processor. (Again this is my prejudice, based on my prior problems
with upstart).

I strongly suspect that there is some service that /etc/init.d/apache2
depends upon that is sometimes completely started when it runs, and
sometimes is not. Since upstart runs /etc/init/* services
depending on the "start on" prerequisite "event" list being _exactly_ right,
if I didn't know about one or more, or got the "start on" stanza slightly wrong
I saw unreliable service starts.

My two cents: I think it would be good to have access to an old fashion,
one-step-at-a-time boot sequence (as an option).
I would gladly give up boot speed for boot sequence certainty. I do not
see a way to do that with upstart. Boot sequence certainty was easy to
accomplish when everything was started by numbered filenames under
 /etc/init.d .

I commented on the lack of upstart event documentation in other bug reports:
https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/45
https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/46
https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/543506/comments/49

ingo (ingo-steiner) wrote :

Probably things will turn better in future, Scott will leave Canonical and go to Google by next week, leaving behind the half done upstart in Ubuntu for others to clean up.

David Haskins (surfari) wrote :

I get:
$ runlevel
N 2
whatever that means.

Luckily my remote production unattended 64bit virtual Ubuntu 10.04 LTS server seems to start Apache, MySQL and SSH when it boots otherwise this would be a disaster. Why do this kind of thing in a LTS release?? Doesn't anyone understand the words "server" "stable" "conservative" - next time I meet Mark Shuttleworth (he has an honorary doctorate from my University) I tell him this is ANOTHER fault caused by unnecessary fiddling by youths. The last one was about Tomcat clobbered by a missing catalina-ant link.

ingo (ingo-steiner) wrote :

@David,

try executing following command (simulates upgrade of libc6):

apt-get install --reinstall libc6

and shutdown you server, then check your root-filesystem,
or reboot and check with "cat /var/log/messages | grep orphan"

That's Bug #672177

David, if you get runlevel 2 it means that you have a different problem to this bug (which is about upstart not entering runlevel 2).

If it only effects Apache than I would suggest opening a new bug against that relevant Apache package (if one does not already exist). If it does turn out to be an upstart bug then it can reassigned to that package.

You might also want to try some of the suggestions in this and related bug reports, such as commenting out the "console output" lines in the files in the /etc/init/ directory.

As for your other comments, Ubuntu servers are package-stable for LTS releases, but not as conservative as Debian or Red Hat - that is one of the reasons why Ubuntu exists. I run over 50 Ubuntu servers and only 1 machine was effected by the upstart bug, and that was 9 years old and only used for testing. That fact that only one of your machines has this problem does point to it be a little obscure and not something that is commonly met.

Jens Schødt (jens-schoedt) wrote :

Any news on this bug? Is anyone trying to fix it?

Considering moving to an other distro without upstart....

Jens Schødt, this bug is in the kernel console device not being writeable early enough in the boot. I appears to be in upstart because that is the first thing that tries to write to it. A work around has been available of the problem for several months, which seems to have fixed the boot for those people who reported to this bug.

So if you have a fully updated machine and still see problems like 'runlevel' reporting 'N' then I recommend you put the details of your problem in a new bug report and place a link to this bug report as a possible duplicate (but don't mark as confirmed duplicate).

On Sat, 2011-02-12 at 12:11 +0000, Jens Schødt wrote:
> Any news on this bug? Is anyone trying to fix it?
>
> Considering moving to an other distro without upstart....

Yeah, IMHO, upstart was waaaaayyyy rushed into Ubuntu. It was clearly
not ready considering how many problems it has caused for anyone not
running a configuration that fits within what has got to be some pretty
narrow testing scenarios (i.e. desktop systems with the entire system in
a single partition?)

I guess it wouldn't have seemed so bad if it were not taking literally
many months and several releases to fix the problems. Again, IMHO it
was a mistake to put it in so soon. But everyone makes mistakes, that's
fair enough. But when one does make a mistake, the right thing to do is
get all hands on deck so that your mistake does become other people's
problem.

Just my $0.02.

Jens Schødt (jens-schoedt) wrote :

Thanks John,

I see problems in a fresh 10.10 install (both 32 and 64bit). init.d services are not always started. eg. hostapd and sabnzbdplus.
I dont know about the 'runlevel' reporting 'N' because I am currently trying an other distro without this problem. I will create a bug if I get back to it.

> > On Sat, Feb 12, 2011 at 02:15:20PM -0000, Brian J. Murrell wrote:
> > Any news on this bug? Is anyone trying to fix it?
> On Sat, 2011-02-12 at 12:11 +0000, Jens Schødt wrote:
> Yeah, IMHO, upstart was waaaaayyyy rushed into Ubuntu. ...

I agree.

The evidence strongly suggests that Upstart still is not ready for prime time.

So may I suggest, at the very least, /bin/init be give an detailed logging
function that announces _everything_ it does, and _why_, into syslog
with the _option_ of quieting it if someone sees the need to save the file
space. By "_everything_ it does" I mean not only starting and stopping.
I mean announcing that it is waiting on some event for some /etc/init/*.conf
stanza. Announcing every event that arrives and (if it can) from where.

As I see it the problem is that nobody can present clear evidence of why
their favorite start-up script is not performing as expected because you
have to go in and instrument (read "put print statements into") everything
that upstart touches. Then, if it doesn't touch something you have to
guess a probable cause, hack, and reboot.

This problem has a long history that looks like inattention out here in
user land.

The evidence is that it IS NOT FIXED .

And don't put it off as a kernel or somewhere else.
When Ubuntu does not work, it is Ubuntu that is not working.

And, if a fix still eludes you, put back an old fashioned /bin/init with the
old fashioned /etc/init.d/S??<startupscript> sequence,
at least as an option.

I am still on 8.04 because this is not fixed.

I also am looking to go straight Debian or elsewhere.

Brian Rogers (brian-rogers) wrote :

Upstart has been working fine for me ever since the patch for this bug.

If you're still experiencing the symptoms in a closed bug report, the best thing to do is file a new bug. Otherwise fixing the remaining bug isn't on anyone's to-do list, and it's kind of silly to complain that it isn't being done.

On Sat, 2011-02-12 at 21:22 +0000, Mike Bianchi wrote:
> > > On Sat, Feb 12, 2011 at 02:15:20PM -0000, Brian J. Murrell wrote:
> > > Any news on this bug? Is anyone trying to fix it?
> > On Sat, 2011-02-12 at 12:11 +0000, Jens Schødt wrote:
> > Yeah, IMHO, upstart was waaaaayyyy rushed into Ubuntu. ...
>
> I agree.
>
> The evidence strongly suggests that Upstart still is not ready for prime
> time.
>

Ready or not its out there in RHEL and Ubuntu as well as other places.

>
> So may I suggest, at the very least, /bin/init be give an detailed logging
> function that announces _everything_ it does, and _why_, into syslog
> with the _option_ of quieting it if someone sees the need to save the file
> space. By "_everything_ it does" I mean not only starting and stopping

Steve Langasek (vorlon) wrote :

On Sat, Feb 12, 2011 at 09:22:20PM -0000, Mike Bianchi wrote:
> > > On Sat, Feb 12, 2011 at 02:15:20PM -0000, Brian J. Murrell wrote:
> > > Any news on this bug? Is anyone trying to fix it?
> > On Sat, 2011-02-12 at 12:11 +0000, Jens Schødt wrote:
> > Yeah, IMHO, upstart was waaaaayyyy rushed into Ubuntu. ...

> I agree.

> The evidence strongly suggests that Upstart still is not ready for prime
> time.

This is not a forum for general gripes about how upstart works. It is a bug
report about a specific issue, /one which has been resolved/.

If anyone is having problems with getting an init script to start up
reliably at boot time on your system, please file a new bug report against
the upstart package in Ubuntu.

> I am still on 8.04 because this is not fixed.

As has been stated repeatedly by the developers, *this* bug is fixed, and by
your own admission you are not in a position to provide feedback about any
further issues with startup in Ubuntu releases that use native upstart jobs
by default. If you had an actual bug to report regarding boot reliability,
you would find the Ubuntu developers very responsive to the issue. But you
aren't reporting a bug, you're instead repeating hearsay about boot being
unreliable with upstart when the record does not support this. Please do
not abuse the bug reporting system for general complaints about the
technology decisions that have been made in Ubuntu. Please *do* file new
bug reports for outstanding issues in upstart or elsewhere.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world

Download full text (3.7 KiB)

We users are NOT experts in the Launchpad rules and regulations.
We just want our problems addressed.

Please do whatever is necessary to open up a bug that addresses the
long standing and still present condition of init scripts not being executed
reliably.

For "Title" and "Description" I propose:

        Boot service starts are sometimes, randomly, unreliable.
        Execution of /etc/init/* and /etc/init.d/* boot scripts sometimes fail.

        Since the introduction of the upstart version of /bin/init we continue
        see unreliable boot sequences where it appears boot scripts in
  /etc/init/ and /etc/init.d/ are not always executed to completion.

  Bug #554172, Comment #295 2011-02-12
                "... in a fresh 10.10 install (both 32 and 64bit). init.d
                services are not always started. eg. hostapd and sabnzbdplus."

        It is crucial to note that evidence suggests
                the behavior is somewhat random, and
                that it appears to be more prevalent on multi-core processors.

        This bug was once thought to be based on the "console output" stanza in
         /etc/init/*.conf noted in Bug #554172 and duplicates, but ongoing
        evidence is that the problem has not been completely solved.

I ask this based on the following observations from this bug (#554172) ...

> John Edwards wrote on 2011-02-12: #293
> Jens Schødt, this bug is in the kernel console device not being writeable
> early enough in the boot.

This bug's Description starts with:
   "Cups is not loading on my machine at boot, must run sudo /etc/init.d/cups
   start to after booting to print."

It became titled 'system services using "console output" not starting at boot'
later on when that seemed to be the root cause of all the problems.

While the "console output" fix improved the situation it obviously is not
the complete solution.

> Steve Langasek wrote on 2011-02-12: #299
> If anyone is having problems with getting an init script to start up
> reliably at boot time on your system, please file a new bug report against
> the upstart package in Ubuntu.
> :
> ... *this* bug is fixed, ...

> Jens Schødt wrote on 2011-02-12: #295
> I see problems in a fresh 10.10 install (both 32 and 64bit). init.d services
> are not always started. eg. hostapd and sabnzbdplus.

To my mind, Jens is saying "this bug still exists in 10.10", and that while
this _bug_ is marked "Fix Released", the evidence is that the _problem_ is not.

There are 20 duplicates to this bug, some going back to 9.10 in early 2010.
And that says to me that adding another bug which will just get marked
"possible duplicate" (John Edwards, comment #293) and it will again not
be investigated or attended to.

In bug 642555: (we can see this problem has a long and convoluted history)
> Colin Watson wrote on 2010-11-22: Comment #10
> The thing is that we don't know what the root cause of this problem
> (i.e. the one at the start of this bug report) is. Without data, it's
> just speculation. At the moment, we haven't proven that your missing
> boot messages actually have anything to do with services not starting -
> the link is only circ...

Read more...

Clint Byrum (clint-fewbar) wrote :
Download full text (3.2 KiB)

On Mon, 2011-02-14 at 15:13 +0000, Mike Bianchi wrote:
> We users are NOT experts in the Launchpad rules and regulations.
> We just want our problems addressed.
>
> Please do whatever is necessary to open up a bug that addresses the
> long standing and still present condition of init scripts not being executed
> reliably.
>
> For "Title" and "Description" I propose:
>
> Boot service starts are sometimes, randomly, unreliable.
> Execution of /etc/init/* and /etc/init.d/* boot scripts sometimes fail.
>
>
> Since the introduction of the upstart version of /bin/init we continue
> see unreliable boot sequences where it appears boot scripts in
> /etc/init/ and /etc/init.d/ are not always executed to completion.
>
> Bug #554172, Comment #295 2011-02-12
> "... in a fresh 10.10 install (both 32 and 64bit). init.d
> services are not always started. eg. hostapd and sabnzbdplus."
>
> It is crucial to note that evidence suggests
> the behavior is somewhat random, and
> that it appears to be more prevalent on multi-core processors.
>
> This bug was once thought to be based on the "console output" stanza in
> /etc/init/*.conf noted in Bug #554172 and duplicates, but ongoing
> evidence is that the problem has not been completely solved.
>
>
> I ask this based on the following observations from this bug (#554172) ...
>
> > John Edwards wrote on 2011-02-12: #293
> > Jens Schødt, this bug is in the kernel console device not being writeable
> > early enough in the boot.
>
> This bug's Description starts with:
> "Cups is not loading on my machine at boot, must run sudo /etc/init.d/cups
> start to after booting to print."
>
> It became titled 'system services using "console output" not starting at boot'
> later on when that seemed to be the root cause of all the problems.
>
> While the "console output" fix improved the situation it obviously is not
> the complete solution.
>

Mike, sorry about my earlier reply, some of it got cut off by how
launchpad handles quoted email. Steve said some of what I was saying.

The request to create a new bug report, marking it as possibly related,
is a common practice and it has proven useful in the past to get things
fixed.

Right now the issue that was reported originally is definitely fixed
from the point of view of many of those who reported they were affected
by it, and from the point of view of the developer(s?) who found and
applied a fix.

Some of the duplicates/comments may not be related to what was fixed
there, but its not helpful to anyone to keep pounding on this bug
report, even if some of the original reporters were actually not this
bug.

So I agree with Steve, and recommend that users who are affected by a
similar problem (even if its not easily or predictably reproducible),
report a new bug, and post here saying "I reported bug X, possibly
related to this one".

Then users experiencing this problem will see that comment, go to that
bug, and possibly see that their issues are more closely related to that
problem, and hopefully help by adding some data p...

Read more...

On Mon, Feb 14, 2011 at 11:56 AM, Clint Byrum <email address hidden> wrote:

>
> Right now the issue that was reported originally is definitely fixed from
> the point of view of many of those who reported they were affected

I took Ubuntu 10.04 off my little home network because of this bug after
suffering with it for too long and trying all "fixes". I don't know if the
bug would still persist on my machines with the latest Ubuntu but a flaw in
the startup code is not something I would tolerate for long. (I came to
Ubuntu from NetBSD --- you know it's reputation for quality --- and moved on
from Ubuntu to Archlinux which has worked out well for me).

On Mon, Feb 14, 2011 at 05:56:44PM -0000, Clint Byrum wrote:
> The request to create a new bug report, marking it as possibly related,
> is a common practice and it has proven useful in the past to get things
> fixed.
>
> Right now the issue that was reported originally is definitely fixed
> from the point of view of many of those who reported they were affected
> by it, and from the point of view of the developer(s?) who found and
> applied a fix.

I am one of those who saw the same problem in 10.04 in mid-2010 that many
others were and are reporting. It still isn't fixed.
No matter what the status of this bug says, it still is not fixed.

See
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/554172/comments/295

It seems the only way to get the attention of the developers that there is a
problem is to generate a detailed bug report, but /bin/init does not give
enough logging information to prove the case.

You want evidence of the bug?
Give us the ability to generate the detailed log files that will let us demonstrate it.

Please!

On Mon, Feb 14, 2011 at 03:13:57PM -0000, Mike Bianchi wrote:
> Please do whatever is necessary to open up a bug that addresses the
> long standing and still present condition of init scripts not being executed
> reliably.

No, we do not proxy create bug reports. The user /experiencing the problem/
should open a bug report. Bug reports are an important tool for developers
to have a dialogue with the user in order to isolate and resolve bugs. That
doesn't work if the user isn't part of that dialogue.

> For "Title" and "Description" I propose:

> Boot service starts are sometimes, randomly, unreliable.
> Execution of /etc/init/* and /etc/init.d/* boot scripts sometimes
> fail.

That is a generic title that only underscores the disconnect here. You are
starting from the assumption that there is some general bug in upstart that
causes boot scripts to be unreliable. There is no evidence of this. We
have in the past identified bugs that had such an effect, and those have
been fixed. For the vast majority of users, boot in Ubuntu is now reliable.

Any bug report about remaining boot problems should be *specific* in its
description of the problem. If there is a general bug, the developers can
extrapolate from the specific to the general. But if you can't give
specifics, it's not a bug report at all - it's a rumor.

> Bug #554172, Comment #295 2011-02-12
> "... in a fresh 10.10 install (both 32 and 64bit). init.d
> services are not always started. eg. hostapd and sabnzbdplus."

The author of comment #295 says he is not currently in a position to help us
debug this problem and that he will file a bug report if and when he is
able. Once again, there is no need to proxy creation of a report.

I understand that some people are still finding their boot unreliable. But
there is no evidence that it's related to *this* bug report, and these boot
reliability problems are not reproducible for any of the developers. We
want to help make Ubuntu more reliable for users but the first step in doing
that is for the affected users file bug reports for their specific issues
and *not* misuse this bug report for that purpose.

ingo (ingo-steiner) wrote :

> I understand that some people are still finding their boot unreliable. But
there is no evidence that it's related to *this* bug report, and these boot
reliability problems are not reproducible for any of the developers. We
want to help make Ubuntu more reliable for users but the first step in doing
that is for the affected users file bug reports for their specific issues
and *not* misuse this bug report for that purpose.

Please also consider that meanwhile (almost 1 year after release of LTS-Lucid) people including me get tired in submitting bug reports which after some months probably get marked "won't fix in Lucid', "incomplete", "will fix in Natty" or whatever.

I meanwhile switched to Debian-Squeeze => all troubles gone, very similar to Lucid, but reliable.

tags: added: iso-testing
tags: removed: kernel-key

Mike Basinger, this bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? If so, could you please test for this with the latest development release of Ubuntu? ISO images are available from http://cdimage.ubuntu.com/daily-live/current/ .

If it remains an issue, could you please run the following command in the development release from a Terminal (Applications->Accessories->Terminal), as it will automatically gather and attach updated debug information to this report:

apport-collect -p linux <replace-with-bug-number>

Also, could you please test the latest upstream kernel available (not the daily folder) following https://wiki.ubuntu.com/KernelMainlineBuilds ? It will allow additional upstream developers to examine the issue. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this bug is fixed in the mainline kernel, please add the following tags:
kernel-fixed-upstream
kernel-fixed-upstream-VERSION-NUMBER

where VERSION-NUMBER is the version number of the kernel you tested. For example:
kernel-fixed-upstream-v3.13-rc4

This can be done by clicking on the yellow circle with a black pencil icon next to the word Tags located at the bottom of the bug description. As well, please remove the tag:
needs-upstream-testing

If the mainline kernel does not fix this bug, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-VERSION-NUMBER

As well, please remove the tag:
needs-upstream-testing

Once testing of the upstream kernel is complete, please mark this bug's Status as Confirmed. Please let us know your results. Thank you for your understanding.

Changed in linux (Ubuntu):
assignee: Andy Whitcroft (apw) → nobody
status: Confirmed → Incomplete
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers