Ubuntu

system hangs on boot after updates from 2011-02-22

Reported by C de-Avillez on 2011-02-23
72
This bug affects 13 people
Affects Status Importance Assigned to Milestone
alsa-utils (Ubuntu)
High
Colin Watson

Bug Description

Binary package hint: mountall

Frankly I do not know to which package I should attribute this. Candidates are mountall, upstart, ureadahead, and libc6.

After today's updates, I cannot boot my laptop anymore. Another natty user -- nick proti on #ubuntu+1 -- also has this problem on 3 different machines. His/hers are Sun workstations, mine is a core i7 laptop. The common ground we both of us have multiple filesystems.

Boot stops after mountall is activated (fscks are scheduled, and seemingly most complete).

After rebooting with init=/bin/sh, and activating additional terminals, I:

1. on tty1, 'exec init -v'
init goes for a while, and then hangs. On tty2 'mount' shows only 3 filesystems mounted:
* /
* /usr
* /tmp
and nothing is stated about /var, /src, /boot, /srv, /home, and /opt.

This suggests mountall is hanging.

C de-Avillez (hggdh2) wrote :
Download full text (7.6 KiB)

proti states his home system was still usable today until applying the last sequence of updates:

   1.
      2011-02-22 21:05:19 status installed man-db 2.5.9-3
   2.
      2011-02-22 21:05:21 status installed libpam-modules 1.1.2-2ubuntu3
   3.
      2011-02-22 21:05:32 status installed man-db 2.5.9-3
   4.
      2011-02-22 21:05:33 status installed gcc-4.5-base 4.5.2-3ubuntu3
   5.
      2011-02-22 21:05:36 status installed libstdc++6 4.5.2-3ubuntu3
   6.
      2011-02-22 21:05:37 status installed libc-bin 2.13-0ubuntu1
   7.
      2011-02-22 21:05:45 status installed man-db 2.5.9-3
   8.
      2011-02-22 21:05:46 status installed libgcc1 1:4.5.2-3ubuntu3
   9.
      2011-02-22 21:05:47 status installed libc-bin 2.13-0ubuntu1
  10.
      2011-02-22 21:05:50 status installed man-db 2.5.9-3
  11.
      2011-02-22 21:05:52 status installed libpam-runtime 1.1.2-2ubuntu3
  12.
      2011-02-22 21:05:55 status installed libpam0g 1.1.2-2ubuntu3
  13.
      2011-02-22 21:05:55 status installed libc-bin 2.13-0ubuntu1
  14.
      2011-02-22 21:07:08 status installed man-db 2.5.9-3
  15.
      2011-02-22 21:07:09 status installed install-info 4.13a.dfsg.1-6ubuntu3
  16.
      2011-02-22 21:07:09 status installed doc-base 0.9.5ubuntu2
  17.
      2011-02-22 21:07:10 status installed ureadahead 0.100.0-9
  18.
      2011-02-22 21:07:11 status installed bamfdaemon 0.2.78-0ubuntu1
  19.
      2011-02-22 21:07:11 status installed desktop-file-utils 0.18-0ubuntu1
  20.
      2011-02-22 21:07:13 status installed python-gmenu 2.30.5-0ubuntu2
  21.
      2011-02-22 21:07:19 status installed hicolor-icon-theme 0.12-1
  22.
      2011-02-22 21:07:19 status installed libglib2.0-0 2.28.1-1ubuntu1
  23.
      2011-02-22 21:07:25 status installed python-support 1.0.10ubuntu3
  24.
      2011-02-22 21:07:33 status installed gconf2 2.32.1-2ubuntu2
  25.
      2011-02-22 21:07:35 status installed shared-mime-info 0.90-1
  26.
      2011-02-22 21:07:35 status installed ufw 0.30.0-2ubuntu1
  27.
      2011-02-22 21:07:36 status installed libx11-data 2:1.3.3-3ubuntu5
  28.
      2011-02-22 21:07:36 status installed libx11-6 2:1.3.3-3ubuntu5
  29.
      2011-02-22 21:07:36 status installed libgomp1 4.5.2-3ubuntu3
  30.
      2011-02-22 21:07:37 status installed lib32gcc1 1:4.5.2-3ubuntu3
  31.
      2011-02-22 21:07:37 status installed lib32stdc++6 4.5.2-3ubuntu3
  32.
      2011-02-22 21:07:37 status installed cpp-4.5 4.5.2-3ubuntu3
  33.
      2011-02-22 21:07:37 status installed gcc-4.5 4.5.2-3ubuntu3
  34.
      2011-02-22 21:07:37 status installed gpgv 1.4.11-3ubuntu1
  35.
      2011-02-22 21:07:39 status installed gnupg 1.4.11-3ubuntu1
  36.
      2011-02-22 21:07:39 status installed libfreetype6 2.4.4-1
  37.
      2011-02-22 21:07:39 status installed libfreetype6-dev 2.4.4-1
  38.
      2011-02-22 21:07:40 status installed alsa-utils 1.0.24.2-0ubuntu1
  39.
      2011-02-22 21:07:40 status installed apache2-utils 2.2.17-1ubuntu1
  40.
      2011-02-22 21:07:40 status installed binfmt-support 2.0.3
  41.
      2011-02-22 21:07:40 status installed espeak-data 1.44.05-1ubuntu1
  42.
      2011-02-22 21:07:41 status installed libespeak1 1.44.05-1ubuntu1
  43.
      2...

Read more...

tags: added: natty
Sébastien Bernard (sbernard) wrote :

I'm indeed proti from freenode.
I made some progress regarding this issue.
I found that if you comments all the entries in the /etc/fstab that refers to an UUID (in my case, it's /boot and the swap), then mountall behaves normaly and the system is able to boot up to the gdm login.
The mystery stays complete but should be much easier to debug now.

Crashbit (crashbit-gmail) wrote :

I have a similar problem

After today's updates, I cannot boot my laptop anymore

I boot selecting Ubuntu safe-mode, and when laptop hangs up, press SysReq+i and this actions give shell.

I see:
init: ureadahead-other main process (903) terminated with status 4

# df -h
/dev/sda2 /
none /dev
none /dev/shm
none /var/run
none /var/lock
/dev/sda5 /home
/dev/sda7 /usr

UUID's from /etc/fstab and showed mith blkid command are agree.

Last three dmesg messages are:
EXT4-fs (sda7): mounted filesystem with ordered data mode. Opts: (null)
SysReq : Kill all tasks
udev[940]: starting version 167

Sorry for my bad English

Sébastien Bernard (sbernard) wrote :

Here's an ltrace of the mountall.

Sébastien Bernard (sbernard) wrote :

Unfortunately, commenting UUID did not fix the boot problem.
I realized it after a reboot. mountall started to hang again.
I made various attemps.
Behaviour is unconsistent. Somethings, mountall mounts 1 fs, sometimes 3-4, sometimes nothing.
After a few tries, the mountall mounts everything and returns ok. Then one can start the service via initctl start mountall, and all the system comes to life again.
I've no really clues about what is wrong.
I was only able to start the system by lauching mountall on another tty then suspending it then kill -9 %1, several times.

Colin Watson (cjwatson) wrote :

Can you add --verbose to the 'exec mountall' line in /etc/init/mountall.conf and report the last screenful of output?

Colin Watson (cjwatson) wrote :

Also, could you attach the full /var/log/dpkg.log?

Sébastien Bernard (sbernard) wrote :

Here's the dpkg.log (easy part).

C de-Avillez (hggdh2) wrote :

Here's a snapshot of the console at hang point

I've tried to grab as much state as I could. Attached is the fstab from anacoluthon, a T42 Thinkpad that suffers from this problem.

When booting, I choose the normal (i.e., non-recovery) GRUB entry, since the goal's to bring the system up. I edit the entry before booting, removing the quiet, splash, and vt handoff kernel parameters. I add the --verbose parameter so that upstart's progress is visible. The system boots partially, than hangs (see anacoluthon-boot-blocked.jpg).

I then hit SysRq-i to get a prompt. I record the mount state from /proc/mounts (see anacoluthon-proc-mounts.1). I run "mountall --verbose", redirecting the output to anacoluthon-mountall.log" which runs and then hangs (anacoluthon-mountall-blocked.jpg). I interrupt the program with Ctrl-C, record the mount state (anacoluthon-proc-mounts.2). I then append a blank line to anacoluthon-mountall.log so the individual runs can be discerned.

Finally, I run "mountall --verbose" again, appending the output to anacoluthon-mountall.log. The system boots fully, but some mount permissions are incorrect, such as /dev/shm, (1755 instead of 1777). The state is captured in anacoluthon-proc-mounts.3.

(The /dev/shm bug is unrelated, it seems -- see Bug #566788)

Muharem Hrnjadovic (al-maisan) wrote :

A T61 of mine was affected by this bug as well. Downgrading mountall from rev. 2.22 to 2.21 did not help, no luck with downgrading ureadahead from 0.100.0-10 to 0.100.0-9 either.

I then tried booting into Ubuntu 10.10 which was installed on the same machine (but on a different root partition) and that would also hang at the Ubuntu screen with the 5 dots underneath (and alternating).

Re-doing a grub-install from the Ubuntu 11.04 root partition did not help or make any difference.

So, then I booted into a rescue system, chrooted into the Ubuntu 10.10 root partition and re-did grub-install. That restored the system to a point where I could boot into Ubuntu 10.10 at least.

I hope this helps.

Eric B (ebischoff) wrote :

I have two machines on which I do the alpha natty updates at the same time. One of them has few partitions, and boots fine. the other one has many partitions, and can't boot.

Using /dev/sda* device files instead of UUIDs does not help.

A workaround is to manually kill mountall process during init and restart mountall manually several times until the system boots normally.

Sébastien Bernard (sbernard) wrote :

To help people who are stuck, here's the operations I need to do at each boot :
When in the grub chooser (hold shift while booting if it does not show up), edit the boot configuration :
go the to line which says :
linux /vmlinuz-2.6.38-5-generic root=/dev/<your-root-device> ro crashkernel=384M-2G:64M,2G-:128M quiet splash vt.handoff=7
add at the end : break=bottom and remove the splash option.
It should read :
linux /vmlinuz-2.6.38-5-generic root=/dev/<your-root-device> ro crashkernel=384M-2G:64M,2G-:128M quiet vt.handoff=7 break=bottom
and then hit ctrl-x to boot.
Once at the initramfs prompt do :
chroot /root /bin/bash
getty -8 -n -l /bin/bash 38440 tty8&
getty -8 -n -l /bin/bash 38440 tty6&
exit
exit
the upstart should be starting now. And you should have 2 shells available on console 6 and 8 (alt+f6, alt+f8).
When the mountall get stucks, go to one of the console and do :
initctl stop mountall
The command you be stuck (if not you're lucky). Then hit <ctrl-z> to get back the shell prompt then:
kill -9 %1
fg
initctl start mountall

Do a loop of initctl stop mountall and start mountall until all of the fs are mounted and the upstart daemon resume it's boot.
My $0.02 or 0.02€ contrib.

UpsI Forgot (123itsame) wrote :

does not work for me.
1.: although I have all recent upgrades I still have kernel 2.6.37 (I do not know why, It is just not used, 2.6.38 does not even show up on list)
2.: there is nothing like that "crashkernel=..." thin in my grub boot line
3.: when executing the chroot /root /bin/bash command, I get error messages like:
- cannot set terminal process group (-1): Inappropriate ioctl for device
- no job controll in this shell
- groups: command not found

when executing the getty I get [1]343 for tty8 and [2]344 for tty6

finally I get same as before, no startup, no shells on tty6 or tty8, only choice to acess system via sysreq+i

Sébastien Bernard (sbernard) wrote :

It should read :
getty -8 -n -l /bin/bash 38400 tty8&
getty -8 -n -l /bin/bash 38400 tty6&

If you don't gives the getty that magic value, it's won't work.

Eric B (ebischoff) wrote :

Here is the technique I use to work around the problem, it might not be any better than Sebastian's technique, but anyway here it is, in the hope it might help Herbert:
 - on grub screen, hit 'e' to edit boot commands
 - add init=/bin/bash to the kernel command line
 - press F10 to boot the kernel
 -at the prompt, type the following
    getty -8 -n -l /bin/bash 38400 tty2 &
    getty -8 -n -l /bin/bash 38400 tty3 &
and make sure with Ctl-Alt-F2 and Ctrl-Alt-F3 that you can switch terminal
 - type the following to resume the init sequence:
    exec init
 - switch terminal, get the process number of mountall with
     ps aux
then kill -9 <process number>
 - switch terminal again, and type
   mountall
to restart it. After a while, stop it with Ctl-C, then restart it again, At my place, it starts booting normally at this point.

For your kernel version being outdated in your natty alpha release, you should probably try using "aptitude dist-upgrade" instead of "aptitude clean-upgrade" once you succeeded to boot (just a guess, I'm not sure). I do not have "crashkernel" option either on my grub command line. I don't think the kernel version or having room reserved for a crash kernel matters at all for the mountall problem, anyway.

I hope that helps.

Here's the quickest way I've found (this requires that you've set a root password):

* Edit the normal (non-recovery) grub entry, removing "quiet", "splash",and "vt_handoff". Optionally add "--verbose".

* Press Ctrl-X and let the system run through the boot process until it hangs.

* Hit SysRq-i, and type the root password to log in.

* Execute "mountall" and wait for it to block. Break it with Ctrl-C and execute it again.

On my system, I have to execute "mountall" only twice to get it to boot completely, but this may vary with the configuration. you may well need to do it more than twice, or perhaps only once.

My machine, that has separate /var and /usr is affected with this problem exactly.
The other (laptop) having many mountpoints but /usr /var are not separate from / doesn't suffer- boots normally.
At first I've chrooted from GRML to remove ureadahead and it doesn't help.
Now I'll try some of the advices above and let everybody here know the results. o/

Hi again
The #24 solution is fast and working (you don't need to edit any grub options by the way).
My opinion: it's mountall!

Hi
I've upgraded another machine (netbook) yesterday. This one has one ext4 partition only and is affected by the bug also.

Sébastien Bernard (sbernard) wrote :

I tried to setup a natty server in a Virtualbox machine.
Unfortunately, the VM is working OK.
I'm now trying to setup a maverick then upgrade to natty to see if it gonna break.
Stay tuned.

Sébastien Bernard (sbernard) wrote :

Ok, I managed to reproduce the breakage.
1- Install maverick minimal (I just selected openssh server) with multiple separate FS on LVM
2- update maverick -> latest update.
3- change apt to point to natty
4- upgrade -> linux-image, mountall, libc6, grub-pc

then hang....

Sébastien Bernard (sbernard) wrote :

ok, maybe not.
I got a different behaviour but not a hang.
Systems switchs to vt7 and then stays on this vt.
Nothing happens then.
But system runs, switching back to tty1 show the login as usual.
So no luck.

Sébastien Bernard (sbernard) wrote :

Doing install ubuntu-desktop after upgrading natty gives me the hang.
So the path to reproduce is :
Install maverick with multiple separate FS minimal install
upgrade to natty
then install ubuntu-desktop

Sébastien Bernard (sbernard) wrote :

apt-get install alsa-utils
hangs the system at next reboot.
Just rename the /etc/init/alsa-restore.conf to /etc/init/alsa-restore.conf.disabled make it boot again.

Colin Watson (cjwatson) on 2011-03-01
Changed in mountall (Ubuntu):
status: New → Confirmed
Eric B (ebischoff) wrote :

I confirm that after renaming

  /etc/init/alsa-restore.conf

to

  /etc/init/alsa-restore.conf.disabled

the system does not hang anymore at reboot.

Sébastien and Eric: Thanks! Confirmed here as well; normal system boot was restored by moving alsa-restore.conf.

Colin Watson (cjwatson) wrote :

I am working on debugging this. For the record, I think that the alsa-restore.conf workaround is essentially a coincidence; it seems to be tickling some kind of race condition elsewhere. It's quite difficult to extract debugging output from this stage of the boot process, which is slowing me down - I'm going to build special debugging versions of a few things so that I can get output dumped to files rather than to the console.

Changed in mountall (Ubuntu):
assignee: nobody → Colin Watson (cjwatson)
Colin Watson (cjwatson) wrote :

Actually, never mind the previous comment - it was obvious once I thought about it the right way. The mounted event is SYNCHRONOUS: anything that says 'start on mounted ...' will block mountall until it comes back, and this is by design (see mounted(7)). Thus the effect of alsa-restore's 'start on runlevel [2345] and (mounted MOUNTPOINT=/usr) and (mounted MOUNTPOINT=/var)' is to deadlock mountall.

Furthermore, the current job specification means that 'alsactl restore' will never be run unless /usr and /var are both separate mountpoints.

The right way to specify this is not '(mounted MOUNTPOINT=/usr) and (mounted MOUNTPOINT=/var)', but simply 'filesystem'. I'll fix this shortly.

affects: mountall (Ubuntu) → alsa-utils (Ubuntu)
Changed in alsa-utils (Ubuntu):
status: Confirmed → Triaged
Colin Watson (cjwatson) wrote :

Furthermore, the whole thing is redundant anyway since runlevel [2345] implies that all filesystems listed in fstab have already been mounted!

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package alsa-utils - 1.0.24.2-0ubuntu2

---------------
alsa-utils (1.0.24.2-0ubuntu2) natty; urgency=low

  * Remove '(mounted MOUNTPOINT=/usr) and (mounted MOUNTPOINT=/var)' from
    /etc/init/alsa-restore.conf's 'start on' condition. Not only is this
    redundant, because runlevel [2345] already implies that /usr and /var
    have been mounted, but it deadlocks mountall in the event of /usr and
    /var both being separate filesystems (LP: #723482).
 -- Colin Watson <email address hidden> Wed, 02 Mar 2011 00:43:01 +0000

Changed in alsa-utils (Ubuntu):
status: Triaged → Fix Released

alsa-utils (1.0.24.2-0ubuntu2) fixes my system.

Thanks, Colin!

Eric B (ebischoff) wrote :

Thanks Colin, but also thank you all people all who helped report and analyze this bug! Nice team work!

To post a comment you must log in.