[14.04] Boot stuck in "Starting configuring network devices" after upgrade

Bug #1412671 reported by Jaak Ristioja
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
High
Unassigned

Bug Description

One of these upgrades yesterday resulted in an unbootable system:

2015-01-19 20:32:45 upgrade libc-dev-bin:amd64 2.19-0ubuntu6.4 2.19-0ubuntu6.5
2015-01-19 20:32:47 upgrade libc6-dev:amd64 2.19-0ubuntu6.4 2.19-0ubuntu6.5
2015-01-19 20:32:48 upgrade libc6-dbg:amd64 2.19-0ubuntu6.4 2.19-0ubuntu6.5
2015-01-19 20:32:50 upgrade libc-bin:amd64 2.19-0ubuntu6.4 2.19-0ubuntu6.5
2015-01-19 20:32:53 upgrade libc6:i386 2.19-0ubuntu6.4 2.19-0ubuntu6.5
2015-01-19 20:32:56 upgrade libc6:amd64 2.19-0ubuntu6.4 2.19-0ubuntu6.5
2015-01-19 20:33:05 upgrade libclutter-gtk-1.0-0:amd64 1.4.4-3ubuntu2 1.4.4-3ubuntu2.2
2015-01-19 20:33:06 upgrade libssh-4:amd64 0.6.1-0ubuntu3 0.6.1-0ubuntu3.1
2015-01-19 20:33:07 upgrade multiarch-support:amd64 2.19-0ubuntu6.4 2.19-0ubuntu6.5
2015-01-19 20:33:10 upgrade libcgmanager0:i386 0.24-0ubuntu7.1 0.24-0ubuntu7.2
2015-01-19 20:33:11 upgrade libcgmanager0:amd64 0.24-0ubuntu7.1 0.24-0ubuntu7.2
2015-01-19 20:33:12 upgrade linux-firmware:all 1.127.10 1.127.11

I'm unable to debug this any further because the desktop system is in another part of the country. I got the error report by a phone call by a non-technical non-English-speaking user, on whos machine I installed Kubuntu. After a while I got him to boot into the recovery prompt and enable me SSH access so I could downgrade these packages. After downgrading the system restarted well and came up.

The user reported that the boot process was stuck at a "Starting configuring network connections" prompt or something. I'm not entirely sure, but I might have heard him muttering something about some cgroup messages.

Anyway I still have SSH access, but I can't do on-site debugging until February 7 or restart via SSH and risk having to contact the user to recover from a broken boot.

Revision history for this message
Jaak Ristioja (jotik) wrote :

I remembered that another error message during boot was something about being unable to mount /home. Since there was nothing in the logs, I think that the boot process failed before properly mounting filesystems from /etc/fstab.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Is it possible to attach /var/log/syslog, /var/log/apt/term.log, /var/log/dpkg.log, and a tarball of /var/log/upstart/ ? (If so, please mark this bug private before attaching them).

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Bug may not be in the kernel package, but it seems very unlikely to be in cgmanager, and more clueful people are likely to see it this way.

affects: cgmanager (Ubuntu) → linux (Ubuntu)
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Jaak Ristioja (jotik) wrote :

I can attach the logs, but please note that these do not cover the events of the unsuccessful bootups, because / was mounted as read-only and hence no logs were saved.

Revision history for this message
Jaak Ristioja (jotik) wrote :
Revision history for this message
Jaak Ristioja (jotik) wrote :
Revision history for this message
Jaak Ristioja (jotik) wrote :
Revision history for this message
Jaak Ristioja (jotik) wrote :

/var/log/upstart excluding the ureadahead logs which contained sensitive information.

information type: Public → Private
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Just to be clear (and separate this from another bug in my mind) - did the user try powering the machine off and on before the packages were downgraded, and it always failed to boot?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Also, could the user reboot a few more times to make sure that reboots are in fact reliable with the downgraded packages?

Revision history for this message
Jaak Ristioja (jotik) wrote :

Ok, after a remote session with the user, I got the following info:

> Just to be clear (and separate this from another bug in my mind) - did the user try powering the machine off and on before the packages were downgraded, and it always failed to boot?

Yes, so he held the power button for 4 secs to reboot or something like that to try to restart. And it failed every time.

> Also, could the user reboot a few more times to make sure that reboots are in fact reliable with the downgraded packages?

Hmm... even with downgraded packages it didn't reboot any more... ohwell, doing a "apt-get upgrade" then.

Revision history for this message
Jaak Ristioja (jotik) wrote :

However, I managed to extract a new set up logs, which may help solve this. For one, there are errors about / being read-only.

Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

The network-interface-security logfile shows errors which come from the /sbin/apparmor_parser program:

Warning: unable to find a suitable fs in /proc/mounts, is it mounted?
Use --subdomainfs to override.

THis suggests securityfs is not available. So it's more than / being readonly, mountall is having serious problems.

You say that reboots are not reliable with downgraded packages either? So this really sounds like there is a race in the startup.

Has a nother 'apt-get upgrade' or 'apt-get dist-upgrade' helped?

Revision history for this message
Jaak Ristioja (jotik) wrote :

Yes, reboots with downgraded packages were not reliable either.

I haven't remotely rebooted the system, because the users need the desktop to be fully functional. As I said earlier, I'm not on-site. I will be on-site February 6 13:00-19:00 UTC. Until then I will not risk any remote reboots.

Revision history for this message
Jaak Ristioja (jotik) wrote :

I'm on-site, I updated a bunch of packages and I'm now unable to reproduce the issue. Attaching the dpkg.log entries for the entire time-span for this bug report.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

I've seen no sensitive data in the logfiles, so re-marking the bug public.

information type: Private → Public
Changed in linux (Ubuntu):
importance: Undecided → High
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Since you're now unable to reproduce, I guess I'll mark this fix released. Please re-open if it re-occurs.

When you say 'unable to reproduce', you did try several reboots which all succeeded?

Changed in linux (Ubuntu):
status: New → Fix Released
Revision history for this message
Jaak Ristioja (jotik) wrote :

I tried several reboots and several shutdowns all of which succeeded.

The issues started on January 20, after the system updates on January 19. I don't know how the Ubuntu startup system (Upstart) works, but my guess is that one of those updates caused the race, and a subsequent update caused the race to disappear. Maybe some of those updates cause upstart service/task dependencies to be recalculated/cached, and this failed on January 19 (causing the unbootable system). A subsequent update may have attempted the same and succeeded, fixing the boot.

Have we eliminated this possibilty so that the resolution to mark this bug report as "Fix Released" is valid?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1412671] Re: [14.04] Boot stuck in "Starting configuring network devices" after upgrade

If you prefer we can mark it Incomplete meaning we're waiting for more
info (i.e. other people running into it, or it re-occuring on the original
system) to be able to debug.

There were other known badnesses in the updates on January 20, which is
why I'm pretty sure this is in fact fix-released. But I'll change it to
Incomplete. Please let us know if something shows up again.

 status: incomplete

Changed in linux (Ubuntu):
status: Fix Released → Incomplete
Revision history for this message
Jaak Ristioja (jotik) wrote :

The users complained that the issue re-occurred today. I remotely upgraded the system yesterday (apt-get dist-upgrade), so its very likely something in the upgrade process broke the system again. They will transport me the machine today, so I will have physical access. If we don't manage to fix it today, I'll probably switch it do Debian or back to 12.04.

Revision history for this message
Jaak Ristioja (jotik) wrote :

I will confirm this from the logs when I get access to the machine, but if I remember correctly, at least one of the updates was to libc, just as the previous time this happened.

Revision history for this message
Jaak Ristioja (jotik) wrote :

Ok, nevermind. This seems to be another issue. Namely for some reason /home fails to mount at boot for some reason: "The disk drive for /home is not ready yet or not present. Continue to wait; or press s to skip mounting or M for manual recovery." The workaround appears to be to delete all LVM snapshot volumes.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.