[Lucid] fsck cannot be cancelled in Plymouth

Bug #562811 reported by Swâmi Petaramesh
80
This bug affects 16 people
Affects Status Importance Assigned to Milestone
mountall (Ubuntu)
Fix Released
High
Scott James Remnant (Canonical)
Lucid
Fix Released
High
Scott James Remnant (Canonical)
plymouth (Ubuntu)
Fix Released
High
Steve Langasek
Lucid
Fix Released
High
Steve Langasek

Bug Description

Binary package hint: plymouth

During fsck, Plymouth displays "Press C to cancel checks" but this is inoperative.

Neither lowercase nor uppercase "C" have any effect ([Esc] has no effect either) and fsck goes on until its end, then system boot continues.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: plymouth 0.8.1-4ubuntu1
ProcVersionSignature: Ubuntu 2.6.32-20.30-generic 2.6.32.11+drm33.2
Uname: Linux 2.6.32-20-generic i686
Architecture: i386
Date: Wed Apr 14 08:43:53 2010
DefaultPlymouth: /lib/plymouth/themes/kubuntu-logo/kubuntu-logo.plymouth
MachineType: ASUSTeK Computer INC. 1005PE
ProcCmdLine: BOOT_IMAGE=/vmlinuz-2.6.32-20-generic root=/dev/mapper/VG1-UBUNTU ro clocksource=hpet acpi_osi=Linux acpi_backlight=vendor quiet splash
ProcEnviron:
 LANGUAGE=
 PATH=(custom, user)
 LANG=fr_FR.UTF-8
 SHELL=/bin/bash
ProcFB: 0 inteldrmfb
SourcePackage: plymouth
TextPlymouth: /lib/plymouth/themes/ubuntu-text/ubuntu-text.plymouth
dmi.bios.date: 03/18/2010
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 1003
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: 1005P
dmi.board.vendor: ASUSTeK Computer INC.
dmi.board.version: x.xx
dmi.chassis.asset.tag: 0x00000000
dmi.chassis.type: 10
dmi.chassis.vendor: ASUSTeK Computer INC.
dmi.chassis.version: x.x
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr1003:bd03/18/2010:svnASUSTeKComputerINC.:pn1005PE:pvrx.x:rvnASUSTeKComputerINC.:rn1005P:rvrx.xx:cvnASUSTeKComputerINC.:ct10:cvrx.x:
dmi.product.name: 1005PE
dmi.product.version: x.x
dmi.sys.vendor: ASUSTeK Computer INC.

Revision history for this message
Swâmi Petaramesh (swami-petaramesh) wrote :
Revision history for this message
Steve Langasek (vorlon) wrote :

This appears to be reproducible for me here.

Changed in plymouth (Ubuntu):
importance: Undecided → High
status: New → Triaged
Revision history for this message
Michael Wayne Goodman (goodmami) wrote :

I believe this is the same problem I'm experiencing. I have not found a solution and must reinstall Ubuntu. I notice that before the killer-reboot there are several other symptoms (perhaps red herrings). The first one was a XKB error, and now I am having a strange issue where modal windows (e.g. the Edit Connections or Reboot windows) display nothing in the form, but just the window border and titlebar. I also noticed that while I could connect to wireless networks, no data would be transmitted (as far as I could tell. Ping tests failed without even sending anything (host unknown error)). These symptoms did not appear except when rebooting resulted in the (seemingly) infinite loop of failed fsck checks.

Revision history for this message
Michael Wayne Goodman (goodmami) wrote :

Upon further inspection it seems that the primary error I'm experiencing is different from the described bug, although I also experience it (being unable to cancel fsck with C). I was able to recover my root partition by booting into a live CD (well, USB-stick) and running fsck from there, fixing the errors. I don't know why, but fsck was continuously failing while being run in Plymouth. Every time it found an error, it had some error (sorry I don't recall what) and restarted the disk check. Therefore, it would say "Checking disk (n-1/n)" and every time the check failed it would increment n (e.g. 1/2, 2/3, etc...). Should I file a separate bug for this under Plymouth, or does it concern a different project?

Revision history for this message
Steve Langasek (vorlon) wrote :

Michael,

That sounds like bug #501801, fixed in mountall 2.13. If you can recover the system (probably easiest to boot from external media and run fsck by hand), and install this update, it should fix the problem for you.

Revision history for this message
Steve Langasek (vorlon) wrote :

I've traced plymouth for this; plymouth is correctly detecting and sending the keypress, but mountall is ignoring it. Reassigning to mountall.

I suspect a bug in plymouth_answer() regarding plymouth_mnt->error vs. plymouth_error.

affects: plymouth (Ubuntu) → mountall (Ubuntu)
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote : Re: [Bug 562811] Re: [Lucid] fsck cannot be cancelled in Plymouth

On Tue, 2010-04-20 at 16:42 +0000, Steve Langasek wrote:

> I've traced plymouth for this; plymouth is correctly detecting and
> sending the keypress, but mountall is ignoring it. Reassigning to
> mountall.
>
> I suspect a bug in plymouth_answer() regarding plymouth_mnt->error vs.
> plymouth_error.
>
> ** Package changed: plymouth (Ubuntu) => mountall (Ubuntu)
>
If you've been able to replicate this (I haven't), could you trace the
mountall side too? Alternatively let me know how to replicate it :-)

Scott
--
Scott James Remnant
<email address hidden>

Changed in mountall (Ubuntu):
status: Triaged → Confirmed
Revision history for this message
Michael Wayne Goodman (goodmami) wrote :

Thanks Steve, bug #501801 is exactly the problem I'm experiencing. I suppose the other, pre-reboot, symptoms I was experiencing were because of the errors in / caused by an upgrade (which sounds like another bug...). I'll look for the update.

Revision history for this message
Steve Langasek (vorlon) wrote :

ok, after tracing back and forth, I think ti's a plymouth bug after all.

The problem appears to be in ply_boot_client_process_incoming_replies - whenever it receives data from the server, it calls
ply_list_get_first_node (client->requests_waiting_for_replies) and passes the returned data to this client requetor. This means that if watch_for_keystroke() is called, and then another request is made that generates a reply, the keystroke handler gets the reply to the *next* request, discards it as "Received odd keys <foo>", and never gets the real keypress when it happens!

So I guess our requests_waiting_for_replies list needs to be tagged with the type of reply each one is expecting; the responses will be FIFO within each class, but may be out of order wrt responses of other types.

And the reason other keys aren't missed (for 'skip' or 'maintenance shell') is that only during fsck do we send more messages to plymouth after setting a key watch.

Scott, given this, I'm not sure why it *wasn't* reproducible for you... am I overlooking something?

affects: mountall (Ubuntu) → plymouth (Ubuntu)
Revision history for this message
Steve Langasek (vorlon) wrote :

Turns out that changes are *also* needed to mountall for this; as soon as I got libplymouth2 fixed to pass the right responses to the right callbacks, mountall started segfaulting. ;) Fix committed to bzr.

Changed in mountall (Ubuntu):
importance: Undecided → High
status: New → Fix Committed
Steve Langasek (vorlon)
Changed in plymouth (Ubuntu):
status: Confirmed → Fix Committed
Changed in mountall (Ubuntu Lucid):
assignee: nobody → Steve Langasek (vorlon)
Changed in plymouth (Ubuntu Lucid):
assignee: nobody → Steve Langasek (vorlon)
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Your fix was wrong and introduced bugs (and it isn't at all obvious to me why this would cause a segfault) - I've had to revert it

Changed in mountall (Ubuntu Lucid):
status: Fix Committed → Triaged
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Chatted with Steve on IRC, and realised why there was a segfault; new fix applied

Changed in mountall (Ubuntu Lucid):
status: Triaged → Fix Committed
assignee: Steve Langasek (vorlon) → Scott James Remnant (scott)
Revision history for this message
Jim Robinson (jimbo2150) wrote :

I am having a similar issue, but ESC did cancel the disk check and continued booting Ubuntu.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package plymouth - 0.8.2-2ubuntu1

---------------
plymouth (0.8.2-2ubuntu1) lucid; urgency=low

  * src/main.c: if the splash screen isn't up yet, queue message requests
    instead of discarding them. LP: #507881.
  * src/client/ply-boot-client.c: some replies may be sent out of order
    because they depend on user input, so pay attention to the message type
    when picking the handler instead of handing the response to the first
    handler in the list; without this, cancelling fsck in mountall will
    never work. LP: #562811.
  * src/client/ply-boot-client.c: instead of trying to read from the server
    pipe if there are any outstanding requests, call
    ply_event_loop_process_pending_events() which already knows whether we
    can read from the pipe. LP: #559761.
  * add the pixel display bpp symbols to libplymouth2.symbols with a correct
    version, so that packages using them don't wind up with overly-strict
    dependencies on libplymouth2.
 -- Steve Langasek <email address hidden> Sun, 25 Apr 2010 16:15:37 +0100

Changed in plymouth (Ubuntu Lucid):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mountall - 2.14

---------------
mountall (2.14) lucid; urgency=low

  [ Scott James Remnant ]
  * Flush updates to Plymouth before emitting Upstart events, in case
    the event kills Plymouth. LP: #559761.
  * Don't mark a filesystem "nodev" just because it's got "none" in the
    device column; this will block the "virtual-filesystems" event which
    is the one that can't use Plymouth to prompt. LP: #507881.
  * When cancelling in-progress fsck, don't deference the NULL mount
    record. LP: #562811.
  * mountall is missing a very important line of code that increases the
    udev buffer size; without this it's possible we may miss events during
    busy periods. LP: #561390.

  [ Steve Langasek ]
  * If we're not marking all nodev filesystems as virtual, we need to
    at least mark our placeholder filesystem entries (type=none && dev=none)
    this way.
 -- Steve Langasek <email address hidden> Sun, 25 Apr 2010 23:36:01 +0100

Changed in mountall (Ubuntu Lucid):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.