fsck progress stalls at boot, plymouthd/mountall eats CPU

Bug #571707 reported by mikbini
704
This bug affects 153 people
Affects Status Importance Assigned to Milestone
Linux Mint
Fix Released
High
Clement Lefebvre
mountall (Ubuntu)
Fix Released
High
Unassigned
Lucid
Fix Released
High
Unassigned
plymouth (Ubuntu)
Invalid
Undecided
Unassigned
Lucid
Invalid
Undecided
Unassigned

Bug Description

PROBLEM

When a disk check is performed, the progress stalls somewhere around 70% and will then take a very long time finishing the remaining percent (10 minutes or more).

PATCH

Patch for mountall has now been pushed as an update for Lucid, if you are still seeing this problem, make sure you have mountall 2.15 installed before commenting/reporting a new bug.

[Earlier patch comments:]
Tero Mononen has published a patch for Bug #553745 which applies to the issue described here as well (see https://bugs.launchpad.net/ubuntu/+source/plymouth/+bug/553745/comments/76 and https://bugs.launchpad.net/ubuntu/+source/plymouth/+bug/553745/comments/77 )

I have created corresponding packages which are available through my PPA: https://launchpad.net/~arand/+archive/unstable

!!!Do note that this is an unofficial, untested, preliminary patch!!!
However testing and feedback is welcome, please especially report if there are ANY (new) problems seen when using the patched version.

TEST CASE:

(sudo aptitude install bootchart)
sudo touch /forcefsck && sudo reboot

POSSIBLE TEMPORARY WORKAROUNDS

1. Removing "quiet" and "splash" from the kernel boot line

2. When the progress has stalled, switch away from the splash screen using the left arrowkey (presumably any arrowkey works).

* Both these approaches speeds up the boot process to ~1 minute instead.

OBSERVATIONS

The fsck message "(...) non-contiguous (...)" Which I assume indicates the end of the fsck, is printed in the Virtual Terminal ("outside" plymouth) at around 70% + ~10-20 seconds.

Disk activity is null from this point on (presumed end of fsck above).

Bootchart crashes if trying to catch the whole boot at once with plymouth (at least for my 1h boot).

This problem seems to occur in both plymouthd and mountall, semi-simultaneously:
If you are in the plymouth screen, plymouthd is the cpu-gobbler, if you switch away from it using the arrow keys, mountall instead takes over the cpu-eating.

#####

ORIGINAL REPORT

Binary package hint: mountall

On my system when fsck runs at boot plymouth % completion count goes up quickly (<10 seconds) up to about 80% and then slows down considerably: the complete fsck of my 125GB HD, 30% full takes more than 5 minutes.

While this goes on the text VTs are all completely blank: just a blinking cursor.

An fsck from a recovery disk completes in ~10 seconds so it doesn't look like "fsck just being slow".

This slowdown was *not* happening on 2010-04-14 with the PPA described by this comment: https://bugs.launchpad.net/ubuntu/+source/plymouth/+bug/554737/comments/25

The fix in the PPA is now in the mainline lucid but somewhere in between then and today (2010-04-29) something introduced this slowdown.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: mountall 2.14
ProcVersionSignature: Ubuntu 2.6.32-21.32-generic 2.6.32.11+drm33.2
Uname: Linux 2.6.32-21-generic i686
NonfreeKernelModules: wl
Architecture: i386
Date: Thu Apr 29 15:38:56 2010
InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Beta i386 (20100317.1)
ProcEnviron:
 LANGUAGE=en_IE:en_GB:en
 PATH=(custom, user)
 LANG=en_IE.utf8
 SHELL=/bin/bash
SourcePackage: mountall

Revision history for this message
mikbini (mikbini) wrote :
Revision history for this message
Ernst (ernst-blaauw) wrote :

As four people are affected, I set the status to confirmed.

I experience this behavior on 64 bit. The 'problematic' area starts at 74%. I'm running a fully up to date Lucid.

Changed in mountall (Ubuntu):
status: New → Confirmed
mikbini (mikbini)
description: updated
Revision history for this message
D J Eddyshaw (david-eddyshaw) wrote :

I get this on a Dell Mini 9; the slowdown starts at 70% and gets even worse at 90%,to the point that I eventually just shut down the machine.

The odd behaviour from 70% on has been present ever since I installed the beta. Initially there was a complete hang at 70%; this was "fixed" (#554737) inasmuch as the machine no longer locked up, but there was clearly something still wrong; fsck got to 70% and stopped and then the login screen came up.

Over the past couple of days something has changed; fsck proceeds beyond 70% but impossibly slowly.

Revision history for this message
Fabio Marzocca (thesaltydog) wrote :

I am waiting for fscheck to finish booting my other PC since 7 minutes now... stil all 95%, very very slow.

Revision history for this message
Fabio Marzocca (thesaltydog) wrote :

This not happening only on ext4: on ext3 it is even worst

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Thanks for the report, guys.

It _sounds_like this could be a plymouth issue, but I'd like to collect some more information to be sure.

Could you install the "bootchart" package and reboot (with fsck forced I guess), attach the resulting image from /var/log/bootchart

Thanks

Changed in mountall (Ubuntu):
importance: Undecided → High
Revision history for this message
D J Eddyshaw (david-eddyshaw) wrote :

Bootchart logs

Revision history for this message
Barry Drake (b-drake) wrote :

Dell Inspiron Mini 10v 8Gb SSD running Lucid 10.4 with latest updates.
I have exactly the same fault. As with the other reports, three or so updates back, the boot process froze completely. Taking out quiet splash showed that it froze after fsck had completed. The last two updates have stopped it freezing, but fsck now takes in excess of 20 mins to complete on this netbook.

My suggestion for now would be to kill plymouth when /forcefsck is detected. I don't think this would be a popular suggestion, but for me, it would be nice!
Barry.

Revision history for this message
Barry Drake (b-drake) wrote :

Forgot to say: I'm running an ext2 partition!!!

Revision history for this message
Martin Erik Werner (arand) wrote :

I'm sorry if this end up as a hijack, but I'm assuming this is the same issue...

This seems to be a very common thing.
I would suspect plymouth for the problem, since if you do jump out to a TTY during this then the boot seems to complete nicely.

In fact, if you jump to tty and then jump back to plymouth it's noticable that it has managed to get much further in the percent compared to what it would have if you just stayed on the plymouth screen...

Each time I jump to tty the normal "fsck...clean...non-contiguous #%" message is repeated (one extra each jump) which presumably indicates that fsck has finished, and plymouth (or something else) is messing about with other things...

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

srv-1-lucid-20100430-1.tgz and srv-1-lucid-20100430-1.png
'quiet splash' set in grub kernel commandline
takes 6:30 to complete

srv-1-lucid-20100430-2.tgz and srv-1-lucid-20100430-2.png
*NO* 'quiet splash' set in grub kernel commandline
takes 00:26 to complete

Revision history for this message
Martin Erik Werner (arand) wrote :

Also notable is that if I boot with quiet and splash disabled, everything is fine and I'm up in less than a minute.

One thing worth notice is that there is no fsck progress given when booting without the splash.

All my testing done on a virtualbox instance of Lucid

Revision history for this message
Barry Drake (b-drake) wrote :

Here is the picture from bootchart. Again, boot process completed after some 20mins or so. Bootchart also produces a compressed archive containing some logs - do you need this as well?

Revision history for this message
Barry Drake (b-drake) wrote :

Just to confirm - with quiet splash removed from grub, my netbook boots in seconds rather than minutes when fsck is forced.

Revision history for this message
Anders Kaseorg (andersk) wrote :

I saw this too, and can reproduce with touch /forcefsck; reboot. Here’s a bootchart; it shows mountall spinning at 100% CPU for about 15s after the first fsck finishes, and again for over 200s after the last fsck finishes. I wonder what it’s doing with all that CPU…

Revision history for this message
Anders Kaseorg (andersk) wrote :
description: updated
Revision history for this message
Anders Kaseorg (andersk) wrote :

I think removing ‘quiet splash’ is a red herring. I can reproduce the problem by creating /forcefsck, whether or not ‘quiet splash’ is in the boot flags. Here is a bootchart without ‘quiet splash’ that demonstrates the same problem (mountall spins at 100% CPU for 200 seconds after all the fscks are complete).

description: updated
description: updated
Revision history for this message
Martin Erik Werner (arand) wrote :

@Anders Kaseorg:
Sorry, I was in the process of updating the bug description and inadvertedly overwrite your changes.

I am however most definitely able to work around the issue removing quiet and splash...

Maybe we're even bunching two or more separate bugs here...

Revision history for this message
Martin Erik Werner (arand) wrote :

Yup, just tested now, and disabling quiet and splash makes this virtualbox able too boot no problem...

I'm trying to figure out the arrow-out workaround now... it seems to be very fickle.

description: updated
description: updated
description: updated
Revision history for this message
Anders Kaseorg (andersk) wrote :

Here’s a backtrace from mountall while it is spinning. It starts with
#0 0x00007ff69c9fc6e3 in ply_list_find_node (list=0x7ff69e0af860,
    data=0x7ff69faf6460) at ply-list.c:105

Revision history for this message
Martin Erik Werner (arand) wrote :

Comaparing letstrynl's and Anders Kaseorg's bootcharts it seems like there a two separate issues here.
On letstrynl's bootchart it's plymouthd that's eating the CPU, whereas on Anders Kaseorg's it's mountall.

This could account for our disagreement as to the workarounds.

We should maybe split off the plymouthd instance into a new bug, to avoid confusion.

Revision history for this message
Martin Erik Werner (arand) wrote :

Okay. Let me first start out retracting pretty much everything I've said so far... there, now let's start anew:

(Using a virtualbox Lucid 32bit guest on 32bit Karmic host)

PROBLEM

When a disk check is performed, the progress stalls somewhere around 70% and will then take a very long time finishing the remaining percent (in my case, around an hour).

TEST CASE:

(sudo aptitude install bootchart)
sudo touch /forcefsck && sudo reboot

WORKAROUNDS

1. Removing "quiet" and "splash" from the kernel boot line

3. When the progress has stalled, switch away from the splash screen using the left arrowkey (presumably any arrowkey works).

* Both these approaches speeds up the boot process to ~1 minute instead.

OBSERVATIONS

The fsck message "somethingsomething non-contiguous somethingsomething" Which I assume indicates the end of the fsck, is printed in the Virtual Terminal (Not-plymouth) at around 70% + ~10-20 seconds.

Disk activity is null from this point on (presumed end of fsck above).

Bootchart crashes if trying to catch the whole boot at once with plymouth (at least for my 1h boot).

This problem seems to occur in both plymouthd and mountall, semi-simultaneously:
If you are in the plymouth screen, plymouthd is the cpu-gobbler, if you switch away from it using the arrow keys, mountall instead takes over the cpu-eating.

BOOTCHARTS
(attached along with complete bootchart log as arand_bootcharts.tar.gz)

0arand_clean
######
Reference clean boot, with plymouth and no fsck.

1arand_switch_to_vt_early
######
In this boot I switched to VT (allowkey out from splash) quite early, as seen in that the shift plymouthd->mountall cpu-hogging is early. mountal takes a little over 100 seconds to finish.

2arand_switch_to_vt_later
######
In this boot I switched to VT later on.
It might be noteworthy that the time that mountall cpu-hogs is approximately the same (100s)

3arand_no_quiet_splash
#####
mountall still hogs the cpu, but for a considerably shorter time, overall boot finishes much faster.

Please do tell if there is anything else useful I could provide.

description: updated
summary: - fsck at bootstrap is too slow
+ fsck progress stalls at boot, plymouthd/mountall eats CPU
frnstefano (frnstefano)
description: updated
Revision history for this message
frnstefano (frnstefano) wrote :

the installation on my desktop computer was a fresh install and it is NOT AFFECTED from this bug.
the installation on my laptop is an upgrade from karmic and it is AFFECTED from this bug.
all these two installations where done starting from release candidate and are now updated.

could the bug be related to DISTRIBUTION UPGRADE (i.e. from karmic)?

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

I did a fresh installation on my desktop machine (64-bit) also.
Still has the bug. Removing 'splash' fixed the problem, as before.

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

Could it be that it is hardware related?

commands I used:
  bios: hwinfo --bios | grep Socket:
  video: hwinfo --gfxcard | grep Modules:

desktop 64-bit:
  bios: Socket: "LGA1366"
  video: Driver Modules: "nvidia"

server 64-bit:
  bios: Socket: "Socket437"
  graphics: Model: "Intel 945G"

frnstefano, can you check this, and tell us if you're using the 'nouveau' driver?

Revision history for this message
Martin Erik Werner (arand) wrote :

I would guess that it is not hardware related, since I'm seeing this on a virtualbox 32bit.
This one is not an upgraded system but it has been around since beta somewhere.

Revision history for this message
pelm (pelle-ekh) wrote :

I'm having exactly the same problem fsck stalling at 70% and then finishes after a very long time. I've upgraded from karmic and use the nvidia closed driver not the nouveau one. The process eates my CPU and it's very disturbing.

Revision history for this message
frnstefano (frnstefano) wrote :

i will post something more complete in late afternoon because i cannot connect to the desktop system now.

laptop 32-bit:
$ hwinfo --bios |grep Socket
    Socket: "uFCPGA2"
    Socket Type: 0x04 (ZIF Socket)
    Socket Status: Populated
    Location: 0x00 (Internal, Not Socketed)
$ hwinfo --gfxcard
22: PCI(AGP) 100.0: 0300 VGA compatible controller (VGA)
  [Created at pci.318]
  UDI: /org/freedesktop/Hal/devices/pci_1002_4e50
  Unique ID: VCu0.PkvpIaxQqbF
  Parent ID: vSkL.oF7y00qHwA3
  SysFS ID: /devices/pci0000:00/0000:00:01.0/0000:01:00.0
  SysFS BusID: 0000:01:00.0
  Hardware Class: graphics card
  Model: "ATI RV350 NP"
  Vendor: pci 0x1002 "ATI Technologies Inc"
  Device: pci 0x4e50 "RV350 NP"
  SubVendor: pci 0x1025 "Acer Incorporated [ALI]"
  SubDevice: pci 0x005d
  Memory Range: 0xd8000000-0xdfffffff (rw,prefetchable)
  I/O Ports: 0x3000-0x3fff (rw)
  Memory Range: 0xd0100000-0xd010ffff (rw,non-prefetchable)
  Memory Range: 0xd0120000-0xd013ffff (ro,prefetchable,disabled)
  IRQ: 11 (116070 events)
  I/O Ports: 0x3c0-0x3df (rw)
  Module Alias: "pci:v00001002d00004E50sv00001025sd0000005Dbc03sc00i00"
  Driver Info #0:
    XFree86 v4 Server Module: radeon
  Driver Info #1:
    XFree86 v4 Server Module: radeon
    3D Support: yes
    Extensions: dri
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #11 (PCI bridge)

i have these problem the problem of fsck stall with both dri (KMS disabled) and dri2

desktop 64-bit:
Core Duo Q9400 (socket 775)
Ati Radeon 4870x2 (fglrx driver from ubuntu packages)

I hope this could help for the moment... i'll post other informations about 64-bit system later

Revision history for this message
D J Eddyshaw (david-eddyshaw) wrote :

Interesting thought that this could be related to upgrading from Karmic rather than a fresh install. Both my machines suffer from the problem and were upgraded, for what it's worth. How about everyone else?

Revision history for this message
Chow Loong Jin (hyperair) wrote : Re: [Bug 571707] Re: fsck progress stalls at boot, plymouthd/mountall eats CPU

On Monday 03,May,2010 06:49 PM, D J Eddyshaw wrote:
> Interesting thought that this could be related to upgrading from Karmic
> rather than a fresh install. Both my machines suffer from the problem
> and were upgraded, for what it's worth. How about everyone else?
>
I think if you read some of the comments made before yours, you'll notice that
there was a fresh install test case which suffered from this problem as well.

--
Kind regards,
Chow Loong Jin

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

> Arand:
> I would guess that it is not hardware related, since I'm seeing this on a virtualbox 32bit.

I just tested a fresh 32-bit installation within virtualbox.
No problems whatsoever (with or without quiet/flash).
I don't see any progressmeter, by the way.

This could mean it *IS* hardware related.

Plymouth has hooks into graphics drivers, so...

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

People, can everyone (with or without problems) run the two hwinfo commands mentioned (with grep please) , so we can see if there's a connection between hardware and the behavior of plymouth ?

As allready said, mine are:

desktop 64-bit:
  bios: Socket: "LGA1366"
  video: Driver Modules: "nvidia"

server 64-bit:
  bios: Socket: "Socket437"
  graphics: Model: "Intel 945G"

commands:
  bios: hwinfo --bios | grep Socket:
  video: hwinfo --gfxcard | grep Modules:

Revision history for this message
frnstefano (frnstefano) wrote :

to letstrynl: hwinfo --gfxcard | grep Modules doesn't produce any output. do you mean hwinfo --gfxcard | grep Model??

desktop 32-bit
   bios: Socket: "uFCPGA2"
   video: Model: "ATI RV350 NP"

Revision history for this message
frnstefano (frnstefano) wrote :

only splash directive reproduces the bug.
removing or not quiet directive doesn't change nothing in my case.

quiet AND splash --> fsck stalls
ONLY splash --> fsck stalls
ONLY quiet --> all works fine
NO both --> --> all works fine

Revision history for this message
Fabio Marzocca (thesaltydog) wrote :

Linux Plato 2.6.32-21-generic-pae #32-Ubuntu SMP Fri Apr 16 09:39:35 UTC 2010 i686 GNU/Linux

 hwinfo --bios | grep Socket:
(nothing. Blank)

hwinfo --gfxcard | grep Modules:
  Driver Modules: "nvidia"

Revision history for this message
Peter Ries (peterriesde) wrote :

Just investigating on the long time fsck during boot on my netbook with ubuntu 10.04 (around 15 minutes)

I want to confirm, but can't give you hwinfo output, as I'm not @home right now.

It's a Samsung NC10 Netbook on Atom platform with Intel Video.
Maybe somebody else can supply additional information if necessary. Thx.

Revision history for this message
frnstefano (frnstefano) wrote :

hwinfo --bios | grep Socket requires root permission to produce the
output.

Fabio Marzocca try using sudo hwinfo --bios | grep Socket

Revision history for this message
Peter Ries (peterriesde) wrote :

addition to comment #36:

updated from 9.10 karmic (and still use grub legacy if this should affect boottime)

Revision history for this message
frnstefano (frnstefano) wrote :

hwinfo --bios | grep Socket requires root permission to produce the
output.

try using sudo hwinfo --bios | grep Socket

Revision history for this message
Barry Drake (b-drake) wrote :

On Mon, 2010-05-03 at 13:34 +0000, letstrynl wrote:
> People, can everyone (with or without problems) run the two hwinfo
> commands mentioned (with grep please) , so we can see if there's a
> connection between hardware and the behavior of plymouth ?

I looked at this, and hwinfo doesn't seem to exist on my netbook. I
looked at installing it, but is seems to be asking for a crazy amount of
drive space that I actually don't have. I only have 8 Gig on this
little netbook. Dell Inspiron Mini 10v - 8Gig SSD

I do have the problem. Will the output from lshw help? If so, do you
want that piped through grep? Please give the command you would like
and I'll run it.

BTW - this morning, fsck was forced automatically. I pressed an arrow
key when it stalled, and the process completed in < 1min.

Revision history for this message
Fabio Marzocca (thesaltydog) wrote :

Gotcha!

sudo hwinfo --bios | grep Socket
[sudo] password for fabio:
    Socket: "Socket 775"
    Socket Type: 0x04 (ZIF Socket)
    Socket Status: Populated
    Location: 0x00 (Internal, Not Socketed)
    Location: 0x00 (Internal, Not Socketed)
    Location: 0x00 (Internal, Not Socketed)

Revision history for this message
D J Eddyshaw (david-eddyshaw) wrote :

dje@dell:~$ sudo hwinfo --bios | grep Socket:
    Socket: "Microprocessor"
dje@dell:~$ sudo hwinfo --gfxcard | grep Modules:
  Driver Modules: "drm"

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

I don't think this is going to be hardware specific.

Peering through the source code for mountall I think there may be at least one failure case that is not handled. If plymouth dies after the first connection to it is made the failure will be reported every time progress is supposed to be updated. Perhaps it should be reported only the first time?

I'm not quite sure why mountall is not spotting the fsck has quit. This is surprisingly difficult to debug too...

Revision history for this message
D J Eddyshaw (david-eddyshaw) wrote :

dje@mini:~$ sudo hwinfo --bios | grep Socket:
    Socket: "U1"
dje@mini:~$ sudo hwinfo --gfxcard | grep Modules:
  Driver Modules: "drm"

Revision history for this message
hpu (zippy888) wrote :

First time automatically checking my disk on Lucid slowed down considerably at 70% too, but finally ended after like 20 minutes; while on Karmic it used to take less than 5 minutes.

No fresh install: I updated from Karmic to Lucid.

These are my outputs:

$ sudo hwinfo --bios | grep Socket
    Socket: "Socket 478"
    Socket Type: 0x04 (ZIF Socket)
    Socket Status: Populated
    Location: 0x00 (Internal, Not Socketed)
    Location: 0x01 (External, Not Socketed)

$ sudo hwinfo --gfxcard | grep Modules
    (no output)

$ sudo hwinfo --gfxcard | grep Model
    Model: "Silicon Integrated SiS315PRO"

Revision history for this message
Martin Erik Werner (arand) wrote :

It may be related to hardware. But since we already know that it's a very large set, it's likely quite irrelevant. So if not requested, please don't paste any more hwinfo logs.

Since this bug seems to affect a LOT of people could you please avoid "Me too" comments and just mark the bug as affecting you instead, thanks.

@Anders Kaseorg:
How did you manage to get the backtrace from mountall there, and do you (or whomever is tasked to look at the bug) think it would be useful from a debugging point of view to gather it at some specific point, if possible?

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

The problem is a bit strange. Watching mountall on inside a screen with -v shows that it is noticing the fsck finishing and it goes on to remounting "/" and mounting all the other filesystem components. For example I see:
fsck from util-linux-ng 2.17.2
ubuntu1004: 149164/494832 files (0.1% non-contiguous), 754483/1984027 blocks
fsck / [3034] exited normally
remounting /
...
local 3/3 remote 0/0 virtual 11/11 swap 0/0

However it is hanging waiting for something to finish.

Running gdb on it shows the last thing gdb can spot it doing is ply_boot_client_flush(). On plymouth screen, the percentage count goes up very slowly. A few seconds after killing plymouthd mountall spews a lot of errors before finally saying it has disconnected from plymouth.

If I had to hazard a guess I'd say a huge backlog is occurring...

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

For others who are trying, I've been dropping to runlevel 1 then doing:
service rsylog stop
screen
mount -o ro,remont /; mountall -v --force-fsck --no-events

You can then use screen to get extra terminals and strace or gdb mountall.

Revision history for this message
Paul Crawford (psc-sat) wrote :

Seeing the same thing on my PC with fresh 10.04 install on an ext4 partition on Areca HW RAID card.

$ sudo hwinfo --bios | grep Socket
    Socket: "J3E1"
    Socket Type: 0x01 (Other)
    Socket Status: Populated
    Location: 0x00 (Internal, Not Socketed)
    Location: 0x00 (Internal, Not Socketed)
$ sudo hwinfo --gfxcard | grep Modules
  Driver Modules: "drm"
$ sudo hwinfo --gfxcard | grep Model
  Model: "PC Partner Radeon X300 (PCIE)"
  Model: "PC Partner Radeon X300SE"

Also should comment that:

(1) the results are not logged properly, messages in boot.log but not in fsck/checkfs or fsck/checkroot

(2) The system tried to check my CIFS mounts in fstab that were configured with '0' for the <pass> and <dump> fields.

The logging problem also occurs on 9.10 (maybe 9.04?) but worked fine with 8.10 & usplash.

Revision history for this message
Anders Kaseorg (andersk) wrote :

This patch to libplymouth2 makes the long pause go away. It perhaps hides the real problem, though: why is the client->requests_to_send list growing to many thousands of nodes in the first place? It sounds like there is a problem with the last Ubuntu patch to plymouth (0.8.2-2ubuntu2, “Don't call ply_boot_client_process_pending_requests on flush”, bug 570289).

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

I agree with Anders - something else is suspect. Does fsck really output that many updates? On my system I think fsck produces about 20 updates and yet there seem to be hundreds of calls to ply_boot_client_update_daemon...

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Further inspection says fsck really does produce that many lines of output so I was wrong. Here it produces about 32000 lines of output.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

An optimisation could be simply to only send progress when it has changed:

In fsck_reader() (line 2701 of mountall.c) an old_progress variable could be set to the previous iteration's progress and plymouth_progress() only called when progress differed from old_progress...

Revision history for this message
Jussi Kivilinna (jukivili) wrote :

Same problem, upgraded from Karmic, RAID1+LVM setup.

hwinfo --gfxcard|grep Modules
  Driver Modules: "nvidia"
hwinfo --gfxcard|grep Model
  Model: "nVidia VGA compatible controller"
hwinfo --bios|grep Socket:
    Socket: "Socket 775"

I didn't have problems at first when plymount was running in text/console mode. After I switched on graphical splash with 'vga=' kernel setting, I got stuck with 70% when fscking.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Darn it! Someone coded up my idea hours before I thought of it in Bug 553745 (see Bug 553745 comment #77). And yes plymouth in graphical mode is easier to overwhelm than in text mode.

Revision history for this message
Martin Erik Werner (arand) wrote :

Above mentioned patch fixes the problem, seemingly completely, for me.

Revision history for this message
Martin Erik Werner (arand) wrote :
Revision history for this message
Martin Erik Werner (arand) wrote :

likewise for plymouth (filterdiffed)

Revision history for this message
Martin Erik Werner (arand) wrote :

I've uploaded packages with the patch by Tero Mononen to my "unstable" PPA, feel free to test and report back, especially if any other issues are seen: https://edge.launchpad.net/~arand/+archive/unstable

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

Problem seems to be fixed here.
Progress runs up to 100%, like it should, no delays anymore.
My 64-bit desktop starts up fine now. Great work.

When will this be updated in the repository (when no problems are found)?

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

Just as a sidenote, because the problems seem to be fixed now.
For anyone who would like to fully use plymouth now on desktop systems:

in synaptic:
  purge plymouth-theme-*
  install plymouth-theme-solar

For a high-res startup (even on nvidia and ati), see:
  http://idyllictux.wordpress.com/2010/04/26/lucidubuntu-10-04-high-resolution-plymouth-virtual-terminal-for-atinvidia-cards-with-proprietaryrestricted-driver/

Cheers

Revision history for this message
Martin Erik Werner (arand) wrote :

@letstrynl:
The presumed patch will need to go through the SRU process https://wiki.ubuntu.com/StableReleaseUpdates

Testing the fix and confirming that it has no adverse side-effects may get it accepted into updates sooner.
But more importantly it will make sure that the patch doesn't break the boot process in som other weird and wonderful way.

tags: added: patch
tags: removed: apport-bug i386
Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

Thanks for the link, arand.

Good to see canonical takes this seriously.

But, is this a critical update? Guess not.

So we can't expect an official (non-PPA) update very soon?

Revision history for this message
D J Eddyshaw (david-eddyshaw) wrote :

Patch seems to solve the problem completely on my Dell Mini 9 too, both a "spontaneous" and forced fsck having completed fine with the progress bar also behaving just as expected, continuing up to 100% at a regular pace and proper speed.

No surprising side-effects so far.

Many thanks to all involved.

Revision history for this message
Mariusz Kielpinski (kielpi) wrote :

@letstrynl

Patch should be issued very soon in official repo. My system has 8 partition. With this bug is almost unusable, because a have to wait many hours to log in. Running PC like server is not a solution.

Revision history for this message
D J Eddyshaw (david-eddyshaw) wrote :

All seems well on my other affected machine now, too (a Dell M1330).

Revision history for this message
Paul Crawford (psc-sat) wrote :

I have tried the patch from Arand's "unstable" PPA and it seems to work fine, went quickly to 100% on my ext4 partition after forcing a test with 'sudo touch /forcefsck'

Revision history for this message
Mariusz Kielpinski (kielpi) wrote :

Patch works for me. No additional problem. I also forced all 8 partitions to check.

description: updated
description: updated
Revision history for this message
Paulo J. S. Silva (pjssilva) wrote :

Count me as one more happy user of the patched packages. No additional problems.

Revision history for this message
Peter (nitep) wrote :

Sane problem with fresh install intel 32 bit ext4 and The C to cancel check is ignored.

Revision history for this message
Martin Erik Werner (arand) wrote :

@Peter
So this is without the patch, no?

@Anders Kaseorg:
Oops, I forgot to mention this earlier, but I tested your patch and at least for me I saw no difference in behaviour on my system, meant to comment on that sooner, sorry.

Changed in plymouth (Ubuntu):
status: New → Confirmed
Revision history for this message
Mike.lifeguard (mikelifeguard) wrote :

Suggested fix resolves the issue for me.

But if you want to get feedback on proposed changes, shouldn't this go in lucid-proposed?

Revision history for this message
Peter (nitep) wrote :

@Peter
So this is without the patch, no?
Cancel failure is without the patch.

Revision history for this message
Kees Cook (kees) wrote :

The intent of the mountall patch is correct -- no reason to flood plymouth with updates, however, the "lastprogress" tracking needs to be attached to the mount structure, rather than the callback, since the callback can be used for multiple mounts.

Changed in mountall (Ubuntu Lucid):
status: New → Triaged
Changed in mountall (Ubuntu):
status: Confirmed → Triaged
Changed in plymouth (Ubuntu Lucid):
status: New → Triaged
Changed in plymouth (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
Martin Erik Werner (arand) wrote :

I just noticed that the debdiff for plymouth attached by me earlier has not been filterad properly (debian/patches/series is missing, and thus does not apply properly if used independently), PPA packages has it alright though.
I was going to post new, cleaner and reformatted debdiffs targeted for -proposed but will delay due to Kees' concerns with the current solution (which I have no idea how to address, I can but package, not code... ).

Revision history for this message
Anders Kaseorg (andersk) wrote :

How about this?

Revision history for this message
Kees Cook (kees) wrote :

Looks about right. I'd probably name it "fsck_progress" and group it with the other fsck-related variables (_pid, _fix).

Revision history for this message
Anders Kaseorg (andersk) wrote :

Sure.

Revision history for this message
Steve Grace (sgrace) wrote :

I have this issue as well, on a fresh install of 10.04 using an ext3 file system. Pressing left arrow during the disk check appears to work around the issue for me.

Revision history for this message
Barry Drake (b-drake) wrote :

I've been at a conference and only had time last night to look at the
patched binaries. I installed the patched mountall, but both
libplymouth and plymouth i386 binaries that I got were faulty (looking
inside the packages, the two executables were only 20 bytes). Taking a
fresh download got the same thing.

Out of curiosity, I forced fsck on boot, and was surprised that it
worked OK.

Summary: mountall=installed && !plymouth=installed == !bug

Revision history for this message
Martin Erik Werner (arand) wrote :

@Barry Drake:
I do not see that, when downloading the .deb files I get e.g. ~54k for the plymouthd executable, so that sounds like some kind of corruption in the download on your side, I think.

I'm currently in the process of updating the mountall package in my PPA with Anders Kaseorg's new patch. So everyone currently using the PPA should see the update shortly.

Revision history for this message
Barry Drake (b-drake) wrote :

On Fri, 2010-05-07 at 10:28 +0000, arand wrote:
> I do not see that, when downloading the .deb files I get e.g. ~54k for the plymouthd executable, so that sounds like some kind of corruption in the download on your side, I think.

Just tried again from:
https://edge.launchpad.net/~arand/+archive/unstable/+build/1715057 after
deleting the previous packages. Still cannot install plymouth. Have
tried in various ways including the gui pachkage manager. One of the
command I tried gives:
barry@netbook:~/Downloads$ aptitude install -S
plymouth_0.8.2-2ubuntu3~ppa1_i386.deb
Reading package lists... Done
Building dependency tree
Reading state information... Done
Reading extended state information
Initialising package states... Error!
E: Unable to parse package file plymouth_0.8.2-2ubuntu3~ppa1_i386.deb
(1)

You are right - the plymouthd executable is 54k. I was tired and
looking at the wrong file. Sorry.

Revision history for this message
Benjamin Kay (benkay) wrote :

I tried arand's plymouth and mountall packages from stable and they work... sort of. With the patches:

* fsck now proceeds at normal speed
* Pressing C is no longer ignored

That's the good news. The "sort of" goes with the second point. Pressing C ought to resume the normal boot process, preferably with some sort of user-friendly message like, "File check skipped." Instead, pressing C produces the following message:

Serious errors were found while checking the disk drive for /
Press I to ignore, S to skip mounting or M for manual recovery

First of all, no, serious errors were not found. Plymouth is just saying that. Secondly, pressing I, S, or M is ignored. This behavior is mentioned in the forums, so it's probably a separate (as of yet unreported) bug. Just thought I'd bring it to everyone's attention. http://ubuntuforums.org/showthread.php?p=9255657

Revision history for this message
Martin Erik Werner (arand) wrote :

@Benjamin
I got that as well. Just reported Bug #577331
Also to note, this is not only seen with the patch here. If you are using the unpatched version, and press cancel early enough (so that you avoid the hang from this bug), it is present there as well. So I'm considering that a separate (no less serious) issue.

Revision history for this message
Steve Langasek (vorlon) wrote :

On Thu, May 06, 2010 at 10:34:07PM -0000, Anders Kaseorg wrote:
> Sure.

> ** Patch added: "mountall_2.14_lp571707.debdiff"
> http://launchpadlibrarian.net/47964636/mountall_2.14_lp571707.debdiff

This patch doesn't apply cleanly against the current bzr branch (which
happens to already have a fsck_progress member added by Scott in a pending
change), but the overall idea is sound. I've fixed up the patch and will
put it through its paces here, and submit to SRU if it checks out. Thanks!

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Steve Langasek (vorlon)
Changed in mountall (Ubuntu Lucid):
status: Triaged → In Progress
importance: Undecided → High
Revision history for this message
Colin Watson (cjwatson) wrote : Please test proposed package

Accepted mountall into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in mountall (Ubuntu Lucid):
status: In Progress → Fix Committed
tags: added: verification-needed
description: updated
Revision history for this message
tankdriver (stoneraider-deactivatedaccount) wrote :

The fix works for me.
I upgraded mountall to v2.15, ran "sudo touch /forcefsck" and reboot:
Filesystem check took ~15sek for 400GB. Thank you!

Martin Pitt (pitti)
tags: added: verification-done
removed: verification-needed
Revision history for this message
Martin Erik Werner (arand) wrote :

I've tested with the proposed mountall, both with and without the plymouth change in my ppa.
I've noticed no distinct difference in speed or otherwise between the two cases.

So from a /strictly/ /superficial/ point of view, the plymouth patch does not seem to change anything when it comes to this bug, at least for me.

Proposed mountall is good.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mountall - 2.15

---------------
mountall (2.15) lucid-proposed; urgency=low

  [ Scott James Remnant ]
  * Fix an obvious thinko error that meant that the "I"gnore fsck error key
    for a "hard" failure was ignored.
  * When cancelling filesystem checks, only cancel those that are actually
    checking filesystems; otherwise those that are merely verifying the
    superblock will return an "unrecoverable error" rather than "cancelled".
    LP: #577331.

  [ Steve Langasek ]
  * Only send plymouth a progress update when there's actual progress to
    report; otherwise we flood plymouthd with redundant events, and the
    progress will spin for minutes after the fsck itself is finished. Thanks
    to Tero Mononen and Anders Kaseorg for the patch. LP: #571707.
 -- Steve Langasek <email address hidden> Sun, 09 May 2010 01:04:24 +0200

Changed in mountall (Ubuntu Lucid):
status: Fix Committed → Fix Released
Revision history for this message
Steve Langasek (vorlon) wrote :

copied from lucid-proposed to maverick.

Changed in mountall (Ubuntu):
status: Triaged → Fix Committed
status: Fix Committed → Fix Released
Changed in mountall (Ubuntu Lucid):
status: Fix Released → Fix Committed
Revision history for this message
Michael Hampson (michael-hampson-mobile) wrote :

Deeply depressing, I just feel like a newbie reading all this stuff even though I've been completely M$-free for almost two years, but I have the same problem, I've been watching the boot screen for nearly an hour, this really ought to be a top priority for an automatic update, it's completely unacceptable to have so many people affected. My PC is eight years old, my 10.04 install (updated from 9.10) was updated this very morning. Right now I'm off to investigate the various methods for ensuring that fsck never even begins, ever...

Revision history for this message
Michael Hampson (michael-hampson-mobile) wrote :

Further to the immediately preceding post, as an experiment, I forced fsck on my one-year-old Acer Aspire One Atom-based Netbook with fresh 10.04 install. It has *exactly* the same bug: the check slows to snail's pace, and C doesn't cancel. You could hardly find two more diverse machines with the same error. This really should be being taken *very* seriously. 9.04 and 9.10 were both brilliant on the netbook.

Revision history for this message
Steve Langasek (vorlon) wrote : Re: [Bug 571707] Re: fsck progress stalls at boot, plymouthd/mountall eats CPU

Michael,

On Sun, May 09, 2010 at 02:34:30PM -0000, Michael Hampson wrote:
> Further to the immediately preceding post, as an experiment, I forced
> fsck on my one-year-old Acer Aspire One Atom-based Netbook with fresh
> 10.04 install. It has *exactly* the same bug: the check slows to snail's
> pace, and C doesn't cancel. You could hardly find two more diverse
> machines with the same error. This really should be being taken *very*
> seriously. 9.04 and 9.10 were both brilliant on the netbook.

This *is* being taken seriously, a candidate fix has already been pushed to
the lucid-proposed archive. You are welcome to help with testing this fix
following the directions in comment #86. Otherwise, you can wait for the
fix to be available in lucid-updates, which will happen in a few days once
we are confident that no regressions have been introduced with this change.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Revision history for this message
Michael Hampson (michael-hampson-mobile) wrote :

Sorry ... I'll check it and report back once it comes through on automatic updates. I missed the note #86 because so many of these 90-something comments are in note form rather than sentences, with abbreviations and technical terms that mean nothing to me: I find these pages very difficult to read. Even the helpful-looking table at the top means nothing to me, with no dates to say what happened when or explanations of the terms used. It presently says "fix released", but my two machines are still broken ... I didn't realise the nuance about "proposed" until you pointed it out. I guess simplifying the system for reporting and discussing bugs is not a priority when there's firefighting going on :( ... sorry.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :
Download full text (3.5 KiB)

Hi Michael,

As one of the random people who has posted a quite a few of those technical comments I must apologise. The aim wasn't to make other people's lives harder...

It took the pretty much an entire day last week of non stop work for me to get a handle on the problem (I'm not the world's fastest/smartest programmer) and I was working on a netbook with limited space. I was doing this on my own time (I have no connection to Canonical) but even if I were being paid I could not have gone faster. While I was going along, I was posting notes in the hope it would help others quickly reproduce what I had done and help people guess at new reasons for the problem occurring. Arand and others were working on the problem too.

Having spent many hours, I finally came up with an idea I couldn't test as I lacked the appropriate setup (I was working by reading the code and thinking about what might help). I posted it here and it was only then I started searching around at other bugs and could recognise someone else (Temo!) had independently posted a tested solution to the problem much earlier so I posted yet another comment making a link. Arand appears to have quickly turned this into a package people could test and from there things seem to have moved a lot faster.

Part of the problem with bugs like this is that they take time to diagnose. As a programmer I can tell you some of the most difficult problems to fix are the ones you can't reproduce on your own machine on demand. Is it something everyone will see? Why haven't I seen it myself? Is there something special about the setup of the people seeing the problem (e.g. disks that take a short time to check, the speed of the computer and the graphical splash settings)?

Once you know the cause you then have to come with an idea for the fix. If someone presents you with a fix it is often quicker to look at it and say it right than to come up with an idea from scratch. Once you've done that you have to test the fix to make sure it doesn't cause any new problems (sometimes this leads to the fix being split into two). But who (else) wants to do testing and risk breaking their system? Somehow we need more programmers, testers and community liaisons because these can be thankless tasks. Whatever you do, it all takes extra time and carries risks...

You also raised some good general issues too. This is a long bug (it affects many people and attracts a lot of comments because people want to help whichever way they can). The thing is it's hard to know which comments to show people. Often when I am searching for bugs I need to see all the comments to be able to find the one I need but this is clearly not the general case... Perhaps you could file a new bug explaining this and how it could be improved (perhaps comment voting? I don't have a good suggestion there :) ). You can use this link https://bugs.launchpad.net/malone/+filebug .

A further issue as you've pointed out is the Fix Committed -> Fix Released wording. You are not the first to have this issue. Again perhaps you could file a new bug on that (if there isn't one already)? I don't think it's fair to ask for dates (unless I'm paying whoever is going to ...

Read more...

Revision history for this message
Bernat (berarma) wrote :

I think this tool is more oriented towards developers and beta-testers, users feeling uneasy here might post a question or go to the forums instead. I guess the problem is that this bug is going to hit almost everyone sooner or later, and it can be troublesome for the naive user. It happened to me in a meeting at work and it's a bit painful even for the non-naive (this things always happen like this.) It's a pity this bug wasn't caught before release, but everyone's being working hard to fix it as soon as it was detected. Certainly something to include in the test-run for future releases.

Revision history for this message
Ron S (ronshere-people) wrote :

Upgrading to mountall - 2.15 in proposed fixed the problem for me. Disk check now takes less than 30 secs. compared to about 30 minutes before. New package has not caused any problems that I'm aware of.
Nice work thanks.

Revision history for this message
Bernat (berarma) wrote :

Upgrading here solved the problem too. I see a slow down at 95% but it finishes in a reasonable time. Good work.

Revision history for this message
Jane Atkinson (irihapeti) wrote :

Upgrading to 2.15 reduced the time from about 10 minutes (or more, even) to about 1 minute or so.

Revision history for this message
Michael Hampson (michael-hampson-mobile) wrote :

[notes #91 and #92]

Upgrading to mountall 2.15 resolved the problem on both machines.
Netbook scan about 10 seconds, desktop about one minute
(down from 10 minutes and one hour).

Using lucid-proposed can be *far* easier than the complex Wiki page suggests, but it's been made immutable/uneditable! You can do the entire thing from within Synaptic:
1. System / Administration / Synaptic Package Manager
2. Settings / Repositories / Updates / check "Proposed" / Close / click "Reload"
3. Quick-search for item, mark for upgrade, click Apply
4. Settings / Repositories / Updates / uncheck "Proposed" / Close / click "Reload" / close synaptic

Anyone can try this now to test mountall 2.15 ... but after that, who has the authority to put this simple procedure at the top of the 'immutable' Wiki page instead of the current scary paragraphs of terminal hacks?

Revision history for this message
Michael Hampson (michael-hampson-mobile) wrote :

@ Sitsofe Wheeler, Steve Langasek etc: you are, of course, heroes all.

Revision history for this message
Benjamin Kay (benkay) wrote :

I can verify that mountall 2.15 in lucid-proposed fixes this bug for my x86_64 Thinkpad T61 laptop without introducing any new regressions. Good work everyone!

Although the new mountall doesn't seem to fix Bug #577331, may I suggest it be pushed to lucid-updates anyway as soon as verification is complete? Considering the severity of this bug, it should probably be fixed ASAP. Given how fast fsck runs, fewer users are likely to encounter 577331 than encounter this bug.

Revision history for this message
Barry Drake (b-drake) wrote :

On Mon, 2010-05-10 at 10:53 +0000, Michael Hampson wrote:
> @ Sitsofe Wheeler, Steve Langasek etc: you are, of course, heroes all.

Can I add my praise?? I've done a bit of development and know how time
consuming and exasperating debugging can be. I've been following the
process and am sooo impressed. Congratulations and a very heartfelt
thank-you.

Barry

--
Sent from my Dell Netbook using Ubuntu - the Windows-free environment
that gives me real fresh air.

Revision history for this message
Mike.lifeguard (mikelifeguard) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10-05-10 06:29 AM, Bernat wrote:
> I see a slow down at 95%

I'm pretty sure that's normal
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iEYEARECAAYFAkvoHEkACgkQst0AR/DaKHveqgCcCXiiBUSI/BIxdhZFzUw7iMKI
BLAAn1gzIEplpvyREsxlDLAmIoXVhRfJ
=Kfo6
-----END PGP SIGNATURE-----

Revision history for this message
theadmin (theadmin-) wrote :

Problem is a bit different over here. It gets stuck on 90% and takes about half an hour to finsih after this. Adding "nosplash" to grub.cfg somehow seems to resolve it. Maybe the bug should be moved to plymouth section?

Revision history for this message
theadmin (theadmin-) wrote :

Uhm. Ignore the above. I now have read other comments and see that it's the same.

Revision history for this message
kgsuarez (kgsuarez) wrote :

13thSlayer, have you checked if the patch fixes it for you?

And also.. any other general updates on this issue?

zellfaze (zellfaze)
Changed in mountall (Ubuntu Lucid):
status: Fix Committed → Fix Released
status: Fix Released → Fix Committed
Revision history for this message
Michael Lazarev (milaz) wrote :

I just have updated only mountall package from -proposed, and I can confirm that for me disk check time is reduced from 20 minutes to less than one minute.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mountall - 2.15

---------------
mountall (2.15) lucid-proposed; urgency=low

  [ Scott James Remnant ]
  * Fix an obvious thinko error that meant that the "I"gnore fsck error key
    for a "hard" failure was ignored.
  * When cancelling filesystem checks, only cancel those that are actually
    checking filesystems; otherwise those that are merely verifying the
    superblock will return an "unrecoverable error" rather than "cancelled".
    LP: #577331.

  [ Steve Langasek ]
  * Only send plymouth a progress update when there's actual progress to
    report; otherwise we flood plymouthd with redundant events, and the
    progress will spin for minutes after the fsck itself is finished. Thanks
    to Tero Mononen and Anders Kaseorg for the patch. LP: #571707.
 -- Steve Langasek <email address hidden> Sun, 09 May 2010 01:04:24 +0200

Changed in mountall (Ubuntu Lucid):
status: Fix Committed → Fix Released
Revision history for this message
Martin Erik Werner (arand) wrote :

Since the fix is still in -proposed and not released to normal updates, I'm changing status to "Fix Committed" for Lucid.

Changed in mountall (Ubuntu Lucid):
status: Fix Released → Fix Committed
Revision history for this message
Benjamin Kay (benkay) wrote :

Are you sure, arand? I recently upgraded to mountall 2.15 on a machine without -proposed enabled. It does look as if packages.ubuntu.com is still reporting mountall at version 2.14, but that would be a separate issue.

Revision history for this message
papukaija (papukaija) wrote :

I just upgraded to mountall 2.15 from lucid-updates.

Changed in mountall (Ubuntu Lucid):
status: Fix Committed → Fix Released
Revision history for this message
Martin Erik Werner (arand) wrote :

Ah, yes, my bad. I was simply going by the janitor's message above.

Great to see this fix out the door.

Revision history for this message
Benjamin Drung (bdrung) wrote :

nothing to sponsor left, unsubscribing ubuntu-sponsors

mikbini (mikbini)
Changed in plymouth (Ubuntu):
status: Triaged → Invalid
Changed in plymouth (Ubuntu Lucid):
status: Triaged → Invalid
Steve Langasek (vorlon)
Changed in plymouth (Ubuntu):
status: Invalid → Triaged
Changed in plymouth (Ubuntu Lucid):
status: Invalid → Triaged
description: updated
Revision history for this message
Arjan (iafilius) wrote :

Hello,

on my production ubuntu server (HP380G6) 10.04 LTS i noticed this same issue.
console was hanging on an fsck .. and when logged i (ssh) a saw the
" /sbin/plymouthd --mode=boot --attach-to-session
"
running.

ubuntu is completely uptodate (today) and having mountall 2.15 installed.

So with the fix of mountall is probably only a partial fix/workaround
as a possibel (not tested yet) i adjusted the /etc/default/grub with GRUB_CMDLINE_LINUX_DEFAULT="" (removed the quiet and only option)

I didn't tested the workaround yet, and i'm not 100% now that everything that has to be started is actually started (apart from getty's).

ps: actually i have no indication plymouthd was eating CPU at all, haven's specially looked at that, but i see at the time i looked at it the time consumes was stil 0, so below 1 second.

Regards,

Arjan Filius

Revision history for this message
Keith Humm (keith-spronkey) wrote :

I'm getting the same issue on Server 10.04 LTS.

Installing mountall 2.15 appeared to fix the issue, but it has returned.

touch /forcefsck does not cause the issue - it only appears without interference (seemingly). I also noticed it started occuring again when the fsck on boot reports 'clean' as opposed to an x% fragmented type error, so this may be related.

Revision history for this message
Keith Humm (keith-spronkey) wrote :

On further investigation, it appears that no *actual* fsck is being performed when I get this error.

When I use /forcefsck, it returns 0.1% non-contiguous after a short delay, and boots normally.

When my boot stalls with a fsck message, no vterms, but sshd active, on the connected display it returns 'clean' for the same partition *IMMEDIATELY*.

Interestingly, after installing bootchart my machine boots to login every time without fail, but also during the boot prints the fsck: clean message with no delays for actually checking the drive.

Revision history for this message
luojie-dune (luojie-dune) wrote :

Yesterday, my Linux Mint cannot startup( without any information printed in screen), the recovery mode doesn't work as well, the fsck printed the check results then system stalled? I can restart but cannot proceed any further...

Revision history for this message
rupert (r-plumridge) wrote :

@luojie-dune I updated yesterday (Ubuntu Lucid) and noticed that on re-boot, the grub menu started showing up again, even though I had edited the configoration files to hide it. Clearly a recent update meddled with those files, without asking the user. I would imagine that is your issue to. If you boot up your PC using a LiveCD or USB and then read through the Grub conf files, it might fix your issue.

Revision history for this message
hseuming (hseuming) wrote :

Hi,
i just came across this bug report after surfing the web for the problem i'm experiencing. The symptom is that at boot time the disk check finishes 70% within one or two minutes. From 70% to 74%, it takes about 5 minutes. From 75% to 100%, it takes a few hours with worst case so far being 15 hours. Other pertinent info is as follows:

* mountall version installed is 2.15.2.

* the file system format is ext3

% cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=10.04
DISTRIB_CODENAME=lucid
DISTRIB_DESCRIPTION="Ubuntu 10.04.1 LTS"

% uname -a
Linux ubuntu 2.6.32-24-generic #43-Ubuntu SMP Thu Sep 16 14:17:33 UTC 2010 i686 GNU/Linux

Thanks

Revision history for this message
Billy Silver (billysilver) wrote :

I also have mountall 2.15.2 installed and I am receiving the error described many times in this report:

- fsck does not run correctly at boot - stuck at 70% or so, no disk activity
- must hit C to cancel, then boot proceeds as expected
- file system is ext3, SATA
- running Mint 9 x64
- mountall 2.15.2

The problem is critical for me, as this sytsem is used remotely over long periods of time while I'm on assignment in other countries, and this bug renders this system unusable until I can come home and mess with it several weeks later.

Revision history for this message
axel (axel334) wrote :

I don't even have progress bar. I don't now how long I have to wait so I always press C to stop this fsck process.
http://forums.linuxmint.com/viewtopic.php?f=29&t=53170

Revision history for this message
papukaija (papukaija) wrote :

@axel: That issue is not probably related to this bug, and can even be related to your graphic card driver or plymouth (you did not mention whether you see the pink background with Ubuntu logo correctly or not). Please open a new bug (if you can't find an bug for that already) for your issue since this bug is fixed nearly half year ago.

Revision history for this message
Richard Postlewait (sandman6471) wrote :

Have switched to Ubuntu 10.10. No issues at all. Thanks. I wasn't having the
problems your listing anyway.

On Oct 11, 2010 9:22 AM, "papukaija" <email address hidden> wrote:

@axel: That issue is not probably related to this bug, and can even be
related to your graphic card driver or plymouth (you did not mention
whether you see the pink background with Ubuntu logo correctly or not).
Please open a new bug (if you can't find an bug for that already) for
your issue since this bug is fixed nearly half year ago.

--
fsck progress stalls at boot, plymouthd/mountall eats CPU
https://bugs.launchpad.net/bugs/57170...

Revision history for this message
sacha@ubuntu (sacha-b77) wrote :

This bug is not really fixed.

On mint 9 kde, when a fsck is asked by the system on boot, he progress to 100% and plymouth works endless.

Press C or right arrow key does the boot continue.

When i hit "esc" on plymouth, i see this message : "GLib-WARNING **: getpwuid_r(): falied due to unknown user id (0)". and the result of the scan : /dev/sda1 : 1627009/3571712 fichiers (0.3% non-contigus), 6708217/14284032 blocs.

The boot stalls with these messages. When i press c or right arrow key, the boot comes back and i can connectind the session on KDM.

see this bug please : https://bugs.launchpad.net/ubuntu/+bug/531027

Thank's for your investigation.

Revision history for this message
axel (axel334) wrote :

Now, I have installed Linux Mint 10 Julia 64-bit
Linux 2.6.35-22-generic #35-Ubuntu SMP Sat Oct 16 20:45:36 UTC 2010 x86_64 GNU/Linux
I checked with command
sudo touch /forcefsck && sudo reboot
Unfortunately the problem still exist. I don't have progress bar showing in %. The process was very quick and system started but I didn't see what was actually happening. Only information that it will be checked.

Revision history for this message
ingo (ingo-steiner) wrote :

And, of course nothing logged under /var/log/fsck :-(

My logs are absolutely empty since install of Lucid:

cat /var/log/fsck/checkfs
(Nothing has been logged yet.)

cat /var/log/fsck/checkroot
(Nothing has been logged yet.)

Revision history for this message
ingo (ingo-steiner) wrote :

and this is how it looks in Debian-Squeeze (after touch /forcefsck):

squeeze:/var/log/fsck# cat checkroot
Log of fsck -C -f -a -t ext3 /dev/sda1
Sat Nov 6 20:02:33 2010

fsck from util-linux-ng 2.17.2
/dev/sda1: 163080/498736 files (5.8% non-contiguous), 1624972/1994060 blocks

Sat Nov 6 20:04:14 2010
----------------

and the known progress bar is also presented at boot-up

Mint Debian-Edition as well!

Revision history for this message
Alexey Loukianov (lexa2) wrote :

Just got a report from my clients using Linux Mint 9. Today "three of the workstations stalled at boot displaying plymouth animation screen and doing nothing" (this is roughly what the client complaint was). Knowing about this bug I suggested them to press the "C" key on the keyboard. Shortly after the press all stalled workstations continued to boot normally. So, looks like the bug is still here and awaits to be fixed.

Revision history for this message
Andreas Jürgens (andreas-mk) wrote :

Hello,

this Bug is nearly 7 month old and I can't understand, why it is not fixed for LM 9.

Regards
Andreas

Revision history for this message
Richard Postlewait (sandman6471) wrote :
Download full text (4.1 KiB)

It's fixed in Mint 10.

On Sun, Nov 14, 2010 at 8:45 AM, Andreas Jürgens
<email address hidden>wrote:

> Hello,
>
> this Bug is nearly 7 month old and I can't understand, why it is not
> fixed for LM 9.
>
> Regards
> Andreas
>
> --
> fsck progress stalls at boot, plymouthd/mountall eats CPU
> https://bugs.launchpad.net/bugs/571707
> You received this bug notification because you are a direct subscriber
> of a duplicate bug (606588).
>
> Status in The Linux Mint Distribution: New
> Status in “mountall” package in Ubuntu: Fix Released
> Status in “plymouth” package in Ubuntu: Triaged
> Status in “mountall” source package in Lucid: Fix Released
> Status in “plymouth” source package in Lucid: Triaged
>
> Bug description:
> PROBLEM
>
> When a disk check is performed, the progress stalls somewhere around 70%
> and will then take a very long time finishing the remaining percent (10
> minutes or more).
>
> PATCH
>
> Patch for mountall has now been pushed as an update for Lucid, if you are
> still seeing this problem, make sure you have mountall 2.15 installed before
> commenting/reporting a new bug.
>
> [Earlier patch comments:]
> Tero Mononen has published a patch for Bug #553745 which applies to the
> issue described here as well (see
> https://bugs.launchpad.net/ubuntu/+source/plymouth/+bug/553745/comments/76and
> https://bugs.launchpad.net/ubuntu/+source/plymouth/+bug/553745/comments/77)
>
> I have created corresponding packages which are available through my PPA:
> https://launchpad.net/~arand/+archive/unstable
>
> !!!Do note that this is an unofficial, untested, preliminary patch!!!
> However testing and feedback is welcome, please especially report if there
> are ANY (new) problems seen when using the patched version.
>
> TEST CASE:
>
> (sudo aptitude install bootchart)
> sudo touch /forcefsck && sudo reboot
>
> POSSIBLE TEMPORARY WORKAROUNDS
>
> 1. Removing "quiet" and "splash" from the kernel boot line
>
> 2. When the progress has stalled, switch away from the splash screen using
> the left arrowkey (presumably any arrowkey works).
>
> * Both these approaches speeds up the boot process to ~1 minute instead.
>
> OBSERVATIONS
>
> The fsck message "(...) non-contiguous (...)" Which I assume indicates the
> end of the fsck, is printed in the Virtual Terminal ("outside" plymouth) at
> around 70% + ~10-20 seconds.
>
> Disk activity is null from this point on (presumed end of fsck above).
>
> Bootchart crashes if trying to catch the whole boot at once with plymouth
> (at least for my 1h boot).
>
> This problem seems to occur in both plymouthd and mountall,
> semi-simultaneously:
> If you are in the plymouth screen, plymouthd is the cpu-gobbler, if you
> switch away from it using the arrow keys, mountall instead takes over the
> cpu-eating.
>
> #####
>
> ORIGINAL REPORT
>
> Binary package hint: mountall
>
> On my system when fsck runs at boot plymouth % completion count goes up
> quickly (<10 seconds) up to about 80% and then slows down considerably: the
> complete fsck of my 125GB HD, 30% full takes more than 5 minutes.
>
> While this goes on the text VTs are all completely blank: just a blinking
> cursor.
>
> An fsck from a recovery disk comp...

Read more...

Revision history for this message
Andreas Jürgens (andreas-mk) wrote :

Thanks for the answer and why not in Mint 9, because a lot of users will work with Mint 9 till ending support in 2013?

I am one of it. ;)

Revision history for this message
Richard Postlewait (sandman6471) wrote :
Download full text (4.4 KiB)

I can't answer that one; sorry. I just know it's fixed inMint 10, cause it
came up on my system the other day, took like maybe 3 to 5 minutes at the
most. Then it finished booting. I tried Ubuntu 10.10, before Mint 10 RC came
out. If I'm not mistaken Ubuntu still has the same problem.lol Mint's just
an awsome distro, you can't do better than mint.

On Sun, Nov 14, 2010 at 12:13 PM, Andreas Jürgens <<email address hidden>
> wrote:

> Thanks for the answer and why not in Mint 9, because a lot of users will
> work with Mint 9 till ending support in 2013?
>
> I am one of it. ;)
>
> --
> fsck progress stalls at boot, plymouthd/mountall eats CPU
> https://bugs.launchpad.net/bugs/571707
> You received this bug notification because you are a direct subscriber
> of a duplicate bug (606588).
>
> Status in The Linux Mint Distribution: New
> Status in “mountall” package in Ubuntu: Fix Released
> Status in “plymouth” package in Ubuntu: Triaged
> Status in “mountall” source package in Lucid: Fix Released
> Status in “plymouth” source package in Lucid: Triaged
>
> Bug description:
> PROBLEM
>
> When a disk check is performed, the progress stalls somewhere around 70%
> and will then take a very long time finishing the remaining percent (10
> minutes or more).
>
> PATCH
>
> Patch for mountall has now been pushed as an update for Lucid, if you are
> still seeing this problem, make sure you have mountall 2.15 installed before
> commenting/reporting a new bug.
>
> [Earlier patch comments:]
> Tero Mononen has published a patch for Bug #553745 which applies to the
> issue described here as well (see
> https://bugs.launchpad.net/ubuntu/+source/plymouth/+bug/553745/comments/76and
> https://bugs.launchpad.net/ubuntu/+source/plymouth/+bug/553745/comments/77)
>
> I have created corresponding packages which are available through my PPA:
> https://launchpad.net/~arand/+archive/unstable
>
> !!!Do note that this is an unofficial, untested, preliminary patch!!!
> However testing and feedback is welcome, please especially report if there
> are ANY (new) problems seen when using the patched version.
>
> TEST CASE:
>
> (sudo aptitude install bootchart)
> sudo touch /forcefsck && sudo reboot
>
> POSSIBLE TEMPORARY WORKAROUNDS
>
> 1. Removing "quiet" and "splash" from the kernel boot line
>
> 2. When the progress has stalled, switch away from the splash screen using
> the left arrowkey (presumably any arrowkey works).
>
> * Both these approaches speeds up the boot process to ~1 minute instead.
>
> OBSERVATIONS
>
> The fsck message "(...) non-contiguous (...)" Which I assume indicates the
> end of the fsck, is printed in the Virtual Terminal ("outside" plymouth) at
> around 70% + ~10-20 seconds.
>
> Disk activity is null from this point on (presumed end of fsck above).
>
> Bootchart crashes if trying to catch the whole boot at once with plymouth
> (at least for my 1h boot).
>
> This problem seems to occur in both plymouthd and mountall,
> semi-simultaneously:
> If you are in the plymouth screen, plymouthd is the cpu-gobbler, if you
> switch away from it using the arrow keys, mountall instead takes over the
> cpu-eating.
>
> #####
>
> ORIGINAL REPORT
>
> Binary package hint: m...

Read more...

Revision history for this message
Chow Loong Jin (hyperair) wrote :

On Monday 15,November,2010 01:38 AM, Richard Postlewait wrote:
> I can't answer that one; sorry. I just know it's fixed inMint 10, cause it
> came up on my system the other day, took like maybe 3 to 5 minutes at the
> most. Then it finished booting. I tried Ubuntu 10.10, before Mint 10 RC came
> out. If I'm not mistaken Ubuntu still has the same problem.lol Mint's just
> an awsome distro, you can't do better than mint.

No, I don't believe this issue is present in Ubuntu any longer, see this bug's
status on mountall, which was flooding plymouth with events. All the recent
noise here was caused by Mint users and Mint users alone. Please Get The Facts™
before posting next time.

--
Kind regards,
Loong Jin

Revision history for this message
Alexey Loukianov (lexa2) wrote :

14.11.2010 19:59, Richard Postlewait wrote:
> It's fixed in Mint 10.
>

Awesome! Then why the hell is this bug not fixed in latest LTS release? Regular
one-year support releases are just a "toys" for home linux users while corporate
one tend to use LTS releases for production use. And the fact that the major bug
in LTS release wasn't fixed for months and when a new "toy" release comes out it
happens that the bug is fixed in it while still being present in LTS release
gives a very good reason to reconsider using something like CentOS/RHEL instead
of Ubuntu/Mint LTS in production environment.

Will take this in account in the future consulting my clients.

--
Best regards,
Alexey Loukianov mailto:<email address hidden>
System Engineer, Mob.:+7(926)218-1320
*nix Specialist

Revision history for this message
Chow Loong Jin (hyperair) wrote :

On Monday 15,November,2010 04:42 AM, Alexey Loukianov wrote:
> 14.11.2010 19:59, Richard Postlewait wrote:
>> It's fixed in Mint 10.
>>
>
> Awesome! Then why the hell is this bug not fixed in latest LTS release? Regular
> one-year support releases are just a "toys" for home linux users while corporate
> one tend to use LTS releases for production use. And the fact that the major bug
> in LTS release wasn't fixed for months and when a new "toy" release comes out it
> happens that the bug is fixed in it while still being present in LTS release
> gives a very good reason to reconsider using something like CentOS/RHEL instead
> of Ubuntu/Mint LTS in production environment.
>
> Will take this in account in the future consulting my clients.
>

Status in “mountall” source package in Lucid: Fix Released

Again, Get The Facts™. I don't care what Mint LTS does, but Ubuntu LTS has it
fixed, so please don't confuse the two again.

--
Kind regards,
Loong Jin

Revision history for this message
ingo (ingo-steiner) wrote :

> Status in “mountall” source package in Lucid: Fix Released

But when - too late!

Such things happen when releases are time based instead of "when it's done". What is a LTS-release worth when takes months to become ready for use.They are almost done approximately with second point release, so remaining sevice period is by far less then advertised.

Why is Mint considering a Debian-based version?

Revision history for this message
Alexey Loukianov (lexa2) wrote : Re: [Bug 571707] Re: fsck progress stalls at boot, plymouthd/mountall eats CPU

15.11.2010 00:11, Chow Loong Jin wrote:
> Status in “mountall” source package in Lucid: Fix Released
>
> Again, Get The Facts™. I don't care what Mint LTS does, but Ubuntu LTS has it
> fixed, so please don't confuse the two again.
>

Thanks for pointing out on this. So, what is the _real_ current status for this
bug? I know how does it behave in Linux Mint 9 LTS (bug is still there). But
what about Ubuntu 10.04 LTS? I've seen reports here that it is still not fixed
too (most recent was by Richard Postlewait). So whom too believe?

P.S. Needless to say that an argument that it took a way to long to fix it in
Ubuntu (in case it is really fixed) still counts. And another thing to note: Get
The Facts™, this bug is not about Ubuntu only. So I don't care if you care about
what Mint LTS does - this bug applies to Mint LTS so all people here blaming
about have got all rights and reasons to do so.

--
Best regards,
Alexey Loukianov mailto:<email address hidden>
System Engineer, Mob.:+7(926)218-1320
*nix Specialist

Revision history for this message
Chow Loong Jin (hyperair) wrote : Re: [Bug 571707] Re: fsck progress stalls at boot, plymouthd/mountall eats CPU

On Monday 15,November,2010 05:43 AM, ingo wrote:
>> Status in “mountall” source package in Lucid: Fix Released
>
> But when - too late!
>
> Such things happen when releases are time based instead of "when it's
> done". What is a LTS-release worth when takes months to become ready for
> use.They are almost done approximately with second point release, so
> remaining sevice period is by far less then advertised.

The bug was reported on 2010-04-29. When was Lucid released? 2010-04-29. Given
the complexity of this bug, as you would have discovered if you had actually
read through the original comments, it was not an easy one to identify. When did
it get fixed *in Lucid*? 2010-05-09. That's 10 days after release. Ten days for
a tough bug like this one. I believe we deserve a little credit.

Then Mint appears, without deploying our patched packages properly, and a whole
bunch of Mint users come here to complain that Mint doesn't work, and blame
Ubuntu for it, and then talk about Mint's superiority. Oh, the irony. Thank you
for trolling, guys. That was fun, wasn't it?

> Why is Mint considering a Debian-based version?

Hell if I know, and I don't give a damn. Maybe some other Ubuntu developers do,
but I don't. Mint can do whatever they want.

--
Kind regards,
Loong Jin

Revision history for this message
Chow Loong Jin (hyperair) wrote :

On Monday 15,November,2010 08:54 AM, Alexey Loukianov wrote:
> 15.11.2010 00:11, Chow Loong Jin wrote:
>> Status in “mountall” source package in Lucid: Fix Released
>>
>> Again, Get The Facts™. I don't care what Mint LTS does, but Ubuntu LTS has it
>> fixed, so please don't confuse the two again.
>>
>
> Thanks for pointing out on this. So, what is the _real_ current status for this
> bug? I know how does it behave in Linux Mint 9 LTS (bug is still there). But
> what about Ubuntu 10.04 LTS? I've seen reports here that it is still not fixed
> too (most recent was by Richard Postlewait). So whom too believe?

Well, given that it is a pretty noticeable bug, and a very annoying one when you
notice it, and that the only users complaining about this bug now are Mint
users, I would believe that Ubuntu has it fixed, even in 10.04 LTS, but perhaps
not in Mint.

And Ubuntu 10.04 LTS had it fixed since May 9th, which is 10 days after release,
and also 10 days after this bug was reported (it was reported on release date).

> P.S. Needless to say that an argument that it took a way to long to fix it in
> Ubuntu (in case it is really fixed) still counts.

10 days is too long? Really? If you used Ubuntu, you should have only
encountered this once, if at all, since fsck only happens every 30 days or after
a set number of mounts I can't remember. Unless you do reboot your computer that
many times in a day..

> And another thing to note: Get The Facts™, this bug is not about Ubuntu only.
> So I don't care if you care about what Mint LTS does - this bug applies to
> Mint LTS so all people here blaming about have got all rights and reasons to
> do so.

Then you can use this as basis to report about Mint's failures to your clients,
but please keep in mind that Ubuntu is not Mint.

--
Kind regards,
Loong Jin

Revision history for this message
Andreas Jürgens (andreas-mk) wrote :

Sorry for my bad english and the following is translatet by google:

At certain intervals Mint 9 tries to examine me before the start of the hard drives for errors. The process ever run for several hours, but he has not come to an end. In the beginning, I can still read "Mint examined 1 of 2 hard drives (progress". The text appears exactly without 2.Clamp and progress bar. Then disappears at some point this line and today I canceled the process after about an hour.

The LED will disappear when the two line stops blinking. It also does nothing for hours.

The problem have many users of Mint 9 and you can read the Thread in the german community of www.linuxmintusers.de

I hope the problem will fixed soon.

Thanks
Andreas

Revision history for this message
Andreas Jürgens (andreas-mk) wrote :
Revision history for this message
Alexey Loukianov (lexa2) wrote : Re: [Bug 571707] Re: fsck progress stalls at boot, plymouthd/mountall eats CPU

15.11.2010 06:14, Chow Loong Jin wrote:
> 10 days is too long? Really?
>
If it was really fixed in that 10 days and all people reporting here blaming
Ubuntu are in reality just hadn't installed fresh updates or are using Mint
instead then it was fast enough to call it "good level of support". If not -
it's a shame for Ubuntu. And it is certainly a shame for Mint that this bug is
still unfixed.

> Then you can use this as basis to report about Mint's failures to your clients,
> but please keep in mind that Ubuntu is not Mint.
>
It is certainly not. As for clients - unfortunately they prefer working
workstations instead of reports about Mint's failures. About half of linux
installations in production environment I've done last year were based on Mint
LTS (rest were CentOS 5 based). Looking at the history of this bug I would
reconsider to use Ubuntu instead of Mint in a near future, probably switching to
the CentOS 6 as soon as it would be released.

--
Best regards,
Alexey Loukianov mailto:<email address hidden>
System Engineer, Mob.:+7(926)218-1320
*nix Specialist

Revision history for this message
Andreas Jürgens (andreas-mk) wrote :

It is a shame that this bug is not yet solved for Mint 9

It's all very sad, after almost seven months.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Hello,

If people using a up-to-date 10.04 (or later) of stock Ubuntu are still seeing symptoms described in this bug I would STRONGLY recommend they file a new bug report. Mint users are similarly better off filing a bug report in their distribution's bug tracker as it is difficult to tell whether the exact same problems they are seeing exist in stock (updated) Ubuntu too). It possible an issue still exists in Ubtunu but if so posting to this bug report is not going to help as it has become quite long and too confused. A new bug report for Linux Mint users might serve them better.

For what it is worth, in Ubuntu 10.04 the 2.15 version of mountall resolved the problem I was seeing. I have just retested the system that showed the problem and the fix still seems to be holding. This suggests that anyone seeing the symptoms described in this bug report with a version of mountall of *2.15 or later* is almost certainly seeing a bug stemming from a different root cause and as such should definitely be reporting the issue in a different bug report (a good bug report should generally only cover one root cause so that separate issues can be closed individually).

Good luck!

Revision history for this message
Richard Postlewait (sandman6471) wrote : Re: [Bug 571707] Re: fsck progress stalls at boot, plymouthd/mountall eats CPU

Well pardon me. I'm sorry for the wrong info. I just know what took place on
my system. So until you have walked in my shoes, you need to keep your
attitude and that I know all bull crap to yourself. So cancel my account,
who really cares. Its just a linux distro, life still goes on.

On Nov 14, 2010 3:32 PM, "Chow Loong Jin" <email address hidden> wrote:

On Monday 15,November,2010 01:38 AM, Richard Postlewait wrote:
> I can't answer that one; sorry. I ...
No, I don't believe this issue is present in Ubuntu any longer, see this
bug's
status on mountall, which was flooding plymouth with events. All the recent
noise here was caused by Mint users and Mint users alone. Please Get The
Facts™
before posting next time.

--
Kind regards,
Loong Jin

--

fsck progress stalls at boot, plymouthd/mountall eats CPU
https://bugs.launchpad.net/bugs/571707
You...

Revision history for this message
Andreas Jürgens (andreas-mk) wrote :
Revision history for this message
Chow Loong Jin (hyperair) wrote :

On Tuesday 16,November,2010 12:17 AM, Richard Postlewait wrote:
> Well pardon me. I'm sorry for the wrong info. I just know what took place on
> my system. So until you have walked in my shoes, you need to keep your
> attitude and that I know all bull crap to yourself. So cancel my account,
> who really cares. Its just a linux distro, life still goes on.

Don't get me wrong. Nobody's interested in cancelling your account. I just
requested that you try your best to keep the spreading of misinformation to a
minimum.

My issue was with how you phrased "If I'm not mistaken, Ubuntu still has the
same problem." If you don't know, please don't pretend that you have an idea. It
was pretty much that comment there that sparked the whole bunch of scathing
false accusations about Ubuntu not bothering to fix serious bugs within
reasonable time for its LTS releases.

If it was not your intention to cause this situation, then I genuinely am sorry
for lashing out at you like that. Otherwise, please find better uses of your
energy and time that do not involve wasting others' time.

--
Kind regards,
Loong Jin

Revision history for this message
ingo (ingo-steiner) wrote :

@ Loong Jin

please accept my excuse regarding the statement that "fixing the bug in Lucid did take too long" - 10 days ist really fast.

On the other hand - and that is probably why people including me get upset so easily - the real root cause is something different and in no way addressed to you personally - it is related to plymouth:

I followed the development of Lucid since alpha stage and experienced that an enormous effort was put into plymouth to get it working somehow. This eye-candy splash screen had priority over functional features and still today plymouth is causing troubles (the normal user can't un-install it because artificially tied to mountall and cryptsetup and Canonical refuses to correct dependencies).

A lot of bug reports like this one are in my opinion related to plymouth which requires to patch 'mountall' and 'cryptsetup'. And these are really patches and no fixes, because they all have to consider not to disturb buggy plymouth. The logical way to solve this issues/bugs would be to purge plymouth and set up a good functional system.

Best regards, Ingo

Revision history for this message
Alexey Loukianov (lexa2) wrote : Re: [Bug 571707] Re: fsck progress stalls at boot, plymouthd/mountall eats CPU

16.11.2010 00:25, ingo wrote:
> The logical way to solve this issues/bugs would
> be to purge plymouth and set up a good functional system.

Correct me if I'm wrong but it is perfectly possible to get rid of plymouth in
initramfs and following boot process by simply not installing plymouth themes
packages. Yes, it might require to delete metapackages like "mint-meta-*", but I
don't see much trouble in having them uninstalled anyway because the default set
of packages that Mint tends to install is far from "minimal" set required to get
a slim linux installation perfectly fitting the "normal office workstation"
requirements.

--
Best regards,
Alexey Loukianov mailto:<email address hidden>
System Engineer, Mob.:+7(926)218-1320
*nix Specialist

Revision history for this message
Billy Silver (billysilver) wrote :

Enough. Your points are both understood. Quit spamming.

Revision history for this message
der_alex1980 (beckstrinker) wrote :

I activated level 4 and 5 updates in mitupdate and updated all packages that are possibly related to this bug (mountall, upstart, plymouth) so they are the same version as in Ubuntu 10.04 now. But the bug still remains.

Does anybody know if Ubuntu 10.04 users still suffer from this bug?

Revision history for this message
Richard C. (linuxrichard1) wrote :

mountall 2.19 is available for Ubuntu 10.10. The description of the bugfix between 2.15 and 2.19 appears to tackle this bug. The dependencies of 2.19 are satisfied by Ubuntu 10.04. This bug hasn't gone away with 2.15-3. So why has 2.19 not been provided to 10.04? This bug is very annoying and potentially dangerous, since inevitably one has to cancel fsck. Can we get 219 into the 10.04 repositories please?

Revision history for this message
axel (axel334) wrote :

When a disk check is performed, the progress stalls (I don't see any progress bar or % but I can't hear a hard disk not working) and I press F2 I see: plymouth main process terminated with status 1.

Revision history for this message
Justin Krehel (jkrehel) wrote :

Good day.

Clem please action the introduction of mountall 2.19 into the repos for Linux Mint 9 LTS so this issue may be resolved.

Thanks

- Justin

Changed in linuxmint:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Clement Lefebvre (clementlefebvre)
Changed in linuxmint:
status: Confirmed → Fix Released
Revision history for this message
Clement Lefebvre (clementlefebvre) wrote :

mountall 2.19 added to Mint 9 repository as a level 4 upgrade.

@Justin: Let's keep an eye on it so we can eventually reduce the level to 3 or even 2 in a week or so.

Revision history for this message
axel (axel334) wrote :

I installed new mountall 2.19 and did "sudo touch /forcefsck && sudo reboot"
but nothing has changed. It stucks and I can't see any % progression only information ...presss C...
But when I press F1 I can see some info about ... plymouth main process terminated with status ...
I wish I could sent a log report but I don't know how to.

Revision history for this message
queo (kurt-quehenberger) wrote :

Dear Clement,

first of all I would like to thank you very much for the creation of this great OS Linux Mint.
What I'm using:
HW: Lenovo Thinkpad T410
OS: Linux Mint Isadora 64bit with ext4 filesystem
I've updated the mountall package to version 2.19, but the problem still exists.
The fsck process on boot is faster now but when it stops at 100% only the splash screen with no additional info appears on the screen.
When hitting ESC the message "/dev/sda1: .../... files (0,2% discontiguous) , .../... blocks" appears
The boot process can only be continued by pressing the "c" button.

* I'm getting the same problem on an Acer Laptop with Linux Mint Isadora 32bit!

Thanks in advance for your help!
Best regards,
Kurt

Revision history for this message
queo (kurt-quehenberger) wrote :

the problem is not fixed yet

Revision history for this message
queo (kurt-quehenberger) wrote :

Is there any help available regarding this bug please??
It is open since 2010-04-29 (!)

Revision history for this message
ingo (ingo-steiner) wrote :

On 22.08.2011 17:31, queo wrote:
> Is there any help available regarding this bug please??
> It is open since 2010-04-29 (!)

Here: http://www.debian.org/devel/debian-installer/ - works reliably.

Revision history for this message
Clint Byrum (clint-fewbar) wrote : Re: [Bug 571707] Re: fsck progress stalls at boot, plymouthd/mountall eats CPU

Excerpts from ingo's message of Mon Aug 22 16:07:50 UTC 2011:
> On 22.08.2011 17:31, queo wrote:
> > Is there any help available regarding this bug please??
> > It is open since 2010-04-29 (!)
>
> Here: http://www.debian.org/devel/debian-installer/ - works reliably.
>

Bug reports are for helping developers fix and/or prioritize bugs,
including dialog with affected users. Ingo, if you are no longer affected,
and/or have nothing constructive to add, please refrain from posting
comments.

Revision history for this message
queo (kurt-quehenberger) wrote :

this fix doesn't work

Revision history for this message
Mr. Aljoriz Dublin (aljoriz) wrote :
Download full text (4.2 KiB)

Oddly this was fixed in Ubuntu 10.4.2 and its sub-sequent releases. Mint
maintains that the bug lies upstream. Oh wth I'm using LMDE now

On Wed, Aug 24, 2011 at 3:34 PM, queo <email address hidden> wrote:

> this fix doesn't work
>
> --
> You received this bug notification because you are subscribed to a
> duplicate bug report (607507).
> https://bugs.launchpad.net/bugs/571707
>
> Title:
> fsck progress stalls at boot, plymouthd/mountall eats CPU
>
> Status in The Linux Mint Distribution:
> Fix Released
> Status in “mountall” package in Ubuntu:
> Fix Released
> Status in “plymouth” package in Ubuntu:
> Triaged
> Status in “mountall” source package in Lucid:
> Fix Released
> Status in “plymouth” source package in Lucid:
> Triaged
>
> Bug description:
> PROBLEM
>
> When a disk check is performed, the progress stalls somewhere around
> 70% and will then take a very long time finishing the remaining
> percent (10 minutes or more).
>
> PATCH
>
> Patch for mountall has now been pushed as an update for Lucid, if you
> are still seeing this problem, make sure you have mountall 2.15
> installed before commenting/reporting a new bug.
>
> [Earlier patch comments:]
> Tero Mononen has published a patch for Bug #553745 which applies to the
> issue described here as well (see
> https://bugs.launchpad.net/ubuntu/+source/plymouth/+bug/553745/comments/76and
> https://bugs.launchpad.net/ubuntu/+source/plymouth/+bug/553745/comments/77)
>
> I have created corresponding packages which are available through my
> PPA: https://launchpad.net/~arand/+archive/unstable
>
> !!!Do note that this is an unofficial, untested, preliminary patch!!!
> However testing and feedback is welcome, please especially report if there
> are ANY (new) problems seen when using the patched version.
>
> TEST CASE:
>
> (sudo aptitude install bootchart)
> sudo touch /forcefsck && sudo reboot
>
> POSSIBLE TEMPORARY WORKAROUNDS
>
> 1. Removing "quiet" and "splash" from the kernel boot line
>
> 2. When the progress has stalled, switch away from the splash screen
> using the left arrowkey (presumably any arrowkey works).
>
> * Both these approaches speeds up the boot process to ~1 minute
> instead.
>
> OBSERVATIONS
>
> The fsck message "(...) non-contiguous (...)" Which I assume indicates
> the end of the fsck, is printed in the Virtual Terminal ("outside"
> plymouth) at around 70% + ~10-20 seconds.
>
> Disk activity is null from this point on (presumed end of fsck above).
>
> Bootchart crashes if trying to catch the whole boot at once with
> plymouth (at least for my 1h boot).
>
> This problem seems to occur in both plymouthd and mountall,
> semi-simultaneously:
> If you are in the plymouth screen, plymouthd is the cpu-gobbler, if you
> switch away from it using the arrow keys, mountall instead takes over the
> cpu-eating.
>
> #####
>
> ORIGINAL REPORT
>
> Binary package hint: mountall
>
> On my system when fsck runs at boot plymouth % completion count goes
> up quickly (<10 seconds) up to about 80% and then slows down
> considerably: the complete fsck of my 125GB HD, 30% full takes more
> than 5 minutes.
>
> While this goes on the text VTs are...

Read more...

tags: added: testcase
Revision history for this message
queo (kurt-quehenberger) wrote :

Please, are there any updates on this issue?
The mountall package 2.19 doesn'nt fix the problem.
Is anybody working on this to resolve this bug???

Revision history for this message
cduv (martagenisll) wrote :

Just installed ubuntu 10.04.3 64 bits, mountall is 2.15 and getting the same problem. In this post looks like its solved, is it?
Tried workaround but is not working.

Any hint,

Thanks in advance

Revision history for this message
queo (kurt-quehenberger) wrote :

Please, are there any updates on this issue?
The bug is open since 2010-04-29!
I'm using Linux Mint Isadora 64bit and 32bit, both are having the same bugs.
Please update the Status to "Won't fix"!!!

Revision history for this message
Chow Loong Jin (hyperair) wrote :

On 25/10/2011 21:31, queo wrote:
> Please, are there any updates on this issue?
> The bug is open since 2010-04-29!
> I'm using Linux Mint Isadora 64bit and 32bit, both are having the same bugs.
> Please update the Status to "Won't fix"!!!
>

But it's fixed in Ubuntu Oneiric.

--
Kind regards,
Loong Jin

Revision history for this message
Alexey Loukianov (lexa2) wrote :

25.10.2011 18:38, Chow Loong Jin wrote:
>
> But it's fixed in Ubuntu Oneiric.
>

And what does it change with regards to Linux Mint 9 Isadora users? Who cares if
this bug was fixed in some fresh-n-shiny Ubuntu and/or Mint release while it is
still not fixed in so-called "Long Term Support" version of Linux Mint? And - to
be honest - I still occasionally hit this bug on Ubuntu 10.04 LTS despite the
fact that it had been marked as "fixed" for Ubuntu months ago. Released "fix"
had only changed the frequency this bug happens: before fix I've been hitting
this bug every time I force my system to do fsck on next boot. After the "fix" I
hit this bug about once in 4-5 fsck-enabled reboots. Better than nothing but
still smells like crap.

--
Best regards,
Alexey Loukianov mailto:<email address hidden>
System Engineer, Mob.:+7(926)218-1320
*nix Specialist

Revision history for this message
queo (kurt-quehenberger) wrote :

Dear Loong Jin,

thanks for your reply.
I always thought there is a support for 3 years for LTS releases (Ubuntu 10.04 is LTS).
Do I have the wrong information here?

Best regards,
Kurt

Revision history for this message
Chow Loong Jin (hyperair) wrote :

On 25/10/2011 22:59, queo wrote:
> Dear Loong Jin,
>
> thanks for your reply.
> I always thought there is a support for 3 years for LTS releases (Ubuntu 10.04 is LTS).
> Do I have the wrong information here?
>
> Best regards,
> Kurt
>

Sorry, let me correct myself. It's been fixed from lucid-updates all the way up
to oneiric, and I have not experienced this ever since the fix landed in
lucid-updates. That said, I had no part in the creation of the patch that fixed
this bug, and don't have a clear idea on what's going wrong where. And I don't
use Linux Mint, so I have no idea what's going wrong there either.

The status of the bug is correct -- it's fixed in mountall on Ubuntu, and
mountall in Lucid (via lucid-updates). If it's still broken in Linux Mint, then
open a task there and set it to the appropriate status. There is no reason to
demand that the "Fix released" status in Ubuntu be changed to "Won't Fix" when
it clearly has been fixed over here.

So to sum it all up in a nutshell, it's all fixed on Ubuntu, but not fixed on
Linux Mint, so obviously something went wrong somewhere in porting the fix from
Ubuntu to Mint, so please stop blaming Canonical or Ubuntu for not paying
attention to this bug report. If you need someone to badger about this bug
occurring in Mint, find a Mint developer.

--
Kind regards,
Loong Jin

Revision history for this message
Steve Langasek (vorlon) wrote :

On Tue, Oct 25, 2011 at 05:42:18PM -0000, Chow Loong Jin wrote:
> The status of the bug is correct -- it's fixed in mountall on Ubuntu, and
> mountall in Lucid (via lucid-updates). If it's still broken in Linux Mint, then
> open a task there and set it to the appropriate status. There is no reason to
> demand that the "Fix released" status in Ubuntu be changed to "Won't Fix" when
> it clearly has been fixed over here.

> So to sum it all up in a nutshell, it's all fixed on Ubuntu, but not fixed on
> Linux Mint, so obviously something went wrong somewhere in porting the fix from
> Ubuntu to Mint, so please stop blaming Canonical or Ubuntu for not paying
> attention to this bug report. If you need someone to badger about this bug
> occurring in Mint, find a Mint developer.

There is a task for Linux Mint on this bug already, and its status has been
set to 'fix released' by a Mint developer, apparently on the basis that the
lucid-updates version of mountall was copied into Linux Mint 9. I believe
it's the status of *this* task that the Mint users are concerned with.
Unfortunately, the Launchpad bug workflow doesn't make this at all clear.
(I think for this reason it might be better for derivatives to use separate
bugs for tracking issues, instead of tasks on Ubuntu bugs.)

There are also comments that the bug is still reproducible with 10.04 with
much lower frequency. This may be true; when this bug was being worked on,
the analysis was that there was still a bug in plymouth here, just one with
much less user impact now that the mountall side has been fixed. However
(as you know, but it appears the users subscribed to this bug do not), "LTS"
does not mean that all bugs reported against that release will be fixed; it
means that security support, upgrade support, and commercial support are
provided, and that bugfixes will be made on a best-effort basis.

And there are a number of other plymouth bugs present that are higher-impact
than this one, so it is unlikely that this bug will receive further
attention for 10.04.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Revision history for this message
ingo (ingo-steiner) wrote :

Steve Langasek (vorlon) wrote on 2011-10-25:

> And there are a number of other plymouth bugs present that are higher-impact
> than this one, so it is unlikely that this bug will receive further
> attention for 10.04.

That's true for sure. With this in mind it is definitely a shame that Canonical tries to prevent
users from un-installing plymouth by artificially setting plymouth as dependency for mountall.
Workarounds see bug #556372 - which is marked "won't fix".

Revision history for this message
Steve Langasek (vorlon) wrote :

On Sun, Oct 30, 2011 at 06:49:14PM -0000, ingo wrote:
> Steve Langasek (vorlon) wrote on 2011-10-25:

> > And there are a number of other plymouth bugs present that are higher-impact
> > than this one, so it is unlikely that this bug will receive further
> > attention for 10.04.

> That's true for sure. With this in mind it is definitely a shame that
> Canonical tries to prevent users from un-installing plymouth by
> artificially setting plymouth as dependency for mountall.

It is not an artificial dependency. *mountall cannot communicate with the
user without plymouth.* There must be some framework for multiplexing
boot-time I/O in order to let packages like mountall communicate with the
user, and that framework is plymouth. There aren't even any other
contenders in the field.

And your persistent sniping here about Ubuntu design decisions is an
inappropriate use of the bug system. Please stop.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Revision history for this message
bth73 (bth1969) wrote :

I'm using 9x64 with all updates and I've posted question on mint forums for last 6 month and have 0 replies. Auto fsck has never finished for me on this install. Seems everyone is working on the newest disaster, ahh, distribution. The problem with all linux distributions is that no-one cares about a finished product. They just barely get it to work, then it is on to the next thing.

PLEASE FINISH ONE THING COMPLETELY BEFORE GOING ON TO THE NEXT. BECAUSE THE NEXT THING WILL JUST HAVE ANOTHER SET OF BUGS, AND YOU WILL NEVER HAVE A UN-BUGGED, CLEAN RUNNING PROGRAM.

Revision history for this message
dino99 (9d9) wrote :

Eol is very close; so its time to use a newer release

Changed in plymouth (Ubuntu):
status: Triaged → Invalid
Changed in plymouth (Ubuntu Lucid):
status: Triaged → Invalid
To post a comment you must log in.