Comment 9 for bug 397745

Revision history for this message
rew (r-e-wolff) wrote : Re: [Bug 397745] Re: fsck takes ages with one large partition

On Mon, May 02, 2011 at 02:01:54PM -0000, Phillip Susi wrote:
> > Just the fact that /I/ happen to have 67 million files makes this bug
> > invalid in your eyes?

> No, it is the fact that this is a highly unusual configuration that
> Ted Tso seems to think is an abusive and unsupported use of the fs,
> and therefore, Ubuntu should not be trying to optimize its defaults
> for.

My configuration is maybe a bit extreme. I'm happy to have provided a
patch to e2fsck to have improved fsck performance from about 3 months
to "only a day". Ted however hasn't moved to integrate my patch.

> > My last "fight" with fsck resulted in an fsck time of something like
> > three months. Provided your assumption of linear fsck time with file
> > system parameters is valid, this would mean that 100x less files would
> > result in an fsck time of 1 day. Unacceptable to many people. OR 1000
> > times less files may take 2.4 hours. How do you explain your "not much"
> > answer to your boss after he askes: "what did you do this morning?". "I
> > turned on my workstation this morning and it decided to fsck the root
> > partition, and the fsck took 2.4 hours. So this morning I mostly waited
> > for my machine to boot... "

> If it scaled linearly then I would expect 67 million inodes to take
> about 1.5 hours given that checking an fs with 300k takes half a
> minute. Perhaps this bug should be reassigned to e2fsprogs and
> refactored to focus just on the pathological slow down of fsck with
> extreme numbers of inodes with many hard links.

I HAVE found the problem in fsck. And Fixed it. And provided the
patch.

Still it pays to think about the future. What if 1% of the ubuntu
users happen to use backuppc and end up with lots of files and hard
links like me? Bad luck to them!

> You could also just disable the periodic fsck.

You think it doesn't serve any purpose? Why not disable it for
everybody?

> > Over the years people will have more and more files. This will continue
> > to grow. I mentioned the fact that I was "beyond reasonable" back in
> > 1987 with only 2200 files.

> Maybe in another 10-20 years your average person might get that
> many, but by then I'm sure we'll be using a different fs entirely.
> At present this isn't anywhere close to being an issue.

And F.. bad lUCK to those who upgrade their ext3-4-5 filesystem over
the years and keep their data. And those that happen to install Ubuntu
as a backup-server using backuppc can go f... themselves.

> > I adapt the boot stuff of my fileservers so that they will boot first,
> > (i.e. start all services that depend on just the root) and then they
> > start fscking the data partitions. When done they will mount them and
> > restart services that depend on them (nfs).
> >
> > This allows me to monitor their progress remotely. Stuff like that. But
> > also I can already use the services that they provide that are not
> > dependent on the large partition with most of the files.
> >
> > If we have just one partition this is no longer possible: ALL services
> > depend on "fsck of /" .
>
> That is the kind of thing that requires manual planning; it can not be
> set up like that by default.

Right. But having the system set up by default in a sensible way means
it is possible to realize that this is necceary and adapt a system to
this setup instead of requiring a full reinstall.

> > Another reason why splitting up the disk helps is that fsck is not
> > linear with the disk space. Especially when fsck's working set is beyond
> > the amount of RAM available.
>
> That sounds like the problem. If you are swapping heavily that would
> explain the pathological slow down.

No. It is not the swapping that is the problem. If it fits in RAM+SWAP
things slow down (a lot) but it doesn't slow down unreasonably.

When it exceeds RAM+SWAP or the addressing space, it can be configured
to use a temporary file, and that will get on the order of the number
of inodes of entries but was configured to be efficient when used with
"on the order of 131" entries....

If you have your "big data" partition the same as your root you won't
be able to configure your fsck to use a temporary file for your fsck
of the big partition.

FYI, I encountered the "fsck takes too long" problem over a year and a
half ago. I then chose to more or less ignore the problem and just
mounted the filesystem. I intended to look into the problem but forgot
about it when the machine was running fine again.

Then the filesystem got corrupted. It would remount read-only within
24 hours after I had forced it to run RW. So now the fsck wasn't
optional. Then I had to reconfigure fsck to use temporary files on my
root filesystem to fsck the big data partition. Then I had to fix fsck
to finish in under "months". Finally the patched fsck finished in 24
hours and I could remount my filesystem.

All this would have put someone with one big partition in big trouble
with having to install new hardware for the partition.

Installing new hardware is impossible if you rent a dedicated server
at say "amenworld.com".

> Any sane person will not keep millions of pictures because you could
> not possibly ever look at them all. Just taking a million pictures
> at a rate of one every minute for 18 waking hours a day would take
> 2.5 years. They also won't create 100 hard links to each one.

I've had a commercial datarecovery where I had to recovery 250Gb of
images for a professional photographer. It will become more and more
easy to just press the shutter and take 5 images per second for say 10
seconds as technology improves. Maybe people will decide to keep all
the originals, intending (forever) to look at them later. Or they
select the "best" image and use it but keep all the originals.

I have decided for myself that when we look at the family albums from
40 years ago, we do NOT look at the pictures for the reasons they were
taken and kept. Here is Uncle Eric skiing. Here is Uncle Franklin
skiing. Here is Uncle Dicky skiing. After a few of these images we
know the style of their clothes, we know they were all good skiers.

The interesting things are the backgrounds: Hey that building already
existed, and that other one didn't! Look at how empty the vilage was!
That sort of stuff.

So if I take a picture now, I don't know what my grandchildren will
find interesting in them 50 years from now. So I've decided not to
delete any of them. Storage is cheap. How they will sort through the
thousands of pictures I take is a problem technology might solve when
we get there. Or not.

But images is only one thing where filesizes don't grow as much as the
disks.

And if you have one of those applications and run backuppc and/or have
a server that backs up a whole bunch of workstations you'll end up
with lots of files like me.

But yes, You can maintain that everybody who ends up with these things
is doing something extreme and has to manually configure his system.