Heavy Disk I/O harms desktop responsiveness

Bug #131094 reported by Jamie McCracken
This bug affects 180 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Low
Unassigned
Nominated for Hardy by gururise
Nominated for Intrepid by unggnu
Nominated for Jaunty by Jeffery Davis
Nominated for Karmic by daneel
Nominated for Lucid by geek
Nominated for Maverick by Montblanc

Bug Description

Binary package hint: linux-source-2.6.22

When compared with 2.6.15 in feisty, heavy disk I/O causes increased iowait times and affects desktop responsiveness in 2.6.22

this appears to be a regression from 2.6.15 where iowait is much lower and desktop responsiveness is unaffected with the same I/O load

Easy to reproduce with tracker - index the same set of files with 2.6.15 kernel and 2.6.22 kernel and the difference in desktop responsiveness is massive

I have not confirmed if a non-tracker process which does heavy disk i/o (especially writing) replicates this yet - will do further investigation soon

Tags: cft-2.6.27
Revision history for this message
Jamie McCracken (jamiemcc-blueyonder) wrote :

Further investigation has led me to conclude that this bug is no longer valid

Slowdown in system can be eliminated by:

1) Clean install of tribe 4. I originally had tribe 3 when problem occurred and it persisted when upgrading but clean install somehow fixes the desktop responsiveness issues

2) Apps still feel slow but this is not a kernel issue - disabling esd sound in sound preferences makes gutsy as fast as feisty (see https://bugs.launchpad.net/ubuntu/+source/libgnome/+bug/115652)

Changed in linux-source-2.6.22:
status: New → Invalid
Revision history for this message
Jamie Lokier (jamie-shareable) wrote :

I have esd sound disabled, and performance is still incredibly slow when trackerd is running on a 2.6.22-{7,8,9} kernel. When I want to actually get some work done, I "killall -STOP trackerd".

The effect on desktop performance is weird: it feels exactly like heavy swapping. Menus etc. take seconds to appear. New apps take ages. Dragging a window can even take 10 seconds or more before it responds.

But there is free RAM, and especially there's plenty of reclaimable (i.e. not used by programs) RAM. I have 1GB.

It's not using much CPU either. (I have a Core Duo; neither core sees much usage while trackerd is running).

So it may be in some way dependent on I/O. But this is with the trackerd set to maximum throttling, i.e. slowest scanning.

Interestingly, the disk activity monitoring applet shows very little activity (little spikes every second or two), but the disk light is constantly on.

There's something else fishy: strace -p on the trackerd process shows expected system calls, but sometimes killing the strace prints "Process xxx detached" but then strace doesn't terminate, even with kill -9.

Revision history for this message
Jamie McCracken (jamiemcc-blueyonder) wrote :

Im reopening this

Other users have experienced this (see comments in https://bugs.launchpad.net/ubuntu/+source/tracker/+bug/135115) and reported that a fresh install cures the problem

This indicates there's a bug when upgrading to gutsy which causes the high iowait times which can only be solved by doing a clean install.

I cant say whether this bug only occurs when upgrading from older gutsy versions or from feisty...

All I can say is that it started from clean install of tribe 3 and persisted when upgraded and did not go away until clean install of tribe 4

Changed in linux-source-2.6.22:
status: Invalid → Confirmed
Revision history for this message
Tom Badran (tom-badran) wrote :

I've marked the bug i filed against trackerd as a dup against this bug.

Like i say, a fresh install has made a substantial difference (completely unuseable machien with trackerd -> useable). I do however still hear my disk being hit fairly often. Its not impacting interactivity as severely as it used to, but there are still noticeable short stalls doing fairly trivial things such as opening menus etc.

Revision history for this message
Miguel Martinez (el-quark) wrote :

I'm also experiencing the slowdowns during large dist-upgrades involving several packages. This is a dist-upgraded Gutsy. Furthermore, I've seen firefox crashing pretty often during those heavy I/O periods. Sometimes, it has taken thunderbird with him.

Revision history for this message
Michael Vogt (mvo) wrote :

I milestone this bug as it is important to get this fixed if we use tracker by default.

Changed in linux-source-2.6.22:
importance: Undecided → High
Revision history for this message
Ben Collins (ben-collins) wrote :

Please try booting with elevator=deadline and tell me if that helps any.

Changed in linux-source-2.6.22:
assignee: nobody → ben-collins
status: Confirmed → In Progress
Revision history for this message
Jamie McCracken (jamiemcc-blueyonder) wrote :

if anything elevator=deadline seems to cause higher iowait and for longer periods (I even saw a 100% for it with that setting) when running trackerd

average iowait values when tracker is flushing to disk during heavy indexing of same files:

for feisty 2.6.20-15 : 90-95%
for 2.6.22-9 : 90-99%
for 2.6.22-9 with elevator=deadline: 95-100%

Revision history for this message
Miguel Martinez (el-quark) wrote : Re: [Bug 131094] Re: Heavy Disk I/O harms desktop responsiveness

Same here. elevator=deadline doesn't seem to help, although I don't have
any objective data to complement Jamie's

Jamie McCracken escribió:
> if anything elevator=deadline seems to cause higher iowait and for
> longer periods (I even saw a 100% for it with that setting) when
> running trackerd
>
> average iowait values when tracker is flushing to disk during heavy
> indexing of same files:
>
> for feisty 2.6.20-15 : 90-95%
> for 2.6.22-9 : 90-99%
> for 2.6.22-9 with elevator=deadline: 95-100%
>

--
----------------------------------------
Miguel Martínez Canales
    Dto. Física de la Materia Condensada
    UPV/EHU
    Facultad de Ciencia y Tecnología
    Apdo. 644
    48080 Bilbao (Spain)
Fax: +34 94 601 3500
Tlf: +34 94 601 5437
----------------------------------------

  "If you have an apple and I have an apple and
  we exchange these apples then you and I will
  still each have one apple. But if you have an
  idea and I have an idea and we exchange these
  ideas, then each of us will have two ideas."

  George Bernard Shaw

Revision history for this message
Ben Collins (ben-collins) wrote :

Ok, for the fun of it, please also try elevator=anticipatory

Revision history for this message
Jeff Schroeder (sejeff) wrote :

The latest gutsy kernel have the right settings to use blktrace. Try these commands
sudo apt-get install blktrace
sudo mount -t debugfs debugfs /sys/kernel/debug/

# If /dev/sda is the disk that / is located on
sudo btrace /dev/sda

# Let it run for a few seconds and then kill it with CTRL C.

That will show the top processes using your disk.

Revision history for this message
Jeff Schroeder (sejeff) wrote :

Make that:
sudo btrace -s /dev/sda

It gives a summary of the disk usage of each proccess.

Revision history for this message
Jeff Schroeder (sejeff) wrote :

Also note that gutsy has an 'ionice' command that you can use to slow don't IO for a process like trackerd. man ionice.

Revision history for this message
Julien Olivier (julo) wrote :

Hi,

I have upgraded from feisty to gutsy and also noticed that my GNOME desktop felt way slower than on feisty. I tried disabling esd, but it didn't help. The thing is that I laos tried to disable trackerd, but the slowness remains when I open F-Spot, or when I use Firefox. Is there a way to know if the problem really comes from the kernel ? Is it safe to re-install linux-image-2.6.20 from feisty ? If yes, are there any other packages I should downgrade too ?

Revision history for this message
Tobias Oetiker (tobi-oetiker) wrote :

Julien,

it seems that kernels 2.6.18 to 2.6.21 have some serios issues with
heavy disk io especially when multiple processes are fighting over
io and if read and write are going on in parallel ...

for us the upgrade to 2.6.22 helped a lot ...

there were changes to the io schedulers and massive changes to the
default values of the /proc/sys/vm/dirty_* tunables ...

we also found that the problems were more pronounced when using lvm
... unfortunately this is all anecdotal and non conclusive.

so if you have the chance, you might want to try 2.6.22 ...

cheers
tobi

--
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten
http://it.oetiker.ch <email address hidden> ++41 62 213 9902

Revision history for this message
Julien Olivier (julo) wrote :

Tobias,

as I said, I have upgraded to gutsy recently, so I do have kernel 2.6.22, and I still have speed problems. Whether or not the kernel is the culprit is still a mystery to me though.

Someone said that the problems seem to persist when you upgrade from feisty (versus a fresh install), so maybe I have inherited wrong values in /proc/sys/vm/dirty_* ?

I would be really pleased to help, so if there is anything I can test, I'm ready to help.

PS: I installed kernel 2.6.20 from feisty and booted on it, and it didn't change anything.

Revision history for this message
Tobias Oetiker (tobi-oetiker) wrote :

Today Julien Olivier wrote:

> Tobias,
>
> as I said, I have upgraded to gutsy recently, so I do have kernel
> 2.6.22, and I still have speed problems. Whether or not the kernel is
> the culprit is still a mystery to me though.
>
> Someone said that the problems seem to persist when you upgrade from
> feisty (versus a fresh install), so maybe I have inherited wrong values
> in /proc/sys/vm/dirty_* ?

this is highly unlikely ... check /etc/sysctl.conf to see if there
are any explicit settings

> I would be really pleased to help, so if there is anything I can test,
> I'm ready to help.
>
> PS: I installed kernel 2.6.20 from feisty and booted on it, and it
> didn't change anything.

in that case I am fresh out of ideas unfortunately.

cheers
tobi

--
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten
http://it.oetiker.ch <email address hidden> ++41 62 213 9902

Revision history for this message
Jamie McCracken (jamiemcc-blueyonder) wrote :

I think there are two separate issues here

1) something in old tribes affects disk access (HAL or UDEV?) and on some occasions they persist when upgraded and only a fresh install cures the problem. This is what affected me and all disk IO read and writes were affected very badly even without tracker running. This only happens rarely as only a few people had this...

2) Ext3 write performance is very poor on both feisty and Gutsy - as soon as pdflush starts it tends to hog the disk. Putting $Home/.cache/tracker on a different FS like XFS improves things a lot (I only did this on feisty but not gutsy)

if default pdflush params have changed on gutsy kernel that could also affect write performance negatively.

 Another thing is my hard disk is whisper quiet on feisty but extremely noisy on gutsy - I had to hdparm to lower the noise. WOuld be nice to make it quiet by default too especailly as tracker makes it very noisy at times

Revision history for this message
Julien Olivier (julo) wrote :

Jamie,

about #1: any idea what exactly went wrong, and is there a chance that it might still be unfixed for some users ?

Revision history for this message
Jamie McCracken (jamiemcc-blueyonder) wrote :

julien: Ive no idea what caused it but the effect was very noticeable even with light disk access. Only two people (myself included) have done a fresh install to solve the issue so i think its quite rare.

Im not sure if its recommended to dist-upgrade from feisty or not? (ive read a few cases where it did not work properly on osnews)

Revision history for this message
Julien Olivier (julo) wrote :

OK, I will try to re-install everything from scratch then.

Revision history for this message
Martin (martin615) wrote :

I disabled Tracker as a result of all the disk trashing. Yes, Tracker is nice. But I seriously question enabling it by default while this problem is still around (wherever the problem might lie).

Revision history for this message
Jamie McCracken (jamiemcc-blueyonder) wrote :

Martin,

if the disk io issues are only tracker related then thats ok as we have fix for that in latest version (not yet in gutsy though) which should reduce the problem and prevent tracker from hogging the disk for long periods.

Revision history for this message
Jeff Fortin Tam (kiddo) wrote :

Please, don't tell me this will be unfixed for users who went the dist-upgrade way.
This is not as rare as you think, and clean installing for lots of people is not someting you want to do all the time. Isn't it possible to fix that with upgrades? If some config broke sometime, it should be possible to reverse it for everyone no?

I actually don't even know what is going on exactly anymore, but the thing I do see is that all my gutsy computers have really horrible performance whenever I do anything that uses the hard drive.

Revision history for this message
Martin (martin615) wrote :

Jamie,

Ok, that sounds great. I'll try enabling it again when the fix hits Gutsy.

Revision history for this message
Alexey Borzenkov (snaury) wrote :

I can confirm strange disk-related performance problems too, and I dist-upgraded to gutsy way after tribe5 was already out (thus I don't think it could be something from previous tribes). Also I wonder if other problems (like desktop often not showing after I relogin [so I always have to restart if I logout, not even /etc/init.d/gdm restart helps], and login sound not playing the first time, even after I installed esound) could be cured by a fresh install, but I won't have time to do it for several weeks... I guess it will be after gutsy is already released.

And somehow I don't believe it's rare... I wonder how many people actually dist-upgraded, as opposed to fresh install of tribe5?

Revision history for this message
Lukas Kolbe (lukas-einfachkaffee) wrote :

I can confirm this problem on latest Gutsy. It bothered me a while, but shamefully I didn't yet took the time to report it and I forgot wether this first appeared in feisty or in gutsy. My system was upgraded at least since feisty, possibly also since dapper. I actually can't remember when I last installed ubuntu from scratch.

Attached are the outputs of dmesg, hdparm -tT, smartctl -a, lspci -vvn and a vmstat 2 during my latest dist-upgrade that made the system heavily unresponsive (again). Also, while tracker is indexing, or evolution is starting or any other normal disk-io is happening, the system becomes unusable. Dist-upgrades of only a few packages take ages.

If there's anything I can do to help identify the root cause, please ask.

Revision history for this message
Lukas Kolbe (lukas-einfachkaffee) wrote :

And as this was mentioned before I thought it might be important: I'm using LVM. Attached is the complete disk-layout on my system.

Revision history for this message
Amit Kucheria (amitk) wrote :

This thread seems to be catching fire :-)

I did some IO testing of the Feisty and Gutsy kernels on Gutsy userspace. Results are at https://wiki.ubuntu.com/GutsyFeistySchedulerShootout?action=show

If someone can repeat these tests and posts the results, it would help drill down into the problem. Currently, it seems like only users doing dist-upgrades are having problems. Unfortunately, my machine was a fresh install.

Revision history for this message
Lukas Kolbe (lukas-einfachkaffee) wrote :

I run your test, the numbers seem quite equal to yours, but during the test my system became unresponsive like hell. Switching desktops (from web to evolution) took more than 20 seconds (probably due to swapping, I have 768MB RAM), subsequent switches took up to five seconds. I could see the drawing while I tried scrolling in evolutions' folder list. vim took ages to load etc. pp - all in all very sluggish.

Revision history for this message
Tobias Oetiker (tobi-oetiker) wrote :

I don't think the problem is entirely ubunty made ... Other people
are looking at IO performance too.

This does look interesting

  http://lkml.org/lkml/2007/8/16/77

and this ... http://lkml.org/lkml/diff/2007/8/23/218/1

cheers
tobi

--
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten
http://it.oetiker.ch <email address hidden> ++41 62 213 9902

Revision history for this message
Jamie McCracken (jamiemcc-blueyonder) wrote :

Could this be sata related?

Can everyone who has this problem indicate if this is so?

just wondering if its related to https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/119730

Revision history for this message
Tom Badran (tom-badran) wrote :

I am on a sata machine, however i never had a problem with file copy
throughut speed etc., its just interactivity.

On 29/09/2007, Jamie McCracken <email address hidden> wrote:
>
> Could this be sata related?
>
> Can everyone who has this problem indicate if this is so?
>
> just wondering if its related to
> https://bugs.launchpad.net/ubuntu/+source/linux-
> source-2.6.20/+bug/119730
>
> --
> Heavy Disk I/O harms desktop responsiveness
> https://bugs.launchpad.net/bugs/131094
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
Tom Badran
http://badrunner.net

Revision history for this message
Miguel Martinez (el-quark) wrote :

I don't think it's sata-related as I have an "old" Pentium-M (735) that
doesn't support SATA, and my laptop does suffer from the I/O issue.

Jamie McCracken escribió:
> Could this be sata related?
>
> Can everyone who has this problem indicate if this is so?
>
> just wondering if its related to
> https://bugs.launchpad.net/ubuntu/+source/linux-
> source-2.6.20/+bug/119730
>

--
----------------------------------------
Miguel Martínez Canales
    Dto. Física de la Materia Condensada
    UPV/EHU
    Facultad de Ciencia y Tecnología
    Apdo. 644
    48080 Bilbao (Spain)
Fax: +34 94 601 3500
Tlf: +34 94 601 5437
----------------------------------------

  "If you have an apple and I have an apple and
  we exchange these apples then you and I will
  still each have one apple. But if you have an
  idea and I have an idea and we exchange these
  ideas, then each of us will have two ideas."

  George Bernard Shaw

Revision history for this message
Jamie McCracken (jamiemcc-blueyonder) wrote :

Also forgot to mention tracker 0.6.3 is now in gutsy (its not in the beta) - this version is designed to work around the issues here as well as being much better optimised as far as disk access goes.

Revision history for this message
Jeff Fortin Tam (kiddo) wrote :

Nope. My desktop only has IDE drives, and so does my laptop, so not sata-related.

Revision history for this message
Tobias Oetiker (tobi-oetiker) wrote :

Jamie,

I run sata with lvm

cheers
tobi

Today Jamie McCracken wrote:

> Could this be sata related?
>
> Can everyone who has this problem indicate if this is so?
>
> just wondering if its related to
> https://bugs.launchpad.net/ubuntu/+source/linux-
> source-2.6.20/+bug/119730
>
>

--
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten
http://it.oetiker.ch <email address hidden> ++41 62 213 9902

Revision history for this message
Amit Kucheria (amitk) wrote :

As pointed out by Jeff above, can someone having the problems run trackerd with ionice.

e.g. ionice -c3 -p<pid of trackerd>

Revision history for this message
Tom Badran (tom-badran) wrote :

I had already tried the ionice in one of the bugs closed off as a dup, it
makes absolutely no difference whatsoever

On 02/10/2007, Amit Kucheria <email address hidden> wrote:
>
> As pointed out by Jeff above, can someone having the problems run
> trackerd with ionice.
>
> e.g. ionice -c3 -p<pid of trackerd>
>
> --
> Heavy Disk I/O harms desktop responsiveness
> https://bugs.launchpad.net/bugs/131094
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
Tom Badran
http://badrunner.net

Revision history for this message
Jamie McCracken (jamiemcc-blueyonder) wrote :

Amit,

trackerd uses the best effort 7 disk io schedule by default (it tries idle class first but as that needs root it will fail and default to BE 7)

note disk writes are not affected by the schedule as they are controlled by pdflush and heavy writing is where the problem lies (pdflush tends to go crazy)

tracker 0.6.3 mitigates the pdflush problems by intermittently calling fsync when merging indexes to prevent pdflush from taking over the disk and starving other apps

Changed in linux:
status: New → Incomplete
Changed in linux-source-2.6.22:
status: In Progress → Won't Fix
Changed in linux:
assignee: nobody → ubuntu-kernel-team
importance: Undecided → Medium
status: Incomplete → Triaged
Tim Gardner (timg-tpi)
Changed in linux:
status: Triaged → Confirmed
Changed in linux:
status: Unknown → Confirmed
Changed in linux:
status: Confirmed → Invalid
Bryan Wu (cooloney)
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Jim Lieb (lieb)
Changed in linux (Ubuntu):
assignee: nobody → Jim Lieb (lieb)
status: Confirmed → In Progress
Changed in linux (Ubuntu):
assignee: Jim Lieb (lieb) → Ubuntu Kernel Team (ubuntu-kernel-team)
status: In Progress → Confirmed
Changed in linux (Ubuntu):
assignee: Ubuntu Kernel Team (ubuntu-kernel-team) → nobody
Changed in linux-source-2.6.22 (Ubuntu):
assignee: Ben Collins (ben-collins) → nobody
Changed in linux:
status: Invalid → Confirmed
Changed in linux:
importance: Unknown → High
339 comments hidden view all 419 comments
Revision history for this message
AvitarX (ddwornik) wrote :

That's what I assumed, but the previous post tricked me.
On Mar 15, 2011 12:34 PM, "Omer Akram" <email address hidden> wrote:
>>
>> Is ubuntu going to throw out upstart?
>>
>>
> Simple answer: no.
>
> --
> You received this bug notification because you are a direct subscriber
> of the bug.
> https://bugs.launchpad.net/bugs/131094
>
> Title:
> Heavy Disk I/O harms desktop responsiveness
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/linux/+bug/131094/+subscribe

Revision history for this message
tankdriver (stoneraider-deactivatedaccount) wrote :

Testing oneiric beta + updates,
Under high I/O load the mouse pointer has now become a very choppy feeling, (e.g. freezes for 1 second)
can someone confirm this change from natty > oneiric?

Revision history for this message
cometdog (ericctharley) wrote :

Incredibly bad responsiveness under heavy IO for me on Oneiric. My only recent point of comparison is Lucid. Unfortunately it's not completely fair since I had a different HDD setup then. But in any case, desktop gets nearly unusable when starting up a program, etc. Freezes for multiple seconds at a time.

Revision history for this message
Vadim Peretokin (vperetokin) wrote :

Yeah. Anytime a system has to swap, you know it because your desktop
freezes.

Revision history for this message
Ofer Chen (oferchen) wrote :

I switched to using zramswap-enabler instead of a real swap partition it makes things a lot better if you have the ram..

sudo add-apt-repository ppa:shnatsel/zram && sudo apt-get update&& sudo apt-get install zramswap-enabler

Revision history for this message
DAF (dfiguero) wrote : AUTO: Diego Figueroa is out of the office

I am out of the office from Fri 01/21/2011 until Sun 01/08/2012.

Hi,

I will be out of the office from Wednesday December 21 until Monday January
8. If you need urgent assistance with any of my projects please contact my
manager Miguel Marques at extension 22684.

Thank you,

Diego.

Note: This is an automated response to your message "[Bug 131094] Re:
Heavy Disk I/O harms desktop responsiveness" sent on 11/18/2011 4:01:09 PM.

This is the only notification you will receive while this person is away.

Changed in linux:
status: Confirmed → Fix Released
Revision history for this message
Francisco J. Yáñez (fjyaniez) wrote :

5 years later... too late :(

I had to change to another OS after 8 years using linux... I won't get back now.

Revision history for this message
Vadim Peretokin (vperetokin) wrote : Re: [Bug 131094] Re: Heavy Disk I/O harms desktop responsiveness

I don't think it was actually fixed, if you look at the upstream report.
On Jun 11, 2012 5:06 PM, "Francisco J. Yáñez" <email address hidden> wrote:

> 5 years later... too late :(
>
> I had to change to another OS after 8 years using linux... I won't get
> back now.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/131094
>
> Title:
> Heavy Disk I/O harms desktop responsiveness
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/linux/+bug/131094/+subscriptions
>

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

Well, yes, many-many years ago on much more powerless machines, I could play MP3 and we had some kind of compo with friends to be able to interrupt the music by doing I/O. It was quite hard. Then as far as I can tell the situation became more and more worse, which is especially odd that I started to use more and more powerfull machines meanwhile. Now, I can say that almost anything I do which generates some I/O stops the whole desktop, gnome-terminal windows are white (not updated) for long seconds (sometimes even a minute!) sometimes even the mouse can't be moved. And no, it can't be a hw problem as I noticed it on many different machines with totally different hardware (SCSI, "normal" ide/pata, sata .... both of 32 and 64 bit kernels/systems, AMD/intel CPU, etc) and very different kernels and even distributions (well, ubuntu and debian to be precise) during the years. However that's true that the worst came in the last 1-2 years, as far as I can remember, though I could notice getting things worse even before that.

Revision history for this message
Mike Mestnik (cheako) wrote :

I had this issue, I've always had this issue. It get's really bad if your disk is doing bad sector relocation(s)... then the desktop/gui and mouse can freeze for 15minuets.

Revision history for this message
laksdjfaasdf (laksdjfaasdf) wrote :

@Canonical: Why don't you make the lowlatency kernel as the default one instead of generic kernel? This should solve the problem of bad responsiveness correlated with graphical user interface.

Even if the throughput isn't getting better with lowlatency kernel - it feels much faster if your mouse pointer moves _without_ dropouts or menus pop up instantly under heavy disk I/O.

On graphical desktops it's not always the real throughput what makes the system feels fast, but the responsivness! Even if it takes a second longer to copy a big file, your system "feels" much fast if mouse pointer still moves _without_ dropouts or menus pop up instantly.

Revision history for this message
Ronan Jouchet (ronj) wrote :

Interesting proposal. Are you sure about that claim, Felix? Do you
have data to support it?

Now that linux-lowlatency is in universe and is just a build with
different option of the same kernel, it might not be risky at all, and
if that's a real win for responsiveness (which is definitely an
important metric), using -lowlatency by default can be something to
suggest to the kernel team.

Revision history for this message
Jakob Lenfers (jakob-drss) wrote :

Thanks a lot Felix, just as an FYI for others: This helped me a lot. Writing this from an old (was a 08.04 IIRC) and often updated Ubuntu server 12.04 installation and I switched from the server kernel to the lowlatency one. Now I can run updatedb and start Thunderbird while music is running. I'm embarrassed to say that, but I haven't been able to do that (without a lot of ionice -c3) for quite some time. This makes this computer usable for me again. I just hope that the nvidia driver stops making problems with my onboard card soon and my old server & desktop is golden again. :)

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

Ok, but the odd thing that in the "old time" everything was much-much-much better even with regular kernel (so no special low-latency one etc) on much-much weaker hardwares than now :(

Revision history for this message
yarly (ih8junkmai1) wrote :

I agree with comments by Vorname Nachname (post #390). The low-latency kernel provides for a much more responsive desktop. Differences between linux-meta-lowlatency and linux-meta-generic are profound when running in a LUKS environment with FDE.

Revision history for this message
Adam Porter (alphapapa) wrote :

It's very true that years ago I/O latency was much less of a problem
with Linux. When I first started using Debian full-time about ten
years ago, I never had problems with music skipping or anything like
that. I guess in the kernel development since then, throughput has
been prioritized over latency. Nowadays with 3.8 kernels and the same
hardware, it's trivial to make my music player skip under load, even
when its buffer is set to 30000 ms.

I haven't thought of trying the lowlatency kernel, so thanks for that
idea. I will be trying that!

Besides that, I wish Ubuntu would make BFQ the default I/O scheduler
(or at least build it in by default so we can easily switch to it,
instead of having to build kernels or install from third-party repos).
 Check out this video from a year ago:

http://youtu.be/J-e7LnJblm8

Seems obvious to me that BFQ is the way to go for desktops.

I have noticed lately that Deadline seems to result in less music
skipping than CFQ, so I can see why Deadline is the default now. But
Deadline doesn't support ionice, so I can't do things like run backups
or upgrades in the background at minimum I/O priority.

Revision history for this message
penalvch (penalvch) wrote :

Jamie McCracken, this bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? If so, could you please test for this with the latest development release of Ubuntu? ISO images are available from http://cdimage.ubuntu.com/daily-live/current/ .

If it remains an issue, could you please run the following command in the development release from a Terminal (Applications->Accessories->Terminal), as it will automatically gather and attach updated debug information to this report:

apport-collect -p linux <replace-with-bug-number>

Also, could you please test the latest upstream kernel available (not the daily folder, but the one at the top) following https://wiki.ubuntu.com/KernelMainlineBuilds ? It will allow additional upstream developers to examine the issue. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this bug is fixed in the mainline kernel, please add the following tags:
kernel-fixed-upstream
kernel-fixed-upstream-VERSION-NUMBER

where VERSION-NUMBER is the version number of the kernel you tested. For example:
kernel-fixed-upstream-v3.13-rc4

This can be done by clicking on the yellow circle with a black pencil icon next to the word Tags located at the bottom of the bug description. As well, please remove the tag:
needs-upstream-testing

If the mainline kernel does not fix this bug, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-VERSION-NUMBER

As well, please remove the tag:
needs-upstream-testing

Once testing of the upstream kernel is complete, please mark this bug's Status as Confirmed. Please let us know your results. Thank you for your understanding.

no longer affects: linux-source-2.6.22 (Ubuntu)
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Vadim Peretokin (vperetokin) wrote :

IO is still an issue on every Ubuntu machine I've used - whenever it
becomes heavily used, everything else slows down, sometimes drastically.
What is there to test - has anything been done to address it?

Revision history for this message
penalvch (penalvch) wrote :

Vadim Peretokin, so your hardware may be tracked, could you please file a new report by executing the following in a terminal while booted into a Ubuntu repository kernel (not a mainline one) via:
ubuntu-bug linux

For more on this, please read the official Ubuntu documentation:
Ubuntu Bug Control and Ubuntu Bug Squad: https://wiki.ubuntu.com/Bugs/BestPractices#X.2BAC8-Reporting.Focus_on_One_Issue
Ubuntu Kernel Team: https://wiki.ubuntu.com/KernelTeam/KernelTeamBugPolicies#Filing_Kernel_Bug_reports
Ubuntu Community: https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette

When opening up the new report, please feel free to subscribe me to it.

Thank you for your understanding.

Revision history for this message
vsuarez (vsuarez) wrote :

Can this be related with this issue?

http://lwn.net/Articles/572911/

Revision history for this message
penalvch (penalvch) wrote :

vsuarez, so your hardware may be tracked, could you please file a new report by executing the following in a terminal while booted into a Ubuntu repository kernel (not a mainline one) via:
ubuntu-bug linux

For more on this, please read the official Ubuntu documentation:
Ubuntu Bug Control and Ubuntu Bug Squad: https://wiki.ubuntu.com/Bugs/BestPractices#X.2BAC8-Reporting.Focus_on_One_Issue
Ubuntu Kernel Team: https://wiki.ubuntu.com/KernelTeam/KernelTeamBugPolicies#Filing_Kernel_Bug_reports
Ubuntu Community: https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette

When opening up the new report, please feel free to subscribe me to it.

Thank you for your understanding.

Revision history for this message
Vadim Peretokin (vperetokin) wrote :

I don't think it is related to http://lwn.net/Articles/572911/ because it
is a 32bit machine.

I'll file the report later when I've got access to the said machine.

Revision history for this message
Adam Niedling (krychek) wrote :

Christopher M. Penalver: are you going to tell all the 165 people that are affected by this bug to open a new bug report for the same issue which is not even hardware related?

If you just took a minute you could test this bug yourself instead of require us to do all that work to test the latest mainline kernel.

I think you are just mass closing linux kernel related bugs that are still valid and affect many people. Some of them have upstream bug reports which indicate that no actual work has been done to address those issues. So why do testing? Even if someone does the testing most likely no work will be done by downstream to fix the issue. So what's the point? I think doing what you're doing is just making more harm than good.

Revision history for this message
penalvch (penalvch) wrote :

Adam Niedling, thank you for your comments regarding them:
"...are you going to tell all the 165 people that are affected by this bug to open a new bug report..."

Given the Bug Description is so vague it's largely useless "heavy disk I/O causes increased iowait times", if one has a performance problem, and for hardware tracking purposes, then one would want to file a new report. For more on this, please read the official Ubuntu documentation:
Ubuntu Bug Control and Ubuntu Bug Squad: https://wiki.ubuntu.com/Bugs/BestPractices#X.2BAC8-Reporting.Focus_on_One_Issue
Ubuntu Kernel Team: https://wiki.ubuntu.com/KernelTeam/KernelTeamBugPolicies#Filing_Kernel_Bug_reports
Ubuntu Community: https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette

"...for the same issue which is not even hardware related?"

This is speculation at best.

"If you just took a minute you could test this bug yourself instead of require us to do all that work to test the latest mainline kernel."

I've never had heavy disk I/O affect desktop responsiveness with my hardware, both with a HDD 3GB RAM, and now SSD with 8GB.

"I think you are just mass closing linux kernel related bugs that are still valid and affect many people."

This is also speculation at best, and incorrect. I've never mass closed any bugs anywhere, and your baseless accusations are not appreciated.

"Some of them have upstream bug reports which indicate that no actual work has been done to address those issues."

One having filed an upstream bug report, on a tracker which has no permission restrictions on who can file, is largely irrelevant if the full hardware isn't known, it hasn't been tested in the latest mainline kernel, it hasn't been bisected if a regression, and doesn't have specific, objective metrics demonstrating the issue.

"So why do testing?"

Testing gets a bug report one step closer to a fix. The best question is why do the complaining, which gets you nowhere? ;)

"Even if someone does the testing most likely no work will be done by downstream to fix the issue."

More incorrect speculation. Downstream has the same information requirements as upstream, as previously noted. No developer is going to take a strong interest in working on any problem, up or down, without it.

"So what's the point? I think doing what you're doing is just making more harm than good."

Wasting time arguing about things previously documented and discussed ad nauseam would be considered doing more harm than good, with the time better spent actually doing the testing and bug report filing previously requested.

If you have further comments, please refrain from making them in this report, as you are not the original reporter, and it already has quite enough "Me too!" and "Why isn't this fixed already?" comments. Instead, you are welcome to contact me directly, and/or redirect them to the appropriate mailing list or forum. http://www.ubuntu.com/support/community/mailinglists might be a good start for determining which mailing list to use.

Thank you for your understanding.

Revision history for this message
Adam Niedling (krychek) wrote :

Thanks for analysing each and every sentence of mine one by one.
Who says only the original reporter can comment on bugs? I'm not the original reporter, I'm just somebody who is affected by this bug which you are trying to close in a very crafty way. It's not a speculation that you're doing this all the time, you did this to 2 or 3 of my own bugs. I'm getting tired of you pasting the same text everywhere. Maybe you're pasting it to hundreds of bugs. There is no effort in pasting some text. However you are asking people to do a lot of work which takes huge effort. Most of the time it's completely unnecessary cause no one has made anything to fix the issue.

"Hey! No developer has ever touched this bug but let's ask the poor user who is suffering from it a ton of questions and half day of working and testing the latest mainline kernel maybe he won't be able to do it or just simply has no idea how to do it so we can close this completely valid bug! And let's just ignore the bug even if the poor user does all that work ha ha ha..... Oh yeah and make sure to paste lots of links about etiquette and what not so I will look official even though I'm not working for Canonical I'm just messing around with people's bugs."

Revision history for this message
Ronan Jouchet (ronj) wrote :

Adam Niedling wrote:
  "I'm just somebody who is affected by this bug which you are trying to close in a very crafty way. It's not a speculation that you're doing this all the time, you did this to 2 or 3 of my own bugs. I'm getting tired of you pasting the same text everywhere. Maybe you're pasting it to hundreds of bugs. There is no effort in pasting some text. However you are asking people to do a lot of work which takes huge effort. Most of the time it's completely unnecessary cause no one has made anything to fix the issue.
  "Hey! No developer has ever touched this bug but let's ask the poor user who is suffering from it a ton of questions and half day of working and testing the latest mainline kernel maybe he won't be able to do it or just simply has no idea how to do it so we can close this completely valid bug! And let's just ignore the bug even if the poor user does all that work ha ha ha..... Oh yeah and make sure to paste lots of links about etiquette and what not so I will look official even though I'm not working for Canonical I'm just messing around with people's bugs."

>> I can definitely recognize some of the behavior described here by Adam, and also suffered from it in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/908691 . In my case I could even pinpoint a specific mainline commit, but my inability to do the non-mainline git bisect requested by M. Penalver meant my request fell in deaf ears. I closed my own bug diplomatically, but it was extremely disappointing experience to see so little response for all the effort I put.

I understand Canonical must have lots of bug triage to do, but I'd too love a little more humanity in processing them. Canned answers and strict protocol don't show a lot of empathy, and don't echo into much user love.

Revision history for this message
penalvch (penalvch) wrote :

Quoting from https://bugs.launchpad.net/ubuntu/+source/linux/+bug/336652/comments/15 :
"this is a serious issue but only affects limited hardware..."

Revision history for this message
Adam Niedling (krychek) wrote :

And who is to say that comment #15 is not just a mere speculation at best? What does he mean by limited hardware? Every comp that has HDD and not SSD?

You really had someone's absolutely valid bug report closed because he wasn't able to do a git bisect? Just how many times did you do that? Who gave you the authority? How do you benefit from these kinds of things?

Just as Ronin has said: please show a little more empathy and stop talking to people like a robot with your canned comments.

Revision history for this message
Vadim Peretokin (vperetokin) wrote :

I'm surprised this is being debated. Look at Google:
https://www.google.com.au/search?q=linux+high+io+desktop&oq=linux+high+&aqs=chrome.0.69i59j69i57j69i64l2.1936j0j1&sourceid=chrome&ie=UTF-8

You will clearly see that high enough IO will harm desktop responsiveness.
Surely all of these people aren't making it up?

Revision history for this message
Adam Niedling (krychek) wrote :

Now Christopher is onto me. He started vandalizing another of my bug reports. Bug #1247189.

Changed in linuxmint:
status: New → Invalid
Revision history for this message
Davide Depau (depau) wrote :

This issue is not getting enough attention. I don't know if you all have SSDs but most people don't. On hard disk drives this is a huge issue. System responsiveness drops when tracker is running and pretty much nothing else can run smoothly while it's running, even on computers with fast CPU/large amounts of RAM. The I/O is often the cause of system slowdown and this needs to be reduced as much as possible.
I'm sure this issue can be fixed, a background daemon doesn't need to run at full speed, it can be niced to 19, and internal fixes can be made.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
penalvch (penalvch) wrote :

Davide Depau, it would help immensely if you filed a new report via a terminal:
ubuntu-bug linux

Please feel free to subscribe me to it.

no longer affects: linux (Ubuntu)
affects: linuxmint → linux (Ubuntu)
no longer affects: linux (Ubuntu)
affects: linux → linux (Ubuntu)
Changed in linux (Ubuntu):
importance: High → Undecided
status: Fix Released → New
importance: Undecided → Low
status: New → Incomplete
information type: Public → Public Security
information type: Public Security → Public
Revision history for this message
god (humper) wrote :

I can observe this even on ssd with both ubuntu and mainline kernels. Especially when some background task like update.mlocate which spits out fs-wide find is triggered.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
penalvch (penalvch) wrote :

god (humper), please file a new report via a terminal:
ubuntu-bug linux

Feel free to subscribe me to it.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
god (humper) wrote :

done.

Revision history for this message
AZ (m-dev) wrote :

@Christopher: This is not incomplete. Thanks.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Download full text (7.1 KiB)

It might not be good to stir up such an old bug, but it gets regularly updated and new complains so maybe a new approach might help.

So let us make one thing clear, IMHO if something overloads your machine with disk I/O it has to stall it.
So the solutions paths are more like this:
a) beat it with more Processsing / IO HW
b) mitigate the effect as far as possible
c) avoid the overload before it starts

The issue is a common one - so I'll keep my explanations general and not specific to trackerd or any other case that was mentioned before.

### a) beat it with more Processsing / IO HW ###
There are way more expensive machines out there which can handle way more I/O without being slown down. The reason is that they have more I/O Cards, virtual functions to spread over CPUs handling that and at the high end servers with totally different I/O IRQ designs.
We should agree that on cheap/slow or even medium machines I/O overload just *IS* an issue to responsiveness.
But that isn't important - the question is what can a normal user do about it and spending x000000 $ on a machine isn't the solution.

### b) mitigate the effect as far as possible ###
So regarding mitigation there were already some approaches in this bug discussion.
Like using ionice and several dirty ratio tunings, but all these don't prevent the I/O overload.
E.g. if you overload the system with only "Best Effort" I/O class, the only difference it makes is that "other I/O" might pass faster, but your system is still fairly busy => unresponsive
Also dirty ratios come down to spending the process remaining time slice to clean up dirty memory as soon as a certain level is reached, now while you can configure higher ratios (at the price of endangering integrity) it also won't stop the burst of I/O. No instead it will allow to submit more data to dirty the page cache and thereby indirectly more I/O overloading the system again.

### c) avoid the overload before it starts ###
It must be said, since this bug starts back in 2007 and a lot of the reports are related to I/O+*sync that just for sync&journaling various filesystem and general kernel improvements have been mad. Several posts in this bug confirm this already.
Now what I didn't see people trying throttle the processes that overload the system.
Throttling at => https://www.kernel.org/doc/Documentation/cgroups/blkio-controller.txt
As any - this approach has certain limitations, but it is a new way to tackle the overall issue.
It also need certain cgroup and filesystem features (like accounting writeback through pagecache) which might only be available in modern ubuntu releases.

### Experiment ###
As an experiment to prove the solution I use the tools fio and latencytop to compare:
1. no background load checking latencytop
2. running a random read/write mutlithread fio in background checking latencytop
3. running a throttled random read/write mutlithread fio in background checking latencytop

# Background Load #
A fio job file like this:

[global]
ioengine=libaio
rw=randrw
bssplit=1k/25:4k/50:64k/25
size=512m
directory=/home/paelzer/latencytest
iodepth=8

[dio]
direct=1
numjobs=8

[pgc]
direct=0
numjobs=8

# Case 1 - No backgroun...

Read more...

Revision history for this message
AZ (m-dev) wrote :

Thanks for driving this forward.

You argue from
> So let us make one thing clear, IMHO if something overloads your machine with disk I/O it has to stall it.

This is a bit tricky, because overload means that the machine will be able not complete all task in the time given, i.e. tasks will accumulate until the resources are exhausted. Though, we usually do not have this situation on desktop machines. There we have tasks to do and want them to complete as fast as possible, thought some tasks may take longer than others. For example, copying a 5 GB DVD disk will take some minutes or so, but refreshing the browser window or switching windows should never. Overlay here would mean the user will turn of the machine and by a windows licence.

So this bug is mostly about having too big delays in applications using only a small bit of the available resources (like when switching back to a libreoffice window) when some other applications (like background file indexing) are asking for the remaining disk io resource capacities.

> Code improves to mitigate effects but can never be perfect for *ALL* users at once (especially in the default config)

I do not agree. Desktop responsiveness was achieved with older ubuntu versions on the given hw and is achieved with other operating systems (windows) on a broad range of hardware. I believe desktop responsiveness is something sufficiently specific a cpu and io scheduler can be tuned to. Using cgroups and alike might help, but should be configured by Ubuntu by default if necessary.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Download full text (4.5 KiB)

Hi AZ,
thanks for your feedback.

>> IMHO if something overloads your machine with disk I/O it has to stall it.
> This is a bit tricky, because overload means that the machine will be able not complete all task in the time given, i.e. tasks will accumulate until the resources are exhausted. Though, we usually do not have this situation on desktop machines.

Excuse me - I didn't want to phrase it too hard - it is surely ok to assume that a system stays responsive :-)
But when you add an background indexer like in the initial example you add some serious load.
The system might add a few other things and somewhen it is this overloaded.
Would you agree to modify your "Though, we usually do not have this situation on desktop machines." to "Though, we usually *should* not have this situation on desktop machines."?

Because that is the point where my suggestion of "throttling the few loads that cause these situations" kicks in.

> So this bug is mostly about having too big delays in applications using only a small bit of the available resources (like when switching back to a libreoffice window) when some other applications (like background file indexing) are asking for the remaining disk io resource capacities.

When I think of an overload case where e.g. a Process submits requests as fast as it can (especially with asynchronous I/O a process can quickly fill up hundreds of I/Os to the block device layer).
Now what should a process scheduler or I/O scheduler do?
1. handle them asap -> achieve good throughput, but might add some stalls to the system
2. throttle them -> improves responsiveness by avoiding overload, but this comes at certain prices
2a) if the process scheduler stalls it people start to ask "there is nothing else on the runqueue, why isn't it running? I want to get all I can from my HW".
2b) if the I/O scheduler stalls it people start to ask "hey my device could handle way more, why isn't it fully utilized with the request queue being filled" (remember all the "fun" people had with anticipatory scheduler)

Both 2a and 2b existed in various patches/tunings and almost every time the decision was that "the default" can not be to stall too much because there are different demands.

That was the reason why I personally didn't think about a cool new piece of code (which surely someone could write), but instead of a good mitigation of the most frequent cases with tools that are already there (like the cgroup io throttling I suggested)

>> Code improves to mitigate effects but can never be perfect for *ALL* users at once (especially in the default config)
>I do not agree.

Long story short - a default configuration has to be a tradeoff trying to make everyone happy but no one sad (hard job).

> Desktop responsiveness was achieved with older ubuntu versions on the given hw and is achieved with other operating systems (windows) on a broad range of hardware.

I'm coming from the server world, and there I/O throughput, I/O latency and even process latency and fairness clearly is superior compared to older releases as well as when compared well to other OSes.
But that doesn't negate your experience - it is just a different one.

> I believe desktop ...

Read more...

Revision history for this message
god (humper) wrote :

In my case ( see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1460985 ) the culprit generating huge I/O throughput was in /etc/cron.daily/man-db

It's such a long-standing and persistent bug that the default advice I give nowadays to people complaining about their ubuntu "got stuck again" is to run "sudo killall -9 find".

That's really a shame:
- it's not some random IO spike coming from nowhere
- it's not 3rd-party, it's in default install
- it's reproducible

Yet, we still don't even have workaround, let alone proper policing IO of all the background tasks shipped in default ubuntu install.

Hopefully migration to systemd timer units would help tackling it.

Displaying first 40 and last 40 comments. View all 419 comments or add a comment.