Frequent swapping causes system to hang

Bug #595047 reported by Pete Goodall on 2010-06-16
30
This bug affects 6 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned

Bug Description

Periodically I notice my system slows to a near stand still, and the hard drive light is constantly going. This seems to be a massive amount of disk i/o and it lasts for a long time (lets say 30 mins to put a number on it). I installed and ran iotop (`iotop -a`) and it seems to point to jbd2. From what I can see jbd2 is related to ext4 journaling, but I cannot figure out how to kill this operation. It might even be a red herring because I have also stopped the disk activity by kill either chromium or firefox. I need to understand what else I can do to troubleshoot this.

$ lsb_release -rd
Description: Ubuntu maverick (development branch)
Release: 10.10

Up-to-date as of 16th June 2010.
---
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.23.
AplayDevices:
 **** List of PLAYBACK Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: ALC272X Analog [ALC272X Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: ALC272X Analog [ALC272X Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: pgoodall 1372 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0x56440000 irq 44'
   Mixer name : 'Realtek ALC272X'
   Components : 'HDA:10ec0272,1025022c,00100001'
   Controls : 14
   Simple ctrls : 8
DistroRelease: Ubuntu 10.10
Frequency: Once a day.
HibernationDevice: RESUME=UUID=145f27a9-859a-4987-8132-ac878c832747
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Alpha i386 (20100602.2)
MachineType: Acer AO531h
Package: linux (not installed)
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.35-6-generic root=UUID=11f96f8b-5e04-4e20-a201-0fa5d0fc07fa ro quiet splash
ProcEnviron:
 PATH=(custom, user)
 LANG=en_GB.utf8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.35-6.9-generic 2.6.35-rc3
Regression: Yes
RelatedPackageVersions: linux-firmware 1.37
Reproducible: No
Tags: maverick ubuntu-une kconfig regression-potential needs-upstream-testing
Uname: Linux 2.6.35-6-generic i686
UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare
dmi.bios.date: 12/22/2009
dmi.bios.vendor: Acer
dmi.bios.version: v0.3304
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.vendor: Acer
dmi.board.version: Base Board Version
dmi.chassis.type: 1
dmi.chassis.vendor: Chassis Manufacturer
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAcer:bvrv0.3304:bd12/22/2009:svnAcer:pnAO531h:pvr1:rvnAcer:rn:rvrBaseBoardVersion:cvnChassisManufacturer:ct1:cvrChassisVersion:
dmi.product.name: AO531h
dmi.product.version: 1
dmi.sys.vendor: Acer

This is an attempt at bringing sanity to bug #7372. Please only comment here is you are experiencing high I/O wait times and interactvity on reasonable workloads.

Latest working kernel version: 2.6.18?

Problem Description:
I/O operations on large files tend to produce extremely high iowait times and poor system I/O performance (degraded interactivity). This behavior can be seen to varying degrees in tasks such as,
 - Backing up /home (40GB with numerous large files) with diffbackup to external USB hard drive
 - Moving messages between large maildirs
 - updatedb
 - Upgrading large numbers of packages with rpm

Steps to reproduce:
The best synthetic reproduction case I have found is,
$ dd if=/dev/zero of=/tmp/test bs=1M count=1M
During this copy, IO wait times are very high (70-80%) with extremely degraded interactivity although throughput averages about 29MB/s (about the disk's capacity I think). Even starting a new shell takes minutes, especially after letting the machine copy for a while without being actively used. Could this mean it's a caching issue?

For the record, this is even reproducible with Linus's master.

I'm also having this problem.

Latest working kernel version: 2.6.18.8 with config:
http://svn.pardus.org.tr/pardus/2007/kernel/kernel/files/pardus-kernel-config.patch

Currently working on 2.6.25.20 with config:
http://svn.pardus.org.tr/pardus/2008/kernel/kernel/files/pardus-kernel-config.patch

Tested also with 2.6.28 and felt no significant performance improvement.

--

During heavy disk IO's like running 'svn up' hogs the system avoiding the start a new shell, browse on the internet, do some text editing using vim, etc.

For example, after being able to open a text buffer with vim, 4-5 seconds delays happens between consecutive search attempts.

Download full text (3.1 KiB)

Hello Ben,

I don't known where to post it exactly. Why Linux Memory Management? Or why -mm and not mainstream? Can you do it for me please?

I have added a second test case, which using threads with pthread_mutex and pthread_cond instead of processes with pipes for communicating, to ensure it is a cpu scheduler issue.

I have repeated the tests with some vanilla kernels again, as there is a remark in the bug report for tainted or distro kernels. As I got a segmentation fault with the 2.6.28 kernel, I added the result of the Ubuntu 9.04 kernel (see attachment). The results are not comparable to the results posted before, as I have changed the time handling (doubles instead of int32_t as some echo messages takes more than one second).
The first three results are 2*100, 2*50 and 2*20 processes exchanging 100k, 200k and 1M messages over a pipe. The last three results are 2*100, 2*50, and 2*20 threads exchanging 100k, 200k and 1M messages with pthread_mutex and pthread_cond. I have added a 10 second pause at the beginning of every thread/process to assure the 2*100 processes or threads are all created and start to exchange the messages nearby at the same time. This was not the case at the old test-case with 2*100 processes, as the first thread was already destroyed before the last was created.

With the second test-case with threads, I got the problems (threads:2*100/msg:1M) immediately with the kernel 2.6.22.19. There kernel 2.6.20.21 was fine with both test-cases.

The meaning of the results:
- min message time
- average message time (80% of the messages)
- message time at median
- maximal message time
- test duration

Here the result.
Linux balrog704 2.6.20.21 #1 SMP Wed Jan 14 10:11:34 CET 2009 x86_64 GNU/Linux
min:0.000ms|avg:0.241-0.249ms|mid:0.244ms|max:18.367ms|duration:25.304s
min:0.002ms|avg:0.088-0.094ms|mid:0.093ms|max:17.845ms|duration:19.694s
min:0.002ms|avg:0.030-0.038ms|mid:0.038ms|max:564.062ms|duration:38.370s
min:0.002ms|avg:0.004-0.007ms|mid:0.004ms|max:1212.746ms|duration:33.137s
min:0.002ms|avg:0.004-0.005ms|mid:0.004ms|max:1092.045ms|duration:31.686s
min:0.002ms|avg:0.004-0.007ms|mid:0.004ms|max:4532.159ms|duration:59.773s

Linux balrog704 2.6.22.19 #1 SMP Wed Jan 14 10:16:43 CET 2009 x86_64 GNU/Linux
min:0.003ms|avg:0.394-0.413ms|mid:0.403ms|max:19.673ms|duration:42.422s
min:0.003ms|avg:0.083-0.188ms|mid:0.182ms|max:13.405ms|duration:37.038s
min:0.003ms|avg:0.056-0.075ms|mid:0.070ms|max:656.112ms|duration:72.943s
min:0.003ms|avg:0.005-0.010ms|mid:0.007ms|max:1756.113ms|duration:49.163s
min:0.003ms|avg:0.005-0.010ms|mid:0.007ms|max:11560.976ms|duration:52.836s
min:0.003ms|avg:0.008-0.010ms|mid:0.010ms|max:5316.424ms|duration:111.323s

Linux balrog704 2.6.24.7 #1 SMP Wed Jan 14 10:21:04 CET 2009 x86_64 GNU/Linux
min:0.003ms|avg:0.223-0.450ms|mid:0.428ms|max:8.494ms|duration:46.123s
min:0.003ms|avg:0.140-0.209ms|mid:0.200ms|max:12.514ms|duration:39.100s
min:0.003ms|avg:0.068-0.084ms|mid:0.076ms|max:38.778ms|duration:78.157s
min:0.003ms|avg:0.454-0.784ms|mid:0.625ms|max:11.063ms|duration:65.619s
min:0.004ms|avg:0.244-0.399ms|mid:0.319ms|max:21.018ms|duration:64.741s
min:0.003ms|avg:0.061-0.138ms|mid:0.111ms|max:23.861ms|durati...

Read more...

Created attachment 19795
test case with processes and pipes

Created attachment 19796
test case with threads and mutexes

38 comments hidden view all 710 comments
Pete Goodall (pgoodall) wrote :

I think this is related to swap, actually. I have decread swappiness to 10, and will see if I encounter this again. Not sure what the next steps are if that works.

Pete Goodall (pgoodall) wrote :

Apparently reducing the swappiness didn't help. My system is currently hopelessly locked (and I'm writing this on another device on my desk). I'm ssh'ed into my netbook and running `iotop -a`. Here are the top two items:

    1 be/4 root 10.34 M 0.00 B ?unavailable? init
   33 be/4 root 16.00 K 340.00 K ?unavailable? [kswapd0]

The init read numbers keep going up (currently 10.86 MB) and the kswapd write numbers are going up as well.

Pete Goodall (pgoodall) wrote :

To add to the last comment, kswapd has just leaped up to over 7 MB written and rising fast! this is in a span of a couple of minutes.

affects: ubuntu → linux (Ubuntu)
Pete Goodall (pgoodall) wrote :

Ok, from what I can tell the system is swapping way too easily. If I'm running Chromium + XChat I'm fine. As soon as I open OpenOffice or Evolution or Rhythmbox that seems to push the memory over the edge and the system starts swapping. If I can manage to get to a terminal in time and kill the last application I started my system will return to normal. If I don't, I might as well just hard power off. I have set the swappiness to 0 in /etc/sysctl.conf.

Afaict there is no one program that is causing the system to swap. It just seems that Ubuntu is over sensitive. There was no problem w/ this workload with Lucid, so I don't think I'm overstressing the system.

Jeremy Foshee (jeremyfoshee) wrote :

Hi Pete,

Please be sure to confirm this issue exists with the latest development release of Ubuntu. ISO CD images are available from http://cdimage.ubuntu.com/daily/current/ . If the issue remains, please run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux 595047

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-kernel-logs
tags: added: needs-upstream-testing
tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Pete Goodall (pgoodall) wrote :

Per some advice I received on #ubuntu-kernel I'm attaching some debugging output. Here is a list of the files:

vmstat-log-7-July-2010.txt: Output of `vmstat 1 60` once the system started swapping
free.txt: Output of `free -m` a minute later
free-2.txt: Output of `free -m` about 10 seconds after I ran free the first time
top-ouput.txt: Output of `top -b -n 1`

In case it isn't reflected in the top output, I'm running Chromium with four (?) tabs open (no Flash site or anything like that), Xchat connected to two servers and with about seven channels open, Evince viewing a simple pdf document, a Nautilus window, one gnome-terminal window and Banshee playing an mp3. Again, this is a normal load for me that didn't have a problem with previous versions of Ubuntu on the same device.

Pete Goodall (pgoodall) wrote :
Pete Goodall (pgoodall) wrote :
Pete Goodall (pgoodall) wrote :
Pete Goodall (pgoodall) wrote :
Pete Goodall (pgoodall) wrote :

I made the last comment before I read Jeremy's comments (should have refreshed the bug report). Anyway, I'll attach the output he is looking for and be sure to test the upstream kernel as well.

apport information

tags: added: apport-collected
description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

I have reproduced the bug using the mainline kernel. I installed linux-image-2.6.35-999-generic from http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/, and rebooted running that kernel. I was running Chromium, Xchat and Banshee w/ no problem, so I decided to try to stress the system a bit more. I started Tomboy and opened one of my notes with no problem. I opened Nautilus and the system started swapping. I tried to get a vmstat by ssh'ing into the system, but by the time I was able to login the swapping had subsided. It started again and I ran vmstat to collect some stats. Don't know if this is useful, but will attach it.

tags: removed: needs-upstream-testing
Pete Goodall (pgoodall) wrote :

I noticed there is a 'needs kernel logs' tag. Is there something else you need attached? If so, do you need it attached running both the current maverick kernel and the mainline kernel?

Pete Goodall (pgoodall) wrote :

Marking as 'New' since I think I have supplied all the required information.

Changed in linux (Ubuntu):
status: Incomplete → New
Pete Goodall (pgoodall) wrote :

In desperation I tried re-installing to see if I just had something installed with a serious memory leak. Unfortunately, this has not improved the situation. My device is next to useless until this is resolved.

Pete Goodall (pgoodall) on 2010-08-09
summary: - massive i/o renders the system unusable
+ Frequent swapping causes system to hang
Changed in linux (Ubuntu):
assignee: nobody → Jeremy Foshee (jeremyfoshee)
tags: added: kernel-core kernel-needs-review
removed: needs-kernel-logs
Pete Goodall (pgoodall) wrote :

Apparently there are some patches for what appears to be this issue. Linked from the upstream kernel bug report: http://www.phoronix.com/scan.php?page=news_item&px=ODQ3Mw

Changed in linux (Ubuntu):
assignee: Jeremy Foshee (jeremyfoshee) → nobody
Brad Figg (brad-figg) on 2010-12-03
tags: added: acpi-namespace-lookup
tags: added: acpi-parse-exec-fail
Changed in linux:
status: Unknown → Confirmed
Changed in linux:
importance: Unknown → High
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu):
status: New → Confirmed
Changed in linux:
status: Confirmed → Fix Released

Pete Goodall, thank you for reporting this and helping make Ubuntu better. Maverick reached EOL on April 10, 2012.
Please see this document for currently supported Ubuntu releases:
https://wiki.ubuntu.com/Releases

We were wondering if this is still an issue in a supported release? If so, could you please test for this with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you please run the following command in the development release from a Terminal (Applications->Accessories->Terminal), as it will automatically gather and attach updated debug information to this report:

apport-collect -p linux <replace-with-bug-number>

Also, could you please test the latest upstream kernel available following https://wiki.ubuntu.com/KernelMainlineBuilds ? It will allow additional upstream developers to examine the issue. Please do not test the kernel in the daily folder, but the one all the way at the bottom. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested and remove the tag:
needs-upstream-testing

This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the text:
needs-upstream-testing

If this bug is fixed in the mainline kernel, please add the following tags:
kernel-fixed-upstream
kernel-fixed-upstream-VERSION-NUMBER

where VERSION-NUMBER is the version number of the kernel you tested.

If the mainline kernel does not fix this bug, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-VERSION-NUMBER

where VERSION-NUMBER is the version number of the kernel you tested.

If you are unable to test the mainline kernel, please comment as to why specifically you were unable to test it and add the following tags:
kernel-unable-to-test-upstream
kernel-unable-to-test-upstream-VERSION-NUMBER

where VERSION-NUMBER is the version number of the kernel you tested.

Please let us know your results. Thank you for your understanding.

Helpful Bug Reporting Tips:
https://help.ubuntu.com/community/ReportingBugs

tags: added: maverick
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
636 comments hidden view all 710 comments

I'm still seeing this.

Setup: Debian 7 Wheezy, amd64 backports kernel (3.11-0.bpo.2-amd64), ~45MB/s write of a low number of large files by rsync (fed through a GBit ethernet link) on an ext3 FS (rw,noatime,data=ordered) in a LVM2 partition on a hardware RAID5.

Observation: The machine (32-core Xeon E5-4650, 192 GB RAM), primarily servicing multiple interactive users via SSH, x2go and SunRay sessions, gets completely unusable during and quite some time after the rsync transfer. TCP connections to SunRay clients time out, IRC connections are dropped, even simple tools like "htop" don't do anything but clear the screen after being started. "iotop" shows a [jbd2/dm-1-8] process on top, reportedly doing "99.99%" I/O (but not reading or writing a single byte, maybe because it's a kernel thread?).

Once I switch from the default CFQ I/O scheduler to "deadline" (echo deadline > /sys/block/sdb/queue/scheduler), the symptoms disappear completely.

Still face this bug. Kernel 3.16

Is it possible to preserve 5% of IO for user/othe processes needs? Any fast download or copying eats 99.99 of IO and system is hard to use.

I'm curious: this bug was ostensibly fixed years ago however I dare everyone, who owns an Android smartphone, run a simple test. Invoke any terminal emulator and execute this command:

$ cat < /dev/zero > /sdcard/EMPTY

What's terribly unpleasant is that _all_ CPU cores become busy (more than 75% load), and the CPU jumps into the highest performance state, i.e. frequency, i.e. power consumption. Obviously this is wrong, bad and shouldn't happen. This test is kinda artificial as no Android app can create such a high IO load, but then there are multiple phones out there with either 5GHz MIMO 802.11n or 802.11ac chips which allow up to 80MB/sec throughput which can easily saturate most if not all internal MMC cards and have the same effect as the above command.

Perhaps vanilla kernel bugzilla is not a place to discuss bugs in Android, but latest Android releases usually feature kernels 3.10.x and 4.1.x without that many patches, so this bug is still there. Both these kernels are currently maintained and supported. Android by default never uses SWAP (one of the reasons for this bug).

Go figure.

P.S. Sample apps from Google Play:

* CPU Stats by takke
* Terminal Emulator by Jack Palevich

I've just experienced this issue with 3.19.0-32-generic on Ubuntu.
My KTorrent downloaded files to NTFS filesystem on SATA3 drive (fuse, download speed was about 100Mbit/s), simultaneously I copied files from that filesystem to USB3.0 flash drive with NTFS filesystem. That resulted poor interactive performance, mouse and windows lags. The workaround was to suspend torrent downloa until files copied.

Hardware: One AMD FX(tm)-8320 Eight-Core Processor, 8 GB RAM.

This bug is definitely not fixed. A simple cp from one drive to another makes a huge impact on my desktop. Trying to do an rsync is even worse. It seems to mainly be a problem with large files. My system is old (Athlon II 250) but even an old P3 running Win98 doesn't lag this bad from just copying files.

I'm trying to copy 50gb from one tower to another via USB 3.0 and it is really no fun. If I would copy all files at once the speed is decreasing constantly. After 30 minutes it copies with 1.0MB/s. If I copy a bunch of directories it is a littlebit better but also decreases in speed. For 2GB my Linux system needs more than an hour. This bug is definitely not fixed. On Windows this USB Stick is working without that speed loss.

OS: Fedora 24
Kernel: 4.8.15

I've noticed that this happens not everytime when I use exact the same USB stick. For my 2GB files (Eclipse with workspace and a project) I needed one hour to copy. It startet with 60MB/s and decreased to 500KB/s. Now I copy 16gb (android studio and some other projects) and it only needs about 15minutes. The copy speed startet with 70MB/s and at the end it was 22MB/s fast. So it also decreases but not as fast as in my 2GB copy process.

It seems Kernel developers not look this topic here, much better to write to the mailing list.

Does Jens' buffered writeback throttling patchset solve your issue?

(In reply to bes1002t from comment #642)
> I'm trying to copy 50gb from one tower to another via USB 3.0 and it is
> really no fun. If I would copy all files at once the speed is decreasing
> constantly. After 30 minutes it copies with 1.0MB/s. If I copy a bunch of
> directories it is a littlebit better but also decreases in speed. For 2GB my
> Linux system needs more than an hour. This bug is definitely not fixed. On
> Windows this USB Stick is working without that speed loss.
>
> OS: Fedora 24
> Kernel: 4.8.15

This bug report has nothing to do with the speed of copying data to USB flash drive. It's about substantially degraded interactivity which manifests in slowness and it's hard to believe you can perceive it via an SSH session.

I'm inclined to believe your bug is related to other subsystems like USB.

> It seems Kernel developers not look this topic here, much better to write to
> the mailing list.

Kernel bugzilla has always been neglected. Thousands of bug reports which have zero comments from prospective developers. LKML is a hit and miss too. Your developer skipped your e-mail because he/she was busy? Bad luck.

@bes1002t: I think throughput is a different issue than this, although it might well be related.

But most important would be for someone to create a I/O concurrency / latency benchmark. Maybe the Phoronix Test Suite is an adequate tool for that? It can also be used for automatic bisecting..

I clearly remember pre-2.6.18 times where I had a much inferior machine and while gent0o's emerge was compiling stuff in the background with multiple threads, I could browse the web switch between programs and play a HD stream without any hickup or stalling.

@bes1002t: Copying to a USB device always starts with the speed of the harddrive as all is cached till the write cache is full and ends with the speed of the usb drive. The write process has to wait till all data is written.

@Artem S. Tashkinov: The stall problems on a ssh session exists or existed. I have migrated an old server with CentOS 6 and copied some vm images. The ssh responsiveness was very bad. I had to wait for up to 20 seconds for tab auto to complete.

I many cases it was a swap problem, as the buffers are full and the caches need a long time to be written to a slow usb device. The server starts to swap process data. It's only a very small amount of data. I could increase the overall desktop performance with an RAM upgrade.

Try Kernel 4.10.

>Improved writeback management
>
>Since the dawn of time, the way Linux synchronizes to disk the data written to
>memory by processes (aka. background writeback) has sucked. When Linux writes
>all that data in the background, it should have little impact on foreground
>activity. That's the definition of background activity...But for a long as it
>can be remembered, heavy buffered writers have not behaved like that. For
>instance, if you do something like $ dd if=/dev/zero of=foo bs=1M count=10k,
>or try to copy files to USB storage, and then try and start a browser or any
>other large app, it basically won't start before the buffered writeback is
>done, and your desktop, or command shell, feels unreponsive. These problems
>happen because heavy writes -the kind of write activity caused by the
>background writeback- fill up the block layer, and other IO requests have to
>wait a lot to be attended (for more details, see the LWN article).
>
>This release adds a mechanism that throttles back buffered writeback, which
>makes more difficult for heavy writers to monopolize the IO requests queue,
>and thus provides a smoother experience in Linux desktops and shells than what
>people was used to. The algorithm for when to throttle can monitor the
>latencies of requests, and shrinks or grows the request queue depth
>accordingly, which means that it's auto-tunable, and generally, a user would
>not have to touch the settings. This feature needs to be enabled explicitly in
>the configuration (and, as it should be expected, there can be regressions)

Hi..

Thank you for your email.

I am sorry, but this email will soon be disabled..

Please send everything work related to <email address hidden>

Please send private mails to <email address hidden>

bye m.

> Try Kernel 4.10.
It not helps in my work load :(
still freezing mouse pointer and keyboard input

Make sure your kernel has that option enabled.

>This feature needs to be enabled explicitly in
>the configuration (and, as it should be expected, there can be regressions)

Created attachment 255491
$ cat /boot/config-`uname -r`

(In reply to Mikhail from comment #651)

First, I'd recommend trying to disable SWAP completely - it might help:

$ sudo swapoff -a

If you compile your own kernel or your distro hasn't enabled them for you, here's the list of the options you need to enable:

BLK_WBT, enable support for block device writeback throttling
BLK_WBT_MQ, multiqueue writeback throttling
BLK_WBT_SQ, single queue writeback throttling

They are all under "Enable the block layer".

If disabling swap and enabling these options have no effect, please ***create a new bug report*** and provide the following information:

CPU
Motherboard and BIOS version
RAM type and volume
Storage and its type
Kernel version and its .config

And also the complete output of these utilities:

dmesg
lspci -vvv
lshw
free
vmstat (when the bug is exposed)

cat /proc/interrupts
cat /proc/iomem
cat /proc/meminfo
cat /proc/mttr

>CONFIG_BLK_WBT=y
># CONFIG_BLK_WBT_SQ is not set
>CONFIG_BLK_WBT_MQ=y

So writeback throttling is enabled only for multi queue devices in your case. I suppose you need to use blk-mq for your sd* devices to activate writeback throttling (scsi_mod.use_blk_mq=1 boot flag) or to recompile kernel with CONFIG_BLK_WBT_SQ enabled.

Created attachment 255501
all required files in one archive

After setting boot flag "scsi_mod.use_blk_mq=1", the freezes became much shorter. I'm not sure now that they are at the kernel level. More look like that window manager (GNOME mutter) is written in such a way that freezes mouse while loading list of applications. To finally defeat freezes, seems need to make the window manager not paged into the swap file.

I'm also catch vmstat output when freeze occurred:
# vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r b swpd free buff cache si so bi bo in cs us sy id wa st
 2 6 15947052 205136 112592 4087608 32 41 93 119 7 23 43 19 37 1 0

Twice I asked you you to try disabling SWAP altogether and you still haven't.

I'm unsubscribing from this bug report.

Created attachment 274511
Per deice dirty ration configuration support

Per device dirty bytes configuration

Per device dirty bytes configuration. Patch is not ideal, i'm make it for smoothly flash drive wriring by passing smaller value of dirty byte per removeable device.

>> Path
# ls /sys/block/sdc/bdi/
dirty_background_bytes dirty_background_ratio dirty_bytes dirty_ratio max_ratio min_pages_to_flush min_ratio power read_ahead_kb stable_pages_required subsystem uevent

>> udev Rule for removeables device
# cat /etc/udev/rules.d/90-dirty-flash.rules
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{removable}=="1", ATTR{bdi/dirty_bytes}="4194304"

Hello Ben,

I was trying to figure out the issue but not really sure what exactly the error here. Is the bug fixed already? Or do we have some sources that maybe help us? Thank you

Carlo B.
https://bondereduction.ci/

Was this bug actually fixed? The status shows CLOSED CODE_FIX with a last modified date of Dec 5 2018. I don't see any updates as to what was corrected, and what version the fix will be put into?

Created attachment 282477
attachment-6179-0.html

This was never fixed and since bug state cheating with no commit info ever
provided even if asked directly, will never be fixed. Nobody just cares and
I guess nobody even figured out who broke the kernel by which changeset and
when. Just buy another couple of Xeons for your zupa-dupa web-serfing
desktop and pray it's enough for loads of waits when you format your
diskette. Another approach is to buy enough ram to hold whole your block
devices set there so write-outs are quick enough and you won't see
microsecond lags. This is complete workaround list they provided since the
bug opened.

вт, 23 апр. 2019 г., 18:21 <email address hidden>:

> https://bugzilla.kernel.org/show_bug.cgi?id=12309
>
> protivakid (<email address hidden>) changed:
>
> What |Removed |Added
>
> ----------------------------------------------------------------------------
> CC| |<email address hidden>
>
> --- Comment #663 from protivakid (<email address hidden>) ---
> Was this bug actually fixed? The status shows CLOSED CODE_FIX with a last
> modified date of Dec 5 2018. I don't see any updates as to what was
> corrected,
> and what version the fix will be put into?
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.

As far as I understand, this is kind of meta-bug: there are multiple causes and multiple fixes.

"I do bulk IO and it gets slow" sounds rather general, and problem that can resurface anytime due to some new underlying issue. So the problem cannot be really "closed for good" no matter how much technical progress is made.

For me 12309 basically stopped happening unless I deliberately tune "/proc/sys/vm/dirty_*" values to non-typical ranges and forgot to revert them back. I see system controllably slowing down processes doing bulk IO so the system in general stays reasonable. This behaviour is one of outcomes of this bug.

I don't expect meaningful technical discussion to be happen in this thread. It should just serve as a hub for linking to specific new issues.

Sure it's a meta bug, but for me 12309 is still actual, and I don't use any tuning for I/O subsystem at all.

Not as bad as years ago when it happens for the first time, but I still have to throttle rtorrent to download at 2.5MB/sec maximum instead of usual 10MB/s if I like to view films in mplayer at same time without jitter/freeze/lag. And that's on powerful and modern enough system with kernel 4.19.27, CPU i7-2600K @ 4.5GHz, RAM 24GB, and HDD 3TB Western Digital Caviar Green WD30EZRX-00D. This is annoying, and I remember time before 12309 when rtorrent without any throttling won't make mplayer to freeze on less powerful hardware.

Created attachment 282483
attachment-22369-0.html

Well, I've tried to report a new bug to investigate my own "my CPU does
nothing because waiting is too hard for it". Of no interest of any kernel
dev. So, just as Linus once said "f**k you Nvidia", the very same goes back
to linux itself. Pity some devs think that make their software linux-bound
(via udev only binding or alsa only sound out) is a good idea (gnome and
even parts of KDE). They forgot 15 years ago they picketed Adobe for having
flash for win only. Now one has to use 12309-bound crap for not having a
way to run his software on another platform.

вт, 23 апр. 2019 г., 21:29 <email address hidden>:

> https://bugzilla.kernel.org/show_bug.cgi?id=12309
>
> --- Comment #665 from _Vi (<email address hidden>) ---
> As far as I understand, this is kind of meta-bug: there are multiple
> causes and
> multiple fixes.
>
> "I do bulk IO and it gets slow" sounds rather general, and problem that can
> resurface anytime due to some new underlying issue. So the problem cannot
> be
> really "closed for good" no matter how much technical progress is made.
>
> For me 12309 basically stopped happening unless I deliberately tune
> "/proc/sys/vm/dirty_*" values to non-typical ranges and forgot to revert
> them
> back. I see system controllably slowing down processes doing bulk IO so the
> system in general stays reasonable. This behaviour is one of outcomes of
> this
> bug.
>
> I don't expect meaningful technical discussion to be happen in this
> thread. It
> should just serve as a hub for linking to specific new issues.
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.

Can 'someone' please open a bounty on creation of a VM test case, f.e. with `vagrant` or `phoronix test suite`?
Basically, a way to reproduce and quantify the perceived/actual performance difference between
> Linux 2.6.17 Released 17 June, 2006
and
> Linux 5.0 Released Sun, 3 Mar 2019

(In reply to Alex Efros from comment #666)
> Not as bad as years ago […]
> And that's on powerful and modern enough system with
> kernel 4.19.27, CPU i7-2600K @ 4.5GHz, RAM 24GB, and HDD 3TB […]
> This is annoying, and I remember time before
> 12309 when rtorrent without any throttling won't make mplayer to freeze on
> less powerful hardware.

Oh yeah, this... i can clearly remember back then when on a then mid-range machine with a lot of compiling (gentoo => 100% cpu �[U+1F923]�) and filesystem work, VLC used to play an HD video stream even under heavy load without any hiccups and micro-stuttering.. It was impressive at the time.. and then.. it broke �[U+1F928]�

tags: added: latest-bios-3304

According to my attempts to fix this bug, I totally disagree with you.

This bug is caused by pure design of current block dev layer. Methods which are good to develop code is absolutely improper for developing ideas. It's probably the key problem of the Linux Comunity. Currently, there is merged WA for block devices with a good queue such as Samsung Pro NVMe.

WBR,

Vitaly

(In reply to _Vi from comment #665)
> As far as I understand, this is kind of meta-bug: there are multiple causes
> and multiple fixes.
>
> "I do bulk IO and it gets slow" sounds rather general, and problem that can
> resurface anytime due to some new underlying issue. So the problem cannot be
> really "closed for good" no matter how much technical progress is made.
>
> For me 12309 basically stopped happening unless I deliberately tune
> "/proc/sys/vm/dirty_*" values to non-typical ranges and forgot to revert
> them back. I see system controllably slowing down processes doing bulk IO so
> the system in general stays reasonable. This behaviour is one of outcomes of
> this bug.
>
> I don't expect meaningful technical discussion to be happen in this thread.
> It should just serve as a hub for linking to specific new issues.

Had this again 20 minutes ago.
Was dopying 8.7GiB of data from one directory to another directory on the same filesystem (ext4 (rw,relatime,data=ordered)) on the same disk (Western Digital WDC WD30EZRX-00D8PB0 spinning metal disk).

The KDE UI became unresponsive (Everything other than /home and user data in on a SSD), could not launch any new applications. Opening a new tab on Firefox to go to Youtube didnt load the page, and kept saying waiting for youtube.com in the status bar (network gets halted?).

dmesg shows these, are they important?

[25013.905943] INFO: task DOMCacheThread:17496 blocked for more than 120 seconds.
[25013.905945] Tainted: P OE 4.15.0-54-generic #58-Ubuntu
[25013.905947] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25013.905949] DOMCacheThread D 0 17496 2243 0x00000000
[25013.905951] Call Trace:
[25013.905954] __schedule+0x291/0x8a0
[25013.905957] schedule+0x2c/0x80
[25013.905959] jbd2_log_wait_commit+0xb0/0x120
[25013.905962] ? wait_woken+0x80/0x80
[25013.905965] __jbd2_journal_force_commit+0x61/0xb0
[25013.905967] jbd2_journal_force_commit+0x21/0x30
[25013.905970] ext4_force_commit+0x29/0x2d
[25013.905972] ext4_sync_file+0x14a/0x3b0
[25013.905975] vfs_fsync_range+0x51/0xb0
[25013.905977] do_fsync+0x3d/0x70
[25013.905980] SyS_fsync+0x10/0x20
[25013.905982] do_syscall_64+0x73/0x130
[25013.905985] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[25013.905987] RIP: 0033:0x7fc9cb839b07
[25013.905988] RSP: 002b:00007fc9a7aeb200 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
[25013.905990] RAX: ffffffffffffffda RBX: 00000000000000a0 RCX: 00007fc9cb839b07
[25013.905992] RDX: 0000000000000000 RSI: 00007fc9a7aeaff0 RDI: 00000000000000a0
[25013.905993] RBP: 0000000000000000 R08: 0000000000000000 R09: 72732f656d6f682f
[25013.905994] R10: 0000000000000000 R11: 0000000000000293 R12: 00000000000001f6
[25013.905995] R13: 00007fc97fc5d038 R14: 00007fc9a7aeb340 R15: 00007fc987523380

KDE have problem too, same copy via CLI or via Ultracopier (GUI) have no problem.
I note too KDE have UI more slow, plasma doing CPU usage in case I use the HDD...

What's the value of `vm.dirty_writeback_centisecs` ?, ie.
$ sysctl vm.dirty_writeback_centisecs

try setting it to 0 to disable it, ie.
`$ sudo sysctl -w vm.dirty_writeback_centisecs=0`

I found that this helps my network transfer not stall/stop at all(for a few seconds when that is =1000 for example) while some kinda of non-async `sync`(command)-like flushing is going on periodically while transferring GiB of data files from sftp to SSD!(via Midnight Commander, on a link limited to 10MiB per second)

vm.dirty_writeback_centisecs is how often the pdflush/flush/kdmflush processes wake up and check to see if work needs to be done.

Coupled with the above I've been using another value:
`vm.dirty_expire_centisecs=1000`
for both cases (when stall and not stall), so this one remained fixed to =1000.

vm.dirty_expire_centisecs is how long something can be in cache before it needs to be written. In this case it's 1 seconds. When the pdflush/flush/kdmflush processes kick in they will check to see how old a dirty page is, and if it's older than this value it'll be written asynchronously to disk. Since holding a dirty page in memory is unsafe this is also a safeguard against data loss.

Well, with the above, at least I'm not experiencing network stalls when copying GiB of data via Midnight Commander's sftp to my SSD until some kernel-caused sync-ing is completed in the background.

I don't know if this will work for others, but if curious about any of my other (sysctl)settings, they should be available for perusing [here](https://github.com/howaboutsynergy/q1q/tree/0a2cd4ba658067140d3f0ae89a0897af54da52a4/OSes/archlinux/etc/sysctl.d)

correction:

> In this case it's 1 seconds.

*In this case it's 10 seconds.

Also, heads up:
I found that 'tlp' in `/etc/default/tlp`, on ArchLinux, will overwrite the values set in `/etc/sysctl.d/*.conf` files if these are set to non `0`, ie.
MAX_LOST_WORK_SECS_ON_AC=10
MAX_LOST_WORK_SECS_ON_BAT=10
will set:
vm.dirty_expire_centisecs=1000
vm.dirty_writeback_centisecs=1000

regardless of what values you set them in `/etc/sysctl.d/*.conf` files.

/etc/default/tlp is owned by tlp 1.2.2-1

Not setting those (eg. commenting them out) will have tlp set the to its default of 15 sec (aka =1500). So the workaround is to set them to =0 which makes tlp not set them at all, thus the values from `/etc/sysctl.d/*.conf` files is allowed to remain as set.

Last response from OR 2010, upstream commit fixed since 2012. -> Invalid

no longer affects: linux (Ubuntu)
affects: linux → linux (Ubuntu)
Changed in linux (Ubuntu):
importance: High → Undecided
status: Fix Released → New
status: New → Invalid
Brad Figg (brad-figg) on 2019-07-24
tags: added: ubuntu-certified
Displaying first 40 and last 40 comments. View all 710 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.