Kernel I/O scheduling writes starving reads, local DoS

Bug #1064521 reported by Alex Bligh on 2012-10-09
300
This bug affects 9 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Unassigned
Precise
Medium
Unassigned
Quantal
Medium
Unassigned

Bug Description

On the Precise default kernel, it is possible by executing zcat on a large file for an unprivileged user to disrupt I/O sufficiently that it causes serious disruption.

Serious disruption means (e.g.) a single MySQL update hangs for over 120 seconds on the default scheduler (cfq), and between 1 and 11 seconds on the deadline scheduler.

This is reproducible on 2 sets of hardware using:

root@extility-qa-test:~# uname -a
Linux extility-qa-test 3.2.0-29-generic #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
   linux-image-3.2.0-29-generic 3.2.0-29.46
root@extility-qa-test:~# cat /proc/sys/vm/dirty_ratio
20
root@extility-qa-test:~# cat /proc/sys/vm/dirty_background_ratio
10

No such problems occur on Lucid OS running the Oneiric Backports kernel.

root@management-dev2:~# uname -a
Linux management-dev2 3.0.0-15-server #26~lucid1-Ubuntu SMP Wed Jan 25 15:55:45 UTC 2012 x86_64 GNU/Linux
linux-image-3.0.0-15-server 3.0.0-15.26~lucid1

In order to replicate, download (e.g.) this gzipped Lucid image (note this is not the OS we are running, this is just an example of the a file that causes the problem):
 http://repo.flexiant.com/images/public/kvm/ubuntu10.04.img.gz
and as un unprivileged user, on a default, untuned Precise install, do
  zcat ubuntu10.04.img.gz > test

Now in another window execute any trivial mysql update on any table. Note that this can take a hugely long time.

"show full processlist" in mysql console will show the time taken executing the command.

In kernel logs (with cfq) we see e.g.:

Oct 8 14:57:02 extility-qa-test kernel: [ 3840.268048] INFO: task mysqld:1358 blocked for more than 120 seconds.
Oct 8 14:57:02 extility-qa-test kernel: [ 3840.268144] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 8 14:57:02 extility-qa-test kernel: [ 3840.268267] mysqld D ffffffff81806200 0 1358 1 0x00000000
Oct 8 14:57:02 extility-qa-test kernel: [ 3840.268272] ffff8801921fde48 0000000000000082 ffff8801921fde00 0000000300000001
Oct 8 14:57:02 extility-qa-test kernel: [ 3840.268278] ffff8801921fdfd8 ffff8801921fdfd8 ffff8801921fdfd8 0000000000013780
Oct 8 14:57:02 extility-qa-test kernel: [ 3840.268283] ffff880195169700 ffff880191f79700 ffff8801921fde58 ffff8801912b2800
Oct 8 14:57:02 extility-qa-test kernel: [ 3840.268288] Call Trace:
Oct 8 14:57:02 extility-qa-test kernel: [ 3840.268298] [<ffffffff816579cf>] schedule+0x3f/0x60
Oct 8 14:57:02 extility-qa-test kernel: [ 3840.268303] [<ffffffff812650d5>] jbd2_log_wait_commit+0xb5/0x130
Oct 8 14:57:02 extility-qa-test kernel: [ 3840.268308] [<ffffffff8108aa50>] ? add_wait_queue+0x60/0x60
Oct 8 14:57:02 extility-qa-test kernel: [ 3840.268313] [<ffffffff81211248>] ext4_sync_file+0x208/0x2d0
Oct 8 14:57:02 extility-qa-test kernel: [ 3840.268317] [<ffffffff81177ba0>] ? vfs_write+0x110/0x180
Oct 8 14:57:02 extility-qa-test kernel: [ 3840.268321] [<ffffffff811a63a6>] do_fsync+0x56/0x80
Oct 8 14:57:02 extility-qa-test kernel: [ 3840.268325] [<ffffffff811a66d0>] sys_fsync+0x10/0x20
Oct 8 14:57:02 extility-qa-test kernel: [ 3840.268329] [<ffffffff81661ec2>] system_call_fastpath+0x16/0x1b
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268176] INFO: task mysqld:1358 blocked for more than 120 seconds.
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268282] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268393] mysqld D ffffffff81806200 0 1358 1 0x00000000
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268399] ffff8801921fde48 0000000000000082 ffff8801921fde00 0000000300000001
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268405] ffff8801921fdfd8 ffff8801921fdfd8 ffff8801921fdfd8 0000000000013780
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268410] ffff880195169700 ffff880191f79700 ffff8801921fde58 ffff8801912b2800
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268415] Call Trace:
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268426] [<ffffffff816579cf>] schedule+0x3f/0x60
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268431] [<ffffffff812650d5>] jbd2_log_wait_commit+0xb5/0x130
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268436] [<ffffffff8108aa50>] ? add_wait_queue+0x60/0x60
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268441] [<ffffffff81211248>] ext4_sync_file+0x208/0x2d0
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268445] [<ffffffff81177ba0>] ? vfs_write+0x110/0x180
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268450] [<ffffffff811a63a6>] do_fsync+0x56/0x80
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268454] [<ffffffff811a66d0>] sys_fsync+0x10/0x20
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268458] [<ffffffff81661ec2>] system_call_fastpath+0x16/0x1b
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268473] INFO: task master:1525 blocked for more than 120 seconds.
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.268563] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

This isn't just a problem with MySQL. Here it is killing Postgres:

Oct 8 14:59:02 extility-qa-test kernel: [ 3960.269090] INFO: task postgres:2188 blocked for more than 120 seconds.
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.269188] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.269298] postgres D ffffffff81806200 0 2188 2186 0x00000000
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.269302] ffff8801919cbe48 0000000000000082 ffff8801919cbe00 0000000300000001
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.269307] ffff8801919cbfd8 ffff8801919cbfd8 ffff8801919cbfd8 0000000000013780
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.269312] ffff880195169700 ffff8801916fae00 ffff8801919cbe58 ffff8801912b2800
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.269317] Call Trace:
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.269321] [<ffffffff816579cf>] schedule+0x3f/0x60
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.269325] [<ffffffff812650d5>] jbd2_log_wait_commit+0xb5/0x130
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.269328] [<ffffffff8108aa50>] ? add_wait_queue+0x60/0x60
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.269332] [<ffffffff81211248>] ext4_sync_file+0x208/0x2d0
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.269335] [<ffffffff81177ba0>] ? vfs_write+0x110/0x180
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.269339] [<ffffffff811a63a6>] do_fsync+0x56/0x80
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.269342] [<ffffffff811a66d0>] sys_fsync+0x10/0x20
Oct 8 14:59:02 extility-qa-test kernel: [ 3960.269346] [<ffffffff81661ec2>] system_call_fastpath+0x16/0x1b

Standard information follows:

root@extility-qa-test:~# lsb_release -rd
Description: Ubuntu 12.04.1 LTS
Release: 12.04

root@extility-qa-test:~# apt-cache policy linux-image-3.2.0-29-generic
linux-image-3.2.0-29-generic:
  Installed: 3.2.0-29.46
  Candidate: 3.2.0-29.46
  Version table:
 *** 3.2.0-29.46 0
        500 http://gb.archive.ubuntu.com/ubuntu/ precise-updates/main amd64 Packages
        500 http://security.ubuntu.com/ubuntu/ precise-security/main amd64 Packages
        100 /var/lib/dpkg/status

Expected behaviour: same as Lucid, perhaps MySQL slows down a bit, but cannot be DoS'd

What actually happened: local DoS of MySQL.
---
AlsaDevices:
 total 0
 crw-rw---T 1 root audio 116, 1 Oct 9 10:02 seq
 crw-rw---T 1 root audio 116, 33 Oct 9 10:02 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.0.1-0ubuntu13
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
DistroRelease: Ubuntu 12.04
HibernationDevice: RESUME=UUID=5cd80c3d-3646-42ed-bf16-528a0151228d
InstallationMedia: Ubuntu-Server 12.04.1 LTS "Precise Pangolin" - Release amd64 (20120817.3)
IwConfig: Error: [Errno 2] No such file or directory
MachineType: Dell Inc. PowerEdge R210 II
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 LANGUAGE=en_GB:en
 TERM=xterm-color
 LANG=en_GB.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-29-generic root=UUID=9daecda1-1f0f-41a0-91b8-5981fdea5c9e ro quiet
ProcVersionSignature: Ubuntu 3.2.0-29.46-generic 3.2.24
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-29-generic N/A
 linux-backports-modules-3.2.0-29-generic N/A
 linux-firmware 1.79
RfKill: Error: [Errno 2] No such file or directory
Tags: precise
Uname: Linux 3.2.0-29-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

dmi.bios.date: 03/13/2012
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 2.0.5
dmi.board.name: 03X6X0
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 23
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr2.0.5:bd03/13/2012:svnDellInc.:pnPowerEdgeR210II:pvr:rvnDellInc.:rn03X6X0:rvrA00:cvnDellInc.:ct23:cvr:
dmi.product.name: PowerEdge R210 II
dmi.sys.vendor: Dell Inc.

Alex Bligh (ubuntu-alex-org) wrote :

Can't apport-collect for this bug (presumably because of security tag)

root@extility-qa-test:~# apport-collect 1064512
Traceback (most recent call last):
  File "/usr/bin/apport-cli", line 365, in <module>
    if not app.run_argv():
  File "/usr/lib/python2.7/dist-packages/apport/ui.py", line 551, in run_argv
    return self.run_update_report()
  File "/usr/lib/python2.7/dist-packages/apport/ui.py", line 401, in run_update_report
    if not self.crashdb.can_update(self.options.update_report):
  File "/usr/lib/python2.7/dist-packages/apport/crashdb_impl/launchpad.py", line 500, in can_update
    bug = self.launchpad.bugs[id]
  File "/usr/lib/python2.7/dist-packages/lazr/restfulclient/resource.py", line 950, in __getitem__
    raise KeyError(key)
KeyError: 1064512

visibility: private → public

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1064521

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: precise

apport information

tags: added: apport-collected
description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Alex Bligh (ubuntu-alex-org) wrote :

This is related to:
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1008400

The outcome of bug 1008400 is (I think) 'this is fixed on Quantal by changing the default IO scheduler to Deadline'. Two problems with that:
1. We see the same problem (but to a lesser extent) on the deadline scheduler. It thus can't be just a scheduler issue.
2. Precise is the LTS release - hence we need to use Precise.

We are going to try the Quantal kernel with a deadline scheduler to see if it's any better.

I have asked our QA team to produce a script that regularly times MySQL updates to make testing this easier.

Ric Gamble (rgamble) wrote :

Please find attached sql-test.py.

This script requires mysql to be installed on the test platform and to be run with sudo if the user is not root.

It will create a database called test_db and a table called test in the local mySQL instance.

It will then update and select from the test table every 10 seconds.

It is currently set to loop 60 times giving 10 minutes of coverage.

It outputs current system time and iteration of the loop.

This script should be run at the same time as the zcat command.

The problem can be found when the timing of each iteration goes beyond 11 seconds

The script was written using Python 2.6.5 and has been tested using Python 2.7.3

Alex Bligh (ubuntu-alex-org) wrote :

Easier way of replicating:

dd if=/dev/zero of=- bs=1M count=50000 | dd if=- of=myfile bs=1M count=50000

(two dd's to prevent dd optimisation of /dev/zero writes), whilst running Ric's script (he's just about to upload a better version.

Ric Gamble (rgamble) wrote :

Please find attached the updated sql-test2.py script

It should be run in the same way as the previous one, however the output has been reformatted to show sql query execution time

Ric Gamble (rgamble) wrote :

file attached here

Ric Gamble (rgamble) wrote :

Please find attached log files generated by the sql-test2.py script attached above.

These were generated while running "dd if=/dev/zero of=- bs=1M count=50000 | dd if=- of=myfile bs=1M count=50000"

On the hardware described at the start of the bug with the deadline scheduler active.

You can see where the timing of the sql update command goes from around 3-400 millionths of a second to taking a minute plus on a few occasions,

Ric Gamble (rgamble) wrote :

log file 2

Ric Gamble (rgamble) wrote :

Output of ( while /bin/true ; do date ; cat /proc/meminfo ; vmstat 1 2 ; echo " "; sleep 1 ; done ) > mylog
files edited to show only data which matches previously attached log files

File for run starting at 10:21

Ric Gamble (rgamble) wrote :

Output of ( while /bin/true ; do date ; cat /proc/meminfo ; vmstat 1 2 ; echo " "; sleep 1 ; done ) > mylog
files edited to show only data which matches previously attached log files

File for run starting at 10:46

Ric Gamble (rgamble) wrote :

sql-test2.py log file output from run started 10:11 this morning

Ric Gamble (rgamble) wrote :

Output of ( while /bin/true ; do date ; cat /proc/meminfo ; vmstat 1 2 ; echo " "; sleep 1 ; done ) > mylog
files edited to show only data which matches previously attached log files

File for run starting at 10:11

Changed in linux (Ubuntu):
importance: Undecided → Medium
Ric Gamble (rgamble) wrote :

I have upgraded our Kernel to 3.6.1
(cat /proc/version output)
Linux version 3.6.1-030601-generic (apw@gomeisa) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201210071322 SMP Sun Oct 7 17:23:28 UTC 2012

And ran the DD test with the sql-test2.py script also running, the quantal kernel appears to have the same issues as the precise kernel.

Please find attached log file

Ric Gamble (rgamble) wrote :

11:17 log file

Ric Gamble (rgamble) wrote :

11:35 log file

Ric Gamble (rgamble) wrote :

11:55 log file

tags: added: kernel-bug-exists-upstream kernel-da-key quantal
Changed in linux (Ubuntu Precise):
importance: Undecided → Medium
status: New → Triaged
Changed in linux (Ubuntu Quantal):
status: Confirmed → Triaged

Disabling AIO and changing allocator to SLOB relaxed the problem. With other allocators like SLAB and SLUB I see no improvement.

tags: added: bios-outdated-2.6.1 needs-upstream-testing regression-release
Ivan Baldo (ibaldo) wrote :

And now what happens with Ubuntu 14.04?
(yeah, too lazy myself to test, sorry)

Changed in linux (Ubuntu Quantal):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public Security information  Edit
Everyone can see this security related information.

Other bug subscribers