kcryptd using 100% IO load

Bug #529510 reported by Bodo Bellut
28
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Won't Fix
Undecided
Unassigned

Bug Description

Hi,

my setup is like this:

4 SATA disks are configured as a software RAID5, the complete RAID is encrypted using LUKS and inside this LUKS containers is one volume group with several LVs.
There's also one eSATA disk using LUKS but no RAID or LVM.

When writing to the eSATA disk everything works smoothly.

When reading from the RAID everything works smoothly.

But when writing larger amounts of data (e.g. an rsync running over a 100 MBit/s network link) to the RAID I see kcryptd going into uninterruptible sleep (D) state and consuming 100% CPU load (IO wait). If I don't stop the data the system crashes soon after, last overall system load displayed is about 8 (normal idle load is 0.2 on this system).

When I'm using GRML 2009.10 in the same system I can write with full speed to the RAID, tested from the eSATA disk.

I've tried renicing kblockd and kcryptd to 15 according to https://lists.linux-foundation.org/pipermail/bugme-new/2007-June/016431.html which somewhat seems to lessen the impact, at least I have the chance to stop the data coming in before the system crashes this way.

There's nothing in dmesg whatsoever.

$ uname -a
Linux sakura 2.6.31-19-generic #56-Ubuntu SMP Thu Jan 28 02:39:34 UTC 2010 x86_64 GNU/Linux

$ cat /proc/version_signature
Ubuntu 2.6.31-19.56-generic

$ apt-cache policy linux
linux:
  Installed: (none)
  Candidate: 2.6.31.19.32
  Version table:
     2.6.31.19.32 0
        500 http://de.archive.ubuntu.com karmic-updates/main Packages
        500 http://security.ubuntu.com karmic-security/main Packages
     2.6.31.14.27 0
        500 http://de.archive.ubuntu.com karmic/main Packages

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 9.10
Release: 9.10
Codename: karmic

regards,
Bodo

Revision history for this message
Bodo Bellut (bodo-bellut) wrote :
Revision history for this message
Bodo Bellut (bodo-bellut) wrote :

The IO load happens at that time when the data is flushed from the page cache, from what I see this looks like the md layer won'T accept data as fast as the crypto layer tries to get rid of it.

tags: added: kernel-series-unknown
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Hi Bodo,

This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux 529510

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-kernel-logs
tags: added: needs-upstream-testing
tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : apport-collect data

Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: bodo 4888 F.... pulseaudio
 /dev/snd/controlC0: bodo 4888 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'AudioPCI'/'Ensoniq AudioPCI ENS1370 at 0xcc00, irq 23'
   Mixer name : 'Asahi Kasei AK4531'
   Components : 'AK4531'
   Controls : 43
   Simple ctrls : 15
Card1.Amixer.info:
 Card hw:1 'Intel'/'HDA Intel at 0xffaf8000 irq 22'
   Mixer name : 'Analog Devices AD1988B'
   Components : 'HDA:11d4198b,1043822d,00100400'
   Controls : 48
   Simple ctrls : 26
Card2.Amixer.info:
 Card hw:2 'HDMI'/'HDA ATI HDMI at 0xff6ec000 irq 17'
   Mixer name : 'ATI R6xx HDMI'
   Components : 'HDA:1002aa01,00aa0100,00100000'
   Controls : 4
   Simple ctrls : 1
Card2.Amixer.values:
 Simple mixer control 'IEC958',0
   Capabilities: pswitch pswitch-joined
   Playback channels: Mono
   Mono: Playback [off]
CurrentDmesg:

DistroRelease: Ubuntu 9.10
HibernationDevice: RESUME=UUID=e70e0d86-0d03-4a24-87ea-4d059c8bc9ff
MachineType: System manufacturer System Product Name
NonfreeKernelModules: fglrx
Package: linux (not installed)
ProcCmdLine: root=/dev/mapper/sakura-root ro quiet
ProcVersionSignature: Ubuntu 2.6.31-20.58-generic
RelatedPackageVersions:
 linux-backports-modules-2.6.31-20-generic N/A
 linux-firmware 1.26
RfKill:
 0: hci0: Bluetooth
  Soft blocked: no
  Hard blocked: no
Uname: Linux 2.6.31-20-generic x86_64
UserGroups: adm admin audio cdrom dialout dip floppy fuse lpadmin plugdev sambashare scard video
WifiSyslog:

WpaSupplicantLog:

dmi.bios.date: 07/13/2007
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 0705
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: P5B-Premium
dmi.board.vendor: ASUSTeK Computer INC.
dmi.board.version: Rev 1.xx
dmi.chassis.asset.tag: Asset-1234567890
dmi.chassis.type: 3
dmi.chassis.vendor: Chassis Manufacture
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0705:bd07/13/2007:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKComputerINC.:rnP5B-Premium:rvrRev1.xx:cvnChassisManufacture:ct3:cvrChassisVersion:
dmi.product.name: System Product Name
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

Revision history for this message
Bodo Bellut (bodo-bellut) wrote : AlsaDevices.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : AplayDevices.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : ArecordDevices.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : BootDmesg.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : Card0.Amixer.values.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : Card1.Amixer.values.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : Card1.Codecs.codec.0.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : Card2.Codecs.codec.0.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : IwConfig.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : Lspci.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : Lsusb.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : PciMultimedia.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : ProcCpuinfo.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : ProcEnviron.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : ProcInterrupts.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : ProcModules.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : UdevDb.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : UdevLog.txt
Revision history for this message
Bodo Bellut (bodo-bellut) wrote : XsessionErrors.txt
Changed in linux (Ubuntu):
status: Incomplete → New
tags: added: apport-collected
Revision history for this message
Bodo Bellut (bodo-bellut) wrote :

Hi,

yes, it's still an issue, I can't, unfortunately, test other distributions or kernels as this is a production system I can't interrupt for tests easily. I can, however, test various settinsg as long as I don't have to reboot the system.

I might be convinced to upgrade to 10.04 once it's at least 1 month old.

regards,
Bodo

Brad Figg (brad-figg)
tags: added: acpi-table-checksum
Revision history for this message
foobar (timo-bumke-deactivatedaccount) wrote :

I am experiencing the exact same problem, any solution yet?

Revision history for this message
foobar (timo-bumke-deactivatedaccount) wrote :

Sorry, here is some more information on my system:

Ubuntu server 10.04 LTS x86_64 with 5 SATA disks configured as a hardware raid (3ware 9550) and several encrypted volumes. A simple "dd if=/dev/zero of=/some/volume/1gb.file bs=1G count=1 oflag=direct" causes enough IO for kcryptd to use 100% load and that once in a while freezes the system.

Kernel: 2.6.32-27 amd64

Revision history for this message
foobar (timo-bumke-deactivatedaccount) wrote :

** Edit **

Of course I am writing to the the device directly. So it is "dd if=/dev/zero of=/dev/mapper/encryptedvolume bs=1G count=1 oflag=direct". But the outcome is still the same.

Revision history for this message
Kai Jauch (kaijauch) wrote :

I'm experiencing the same problem on a (arguably fairly underpowered, but dual-core) Intel Atom D510MO running Ubuntu 10.10 server (amd64).
RAID1 on 2 x 2TB SATA drives, a single dm-crypt on top of that which contains a single LVM VG with two LVs for / and /home.

2 Clients (one with a 100MBit/s, the other with a 1GBit/s connection to the server, the server itself is connected with 1GBit/s) making backups on a SMB share on /home.

If only the 100MBit/s-client is pumping out data, kcryptd uses ~30% CPU, iptraf showing ~90MBit/s traffic incoming, system responds quite nicely.
If the 1Gbit/s-client starts to pump out data in parallel, kcryptd jumps to 100%, making the server virtually unusable as far as IO is concerned (takes ages to login or even start top, but it's fine if it doesn't need IO). Surprisingly, iptraf only jumps up to max. 120MBit/s incoming while this is happening, so there's not *that* much more traffic to handle, but now probably coming from 2 separate SMB threads.

I found this recent commit to Linus' tree which could solve this problem. The dm-crypt scalability patch from Andi Kleen apparently was in development for quite a while and according to the commit message seems to address this problem.

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c029772125594e31eb1a5ad9e0913724ed9891f2

Revision history for this message
Kai Jauch (kaijauch) wrote :

Forgot to mention: I'm running linux-image-2.6.35-22-server (2.6.35-22.33).

Revision history for this message
foobar (timo-bumke-deactivatedaccount) wrote :

It seems to me, that it is a 64bit cpu issue? The patch does address this problem, but as far as I can see using multiple cores more efficiently does not solve the problem, it will probably only occur less, since the system has more ressources to draw on. Still something inside the kcryptd causes the system to freeze under heavy file io.

Please excuse my English skills, but I am not a native speaker :-)

Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
tags: added: b73a1py79
Revision history for this message
Brad Figg (brad-figg) wrote : Unsupported series, setting status to "Won't Fix".

This bug was filed against a series that is no longer supported and so is being marked as Won't Fix. If this issue still exists in a supported series, please file a new bug.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: Confirmed → Won't Fix
Revision history for this message
Bodo Bellut (bodo-bellut) wrote :

This bus is still present in 10.04.2 LTS with linux 2.6.32.32.38.

Revision history for this message
Kip Warner (kip) wrote :

Why has this been closed? It is still a problem. I am using kernel 2.6.35-30 on amd64 hardware. There is no point in opening a new bug and having to repost all the same information and correspondence. Please re-open it Brad, just file it against a new series if you wish.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.