kswapd0 100% CPU usage

Bug #1518457 reported by Sam Lade on 2015-11-20
628
This bug affects 127 people
Affects Status Importance Assigned to Milestone
Linux
Unknown
Unknown
linux (Ubuntu)
High
Dan Streetman
Xenial
High
Dan Streetman
Yakkety
High
Dan Streetman

Bug Description

As per bug 721896 and various others:

I'm on an AWS t2.micro instance (Xeon E5-2670, 991MiB of memory). Occasionally (about once a day), kswapd0 falls into a busy loop and spins on 100% CPU usage indefinitely. This can be provoked by copying/writing large files (e.g. dding a 256MB file), but it happens occasionally otherwise. System memory usage (not including buffers/caches) currently sits at 36%, which is typical[1]. Initially I had no swap space configured; I've since tried enabling a 256MB swap file, but the problem continues to occur and no swap space is used. The system can be recovered with `echo 1 > /proc/sys/vm/drop_caches`.

Happy to provide further information/take further debugging actions.

[1] Full output from `free`:
             total used free shared buffers cached
Mem: 1014936 483448 531488 28556 9756 112700
-/+ buffers/cache: 360992 653944
Swap: 262140 0 262140

ProblemType: Bug
DistroRelease: Ubuntu 15.10
Package: linux-image-4.2.0-18-generic 4.2.0-18.22
ProcVersionSignature: Ubuntu 4.2.0-18.22-generic 4.2.3
Uname: Linux 4.2.0-18-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Nov 19 19:40 seq
 crw-rw---- 1 root audio 116, 33 Nov 19 19:40 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.19.1-0ubuntu5
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: N/A
Date: Fri Nov 20 20:44:30 2015
Ec2AMI: ami-1c552a76
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-east-1d
Ec2InstanceType: t2.micro
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99
MachineType: Xen HVM domU
PciMultimedia:

ProcEnviron:
 TERM=screen
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 xen
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.2.0-18-generic root=UUID=35bc01f4-4602-4823-976e-508edef899df ro console=tty1 console=ttyS0 net.ifnames=0
RelatedPackageVersions:
 linux-restricted-modules-4.2.0-18-generic N/A
 linux-backports-modules-4.2.0-18-generic N/A
 linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UdevLog: Error: [Errno 2] No such file or directory: '/var/log/udev'
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 05/06/2015
dmi.bios.vendor: Xen
dmi.bios.version: 4.2.amazon
dmi.chassis.type: 1
dmi.chassis.vendor: Xen
dmi.modalias: dmi:bvnXen:bvr4.2.amazon:bd05/06/2015:svnXen:pnHVMdomU:pvr4.2.amazon:cvnXen:ct1:cvr:
dmi.product.name: HVM domU
dmi.product.version: 4.2.amazon
dmi.sys.vendor: Xen

CVE References

Sam Lade (sam-sentynel) wrote :

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
mecat (habdankm) wrote :

same issue here:
root@orangepi:/var/log# uname -a
Linux orangepi 3.4.39 #2 SMP PREEMPT Mon Oct 12 12:03:03 CEST 2015 armv7l armv7l armv7l GNU/Linux
root@orangepi:/var/log# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=15.10
DISTRIB_CODENAME=wily
DISTRIB_DESCRIPTION="Ubuntu 15.10"
also
echo 1 > /proc/sys/vm/drop_caches
temporary solve issue

Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.4 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.4-rc2+cod1-wily/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
tags: added: kernel-da-key
Sam Lade (sam-sentynel) wrote :

This was a clean build, so I don't have any information about previous versions unfortunately. (The previous server, which didn't have this issue, was different AWS hardware and the previous Ubuntu version.)

I've tested with the latest mainline kernel and this is still occurring.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: kernel-bug-exists-upstream
Changed in linux (Ubuntu):
status: Confirmed → Triaged
Sean Groarke (sgroarke) wrote :

Pretty much same description here. Started when I upgraded Amazon instance to 15.10.

Causing a lot of disruption - available to test also if it helps move us forward.

Joseph Salisbury (jsalisbury) wrote :

I'd like to perform a bisect to figure out what commit caused this regression. We need to identify the earliest kernel where the issue started happening as well as the latest kernel that did not have this issue.

Can you test the following kernels and report back? We are looking for the first kernel version that exhibits this bug:

4.0 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.0-vivid/
4.1 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.1-wily/
4.2 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.2-wily/

You don't have to test every kernel, just up until the kernel that first has this bug. We can then narrow down further by testing some release candidates.

Thanks in advance!

Changed in linux (Ubuntu):
importance: Medium → High
Sam Lade (sam-sentynel) wrote :

Okay, I cloned my server and tried kernel versions. The latest version which does _not_ exhibit the issue is 3.12.51. The first which does is 3.13-rc1.

Joseph Salisbury (jsalisbury) wrote :

Thanks for testing, Sam. Could you also test the 3.12 final version, since 3.13-rc1 is the next linear version after 3.12 final. The kernel can be downloaded from:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.12-trusty/

Sam Lade (sam-sentynel) wrote :

3.12 final doesn't exhibit the issue either.

Joseph Salisbury (jsalisbury) wrote :

I started a kernel bisect between v3.12 final and v3.13-rc1. The kernel bisect will require testing of about 7-10 test kernels.

I built the first test kernel, up to the following commit:
5cbb3d216e2041700231bcfc383ee5f8b7fc8b74

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1518457

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Sam Lade (sam-sentynel) wrote :

No bug on that version.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
e1f56c89b040134add93f686931cc266541d239a

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1518457

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Sam Lade (sam-sentynel) wrote :

No bug on that version.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
9073e1a804c3096eda84ee7cbf11d1f174236c75

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1518457

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Sam Lade (sam-sentynel) wrote :

Bug is present in this version.

Changed in linux (Ubuntu):
assignee: nobody → Joseph Salisbury (jsalisbury)
status: Triaged → In Progress
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
ab0169bb5cc4a5c86756dde662087f9d12302eb0

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1518457

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Sam Lade (sam-sentynel) wrote :

No bug in that version.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
f080480488028bcc25357f85e8ae54ccc3bb7173

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1518457

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Sam Lade (sam-sentynel) wrote :

Bug is present in this version.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
b746f9c7941f227ad582b4f0bc981f3adcbc46b2

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1518457

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Sam Lade (sam-sentynel) wrote :

No bug in that version.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
72c1253574a1854b0b6f196e24cd0dd08c1ad9b9

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1518457

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Sam Lade (sam-sentynel) wrote :
Download full text (3.3 KiB)

It's crashing on boot with this version. It's related to paging, so it might be relevant to the issue, so I've attached the full dmesg and here's the actual crash:

[ 3.716345] BUG: unable to handle kernel paging request at 000060ffc0002370
[ 3.720056] IP: [<ffffffff811a6ae1>] mem_cgroup_move_account+0xd1/0x250
[ 3.720056] PGD 0
[ 3.720056] Oops: 0000 [#1] SMP
[ 3.720056] Modules linked in: ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp xt_recent xt_conntrack nf_conntrack iptable_filter ip_tables x_tables autofs4 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse floppy pata_acpi
[ 3.720056] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 3.12.0-031200rc2-generic #201512161751
[ 3.720056] Hardware name: Xen HVM domU, BIOS 4.2.amazon 12/07/2015
[ 3.720056] Workqueue: events css_killed_work_fn
[ 3.720056] task: ffff88003da946b0 ti: ffff88003daac000 task.ti: ffff88003daac000
[ 3.720056] RIP: 0010:[<ffffffff811a6ae1>] [<ffffffff811a6ae1>] mem_cgroup_move_account+0xd1/0x250
[ 3.720056] RSP: 0000:ffff88003daadcc8 EFLAGS: 00010046
[ 3.720056] RAX: 0000000000000246 RBX: ffff88003d803a60 RCX: 000000000000053e
[ 3.720056] RDX: 000060ffc0002358 RSI: 0000000000000001 RDI: ffff88003c4e822c
[ 3.720056] RBP: ffff88003daadd20 R08: ffff88003cc55000 R09: 0000000000000004
[ 3.720056] R10: ffff88003c4e8000 R11: 0000000000000001 R12: 0000000000000000
[ 3.720056] R13: ffffea0000e0e980 R14: ffff88003c4e8000 R15: 0000000000000001
[ 3.720056] FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[ 3.720056] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.720056] CR2: 000060ffc0002370 CR3: 0000000036754000 CR4: 00000000001406f0
[ 3.720056] Stack:
[ 3.720056] ffffffff811a7bc8 ffff88003fffb780 ffffea0000e0e980 ffff88003cc55000
[ 3.720056] ffff88003c4e8000 ffff88003c4e822c ffff880036c1da00 ffffea0000e0e980
[ 3.720056] ffff88003fffbcc0 ffff88003d803a60 ffffea0000e0e9a0 ffff88003daadda8
[ 3.720056] Call Trace:
[ 3.720056] [<ffffffff811a7bc8>] ? mem_cgroup_page_lruvec+0x28/0x90
[ 3.720056] [<ffffffff811a8427>] mem_cgroup_reparent_charges+0x257/0x460
[ 3.720056] [<ffffffff811a87df>] mem_cgroup_css_offline+0xaf/0x220
[ 3.720056] [<ffffffff810de897>] offline_css+0x27/0x50
[ 3.720056] [<ffffffff810e199d>] css_killed_work_fn+0x2d/0xa0
[ 3.720056] [<ffffffff81081032>] process_one_work+0x182/0x450
[ 3.720056] [<ffffffff81081dc1>] worker_thread+0x121/0x410
[ 3.720056] [<ffffffff81081ca0>] ? rescuer_thread+0x3d0/0x3d0
[ 3.720056] [<ffffffff81088ba0>] kthread+0xc0/0xd0
[ 3.720056] [<ffffffff81088ae0>] ? kthread_create_on_node+0x120/0x120
[ 3.720056] [<ffffffff816ff4fc>] ret_from_fork+0x7c/0xb0
[ 3.720056] [<ffffffff81088ae0>] ? kthread_create_on_node+0x120/0x120
[ 3.720056] Code: d6 00 55 00 4d 85 e4 4c 8b 55 c8 4c 8b 45 c0 0f 85 a5 00 00 00 41 8b 55 18 85 d2 0f 88 99 00 00 00 49 8b 96 30 02 00 00 45 89 fb <4c> 39 5a 18 0f 8c c2 00 00 00 44 89 f9 f7 d9 89 ce 65 48 01 72
[ 3.720056] RIP [<ffffffff811a6ae1>] mem_cg...

Read more...

Joseph Salisbury (jsalisbury) wrote :

Thanks for testing. I skipped that commit in the bisect, in case it's not related to the bug.

I built the next test kernel, up to the following commit:
cbbc58d4fdfab1a39a6ac1b41fcb17885952157a

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1518457

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Sam Lade (sam-sentynel) wrote :

Same crash on that version.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
3b7834743f9492e3509930feb4ca47135905e640

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1518457

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Sam Lade (sam-sentynel) wrote :

That version has also crashed.

mm (mtl-0) wrote :

This bug is also affecting me on 2 (ident) Xubuntu 15.10 systems:

uname -a: ### 4.2.0-22-generic #27-Ubuntu SMP Thu Dec 17 22:57:08 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
 cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=15.10
DISTRIB_CODENAME=wily
DISTRIB_DESCRIPTION="Ubuntu 15.10"

also
echo 1 or echo 3 > /proc/sys/vm/drop_caches
temporary solves the issue

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
d7876f1be40a16223a44355740de625849504eb5

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1518457

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Sam Lade (sam-sentynel) wrote :

Crashed again.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
732e563373ffc57d38a8a3b6d55f2de865182117

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1518457

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Sam Lade (sam-sentynel) wrote :

Crashed again.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
56aba608257b451f663d25313d5ecae134d5557f

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1518457

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Sam Lade (sam-sentynel) wrote :

Crashed again.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
59ab5a8f4445699e238c4c46b3da63bb9dc02897

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1518457

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Sam Lade (sam-sentynel) wrote :

Crashed again.

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
98fda169290b3b28c0f2db2b8f02290c13da50ef

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1518457

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Sam Lade (sam-sentynel) wrote :

Crashed again.

Changed in linux (Ubuntu):
status: In Progress → Confirmed
assignee: Joseph Salisbury (jsalisbury) → nobody
67 comments hidden view all 147 comments
Rasmus Larsen (rla-2) wrote :

I had this issue too on AWS.

In my case, it was the udev rule for vm-hotadd and the fix as mentioned previously basically came down to "touch /etc/udev/rules.d/40-vm-hotadd.rules" which effectively disables the /lib/udev/rules.d/40-vm-hotadd.rules file (after a reboot).

The udev rule basically seems to only be active for Xen or Hyper-V and while it seems the Hyper-V stuff was also present in previous versions, the Xen stuff seems to be introduced in 15.10 or newer.

So if you're seeing this issue on anything running on Xen, including AWS, try:

touch /etc/udev/rules.d/40-vm-hotadd.rules
reboot

This is probably a bug in Xen or a bug in the kernel.

Download full text (4.7 KiB)

Thank you so much Rasmus!

Your solution worked for me:
sudo touch /etc/udev/rules.d/40-vm-hotadd.rules
reboot

I could always trigger the CPU usage bug with this:
stress --cpu 8 --io 4 --vm 7 --vm-bytes 128M --vm-hang 3 --timeout 60s

If running it once didn't work, a second run would do it. Now I've run it
over and over and kswap hits 1-2% during the stress test and drops back
down as soon as it ends.

Like Rasmus says, it's an AWS instance which runs on Xen.

On 17 June 2016 at 07:37, Rasmus Larsen <email address hidden> wrote:

> I had this issue too on AWS.
>
> In my case, it was the udev rule for vm-hotadd and the fix as mentioned
> previously basically came down to "touch /etc/udev/rules.d/40-vm-
> hotadd.rules" which effectively disables the /lib/udev/rules.d/40-vm-
> hotadd.rules file (after a reboot).
>
> The udev rule basically seems to only be active for Xen or Hyper-V and
> while it seems the Hyper-V stuff was also present in previous versions,
> the Xen stuff seems to be introduced in 15.10 or newer.
>
> So if you're seeing this issue on anything running on Xen, including
> AWS, try:
>
> touch /etc/udev/rules.d/40-vm-hotadd.rules
> reboot
>
> This is probably a bug in Xen or a bug in the kernel.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1518457
>
> Title:
> kswapd0 100% CPU usage
>
> Status in Linux:
> Unknown
> Status in linux package in Ubuntu:
> Confirmed
>
> Bug description:
> As per bug 721896 and various others:
>
> I'm on an AWS t2.micro instance (Xeon E5-2670, 991MiB of memory).
> Occasionally (about once a day), kswapd0 falls into a busy loop and
> spins on 100% CPU usage indefinitely. This can be provoked by
> copying/writing large files (e.g. dding a 256MB file), but it happens
> occasionally otherwise. System memory usage (not including
> buffers/caches) currently sits at 36%, which is typical[1]. Initially
> I had no swap space configured; I've since tried enabling a 256MB swap
> file, but the problem continues to occur and no swap space is used.
> The system can be recovered with `echo 1 > /proc/sys/vm/drop_caches`.
>
> Happy to provide further information/take further debugging actions.
>
>
> [1] Full output from `free`:
> total used free shared buffers cached
> Mem: 1014936 483448 531488 28556 9756 112700
> -/+ buffers/cache: 360992 653944
> Swap: 262140 0 262140
>
> ProblemType: Bug
> DistroRelease: Ubuntu 15.10
> Package: linux-image-4.2.0-18-generic 4.2.0-18.22
> ProcVersionSignature: Ubuntu 4.2.0-18.22-generic 4.2.3
> Uname: Linux 4.2.0-18-generic x86_64
> AlsaDevices:
> total 0
> crw-rw---- 1 root audio 116, 1 Nov 19 19:40 seq
> crw-rw---- 1 root audio 116, 33 Nov 19 19:40 timer
> AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
> ApportVersion: 2.19.1-0ubuntu5
> Architecture: amd64
> ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
> AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq',
> '/dev/snd/timer'] failed with ex...

Read more...

quinn (q-shanahan) wrote :

Is there a fix for this that doesn't require a reboot / something I could add to an ec2 instance's user_data?

Argat (argat) wrote :

The suggested workaround works for me also on AWS instances with 15.10:

sudo touch /etc/udev/rules.d/40-vm-hotadd.rules
reboot

Been stable now for over a week.

Can this workaround please be added to the official EC2 images? EC2 cannot hot swap CPU or memory from what I know.
Having this workaround built in would mean not having to reboot every newly launched instance.

Christopher Snowhill (kode54) wrote :

Is this known to affect paravirtualized instances, or is it restricted to hvm? Can anyone tell me what conditions I need to create this in a fresh instance? I'll spin up a PV t2.nano and see if I can reproduce it there.

I've tried this on t1.micro (PV) and t2.micro (HVM) instances in eu-west-1. To reproduce, I used the following two commands:
sudo apt install docker.io
sudo docker run -p 80:8080 cptactionhank/atlassian-jira

The startup should work, but navigate to http://instanceaddress/ and choose "I'll set it up myself" and "Built In" database. 10-20 seconds after you click "Next", you should see the memory being exhausted and kswapd0 use half of the CPU time.

Test results:
ubuntu/images/ebs-ssd/ubuntu-xenial-16.04-amd64-server-20160721 (PV): kswapd0 OK
ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-20160721 (HVM): kswapd0 high CPU usage

It's worth noting that the memory blocks varies between distro and PV/HVM:
Amazon Linux HVM: /sys/devices/system/memory/memory[0-7]
RHEL 7.2 HVM: /sys/devices/system/memory/memory[0-7]
Ubuntu 16.04 HVM: /sys/devices/system/memory/memory[0-8]
Ubuntu 16.04 PV: /sys/devices/system/memory/memory[0-4]

Why Ubuntu on HVM has an extra memory block is a mystery. It seems to be offline by default, but enabled by the udev hotadd rule. And EC2 doesn't support hotadd.

As Joern Heissler suggested, why not remove the hotadd rule from the official images as a workaround? Although the underlying problem probably is related to why the additional memory block is there at all.

I have run into this issue when using the goofys s3 fuse filesystem (https://github.com/kahing/goofys) on a t2.small instances when copying large files (which causes many memory buffers to be allocated). I think anything that stresses the memory subsystem will be able to trigger it.

Robin Miller (robincello) wrote :

We have been seeing this issue intermittently on a set of servers that were running Ubuntu 15.10 and then 16.04. After overriding that udev vm hotadd rule as suggested above a couple weeks ago, the issue has yet to return (not a conclusive result, but so far so good).

Robin Miller (robincello) wrote :

To add - these servers are all built on the official Ubuntu Amazon EC2 AMIs of the 'ebs-ssd' variety.

The original description says "kswapd0 falls into a busy loop and spins on 100% CPU usage indefinitely". But I think, while the effect may be similar, the actual behavior is a bit different. I think what is happening is that kswapd is accessing pages of memory that are causing the hypervisor (rather than the kernel) to do extra work. If you look at the overall CPU utilization of the instance, you'll see high "st" (steal) time. This can also be provoked manually, for example by trying to read via /proc/kcore from the extra memory region that has been identified in discussion above (for example, try to do a full memory dump with the Volatility getkcore tool).

José Martínez (xosemp) wrote :

I wrote a tiny batch script to reliably reproduce the bug. It mounts a tmpfs filesystem and writes a file that fills 98% of the currently available memory.

You can also pass it a custom percentage, like: ./fillmem.sh 95

<95% is hit and miss on a newly launched instance. 98% (the default) has inmediately spun kswapd to 100% on all of my tests.

----------

@andrewtappert: steal time means the T2 instance has run out of CPU credits. They launch with just enough credits to burst the CPU to 100% for 30 minutes.

I'm always getting SYS time on kswapd, until I hit the T2 credits limit.

All of my t2.micro & nano instances are affected by this in AWS EC2 after upgrading to Ubuntu16.

doing "echo 1 > /proc/sys/vm/drop_caches" (also tried echo 3) works for a short period of time, but it comes back within a few minutes.

I moved a couple instances over to f1.micro on GCP / GCE (1 vCPU, 0.6 GB memory) and the problem seems to have gone away. I can't do this with all of my instances yet though so a fix in AWS would be nice.

Paul Csiki (paulcsiki) wrote :

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518457/comments/69

Seems to have resolved all this issues on AWS t2.micro instances.

Gregg King (greggking) wrote :
Download full text (4.1 KiB)

Did you try this from Rasmussen up above?

sudo touch /etc/udev/rules.d/40-vm-hotadd.rules
reboot

That fixed it for me on EC2

On Saturday, 3 September 2016, Andy Robertson <email address hidden>
wrote:

> All of my t2.micro & nano instances are affected by this in AWS EC2
> after upgrading to Ubuntu16.
>
> doing "echo 1 > /proc/sys/vm/drop_caches" (also tried echo 3) works for
> a short period of time, but it comes back within a few minutes.
>
> I moved a couple instances over to f1.micro on GCP / GCE (1 vCPU, 0.6 GB
> memory) and the problem seems to have gone away. I can't do this with
> all of my instances yet though so a fix in AWS would be nice.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1518457
>
> Title:
> kswapd0 100% CPU usage
>
> Status in Linux:
> Unknown
> Status in linux package in Ubuntu:
> Confirmed
>
> Bug description:
> As per bug 721896 and various others:
>
> I'm on an AWS t2.micro instance (Xeon E5-2670, 991MiB of memory).
> Occasionally (about once a day), kswapd0 falls into a busy loop and
> spins on 100% CPU usage indefinitely. This can be provoked by
> copying/writing large files (e.g. dding a 256MB file), but it happens
> occasionally otherwise. System memory usage (not including
> buffers/caches) currently sits at 36%, which is typical[1]. Initially
> I had no swap space configured; I've since tried enabling a 256MB swap
> file, but the problem continues to occur and no swap space is used.
> The system can be recovered with `echo 1 > /proc/sys/vm/drop_caches`.
>
> Happy to provide further information/take further debugging actions.
>
>
> [1] Full output from `free`:
> total used free shared buffers cached
> Mem: 1014936 483448 531488 28556 9756 112700
> -/+ buffers/cache: 360992 653944
> Swap: 262140 0 262140
>
> ProblemType: Bug
> DistroRelease: Ubuntu 15.10
> Package: linux-image-4.2.0-18-generic 4.2.0-18.22
> ProcVersionSignature: Ubuntu 4.2.0-18.22-generic 4.2.3
> Uname: Linux 4.2.0-18-generic x86_64
> AlsaDevices:
> total 0
> crw-rw---- 1 root audio 116, 1 Nov 19 19:40 seq
> crw-rw---- 1 root audio 116, 33 Nov 19 19:40 timer
> AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
> ApportVersion: 2.19.1-0ubuntu5
> Architecture: amd64
> ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
> AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq',
> '/dev/snd/timer'] failed with exit code 1:
> CRDA: N/A
> Date: Fri Nov 20 20:44:30 2015
> Ec2AMI: ami-1c552a76
> Ec2AMIManifest: (unknown)
> Ec2AvailabilityZone: us-east-1d
> Ec2InstanceType: t2.micro
> Ec2Kernel: unavailable
> Ec2Ramdisk: unavailable
> IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
> Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to
> initialize libusb: -99
> MachineType: Xen HVM domU
> PciMultimedia:
>
> ProcEnviron:
> TERM=screen
> PATH=(custom, no user)
> LANG=en_US.UTF-8
> SHELL=/bin/bash
...

Read more...

Paul Buonopane (zenexer) wrote :

From man 7 udev:

The udev rules are read from the files located in the system rules directory /lib/udev/rules.d, the volatile runtime directory /run/udev/rules.d and the local administration directory /etc/udev/rules.d. All rules files are collectively sorted and processed in lexical order, regardless of the directories in which they live. However, files with identical filenames replace each other. Files in /etc have the highest priority, files in /run take precedence over files with the same name in /lib. This can be used to override a system-supplied rules file with a local file if needed; a symlink in /etc with the same name as a rules file in /lib, pointing to /dev/null, disables the rules file entirely. Rule files must have the extension .rules; other extensions are ignored.

As such, the best way to work around this is:

sudo ln -s /dev/null /etc/udev/rules.d/40-vm-hotadd.rules

Unlike deleting or modifying the original file, this will persist across upgrades without requiring manual conflict resolution.

Richard Trout (richard-trout) wrote :

Thanks #123 (as easy as?) works for me as a workaround.

Poldi (poldi) wrote :

I have the same issue on 16.04

Dan Streetman (ddstreet) wrote :

The problem is a bit complex. The Xen hypervisor uses memory ballooning, to control how many memory pages the guest can use. The kernel enumerates its e820 memory at boot, and since it's only 1G in this case, it all gets placed into the DMA32 zone. Then later during boot when the Xen balloon driver is initialized, it dynamically adds the balloon memory. The kernel always places hot-added memory into the Normal zone however, so the system winds up with the balloon memory, and only the balloon memory, in the Normal zone. Since the balloon driver starts at, or very close to, its memory target, only a very small number of pages are made available, which results in a Normal memory zone that's tiny - only 9 managed pages on the instance I tested. You can read the /proc/zoneinfo file to find the number of managed pages in the Normal zone.

Then when the system encounters memory pressure (i.e. very little free memory left), it wakes up the kswapd daemon to start freeing memory. The kswapd daemon then tries to "balance" memory by "balancing" each zone - DMA, DMA32, and Normal zones. However, it's essentially impossible for it to free pages from the Normal zone, because there are so few pages that whenever one is freed, the next page allocation takes it (because pages are usually allocated from the Normal zone first), and kswapd winds up in a continuous cycle of trying to free pages from the Normal zone forever.

This is also why disabling the udev memory hotadd (see comment 69) works around the problem - it prevents the Xen balloon driver from adding/enabling any of the pages in the Normal zone, so kswapd never has to bother trying to balance it, and thus there's no problem.

This appears to be fixed by Mel Gorman's 34-commit patch series that changes kswapd memory balancing to "per node" instead of "per zone":
https://marc.info/?l=linux-mm&m=146797052519026

That's a rather large patchset to backport to the xenial kernel, but I'll give it a try.

Changed in linux (Ubuntu):
assignee: nobody → Dan Streetman (ddstreet)
Dan Streetman (ddstreet) wrote :

The patch series that fixes this is included in yakkety (if anyone reproduces this on a yakkety kernel, please let me know), so this only needs fixing in xenial.

Dan Streetman (ddstreet) wrote :

On review of the patch series, it's simply too large and complex to backport for this situation; it makes, and depends on, a rather large amount of change to the mm subsystem, and there are easier and smaller ways to work around this bug in the xenial kernel.

Specifically, a comparison of the Xen balloon driver vs. the virtio balloon driver shows an important difference; while the Xen balloon driver hot-adds memory as soon as it initializes, the virtio driver does not hot-add memory; it only adjusts its size to adjust the amount of free memory. Most importantly, the Xen balloon driver initially hot-adds memory but does not make any (except a very small amount) available for system use.

I'm looking at the Xen balloon driver to see how it can be changed to fix this bug.

Seth Forshee (sforshee) on 2016-10-12
Changed in linux (Ubuntu Xenial):
status: New → Fix Committed
importance: Undecided → High
assignee: nobody → Dan Streetman (ddstreet)
Seth Forshee (sforshee) on 2016-10-12
Changed in linux (Ubuntu Yakkety):
status: Confirmed → Fix Committed
Andy Whitcroft (apw) on 2016-10-13
Changed in linux (Ubuntu Yakkety):
status: Fix Committed → Invalid
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.4.0-43.63

---------------
linux (4.4.0-43.63) xenial; urgency=low

  [ Seth Forshee ]

  * Release Tracking Bug
    - LP: #1632375

  * kswapd0 100% CPU usage (LP: #1518457)
    - SAUCE: (no-up) If zone is so small that watermarks are the same, stop zone
      balance.

 -- Seth Forshee <email address hidden> Tue, 11 Oct 2016 07:54:56 -0500

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Seth Forshee (sforshee) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-yakkety' to 'verification-done-yakkety'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-yakkety
CaptSaltyJack (csjubuntu) wrote :

Will we see this fix make it to 16.04 LTS?

The fix is already released in 16.04, make sure you have updated to linux-image 4.4.0-43.63 or later.

PierreF (pierre-fersing) wrote :

If the verification apply also on 16.04, it does fix the issue.

We had a server that triggered the bug at least once a day (I suspect unattended-upgrade run every morning to trigger it). Since the upgrade - 2 days and half ago - the server had no issue.

I have verified that the bug is fixed on 4.8.0-26.28 (yakkety), 4.4.0-43.63 (xenial) and 4.4.0-45.66 (xenial). Doing the same on 4.4.0-42.62 (xenial) reproduced the bug. All tests done on EC2 t2.small.

tags: added: verification-done-yakkety
removed: verification-needed-yakkety
Raniz (raniz-1) wrote :

After upgrading to 4.4.0-45 from 4.4.0-21 the issue seems to have gone away.

Launchpad Janitor (janitor) wrote :
Download full text (3.4 KiB)

This bug was fixed in the package linux - 4.8.0-27.29

---------------
linux (4.8.0-27.29) yakkety; urgency=low

  [ Seth Forshee ]

  * Release Tracking Bug
    - LP: #1635377

  * proc_keys_show crash when reading /proc/keys (LP: #1634496)
    - SAUCE: KEYS: ensure xbuf is large enough to fix buffer overflow in
      proc_keys_show (LP: #1634496)

  * Revert "If zone is so small that watermarks are the same, stop zone balance"
    in yakkety (LP: #1632894)
    - Revert "UBUNTU: SAUCE: (no-up) If zone is so small that watermarks are the
      same, stop zone balance."

  * lts-yakkety 4.8 cannot mount lvm raid1 (LP: #1631298)
    - SAUCE: (no-up) dm raid: fix compat_features validation

  * kswapd0 100% CPU usage (LP: #1518457)
    - SAUCE: (no-up) If zone is so small that watermarks are the same, stop zone
      balance.

  * [Trusty->Yakkety] powerpc/64: Fix incorrect return value from
    __copy_tofrom_user (LP: #1632462)
    - SAUCE: (no-up) powerpc/64: Fix incorrect return value from
      __copy_tofrom_user

  * Ubuntu 16.10: Oops panic in move_page_tables/page_remove_rmap after running
    memory_stress_ng. (LP: #1628976)
    - SAUCE: (no-up) powerpc/pseries: Fix stack corruption in htpe code

  * Paths not failed properly when unmapping virtual FC ports in VIOS (using
    ibmvfc) (LP: #1632116)
    - scsi: ibmvfc: Fix I/O hang when port is not mapped

  * [Ubuntu16.10]KV4.8: kernel livepatch config options are not set
    (LP: #1626983)
    - [Config] Enable live patching on powerpc/ppc64el

  * CONFIG_AUFS_XATTR is not set (LP: #1557776)
    - [Config] CONFIG_AUFS_XATTR=y

  * Yakkety update to 4.8.1 stable release (LP: #1632445)
    - arm64: debug: avoid resetting stepping state machine when TIF_SINGLESTEP
    - Using BUG_ON() as an assert() is _never_ acceptable
    - usb: misc: legousbtower: Fix NULL pointer deference
    - Staging: fbtft: Fix bug in fbtft-core
    - usb: usbip: vudc: fix left shift overflow
    - USB: serial: cp210x: Add ID for a Juniper console
    - Revert "usbtmc: convert to devm_kzalloc"
    - ALSA: hda - Adding one more ALC255 pin definition for headset problem
    - ALSA: hda - Fix headset mic detection problem for several Dell laptops
    - ALSA: hda - Add the top speaker pin config for HP Spectre x360
    - Linux 4.8.1

  * PSL data cache should be flushed before resetting CAPI adapter
    (LP: #1632049)
    - cxl: Flush PSL cache before resetting the adapter

  * thunder nic: avoid link delays due to RX_PACKET_DIS (LP: #1630038)
    - net: thunderx: Don't set RX_PACKET_DIS while initializing

  * crypto/vmx/p8_ghash memory corruption (LP: #1630970)
    - crypto: ghash-generic - move common definitions to a new header file
    - crypto: vmx - Fix memory corruption caused by p8_ghash
    - crypto: vmx - Ensure ghash-generic is enabled

  * arm64: SPCR console not autodetected (LP: #1630311)
    - of/serial: move earlycon early_param handling to serial
    - [Config] CONFIG_ACPI_SPCR_TABLE=y
    - ACPI: parse SPCR and enable matching console
    - ARM64: ACPI: enable ACPI_SPCR_TABLE
    - serial: pl011: add console matching function

  * include/linux/security.h header syntax error with !CONFIG_SECURITYFS
...

Read more...

Changed in linux (Ubuntu Yakkety):
status: Invalid → Fix Released
status: Invalid → Fix Released
1 comments hidden view all 147 comments
Luc Pi (oluc) wrote :

> Dan Streetman (ddstreet) wrote on 2016-10-01: #127
>
> The patch series that fixes this is included in yakkety
> (if anyone reproduces this on a yakkety kernel, please let me know),

I can see it every now and then with Yakkety and linux 4.8.0-27.
Can you advice any action?

$ uname -a
Linux luc-MacBook 4.8.0-27-generic #29-Ubuntu SMP Thu Oct 20 21:01:44 UTC 2016 i686 i686 i686 GNU/Linux

$ lsb_release -a
Description: Ubuntu 16.10
Codename: yakkety

velis (jure-erznoznik-gmail) wrote :

Following this thread I had the same issue, running stock 16.04 (xenial) with kernel 4.4.0-38.
I have upgraded the kernel to 4.8.10-040810 and the end result is the same but symptoms are a bit different:

note: 1GB of RAM, no swap at all

(old kernel)
with 50% of RAM in buffers / cache, kswapd0 took all the CPU it could (~75%). iostat was pretty much at 0% utilisation. server ground to a halt immediately when free ram expired without ever "eating into" the buffers / cache.

(new kernel)
again, with 50% of RAM in buffers / cache, kswapd0 no longer takes 100% CPU, but it still emerges to the top of the list in top as it still manages to take more then the other processes. For a difference, now ~75% cpu processes are *WA*iting. A second difference is that now, after free ram is consumed, for a while buffers / cache are also being reduced in favour of the RAM hungry app. However, even this goes only down to about 40 - 43% or RAM being used by buffers / cache.

Hopefully this is making at least some sense.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.8.0-30.32

---------------
linux (4.8.0-30.32) yakkety; urgency=low

  * CVE-2016-8655 (LP: #1646318)
    - packet: fix race condition in packet_set_ring

 -- Brad Figg <email address hidden> Thu, 01 Dec 2016 08:02:53 -0800

Changed in linux (Ubuntu):
status: Invalid → Fix Released
Dan Streetman (ddstreet) wrote :

> I can see it every now and then with Yakkety and linux 4.8.0-27.
> Can you advice any action?

what do you mean by "every now and then"? you mean kswapd runs at 100% for a short time, occasionally? that's normal.

> kswapd0 no longer takes 100% CPU, but it still emerges to the top of the list in top as it still
> manages to take more then the other processes

this sounds like kswapd is just doing its job now. if you stop all your applications, does kswapd still take up cpu time indefinitely?

Tuomo Sipola (tuomosipola) wrote :

Running Ubuntu 16.10 Yakkety. Old 4.4.0-45-generic kernel works nicely. All the new 4.8.0 kernels eventually go berzerk with kswapd0, just today the newest 4.8.0-32-generic. Normal usage, just a couple of terminals, Chromium, Nautilus windows and Evince PDF documents open.

Running 4.8.13-100.fc23.i686+PAE, no desktop, swapon and swapoff, swappiness 0, 60, and 100.
kswapd0 usage high while reading from /dev/sda (not mounted, internal SSD with 500+MB/s read).
After stop reading, kswapd0 usage is gone.
No Problem when reading from USB-HDD.

Problem with high-speed reading?

Dan Streetman (ddstreet) wrote :

This bug is fixed released already, any new problems should be opened in a new bug.

As of 2017-01-26, I experience this bug on two 16.10 boxes. This kswap0d behavior is 100% reproductible when I copy large files (more than c. 500 Mo) from or to a NTFS hard drive.

As a complement to my previous comment: both computers have a significant amount of RAM (4 and 8 Go respectively) and the swap partition is shown as not used by the system monitor.

Dan Streetman (ddstreet) wrote :

As a comment to those expecting that disabling swap will prevent kswapd from doing anything, that's incorrect. kswapd also is responsible for clearing out the page cache, so even with swap disabled you'll still see it doing work, especially during heavy IO that uses the page cache. For example, copying large files from fast drives. That's completely normal. What this bug is about, is kswapd trying over and over to do its normal work, but making no progress, so it continues to use 100% cpu while the rest of the system is doing nothing. That bug should be fixed now.

If anyone continues to see *this* bug - kswapd using 100% cpu while your system is doing nothing and kswapd never recovers - you can report it here, but you should also open a new bug and reference it in your comment (this bug is fixed and closed). However, if all you see is kswapd using cpu while you're doing things - especially file IO - that isn't this bug, and it probably isn't actually a bug at all.

Displaying first 40 and last 40 comments. View all 147 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.