15.10 swapping heavily after process exceeds 50% physical RAM

Bug #1513673 reported by Timothy Miller
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
Medium
Manpreet

Bug Description

I am experiencing a memory management problem with 15.10 that I did not experience with 15.04. I have a 24-core (48 thread) server with 64G of RAM. I am getting some strange behavior with respect to swapping and physical memory use.

You can see my problem in 'top':

top - 18:52:09 up 1 day, 2:25, 3 users, load average: 1.64, 1.30, 1.18
Tasks: 525 total, 2 running, 523 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.3 us, 1.3 sy, 0.0 ni, 97.0 id, 1.4 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 65937528 total, 37526160 used, 28411368 free, 14396 buffers
KiB Swap: 67071996 total, 67071724 used, 272 free. 104304 cached Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
363 root 20 0 0 0 0 R 100.0 0.0 324:02.98 kswapd0
6725 theosib 20 0 99.398g 0.034t 8920 D 12.0 55.1 59:08.71 common_shell_ex

There's a single user logged in (me), and I have a single process using a large amount of virtual memory. However, something is limiting it to around half the physical memory, while the swap partition is basically full. I've watched these processes (Synopsys Design Compiler) run, and they don't break the 50% mark until swap fills. And another weird thing is that kswapd0 uses very high (usually 100%) CPU. AFAIK, kswapd0 should be I/O bound and therefore not use a lot of CPU time.

I've looked to see if there were any limits being imposed, but ulimit says otherwise:

$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 257447
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 257447
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

I've also tried setting swappiness to 10, but that didn't help.

This is a pretty serious problem. What used to take a few hours on 15.04 now takes more than 5 to 10 times longer, because the process is forced to wait on swap, which is reading and writing at about 20M/sec each. So it's hammering my SSD that contains the swap partition and going really slow.

I've done some googling about kswapd0 using high CPU, and all I can work out is that some people think it's a kernel bug.
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Nov 6 13:28 seq
 crw-rw---- 1 root audio 116, 33 Nov 6 13:28 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.19.1-0ubuntu3
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 15.10
HibernationDevice: RESUME=UUID=e577be79-e4a4-41d9-b946-cd0cd50d05ba
InstallationDate: Installed on 2013-11-27 (709 days ago)
InstallationMedia: Ubuntu-Server 12.04 LTS "Precise Pangolin" - Release amd64 (20120424.1)
MachineType: Supermicro X9DRW
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=screen
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.2.0-16-generic root=UUID=08710eb9-367c-4be0-ba25-700476f690fc ro
ProcVersionSignature: Ubuntu 4.2.0-16.19-generic 4.2.3
RelatedPackageVersions:
 linux-restricted-modules-4.2.0-16-generic N/A
 linux-backports-modules-4.2.0-16-generic N/A
 linux-firmware 1.149
RfKill: Error: [Errno 2] No such file or directory
Tags: wily
Uname: Linux 4.2.0-16-generic x86_64
UpgradeStatus: Upgraded to wily on 2015-10-26 (11 days ago)
UserGroups: adm audio cdrom dip kvm lpadmin plugdev sambashare sudo video
_MarkForUpload: True
dmi.bios.date: 08/08/2013
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 3.0a
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: X9DRW
dmi.board.vendor: Supermicro
dmi.board.version: 0123456789
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: Supermicro
dmi.chassis.version: 0123456789
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr3.0a:bd08/08/2013:svnSupermicro:pnX9DRW:pvr0123456789:rvnSupermicro:rnX9DRW:rvr0123456789:cvnSupermicro:ct3:cvr0123456789:
dmi.product.name: X9DRW
dmi.product.version: 0123456789
dmi.sys.vendor: Supermicro

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1513673/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
affects: ubuntu → linux (Ubuntu)
Timothy Miller (theosib)
affects: linux (Ubuntu) → linux-meta (Ubuntu)
Brad Figg (brad-figg)
affects: linux-meta (Ubuntu) → linux (Ubuntu)
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1513673

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.3 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.3-unstable/

Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Timothy Miller (theosib) wrote : CRDA.txt

apport information

tags: added: apport-collected wily
description: updated
Revision history for this message
Timothy Miller (theosib) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Timothy Miller (theosib) wrote : IwConfig.txt

apport information

Revision history for this message
Timothy Miller (theosib) wrote : JournalErrors.txt

apport information

Revision history for this message
Timothy Miller (theosib) wrote : Lspci.txt

apport information

Revision history for this message
Timothy Miller (theosib) wrote : Lsusb.txt

apport information

Revision history for this message
Timothy Miller (theosib) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Timothy Miller (theosib) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Timothy Miller (theosib) wrote : ProcModules.txt

apport information

Revision history for this message
Timothy Miller (theosib) wrote : UdevDb.txt

apport information

Revision history for this message
Timothy Miller (theosib) wrote : UdevLog.txt

apport information

Revision history for this message
Timothy Miller (theosib) wrote : WifiSyslog.txt

apport information

Revision history for this message
Timothy Miller (theosib) wrote :

I'll see if I can try a newer kernel some time this week. I should note that as long as I'm running multiple processes, this problem doesn't happen. But if I'm running only ONE, then it resists using more than half physical memory.

penalvch (penalvch)
tags: added: bios-outdated-3.2
Revision history for this message
Timothy Miller (theosib) wrote :

I have tried the 4.3 kernel. It definitely has the same bug. See this output from top:

top - 07:57:36 up 1 day, 18:26, 3 users, load average: 1.00, 1.01, 1.05
Tasks: 457 total, 2 running, 455 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2.1 us, 0.0 sy, 0.0 ni, 97.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 65937648 total, 36575292 used, 29362356 free, 24208 buffers
KiB Swap: 67071996 total, 16111052 used, 50960944 free. 553420 cached Mem

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 5515 theosib 20 0 40.620g 0.025t 27352 R 100.0 41.2 583:03.59 common_shell_ex

There is only one process using much memory, but the OS started using swap after only about half the physical RAM was occupied.

Revision history for this message
penalvch (penalvch) wrote :

Timothy Miller, as per http://www.supermicro.com/support/resources/results.aspx an update to your computer's buggy and outdated BIOS is available (3.2). If you update to this following https://help.ubuntu.com/community/BIOSUpdate does it address the issue?

If it doesn't, could you please both specify what happened, and provide the output of the following terminal command:
sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date

For more on BIOS updates and linux, please see https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette .

Please note your current BIOS is already in the Bug Description, so posting this on the old BIOS would not be helpful.

Also, you don't have to create a new bug report.

Once the BIOS is updated, if the problem is still reproducible, and the information above is provided, then please mark this report Status Confirmed. Otherwise, please mark this as Invalid.

Thank you for your understanding.

Revision history for this message
Timothy Miller (theosib) wrote :

I'll willing to try the BIOS update, but I'll bet you dollars to donuts that this will make no difference. It makes no sense to me that BIOS bugs would have any effect on how the Linux kernel decides when to start swapping.

I can use all of physical memory as long as I'm running multiple processes. If I'm running just one, the kernel decides to start swapping when that one process exceeds half of physical memory, but it WILL use all of physical memory once swap fills up. This is all Linux kernel stuff and has nothing to do with hardware access or processor configuration or mapping of RAM to physical and virtual address spaces.

So what I'd like to know is why you think I should take the risk of applying an unnecessary BIOS update for something that should have nothing to do with the BIOS. I'm willing to be educated on this, but I don't get it. Thank you.

Changed in linux (Ubuntu):
assignee: nobody → Manpreet (manpreetkunnath)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.