Ubuntu
linux package

4.15.0 memory allocation issue

Bug #1808412 reported by Marc Gariépy on 2018-12-13

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	linux (Ubuntu)	Confirmed	Undecided	Unassigned
	qemu (Ubuntu)	New	Undecided	Unassigned

Bug Description

My server is :
PowerEdge T630
2x Intel(R) Xeon(R) CPU E5-2623 v4 @ 2.60GHz
128G of ram
4x VGA compatible controller [0300]: NVIDIA Corporation GP102 [TITAN X] [10de:1b00] (rev a1)

Starting 116G ram 16vcpus + 4 pci passthrough allocating memory stops after about half of the memory.

When upgrading from kernel 4.13.0 to 4.15.0 starting a vm takes a long time.

I tested kernel :
linux-image-4.13.0-37 not affected
linux-image-4.13.0-45 not affected
linux-image-4.15.0-34 affected
linux-image-4.15.0-42 affected

After disabling transparent_hugepage on 4.15 everything seems to work correctly.

cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.15.0-42-generic root=UUID=<some uuid> ro intel_iommu=on transparent_hugepage=never splash quiet vt.handoff=7
---
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Dec 18 20:02 seq
crw-rw---- 1 root audio 116, 33 Dec 18 20:02 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.20.1-0ubuntu2.18
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 16.04
HibernationDevice: RESUME=UUID=d1a63627-71f0-4d12-90cb-025ed3aa8439
IwConfig: Error: [Errno 2] No such file or directory
MachineType: Supermicro X8DTH-i/6/iF/6F
Package: linux (not installed)
PciMultimedia:

ProcFB: 0 mgadrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.15.0-42-generic root=UUID=7d5fdcc6-37a6-4b95-bf01-4f5c9637720b ro biosdevname=0 net.ifnames=0 splash intel_iommu=on quiet audit=1 vt.handoff=7
ProcVersionSignature: Ubuntu 4.15.0-42.45~16.04.1-generic 4.15.18
RelatedPackageVersions:
linux-restricted-modules-4.15.0-42-generic N/A
linux-backports-modules-4.15.0-42-generic N/A
linux-firmware 1.157.21
RfKill: Error: [Errno 2] No such file or directory
Tags: xenial xenial
Uname: Linux 4.15.0-42-generic x86_64
UnreportableReason: The report belongs to a package that is not installed.
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: False
dmi.bios.date: 05/04/12
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 2.1b
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: X8DTH
dmi.board.vendor: Supermicro
dmi.board.version: 1234567890
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 17
dmi.chassis.vendor: Supermicro
dmi.chassis.version: 1234567890
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr2.1b:bd05/04/12:svnSupermicro:pnX8DTH-i/6/iF/6F:pvr1234567890:rvnSupermicro:rnX8DTH:rvr1234567890:cvnSupermicro:ct17:cvr1234567890:
dmi.product.family: Server
dmi.product.name: X8DTH-i/6/iF/6F
dmi.product.version: 1234567890
dmi.sys.vendor: Supermicro

See original description

Tags:

Revision history for this message

Marc Gariépy (mgariepy) wrote on 2018-12-13:

4.18.0 is also affected.
Installed the pkg on xenial, and the same issue was present, i didn't tested disabling the transparent_hugepage tho.

(wget http://security.ubuntu.com/ubuntu/pool/main/l/linux-signed/linux-image-4.18.0-12-generic_4.18.0-12.13_amd64.deb http://security.ubuntu.com/ubuntu/pool/main/l/linux/linux-modules-4.18.0-12-generic_4.18.0-12.13_amd64.deb http://security.ubuntu.com/ubuntu/pool/main/l/linux/linux-modules-extra-4.18.0-12-generic_4.18.0-12.13_amd64.deb )

Revision history for this message

Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote on 2018-12-13: Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1808412

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status:	New → Incomplete
tags:	added: bionic

Revision history for this message

Marc Gariépy (mgariepy) wrote on 2018-12-13:

u cannot run the apport-collect script, but if there is something that is really needed, please ask me and i will do my best give the information.

thanks

Changed in linux (Ubuntu):
status:	Incomplete → Opinion
status:	Opinion → Confirmed

Revision history for this message

Marc Gariépy (mgariepy) wrote on 2018-12-14:

When the memory allocation Samples: 483K of event 'cycles:ppp', Overhead Shared Object 28.53% [kernel] 25.34% [kernel] 13.54% [kernel] 11.24% [kernel] 6.35% [kernel] 3.69% [kernel] 1.33% [kernel] 0.63% [kernel] 0.48% [kernel] 0.40% [kernel] 0.35% [kernel] 0.31% [kernel] 0.28% [kernel] 0.27% [kernel] 0.27% [kernel] 0.27% [kernel] 0.22% [kernel] was stalled, here is what perf top was giving me:
Event count (approx.): 52114089074
Symbol
[k] total_mapcount
[k] kvm_age_rmapp
[k] slot_rmap_walk_next
[k] kvm_handle_hva_range
[k] rmap_get_first
[k] __x86_indirect_thunk_r13
[k] __isolate_lru_page
[k] isolate_lru_pages.isra.58
[k] page_vma_mapped_walk
[k] __mod_node_page_state
[k] clear_page_erms
[k] shrink_page_list
[k] _find_next_bit
[k] putback_inactive_pages
[k] move_active_pages_to_lru
[k] inactive_list_is_low
[k] __mod_zone_page_state

numactl -H when the memorry allocation stalled:
root@gpu-compute028:~# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 55983 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63810 MB
node distances:
node 0 1
  0: 10 21
  1: 21 10
root@gpu-compute028:~# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 366 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63782 MB
node distances:
node 0 1
  0: 10 21
  1: 21 10
root@gpu-compute028:~# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 368 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63757 MB
node distances:
node 0 1
  0: 10 21
  1: 21 10
root@gpu-compute028:~# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 368 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63744 MB
node distances:
node 0 1
  0: 10 21
  1: 21 10
root@gpu-compute028:~# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 366 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63504 MB
node distances:
node 0 1
  0: 10 21
  1: 21 10

then i killed the process.

When the memory allocation was stalled, here is what perf top was giving me:
Samples: 483K of event 'cycles:ppp', Event count (approx.): 52114089074
Overhead  Shared Object                             Symbol
  28.53%  [kernel]                                  [k] total_mapcount
  25.34%  [kernel]                                  [k] kvm_age_rmapp
  13.54%  [kernel]                                  [k] slot_rmap_walk_next
  11.24%  [kernel]                                  [k] kvm_handle_hva_range
   6.35%  [kernel]                                  [k] rmap_get_first
   3.69%  [kernel]                                  [k] __x86_indirect_thunk_r13
   1.33%  [kernel]                                  [k] __isolate_lru_page
   0.63%  [kernel]                                  [k] isolate_lru_pages.isra.58
   0.48%  [kernel]                                  [k] page_vma_mapped_walk
   0.40%  [kernel]                                  [k] __mod_node_page_state
   0.35%  [kernel]                                  [k] clear_page_erms
   0.31%  [kernel]                                  [k] shrink_page_list
   0.28%  [kernel]                                  [k] _find_next_bit
   0.27%  [kernel]                                  [k] putback_inactive_pages
   0.27%  [kernel]                                  [k] move_active_pages_to_lru
   0.27%  [kernel]                                  [k] inactive_list_is_low
   0.22%  [kernel]                                  [k] __mod_zone_page_state

numactl -H when the memorry allocation stalled:
root@gpu-compute028:~# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 55983 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63810 MB
node distances:
node   0   1 
  0:  10  21 
  1:  21  10 
root@gpu-compute028:~# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 366 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63782 MB
node distances:
node   0   1 
  0:  10  21 
  1:  21  10 
root@gpu-compute028:~# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 368 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63757 MB
node distances:
node   0   1 
  0:  10  21 
  1:  21  10 
root@gpu-compute028:~# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 368 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63744 MB
node distances:
node   0   1 
  0:  10  21 
  1:  21  10 
root@gpu-compute028:~# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 366 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63504 MB
node distances:
node   0   1 
  0:  10  21 
  1:  21  10

then i killed the process.

Revision history for this message

Marc Gariépy (mgariepy) wrote on 2018-12-19: CRDA.txt

CRDA.txt Edit (434 bytes, text/plain)

apport information

tags:	added: apport-collected xenial
description:	updated

Revision history for this message

Marc Gariépy (mgariepy) wrote on 2018-12-19: CurrentDmesg.txt

CurrentDmesg.txt Edit (103.8 KiB, text/plain)

apport information

Revision history for this message

Marc Gariépy (mgariepy) wrote on 2018-12-19: HookError_generic.txt

HookError_generic.txt Edit (379 bytes, text/plain)

apport information

Revision history for this message

Marc Gariépy (mgariepy) wrote on 2018-12-19: Lspci.txt

Lspci.txt Edit (122.1 KiB, text/plain)

apport information

Revision history for this message

Marc Gariépy (mgariepy) wrote on 2018-12-19: Lsusb.txt

Lsusb.txt Edit (668 bytes, text/plain)

apport information

Revision history for this message

Marc Gariépy (mgariepy) wrote on 2018-12-19: ProcCpuinfoMinimal.txt

#10

ProcCpuinfoMinimal.txt Edit (962 bytes, text/plain)

apport information

Revision history for this message

Marc Gariépy (mgariepy) wrote on 2018-12-19: ProcEnviron.txt

#11

ProcEnviron.txt Edit (113 bytes, text/plain)

apport information

Revision history for this message

Marc Gariépy (mgariepy) wrote on 2018-12-19: ProcInterrupts.txt

#12

ProcInterrupts.txt Edit (17.0 KiB, text/plain)

apport information

Revision history for this message

Marc Gariépy (mgariepy) wrote on 2018-12-19: ProcModules.txt

#13

ProcModules.txt Edit (7.5 KiB, text/plain)

apport information

Revision history for this message

Marc Gariépy (mgariepy) wrote on 2018-12-19: UdevDb.txt

#14

UdevDb.txt Edit (230.1 KiB, text/plain)

apport information

Revision history for this message

Marc Gariépy (mgariepy) wrote on 2018-12-19: WifiSyslog.txt

#15

WifiSyslog.txt Edit (130.2 KiB, text/plain)

apport information

Revision history for this message

Marc Gariépy (mgariepy) wrote on 2018-12-20:

#16

I see the same behavior on both Bionic and Xenial on the 4.15.0-43 kernel.

4.4.0-141 doesn't have the same issue but is still reacting weirdly.

with both libvirt and qemu from ubuntu updates and Ubuntu cloud archive.

the problem is more obvious when starting the vm on a server with multiple numa nodes.

I reproduced the issue on a dual AMD EPYC 7301 16-Core Processor

# numactl -H
available: 8 nodes (0-7)
node 0 cpus: 0 1 2 3 32 33 34 35
node 0 size: 32095 MB
node 0 free: 31947 MB
node 1 cpus: 4 5 6 7 36 37 38 39
node 1 size: 32252 MB
node 1 free: 32052 MB
node 2 cpus: 8 9 10 11 40 41 42 43
node 2 size: 32252 MB
node 2 free: 31729 MB
node 3 cpus: 12 13 14 15 44 45 46 47
node 3 size: 32252 MB
node 3 free: 31999 MB
node 4 cpus: 16 17 18 19 48 49 50 51
node 4 size: 32252 MB
node 4 free: 32166 MB
node 5 cpus: 20 21 22 23 52 53 54 55
node 5 size: 32252 MB
node 5 free: 32185 MB
node 6 cpus: 24 25 26 27 56 57 58 59
node 6 size: 32231 MB
node 6 free: 32161 MB
node 7 cpus: 28 29 30 31 60 61 62 63
node 7 size: 32250 MB
node 7 free: 32183 MB
node distances:
node 0 1 2 3 4 5 6 7
  0: 10 16 16 16 32 32 32 32
  1: 16 10 16 16 32 32 32 32
  2: 16 16 10 16 32 32 32 32
  3: 16 16 16 10 32 32 32 32
  4: 32 32 32 32 10 16 16 16
  5: 32 32 32 32 16 10 16 16
  6: 32 32 32 32 16 16 10 16
  7: 32 32 32 32 16 16 16 10

Step to reproduce:
1- install ubuntu with libvirt
2- configure pci passthrough
3- create a vm that use more ram than a single numa node has.
4- add a pci device to your vm * this step makes qemu pre-allocate the ram to the vm.
5- one qemu-system-x86_64 takes 1 core 100% cpu, and the ram usage goes up. until the numa memory of the running core is full. then it stalls until the process utilisation goes to another core.

setting /sys/kernel/mm/transparent_hugepage/enabled to [never], helps mitigate the issue.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

Ubuntulinux package

4.15.0 memory allocation issue

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux package