CC13: Potential memory leak issue with 5.0.1 kmod vRouter
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Juniper Openstack |
New
|
Undecided
|
Sivakumar Ganapathy |
Bug Description
Issue Description:
We are investigating an issue with our kernel compute nodes where any running workloads (including virtualised, all userspace processes and kmod vRouter) slowly consume more and more host memory, which the kernel then allocates to the buffer/cache mem pools leaving little to no memory allocated to the "free" pool.
#free -m
total used free shared buff/cache available
Mem: 257084 35239 699 3 221145 207
Swap: 0 0 0
This is to be expected under normal operation but issues arise when the virtualised workloads require more memory and libvirt attempts to allocate more host memory but finds none available which triggers OOM error and the VM then crashes.
The expectation is that kernel should release memory from the buff/cache pool to the libvirt but this does not seem to be the case
libvirt log:
Nov 16 04:53:16 overcloud63m-comp-3 kernel: [691744] 0 691744 876 40 6 0 0 sh
Nov 16 04:53:16 overcloud63m-comp-3 kernel: Out of memory: Kill process 258272 (qemu-kvm) score 128 or sacrifice child
Nov 16 04:53:16 overcloud63m-comp-3 kernel: Killed process 258272 (qemu-kvm) total-vm:
Nov 16 04:53:16 overcloud63m-comp-3 journal: 2018-11-16 09:53:16.368+0000: 240077: warning : qemuGetProcessI
Nov 16 04:53:16 overcloud63m-comp-3 journal: 2018-11-16 09:53:16.368+0000: 240077: error : virProcessGetAf
Nov 16 04:53:16 overcloud63m-comp-3 journal: 2018-11-16 09:53:16.380+0000: 240079: warning : qemuGetProcessI
Nov 16 04:53:16 overcloud63m-comp-3 journal: 2018-11-16 09:53:16.380+0000: 240079: error : virProcessGetAf
Nov 16 04:53:17 overcloud63m-comp-3 journal: 2018-11-16 09:53:17.568+0000: 240031: error : qemuMonitorIORe
Nov 16 04:53:17 overcloud63m-comp-3 kvm: 0 guests now active
Nov 16 04:53:17 overcloud63m-comp-3 systemd-machined: Machine qemu-1-
terminated.
Redhat are investigating this issue as well (link below) and have found that this memory is being marked as unreclaimable by the kernel. The likely cause for this is a SLAB memory leak potentially by the vRouter kernel module but this is as yet unconfirmed.
summary: |
- Potential memory leak issue with 5.0.1 kmod vRouter + CC13: Potential memory leak issue with 5.0.1 kmod vRouter |
information type: | Proprietary → Public |
Changed in juniperopenstack: | |
assignee: | nobody → Sivakumar Ganapathy (hotlava51) |