nova version: newton , libvirt realtime feature mlockall: Cannot allocate memory

Bug #1695056 reported by Wei Zhu
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
In Progress
Undecided
Sahid Orentino

Bug Description

Trying to use this feature on newton,
https://specs.openstack.org/openstack/nova-specs/specs/mitaka/implemented/libvirt-real-time.html

Is low latency (realtime) kernel a requirement?
Anyway, I'm using 4.4.0-78-lowlatency now, following failed
(It worked if I manually copy out the portion before <devices> into virsh xml with addition of <memtune>)

  <memtune>
    <hard_limit unit='KiB'>20971520</hard_limit>
  </memtune>

warning: host doesn't support requested feature: CPUID.01H:EDX.ds [bit 21]
warning: host doesn't support requested feature: CPUID.01H:EDX.acpi [bit 22]
warning: host doesn't support requested feature: CPUID.01H:EDX.ht [bit 28]
warning: host doesn't support requested feature: CPUID.01H:EDX.tm [bit 29]
warning: host doesn't support requested feature: CPUID.01H:EDX.pbe [bit 31]
warning: host doesn't support requested feature: CPUID.01H:ECX.dtes64 [bit 2]
warning: host doesn't support requested feature: CPUID.01H:ECX.monitor [bit 3]
warning: host doesn't support requested feature: CPUID.01H:ECX.ds_cpl [bit 4]
warning: host doesn't support requested feature: CPUID.01H:ECX.smx [bit 6]
warning: host doesn't support requested feature: CPUID.01H:ECX.est [bit 7]
warning: host doesn't support requested feature: CPUID.01H:ECX.tm2 [bit 8]
warning: host doesn't support requested feature: CPUID.01H:ECX.xtpr [bit 14]
warning: host doesn't support requested feature: CPUID.01H:ECX.pdcm [bit 15]
warning: host doesn't support requested feature: CPUID.01H:ECX.dca [bit 18]
warning: host doesn't support requested feature: CPUID.01H:ECX.osxsave [bit 27]
warning: host doesn't support requested feature: CPUID.01H:EDX.ds [bit 21]
warning: host doesn't support requested feature: CPUID.01H:EDX.acpi [bit 22]
warning: host doesn't support requested feature: CPUID.01H:EDX.ht [bit 28]
warning: host doesn't support requested feature: CPUID.01H:EDX.tm [bit 29]
warning: host doesn't support requested feature: CPUID.01H:EDX.pbe [bit 31]
warning: host doesn't support requested feature: CPUID.01H:ECX.dtes64 [bit 2]
warning: host doesn't support requested feature: CPUID.01H:ECX.monitor [bit 3]
warning: host doesn't support requested feature: CPUID.01H:ECX.ds_cpl [bit 4]
warning: host doesn't support requested feature: CPUID.01H:ECX.smx [bit 6]
warning: host doesn't support requested feature: CPUID.01H:ECX.est [bit 7]
warning: host doesn't support requested feature: CPUID.01H:ECX.tm2 [bit 8]
warning: host doesn't support requested feature: CPUID.01H:ECX.xtpr [bit 14]
warning: host doesn't support requested feature: CPUID.01H:ECX.pdcm [bit 15]
warning: host doesn't support requested feature: CPUID.01H:ECX.dca [bit 18]
warning: host doesn't support requested feature: CPUID.01H:ECX.osxsave [bit 27]
mlockall: Cannot allocate memory
2017-06-01T14:47:43.122805Z qemu-system-x86_64: locking memory failed

>>>> this is generated by openstack
<domain type='kvm' id='13'>
  <name>instance-0000005f</name>
  <uuid>de8a2358-f5cb-426a-ad2b-840f5f6ddfcf</uuid>
  <metadata>
    <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
      <nova:package version="14.0.4"/>
      <nova:name>vlns1_pfe</nova:name>
      <nova:creationTime>2017-06-01 13:33:29</nova:creationTime>
      <nova:flavor name="vfpfl">
        <nova:memory>12288</nova:memory>
        <nova:disk>4</nova:disk>
        <nova:swap>0</nova:swap>
        <nova:ephemeral>0</nova:ephemeral>
        <nova:vcpus>12</nova:vcpus>
      </nova:flavor>
      <nova:owner>
        <nova:user uuid="50f2aa60006047198e8181e5d1ff2173">admin</nova:user>
        <nova:project uuid="3f79af00c3bc455f99adcde0826ca1ce">admin</nova:project>
      </nova:owner>
      <nova:root type="image" uuid="0d954553-8d39-402b-9190-8fc034b38351"/>
    </nova:instance>
  </metadata>
  <memory unit='KiB'>12582912</memory>
  <currentMemory unit='KiB'>12582912</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='0'/>
    </hugepages>
    <nosharepages/>
    <locked/>
  </memoryBacking>
  <vcpu placement='static'>12</vcpu>
  <cputune>
    <shares>12288</shares>
    <vcpupin vcpu='0' cpuset='1'/>
    <vcpupin vcpu='1' cpuset='17'/>
    <vcpupin vcpu='2' cpuset='0'/>
    <vcpupin vcpu='3' cpuset='16'/>
    <vcpupin vcpu='4' cpuset='3'/>
    <vcpupin vcpu='5' cpuset='19'/>
    <vcpupin vcpu='6' cpuset='6'/>
    <vcpupin vcpu='7' cpuset='22'/>
    <vcpupin vcpu='8' cpuset='4'/>
    <vcpupin vcpu='9' cpuset='20'/>
    <vcpupin vcpu='10' cpuset='7'/>
    <vcpupin vcpu='11' cpuset='23'/>
    <emulatorpin cpuset='1'/>
    <vcpusched vcpus='1-11' scheduler='fifo' priority='1'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <sysinfo type='smbios'>
    <system>
      <entry name='manufacturer'>OpenStack Foundation</entry>
      <entry name='product'>OpenStack Nova</entry>
      <entry name='version'>14.0.4</entry>
      <entry name='serial'>06b68615-3f83-4484-b195-0d68d5d67077</entry>
      <entry name='uuid'>de8a2358-f5cb-426a-ad2b-840f5f6ddfcf</entry>
      <entry name='family'>Virtual Machine</entry>
    </system>
  </sysinfo>
  <os>
    <type arch='x86_64' machine='pc-i440fx-xenial'>hvm</type>
    <boot dev='hd'/>
    <smbios mode='sysinfo'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-model'>
    <model fallback='allow'/>
    <topology sockets='6' cores='1' threads='2'/>
    <numa>
      <cell id='0' cpus='0-11' memory='12582912' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
  <clock offset='utc'>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/bin/kvm-spice</emulator>
    <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <auth username='nova'>
        <secret type='ceph' uuid='f8d2c231-1e4b-4fe3-8f55-eef652966ec5'/>
      </auth>
      <source protocol='rbd' name='vms/de8a2358-f5cb-426a-ad2b-840f5f6ddfcf_disk'>
        <host name='192.168.30.111' port='6789'/>
        <host name='192.168.30.112' port='6789'/>
        <host name='192.168.30.113' port='6789'/>
      </source>
      <backingStore/>
      <target dev='hda' bus='ide'/>
      <alias name='ide0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='network' device='cdrom'>
      <driver name='qemu' type='raw' cache='none'/>
      <auth username='nova'>
        <secret type='ceph' uuid='f8d2c231-1e4b-4fe3-8f55-eef652966ec5'/>
      </auth>
      <source protocol='rbd' name='vms/de8a2358-f5cb-426a-ad2b-840f5f6ddfcf_disk.config'>
        <host name='192.168.30.111' port='6789'/>
        <host name='192.168.30.112' port='6789'/>
        <host name='192.168.30.113' port='6789'/>
      </source>
      <backingStore/>
      <target dev='hdb' bus='ide'/>
      <readonly/>
      <alias name='ide0-0-1'/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <controller type='usb' index='0'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='ide' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <interface type='bridge'>
      <mac address='fa:16:3e:bc:b1:33'/>
      <source bridge='br-int'/>
      <virtualport type='openvswitch'>
        <parameters interfaceid='24dbd5a2-84cd-48d9-b987-7f0def3bd2bf'/>
      </virtualport>
      <target dev='tap24dbd5a2-84'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='fa:16:3e:e9:b5:61'/>
      <source bridge='br-int'/>
      <virtualport type='openvswitch'>
        <parameters interfaceid='619b47e3-d60c-4bce-8404-843cb3c2867f'/>
      </virtualport>
      <target dev='tap619b47e3-d6'/>
      <model type='virtio'/>
      <alias name='net1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </interface>
    <interface type='hostdev' managed='yes'>
      <mac address='fa:16:3e:31:30:b4'/>
      <driver name='vfio'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x02' slot='0x10' function='0x2'/>
      </source>
      <vlan>
        <tag id='31'/>
      </vlan>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </interface>
    <interface type='hostdev' managed='yes'>
      <mac address='fa:16:3e:6a:cd:d7'/>
      <driver name='vfio'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x02' slot='0x10' function='0x0'/>
      </source>
      <vlan>
        <tag id='32'/>
      </vlan>
      <alias name='hostdev1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </interface>
    <serial type='tcp'>
      <source mode='bind' host='127.0.0.1' service='10000'/>
      <protocol type='raw'/>
      <target port='0'/>
      <alias name='serial0'/>
    </serial>
    <serial type='pty'>
      <target port='1'/>
      <alias name='serial1'/>
    </serial>
    <console type='tcp'>
      <source mode='bind' host='127.0.0.1' service='10000'/>
      <protocol type='raw'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='vnc' port='5902' autoport='yes' listen='0.0.0.0' keymap='en-us'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='cirrus' vram='16384' heads='1'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <stats period='10'/>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='apparmor' relabel='yes'>
    <label>libvirt-de8a2358-f5cb-426a-ad2b-840f5f6ddfcf</label>
    <imagelabel>libvirt-de8a2358-f5cb-426a-ad2b-840f5f6ddfcf</imagelabel>
  </seclabel>
</domain>

>>>>> this tested good with addition of memtune
<domain type='kvm' id='4'>
  <name>vlns-pfe</name>
  <uuid>9d5e4b00-ffac-4e9d-9d97-70ad3ac1de2c</uuid>
  <memory unit='KiB'>12582912</memory>
  <currentMemory unit='KiB'>12582912</currentMemory>
  <memtune>
    <hard_limit unit='KiB'>20971520</hard_limit>
  </memtune>
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='0'/>
    </hugepages>
    <nosharepages/>
    <locked/>
  </memoryBacking>
  <vcpu placement='static'>12</vcpu>
  <cputune>
    <shares>12288</shares>
    <vcpupin vcpu='0' cpuset='1'/>
    <vcpupin vcpu='1' cpuset='17'/>
    <vcpupin vcpu='2' cpuset='0'/>
    <vcpupin vcpu='3' cpuset='16'/>
    <vcpupin vcpu='4' cpuset='3'/>
    <vcpupin vcpu='5' cpuset='19'/>
    <vcpupin vcpu='6' cpuset='6'/>
    <vcpupin vcpu='7' cpuset='22'/>
    <vcpupin vcpu='8' cpuset='4'/>
    <vcpupin vcpu='9' cpuset='20'/>
    <vcpupin vcpu='10' cpuset='7'/>
    <vcpupin vcpu='11' cpuset='23'/>
    <emulatorpin cpuset='1'/>
    <vcpusched vcpus='1-11' scheduler='fifo' priority='1'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-xenial'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-model'>
    <model fallback='allow'/>
    <topology sockets='6' cores='1' threads='2'/>
    <numa>
      <cell id='0' cpus='0-11' memory='12582912' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
  <clock offset='utc'>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>

Wei Zhu (slackwei)
summary: - libvirt realtime feature mlockall: Cannot allocate memory Edit
+ libvirt realtime feature mlockall: Cannot allocate memory
Revision history for this message
Matt Riedemann (mriedem) wrote : Re: libvirt realtime feature mlockall: Cannot allocate memory

Which version of libvirt / qemu do you have installed?

tags: added: libvirt
Revision history for this message
Wei Zhu (slackwei) wrote :

root@TS01:~# libvirtd --version
libvirtd (libvirt) 1.3.1
root@TS01:~# kvm --version
QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.14), Copyright (c) 2003-2008 Fabrice Bellard

Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :

It's a limitation of your host, when using hugepages the memory is locked, reserved, you can consider setting this limit in /etc/security/limits.conf

Revision history for this message
Wei Zhu (slackwei) wrote :

My main intention for this feature is to secure emulator pin, not the locked memory but it cannot be disabled,

I have it set to unlimited, didn't help here,

libvirt-qemu - memlock unlimited
root - memlock unlimited

Also according to Redhat doc,
When setting locked, a hard_limit must be set in the <memtune> element to the maximum memory configured for the guest, plus any memory consumed by the process itself.

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html-single/Virtualization_Tuning_and_Optimization_Guide/index.html

Revision history for this message
Wei Zhu (slackwei) wrote :

Went over a bug in redhat with similar symptom, seems <memtune><hard_limit> needs to be exposed in order for <locked> to function well in this feature,

https://bugzilla.redhat.com/show_bug.cgi?id=1316774

Revision history for this message
Wei Zhu (slackwei) wrote :

Default hard_limit is unlimited, but VM couldn't start as it needs real numbers, allocating 12G for guest with two vfio NIC, so had to set hard_limit to 15G to make it work,
Since I'm using HEAT to provision the VM, is there method to preset these values?

Before,
vroot@TS01:~# virsh memtune vlns-pfe
hard_limit : unlimited
soft_limit : unlimited
swap_hard_limit: unlimited

After,
root@TS01:~# virsh memtune vlns-pfe
hard_limit : 15000000
soft_limit : unlimited
swap_hard_limit: unlimited

Revision history for this message
Wei Zhu (slackwei) wrote :

Hi Sahid,

Wondering if there is concern to disable these in driver.py as a temporary workaround? I have hugepage enabled for KVM, with KSM disabled.

        if wantsrealtime:
            if not membacking:
                membacking = vconfig.LibvirtConfigGuestMemoryBacking()
            #membacking.locked = True
            #membacking.sharedpages = False

Meanwhile wondering if there can be enhancement to not enforce locked option if openstack flavor already has hugepage enabled, otherwise this feature doesnt work with newer libvirt,

"hw:mem_page_size": "2048",
"hw:numa_nodes": "1",
"hw:cpu_realtime_mask": "^0",
"hw:cpu_realtime": "yes"

Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :

I don't think it something we want, even if you have disable KSM it's not a generality. For realtime we want reduce the latency not the memory consumption.

What we need to do is to specifically set the hard limit when using locked memory.

Changed in nova:
assignee: nobody → sahid (sahid-ferdjaoui)
Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :

Wei Zhu, can you test this and let me know the result?

diff --git a/nova/virt/libvirt/driver.py b/nova/virt/libvirt/driver.py
index 18005ec..cade2ee 100644
--- a/nova/virt/libvirt/driver.py
+++ b/nova/virt/libvirt/driver.py
@@ -4736,6 +4736,9 @@ class LibvirtDriver(driver.ComputeDriver):
             instance.numa_topology,
             guest_numa_config.numatune,
             flavor)
+ if guest.membacking and guest.membacking.locked:
+ guest.memtune = vconfig.LibvirtConfigGuestMemoryTune()
+ guest.memtune.hard_limit = guest.memory

         guest.metadata.append(self._get_guest_config_meta(instance))
         guest.idmaps = self._get_guest_idmaps()

Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :
Revision history for this message
Wei Zhu (slackwei) wrote :

Thanks Sahid, so the hard-limit is set to guest memory,
In my previous example, guest memory is 12288, I have two vfio host interfaces I read somewhere they need 1g memory each, so I had to make hard-limit 15000 even 14000 didnt work.
Can we expose hard-limit as a new flavor key?

locked
When set and supported by the hypervisor, memory pages belonging to the domain will be locked in host's memory and the host will not be allowed to swap them out, which might be required for some workloads such as real-time. For QEMU/KVM guests, the memory used by the QEMU process itself will be locked too: unlike guest memory, this is an amount libvirt has no way of figuring out in advance, so it has to remove the limit on locked memory altogether. Thus, enabling this option opens up to a potential security risk: the host will be unable to reclaim the locked memory back from the guest when it's running out of memory, which means a malicious guest allocating large amounts of locked memory could cause a denial-of-service attach on the host. Because of this, using this option is discouraged unless your workload demands it; even then, it's highly recommended to set an hard_limit (see memory tuning) on memory allocation suitable for the specific environment at the same time to mitigate the risks described above. Since 1.0.6

Revision history for this message
Wei Zhu (slackwei) wrote :
Download full text (7.9 KiB)

I made the change per comment #9, VM couldn't start as hard_limit is not enough (comment #11)
  <currentMemory unit='KiB'>12582912</currentMemory>
  <memtune>
    <hard_limit unit='KiB'>12582912</hard_limit>
  </memtune>

So hardcoded hard_limit in driver.py instead and VM starts well
   guest.memtune.hard_limit = 20000000

Here is the full xml

<domain type='kvm' id='43'>
  <name>instance-000000a7</name>
  <uuid>b4b9e187-3df3-4b0c-bbb2-ccb11a90c58e</uuid>
  <metadata>
    <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
      <nova:package version="14.0.4"/>
      <nova:name>vlns1_pfe</nova:name>
      <nova:creationTime>2017-06-08 12:32:47</nova:creationTime>
      <nova:flavor name="vfpfl">
        <nova:memory>12288</nova:memory>
        <nova:disk>4</nova:disk>
        <nova:swap>0</nova:swap>
        <nova:ephemeral>0</nova:ephemeral>
        <nova:vcpus>12</nova:vcpus>
      </nova:flavor>
      <nova:owner>
        <nova:user uuid="50f2aa60006047198e8181e5d1ff2173">admin</nova:user>
        <nova:project uuid="3f79af00c3bc455f99adcde0826ca1ce">admin</nova:project>
      </nova:owner>
      <nova:root type="image" uuid="0d954553-8d39-402b-9190-8fc034b38351"/>
    </nova:instance>
  </metadata>
  <memory unit='KiB'>12582912</memory>
  <currentMemory unit='KiB'>12582912</currentMemory>
  <memtune>
    <hard_limit unit='KiB'>20000000</hard_limit>
  </memtune>
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='0'/>
    </hugepages>
    <nosharepages/>
    <locked/>
  </memoryBacking>
  <vcpu placement='static'>12</vcpu>
  <cputune>
    <shares>12288</shares>
    <vcpupin vcpu='0' cpuset='1'/>
    <vcpupin vcpu='1' cpuset='17'/>
    <vcpupin vcpu='2' cpuset='0'/>
    <vcpupin vcpu='3' cpuset='16'/>
    <vcpupin vcpu='4' cpuset='3'/>
    <vcpupin vcpu='5' cpuset='19'/>
    <vcpupin vcpu='6' cpuset='6'/>
    <vcpupin vcpu='7' cpuset='22'/>
    <vcpupin vcpu='8' cpuset='4'/>
    <vcpupin vcpu='9' cpuset='20'/>
    <vcpupin vcpu='10' cpuset='7'/>
    <vcpupin vcpu='11' cpuset='23'/>
    <emulatorpin cpuset='1'/>
    <vcpusched vcpus='1-11' scheduler='fifo' priority='1'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <sysinfo type='smbios'>
    <system>
      <entry name='manufacturer'>OpenStack Foundation</entry>
      <entry name='product'>OpenStack Nova</entry>
      <entry name='version'>14.0.4</entry>
      <entry name='serial'>06b68615-3f83-4484-b195-0d68d5d67077</entry>
      <entry name='uuid'>b4b9e187-3df3-4b0c-bbb2-ccb11a90c58e</entry>
      <entry name='family'>Virtual Machine</entry>
    </system>
  </sysinfo>
  <os>
    <type arch='x86_64' machine='pc-i440fx-xenial'>hvm</type>
    <boot dev='hd'/>
    <smbios mode='sysinfo'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-model'>
    <model fallback='allow'/>
    <topology sockets='6' cores='1' threads='2'/>
    <numa>
      <cell id='0' cpus='0-11' memory='12582912' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
  <clock offset='utc'>
    <timer na...

Read more...

Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :

So nothing is really clear on how to configure this element. Someone says from #virt that we can set it to the host memory size but there is a security concern; QEMU might be compromised or have memory leak and so host itself could crash.

Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :

Wei Zhu, can you try with a recent version libvirt (3.2.0 at least) it seems that now libvirt automatically set the limit to unlimited when using locked?

Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :
Revision history for this message
Wei Zhu (slackwei) wrote :

Very hard to make libvirt 3.2.1 work on ubuntu 16.04, existing latest package is 1.3.1.
Is it possible to expose hard_limit as a flavor key, or to disable /locked as an alternative flavor key? My understanding with hugepage, memory cannot be swapped, so no need for /locked.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/472633

Changed in nova:
status: New → In Progress
Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote : Re: libvirt realtime feature mlockall: Cannot allocate memory

Wei Zhu can you try that patch [0]? Basically it's what libvirt is doing start to 3.1.0 [1]

[0] https://review.openstack.org/472633
[1] https://www.redhat.com/archives/libvir-list/2017-March/msg01092.html

Revision history for this message
Wei Zhu (slackwei) wrote :

Using newton, my fakelibvirt.py file content is a bit different, driver.py is fine,

This doesn't work I guess the value is too big, so I hardcoded HARD_LIMIT_UNLIMITED = 9007199254740988 instead,

>>> print sys.maxint >> 10
9007199254740991

Following is default hard_limit when VM starts,
root@TS01:/usr/lib/python2.7/dist-packages/nova/virt/libvirt# virsh memtune 1
hard_limit : 9007199254740988
soft_limit : 9007199254740988

Revision history for this message
Wei Zhu (slackwei) wrote :

With kernel 3.18 (since commit 3e32cb2e0a12b6915056ff04601cf1bb9b44f967) the
"unlimited" value for cgroup memory limits has changed once again as its byte
value is now computed from a page counter.
The new "unlimited" value reported by the cgroup fs is therefore 2**51-1 pages
which is (VIR_DOMAIN_MEMORY_PARAM_UNLIMITED - 3072). This results e.g. in virsh
memtune displaying 9007199254740988 instead of unlimited for the limits.

This patch deals with the rounding issue by scaling the byte values reported
by the kernel and the PARAM_UNLIMITED value to page size and comparing those.

See also libvirt commit 231656bbeb9e4d3bedc44362784c35eee21cf0f4 for the
history for kernel 3.12 and before.

Revision history for this message
Wei Zhu (slackwei) wrote :

Sahid, Can we use "HARD_LIMIT_UNLIMITED = (sys.maxint >> 10) - 3" instead? I tested good.

Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :

What exactly is the error ? can you share the traceback so at least I can explain the problem on the fix as a side note.

Thanks

Revision history for this message
Wei Zhu (slackwei) wrote :

It's basically same mlock error as memtune option is not there, I think due to the value of sys.maxint >> 10 is too big. When I hardcoded smaller value or used "sys.maxint >> 10 - 3" then memtune option appeared.

Revision history for this message
Wei Zhu (slackwei) wrote :

Here with HARD_LIMIT_UNLIMITED = sys.maxint >> 10 <memtune> option is missed,

2017-06-13 11:49:38.907 12189 ERROR nova.virt.libvirt.guest [req-c831c761-f4d8-4582-a083-72e4f24fb95d 50f2aa60006047198e8181e5d1ff2173 3f79af00c3bc455f99adcde0826ca1ce - - -] Error launching a defined domain with XML: <domain type='kvm'>
  <name>instance-000000dd</name>
  <uuid>630614b5-f29b-4e15-9a73-f13b52743d5a</uuid>
  <metadata>
    <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
      <nova:package version="14.0.5"/>
      <nova:name>vlns1_pfe</nova:name>
      <nova:creationTime>2017-06-13 15:49:31</nova:creationTime>
      <nova:flavor name="vfpfl">
        <nova:memory>12288</nova:memory>
        <nova:disk>4</nova:disk>
        <nova:swap>0</nova:swap>
        <nova:ephemeral>0</nova:ephemeral>
        <nova:vcpus>12</nova:vcpus>
      </nova:flavor>
      <nova:owner>
        <nova:user uuid="50f2aa60006047198e8181e5d1ff2173">admin</nova:user>
        <nova:project uuid="3f79af00c3bc455f99adcde0826ca1ce">admin</nova:project>
      </nova:owner>
      <nova:root type="image" uuid="36d52e83-3d91-4368-955b-0259cdb5367e"/>
    </nova:instance>
  </metadata>
  <memory unit='KiB'>12582912</memory>
  <currentMemory unit='KiB'>12582912</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='0'/>
    </hugepages>
    <nosharepages/>
    <locked/>
  </memoryBacking>
  <vcpu placement='static'>12</vcpu>
  <cputune>
    <shares>12288</shares>
    <vcpupin vcpu='0' cpuset='1'/>
    <vcpupin vcpu='1' cpuset='17'/>
    <vcpupin vcpu='2' cpuset='3'/>
    <vcpupin vcpu='3' cpuset='19'/>
    <vcpupin vcpu='4' cpuset='6'/>
    <vcpupin vcpu='5' cpuset='22'/>
    <vcpupin vcpu='6' cpuset='4'/>
    <vcpupin vcpu='7' cpuset='20'/>
    <vcpupin vcpu='8' cpuset='7'/>
    <vcpupin vcpu='9' cpuset='23'/>
    <vcpupin vcpu='10' cpuset='5'/>
    <vcpupin vcpu='11' cpuset='21'/>
    <emulatorpin cpuset='1'/>
    <vcpusched vcpus='1-11' scheduler='fifo' priority='1'/>
  </cputune>

Revision history for this message
Wei Zhu (slackwei) wrote :

With HARD_LIMIT_UNLIMITED = (sys.maxint >> 10) -3 , it works with correct memtune and hard_limit

  <memory unit='KiB'>12582912</memory>
  <currentMemory unit='KiB'>12582912</currentMemory>
  <memtune>
    <hard_limit unit='KiB'>9007199254740988</hard_limit>
  </memtune>
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='0'/>
    </hugepages>
    <nosharepages/>
    <locked/>
  </memoryBacking>
  <vcpu placement='static'>12</vcpu>
  <cputune>
    <shares>12288</shares>
    <vcpupin vcpu='0' cpuset='1'/>
    <vcpupin vcpu='1' cpuset='17'/>
    <vcpupin vcpu='2' cpuset='3'/>
    <vcpupin vcpu='3' cpuset='19'/>
    <vcpupin vcpu='4' cpuset='6'/>
    <vcpupin vcpu='5' cpuset='22'/>
    <vcpupin vcpu='6' cpuset='4'/>
    <vcpupin vcpu='7' cpuset='20'/>
    <vcpupin vcpu='8' cpuset='7'/>
    <vcpupin vcpu='9' cpuset='23'/>
    <vcpupin vcpu='10' cpuset='5'/>
    <vcpupin vcpu='11' cpuset='21'/>
    <emulatorpin cpuset='1'/>
    <vcpusched vcpus='1-11' scheduler='fifo' priority='1'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
  </numatune>

Revision history for this message
Wei Zhu (slackwei) wrote :

Sahid, seems kmem max is 2**53 - 3 instead of 2**53 -1?

Revision history for this message
Wei Zhu (slackwei) wrote :

sorry 2**53 - 4 instead.

Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :

Ok I think we should set it to -1 so libvirt will take care of using the unlimited value by reading the cgroup. I'm going to update the patch and make some tests. Can you do the same in your side ? to ensure that is well working?

Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :
Download full text (8.2 KiB)

Ok so that does not work with -1 which is strange since that is working with virsh.

 virsh memtune dom --hard-limit -1 --live

ute[9794]: INFO nova.virt.libvirt.driver [None req-389bfc12-d0fd-4039-94de-f44b8acaf0cf demo admin] [instance: 1bcc5575-60e4-4f91-951c-68e3b102cc94] Deletion of /opt/stack/data/nova/instances/
Jun 16 05:16:45 localhost.localdomain nova-compute[9794]: ERROR nova.compute.manager [None req-389bfc12-d0fd-4039-94de-f44b8acaf0cf demo admin] [instance: 1bcc5575-60e4-4f91-951c-68e3b102cc94] Instance failed to spawn: libvirtError: XML er
Jun 16 05:16:45 localhost.localdomain nova-compute[9794]: ERROR nova.compute.manager [instance: 1bcc5575-60e4-4f91-951c-68e3b102cc94] Traceback (most recent call last):
Jun 16 05:16:45 localhost.localdomain nova-compute[9794]: ERROR nova.compute.manager [instance: 1bcc5575-60e4-4f91-951c-68e3b102cc94] File "/opt/stack/nova/nova/compute/manager.py", line 2154, in _build_resources
Jun 16 05:16:45 localhost.localdomain nova-compute[9794]: ERROR nova.compute.manager [instance: 1bcc5575-60e4-4f91-951c-68e3b102cc94] yield resources
Jun 16 05:16:45 localhost.localdomain nova-compute[9794]: ERROR nova.compute.manager [instance: 1bcc5575-60e4-4f91-951c-68e3b102cc94] File "/opt/stack/nova/nova/compute/manager.py", line 1960, in _build_and_run_instance
Jun 16 05:16:45 localhost.localdomain nova-compute[9794]: ERROR nova.compute.manager [instance: 1bcc5575-60e4-4f91-951c-68e3b102cc94] block_device_info=block_device_info)
Jun 16 05:16:45 localhost.localdomain nova-compute[9794]: ERROR nova.compute.manager [instance: 1bcc5575-60e4-4f91-951c-68e3b102cc94] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2781, in spawn
Jun 16 05:16:45 localhost.localdomain nova-compute[9794]: ERROR nova.compute.manager [instance: 1bcc5575-60e4-4f91-951c-68e3b102cc94] destroy_disks_on_failure=True)
Jun 16 05:16:45 localhost.localdomain nova-compute[9794]: ERROR nova.compute.manager [instance: 1bcc5575-60e4-4f91-951c-68e3b102cc94] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5226, in _create_domain_and_network
Jun 16 05:16:45 localhost.localdomain nova-compute[9794]: ERROR nova.compute.manager [instance: 1bcc5575-60e4-4f91-951c-68e3b102cc94] destroy_disks_on_failure)
Jun 16 05:16:45 localhost.localdomain nova-compute[9794]: ERROR nova.compute.manager [instance: 1bcc5575-60e4-4f91-951c-68e3b102cc94] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
Jun 16 05:16:45 localhost.localdomain nova-compute[9794]: ERROR nova.compute.manager [instance: 1bcc5575-60e4-4f91-951c-68e3b102cc94] self.force_reraise()
Jun 16 05:16:45 localhost.localdomain nova-compute[9794]: ERROR nova.compute.manager [instance: 1bcc5575-60e4-4f91-951c-68e3b102cc94] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
Jun 16 05:16:45 localhost.localdomain nova-compute[9794]: ERROR nova.compute.manager [instance: 1bcc5575-60e4-4f91-951c-68e3b102cc94] six.reraise(self.type_, self.value, self.tb)
Jun 16 05:16:45 localhost.localdomain nova-compute[9794]: ERROR nova.compute.manager [instance: 1bcc5575-60e4-4f91-951c-68e3b102cc94] File "...

Read more...

Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :

Oh actually virsh is taking care of that value when we pass -1 so it's not a bug of libvirt.

Revision history for this message
Wei Zhu (slackwei) wrote :

HARD_LIMIT_UNLIMITED = -1

2017-06-16 07:47:16.572 13881 ERROR nova.virt.libvirt.guest [req-c4728dee-aa4d-4995-9f5c-7e466d10e74c 50f2aa60006047198e8181e5d1ff2173 3f79af00c3bc455f99adcde0826ca1ce - - -] Error defining a domain with XML: <domain type="kvm">
  <uuid>f83666bd-90d3-4c02-a5d5-a59b73a1bc14</uuid>
  <name>instance-000000e7</name>
  <memory>12582912</memory>
  <memoryBacking>
    <hugepages>
      <page size="2048" nodeset="0" unit="KiB"/>
    </hugepages>
    <nosharepages/>
    <locked/>
  </memoryBacking>
  <memtune>
    <hard_limit units="K">-1</hard_limit>
  </memtune>

Revision history for this message
Wei Zhu (slackwei) wrote :

when using memtune hard-limit =-1, it removed the memtune portion, later VM crashed

2017-06-16T12:05:41.234284Z qemu-system-x86_64: terminating on signal 15 from pid 3407
2017-06-16 12:05:42.035+0000: shutting down

root@TS01:/usr/lib/python2.7/dist-packages/nova/virt/libvirt# virsh dumpxml 37
<domain type='kvm' id='37'>
  <name>instance-000000f3</name>
  <uuid>ee2bb7ec-fc23-453e-8104-09cbfad08370</uuid>
  <metadata>
    <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
      <nova:package version="14.0.5"/>
      <nova:name>vlns1_pfe</nova:name>
      <nova:creationTime>2017-06-16 11:55:12</nova:creationTime>
      <nova:flavor name="vfpfl">
        <nova:memory>12288</nova:memory>
        <nova:disk>4</nova:disk>
        <nova:swap>0</nova:swap>
        <nova:ephemeral>0</nova:ephemeral>
        <nova:vcpus>12</nova:vcpus>
      </nova:flavor>
      <nova:owner>
        <nova:user uuid="50f2aa60006047198e8181e5d1ff2173">admin</nova:user>
        <nova:project uuid="3f79af00c3bc455f99adcde0826ca1ce">admin</nova:project>
      </nova:owner>
      <nova:root type="image" uuid="36d52e83-3d91-4368-955b-0259cdb5367e"/>
    </nova:instance>
  </metadata>
  <memory unit='KiB'>12582912</memory>
  <currentMemory unit='KiB'>12582912</currentMemory>
  <memtune>
    <hard_limit unit='KiB'>9007199254740988</hard_limit>
  </memtune>

root@TS01:/usr/lib/python2.7/dist-packages/nova/virt/libvirt# virsh memtune 37 --hard-limit=-1 --live

root@TS01:/usr/lib/python2.7/dist-packages/nova/virt/libvirt# virsh dumpxml 37
<domain type='kvm' id='37'>
  <name>instance-000000f3</name>
  <uuid>ee2bb7ec-fc23-453e-8104-09cbfad08370</uuid>
  <metadata>
    <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
      <nova:package version="14.0.5"/>
      <nova:name>vlns1_pfe</nova:name>
      <nova:creationTime>2017-06-16 11:55:12</nova:creationTime>
      <nova:flavor name="vfpfl">
        <nova:memory>12288</nova:memory>
        <nova:disk>4</nova:disk>
        <nova:swap>0</nova:swap>
        <nova:ephemeral>0</nova:ephemeral>
        <nova:vcpus>12</nova:vcpus>
      </nova:flavor>
      <nova:owner>
        <nova:user uuid="50f2aa60006047198e8181e5d1ff2173">admin</nova:user>
        <nova:project uuid="3f79af00c3bc455f99adcde0826ca1ce">admin</nova:project>
      </nova:owner>
      <nova:root type="image" uuid="36d52e83-3d91-4368-955b-0259cdb5367e"/>
    </nova:instance>
  </metadata>
  <memory unit='KiB'>12582912</memory>
  <currentMemory unit='KiB'>12582912</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='0'/>
    </hugepages>
    <nosharepages/>
    <locked/>
  </memoryBacking>
  <vcpu placement='static'>12</vcpu>
  <cputune>

Revision history for this message
Wei Zhu (slackwei) wrote :

This works good, using maximum system avaiable physical memory

HARD_LIMIT_UNLIMITED = os.sysconf('SC_PAGE_SIZE') * os.sysconf('SC_AVPHYS_PAGES') >> 10

Revision history for this message
Wei Zhu (slackwei) wrote :

Sahid, is it possible to use system physical memory instead?

Revision history for this message
Wei Zhu (slackwei) wrote :

Hi Sahid, just trying to follow up, much appreciated.

Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :

Wei, can you elaborate little bit more why you would do that ? at the end it's going to be the same behavior or do I missed something.

Revision history for this message
Wei Zhu (slackwei) wrote :

HARD_LIMIT_UNLIMITED = -1 didn't work, just wondering any other value to use and get it worked around.

Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :

The current patch upstream [0] is proposing to hardcode a value which should be considered as unlimited for most of the cases, then libvirt in its version 3.1.0 is going to take care of that automatically.

I think we are good even if you have more concerns ?

[0] https://review.openstack.org/#/c/472633/

Revision history for this message
Wei Zhu (slackwei) wrote :

Works great, thank you!

Revision history for this message
Sean Dague (sdague) wrote :

Automatically discovered version newton in description. If this is incorrect, please update the description to include 'nova version: ...'

tags: added: openstack-version.newton
Wei Zhu (slackwei)
summary: - libvirt realtime feature mlockall: Cannot allocate memory
+ nova version: newton , libvirt realtime feature mlockall: Cannot
+ allocate memory
Changed in nova:
assignee: sahid (sahid-ferdjaoui) → Stephen Finucane (stephenfinucane)
Changed in nova:
assignee: Stephen Finucane (stephenfinucane) → sahid (sahid-ferdjaoui)
Revision history for this message
Wei Zhu (slackwei) wrote :

can this be merged into ocata?

Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :

I don't know it seems to be blocked by a developer preference...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by sahid (<email address hidden>) on branch: master
Review: https://review.openstack.org/472633
Reason: Blocked for no real reason

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.