Description
===========
Since Ocata, usage information in numa_toplogy of compute_nodes in DB disappears around 2 minutes after a VM is spawned.
Steps to reproduce
==================
* Enable NUMATopologyFilter to use vcpu pining
* Launch a VM with flavor having NUMA context like hw:cpu_policy=dedicated or hw:mem_page_size=large
* Check numa_topology of compute_nodes in nova DB to check whether NUMA usage is applied
* wait for 2 minutes (more or less)
* Check numa_topology of compute_nodes in nova DB to check whether NUMA usage has been reset
Expected result
===============
There should have no changes in the DB.
Actual result
=============
numa_topology of compute_nodes has been reset (usage information has gone)
Environment
===========
1. RDO Ocata
2. CentOS
Logs & Configs
==============
NUMA usage information is alive right after a VM is spawned. (focusing on pinned_cpus and memory_usage)
$ mysql -s nova -e "select numa_topology from compute_nodes where host='ocata1';"
numa_topology
{"nova_object.version": "1.2", "nova_object.changes": ["cells"], "nova_object.name": "NUMATopology", "nova_object.data": {"cells": [{"nova_object.version": "1.2", "nova_object.changes": ["cpu_usage", "memory_usage", "cpuset", "pinned_cpus", "siblings", "memory", "mempages", "id"], "nova_object.name": "NUMACell", "nova_object.data": {"cpu_usage": 4, "memory_usage": 1024, "cpuset": [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], "pinned_cpus": [16, 17, 10, 11], "siblings": [[16, 17], [10, 11], [4, 5], [8, 9], [12, 13], [2, 3], [14, 15], [6, 7], [18, 19]], "memory": 20479, "mempages": [{"nova_object.version": "1.1", "nova_object.changes": ["used", "total", "reserved", "size_kb"], "nova_object.name": "NUMAPagesTopology", "nova_object.data": {"used": 0, "total": 4456317, "reserved": 0, "size_kb": 4}, "nova_object.namespace": "nova"}, {"nova_object.version": "1.1", "nova_object.changes": ["total", "used", "reserved", "size_kb"], "nova_object.name": "NUMAPagesTopology", "nova_object.data": {"used": 1, "total": 3, "reserved": 0, "size_kb": 1048576}, "nova_object.namespace": "nova"}], "id": 0}, "nova_object.namespace": "nova"}, {"nova_object.version": "1.2", "nova_object.changes": ["cpu_usage", "memory_usage", "cpuset", "pinned_cpus", "siblings", "memory", "mempages", "id"], "nova_object.name": "NUMACell", "nova_object.data": {"cpu_usage": 0, "memory_usage": 0, "cpuset": [22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39], "pinned_cpus": [], "siblings": [[32, 33], [36, 37], [22, 23], [24, 25], [28, 29], [30, 31], [38, 39], [26, 27], [34, 35]], "memory": 20480, "mempages": [{"nova_object.version": "1.1", "nova_object.changes": ["used", "total", "reserved", "size_kb"], "nova_object.name": "NUMAPagesTopology", "nova_object.data": {"used": 0, "total": 4718592, "reserved": 0, "size_kb": 4}, "nova_object.namespace": "nova"}, {"nova_object.version": "1.1", "nova_object.changes": ["used", "total", "reserved", "size_kb"], "nova_object.name": "NUMAPagesTopology", "nova_object.data": {"used": 0, "total": 2, "reserved": 0, "size_kb": 1048576}, "nova_object.namespace": "nova"}], "id": 1}, "nova_object.namespace": "nova"}]}, "nova_object.namespace": "nova"}
But after 2 minutes (approximately), the usage information of numa_topology was missing.
# mysql -s nova -e "select numa_topology from compute_nodes where host='ocata1';"
numa_topology
{"nova_object.version": "1.2", "nova_object.changes": ["cells"], "nova_object.name": "NUMATopology", "nova_object.data": {"cells": [{"nova_object.version": "1.2", "nova_object.changes": ["cpu_usage", "memory_usage", "cpuset", "mempages", "pinned_cpus", "memory", "siblings", "id"], "nova_object.name": "NUMACell", "nova_object.data": {"cpu_usage": 0, "memory_usage": 0, "cpuset": [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], "pinned_cpus": [], "siblings": [[16, 17], [10, 11], [4, 5], [8, 9], [12, 13], [2, 3], [14, 15], [6, 7], [18, 19]], "memory": 20479, "mempages": [{"nova_object.version": "1.1", "nova_object.changes": ["total", "used", "reserved", "size_kb"], "nova_object.name": "NUMAPagesTopology", "nova_object.data": {"used": 0, "total": 4456317, "reserved": 0, "size_kb": 4}, "nova_object.namespace": "nova"}, {"nova_object.version": "1.1", "nova_object.changes": ["total", "used", "reserved", "size_kb"], "nova_object.name": "NUMAPagesTopology", "nova_object.data": {"used": 0, "total": 3, "reserved": 0, "size_kb": 1048576}, "nova_object.namespace": "nova"}], "id": 0}, "nova_object.namespace": "nova"}, {"nova_object.version": "1.2", "nova_object.changes": ["cpu_usage", "memory_usage", "cpuset", "mempages", "pinned_cpus", "memory", "siblings", "id"], "nova_object.name": "NUMACell", "nova_object.data": {"cpu_usage": 0, "memory_usage": 0, "cpuset": [22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39], "pinned_cpus": [], "siblings": [[32, 33], [36, 37], [22, 23], [24, 25], [28, 29], [30, 31], [38, 39], [26, 27], [34, 35]], "memory": 20480, "mempages": [{"nova_object.version": "1.1", "nova_object.changes": ["total", "used", "reserved", "size_kb"], "nova_object.name": "NUMAPagesTopology", "nova_object.data": {"used": 0, "total": 4718592, "reserved": 0, "size_kb": 4}, "nova_object.namespace": "nova"}, {"nova_object.version": "1.1", "nova_object.changes": ["total", "used", "reserved", "size_kb"], "nova_object.name": "NUMAPagesTopology", "nova_object.data": {"used": 0, "total": 2, "reserved": 0, "size_kb": 1048576}, "nova_object.namespace": "nova"}], "id": 1}, "nova_object.namespace": "nova"}]}, "nova_object.namespace": "nova"}
At first round of RT it is OK since usage information is recovered and updated by _update_ usage_from_ instances( ) followed by _update(). But 2nd round the information is clobbered by _copy_resources() overwriting CN with resources that has empty usage information about NUMA. ( https:/ /github. com/openstack/ nova/blob/ master/ nova/compute/ resource_ tracker. py#L561)
That's why this issue happens after 2 minutes later.