2019-09-24 14:16:09.682 6 WARNING nova.compute.resource_tracker [req-a02e1e22-b8bf-418e-a135-85a7374d5b3a - - - - -] Instance 17bcf040-cf68-4ac3-b365-8a77f93af85b has been moved to another host t2(t2). There are allocations remaining against the source host that might need to be removed: {u'resources': {u'VCPU': 2, u'MEMORY_MB': 2048, u'DISK_GB': 1}}.
Looks like the periodic task kicked in before the migration updated the instance's host, and so freed that instance's resources. The question is - and incoming migration should be consuming those, why did they get freed?
Log analysis notes:
The XML was updated to pin both instances to CPUs 0 and 15, at very different times:
2019-09-24 14:16:14.195 6 DEBUG nova.virt. libvirt. migration [-] _update_numa_xml output xml=<domain type="kvm"> instance- 00000012< /name> 17bcf040- cf68-4ac3- b365-8a77f93af8 5b</uuid>
<name>
<uuid>
[...]
<vcpupin vcpu="0" cpuset="0"/>
<vcpupin vcpu="1" cpuset="15"/>
2019-09-24 14:16:42.251 6 DEBUG nova.virt. libvirt. migration [-] _update_numa_xml output xml=<domain type="kvm"> instance- 00000011< /name> f1929d75- d6ac-45af- b54b-0e10be75d1 55</uuid>
<name>
<uuid>
[...]
<vcpupin vcpu="0" cpuset="0"/>
<vcpupin vcpu="1" cpuset="15"/>
For the first live migration we create the claims and the NUMAMigrateInfo:
2019-09-24 14:16:08.747 6 DEBUG nova.compute. manager [req-9c210c32- 614c-4040- abc8-e8a4138d88 5b f28f4213c3a1412 1b6f7fa15140e7a ef 25696a57055b4d6 bb5428f45f0473a 8c - default default] [instance: 17bcf040- cf68-4ac3- b365-8a77f93af8 5b] Created live migration claim. _live_migration _claim /var/lib/ kolla/venv/ lib/python2. 7/site- packages/ nova/compute/ manager. py:6659
2019-09-24 14:16:08.760 6 DEBUG nova.virt. libvirt. driver [req-9c210c32- 614c-4040- abc8-e8a4138d88 5b f28f4213c3a1412 1b6f7fa15140e7a ef 25696a57055b4d6 bb5428f45f0473a 8c - default default] Built NUMA live migration info: LibvirtLiveMigr ateNUMAInfo( cell_pins= {0=set( [0]),1= set([1] )},cpu_ pins={0= set([0] ),1=set( [15])}, emulator_ pins=set( [0,15]) ,sched_ priority= <?>,sched_ vcpus=< ?>) _get_live_ migrate_ numa_info /var/lib/ kolla/venv/ lib/python2. 7/site- packages/ nova/virt/ libvirt/ driver. py:8059
Same for the second live migration:
2019-09-24 14:16:35.853 6 DEBUG nova.compute. manager [req-5aeb2f2d- c69f-473a- 8da4-59664d87a2 14 f28f4213c3a1412 1b6f7fa15140e7a ef 25696a57055b4d6 bb5428f45f0473a 8c - default default] [instance: f1929d75- d6ac-45af- b54b-0e10be75d1 55] Created live migration claim. _live_migration _claim /var/lib/ kolla/venv/ lib/python2. 7/site- packages/ nova/compute/ manager. py:6659
2019-09-24 14:16:35.861 6 DEBUG nova.virt. libvirt. driver [req-5aeb2f2d- c69f-473a- 8da4-59664d87a2 14 f28f4213c3a1412 1b6f7fa15140e7a ef 25696a57055b4d6 bb5428f45f0473a 8c - default default] Built NUMA live migration info: LibvirtLiveMigr ateNUMAInfo( cell_pins= {0=set( [0]),1= set([1] )},cpu_ pins={0= set([0] ),1=set( [15])}, emulator_ pins=set( [0,15]) ,sched_ priority= <?>,sched_ vcpus=< ?>) _get_live_ migrate_ numa_info /var/lib/ kolla/venv/ lib/python2. 7/site- packages/ nova/virt/ libvirt/ driver. py:8059
Both claimed host CPUs 0 and 15 - but how/why? What happened between those 2 claims? Going back in time, we see:
The second live migration's claim claims CPUs 0 and 15:
2019-09-24 14:16:34.290 6 DEBUG nova.virt.hardware [req-5aeb2f2d- c69f-473a- 8da4-59664d87a2 14 f28f4213c3a1412 1b6f7fa15140e7a ef 25696a57055b4d6 bb5428f45f0473a 8c - default default] Selected cores for pinning: [(0, 0)], in cell 0 _pack_instance_ onto_cores /var/lib/ kolla/venv/ lib/python2. 7/site- packages/ nova/virt/ hardware. py:979
[...]
2019-09-24 14:16:34.295 6 DEBUG nova.virt.hardware [req-5aeb2f2d- c69f-473a- 8da4-59664d87a2 14 f28f4213c3a1412 1b6f7fa15140e7a ef 25696a57055b4d6 bb5428f45f0473a 8c - default default] Selected cores for pinning: [(1, 15)], in cell 1 _pack_instance_ onto_cores /var/lib/ kolla/venv/ lib/python2. 7/site- packages/ nova/virt/ hardware. py:979
[...]
2019-09-24 14:16:34.296 6 INFO nova.compute.claims [req-5aeb2f2d- c69f-473a- 8da4-59664d87a2 14 f28f4213c3a1412 1b6f7fa15140e7a ef 25696a57055b4d6 bb5428f45f0473a 8c - default default] [instance: f1929d75- d6ac-45af- b54b-0e10be75d1 55] Claim successful on node t1
Before that, the first live migration's claim claims CPUs 0 and 15 as well:
2019-09-24 14:16:07.217 6 DEBUG nova.virt.hardware [req-9c210c32- 614c-4040- abc8-e8a4138d88 5b f28f4213c3a1412 1b6f7fa15140e7a ef 25696a57055b4d6 bb5428f45f0473a 8c - default default] Selected cores for pinning: [(0, 0)], in cell 0 _pack_instance_ onto_cores /var/lib/ kolla/venv/ lib/python2. 7/site- packages/ nova/virt/ hardware. py:979
[...]
2019-09-24 14:16:07.223 6 DEBUG nova.virt.hardware [req-9c210c32- 614c-4040- abc8-e8a4138d88 5b f28f4213c3a1412 1b6f7fa15140e7a ef 25696a57055b4d6 bb5428f45f0473a 8c - default default] Selected cores for pinning: [(1, 15)], in cell 1 _pack_instance_ onto_cores /var/lib/ kolla/venv/ lib/python2. 7/site- packages/ nova/virt/ hardware. py:979
[...]
2019-09-24 14:16:07.224 6 INFO nova.compute.claims [req-9c210c32- 614c-4040- abc8-e8a4138d88 5b f28f4213c3a1412 1b6f7fa15140e7a ef 25696a57055b4d6 bb5428f45f0473a 8c - default default] [instance: 17bcf040- cf68-4ac3- b365-8a77f93af8 5b] Claim successful on node t1
There's one interesting bit between those two claims:
2019-09-24 14:16:09.418 6 DEBUG oslo_concurrenc y.lockutils [req-a02e1e22- b8bf-418e- a135-85a7374d5b 3a - - - - -] Lock "compute_resources" acquired by "nova.compute. resource_ tracker. _update_ available_ resource" :: waited 0.000s inner /var/lib/ kolla/venv/ lib/python2. 7/site- packages/ oslo_concurrenc y/lockutils. py:327
[...]
2019-09-24 14:16:09.682 6 WARNING nova.compute. resource_ tracker [req-a02e1e22- b8bf-418e- a135-85a7374d5b 3a - - - - -] Instance 17bcf040- cf68-4ac3- b365-8a77f93af8 5b has been moved to another host t2(t2). There are allocations remaining against the source host that might need to be removed: {u'resources': {u'VCPU': 2, u'MEMORY_MB': 2048, u'DISK_GB': 1}}.
Looks like the periodic task kicked in before the migration updated the instance's host, and so freed that instance's resources. The question is - and incoming migration should be consuming those, why did they get freed?