Bug #1845146 “NUMA aware live migration failed when vCPU pin set...” : Series train : Bugs : OpenStack Compute (nova)

Revision history for this message

ya.wang (ya.wang) wrote on 2019-09-24:

#1

log.tgz Edit (29.1 MiB, application/x-tar)

Matt Riedemann (mriedem) on 2019-09-24

tags:

added: numa

Revision history for this message

Matt Riedemann (mriedem) wrote on 2019-09-24:

#2

This may be a duplicate of bug 1829349.

Revision history for this message

Artom Lifshitz (notartom) wrote on 2019-09-24:

#3

Download full text (5.4 KiB)

Log analysis notes:

The XML was updated to pin both instances to CPUs 0 and 15, at very different times:

2019-09-24 14:16:14.195 6 DEBUG nova.virt.libvirt.migration [-] _update_numa_xml output xml=<domain type="kvm">
  <name>instance-00000012</name>
  <uuid>17bcf040-cf68-4ac3-b365-8a77f93af85b</uuid>
[...]
    <vcpupin vcpu="0" cpuset="0"/>
    <vcpupin vcpu="1" cpuset="15"/>

2019-09-24 14:16:42.251 6 DEBUG nova.virt.libvirt.migration [-] _update_numa_xml output xml=<domain type="kvm">
  <name>instance-00000011</name>
  <uuid>f1929d75-d6ac-45af-b54b-0e10be75d155</uuid>
[...]
    <vcpupin vcpu="0" cpuset="0"/>
    <vcpupin vcpu="1" cpuset="15"/>

For the first live migration we create the claims and the NUMAMigrateInfo:

2019-09-24 14:16:08.747 6 DEBUG nova.compute.manager [req-9c210c32-614c-4040-abc8-e8a4138d885b f28f4213c3a14121b6f7fa15140e7aef 25696a57055b4d6bb5428f45f0473a8c - default default] [instance: 17bcf040-cf68-4ac3-b365-8a77f93af85b] Created live migration claim. _live_migration_claim /var/lib/kolla/venv/lib/python2.7/site-packages/nova/compute/manager.py:6659

2019-09-24 14:16:08.760 6 DEBUG nova.virt.libvirt.driver [req-9c210c32-614c-4040-abc8-e8a4138d885b f28f4213c3a14121b6f7fa15140e7aef 25696a57055b4d6bb5428f45f0473a8c - default default] Built NUMA live migration info: LibvirtLiveMigrateNUMAInfo(cell_pins={0=set([0]),1=set([1])},cpu_pins={0=set([0]),1=set([15])},emulator_pins=set([0,15]),sched_priority=<?>,sched_vcpus=<?>) _get_live_migrate_numa_info /var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/libvirt/driver.py:8059

Same for the second live migration:

2019-09-24 14:16:35.853 6 DEBUG nova.compute.manager [req-5aeb2f2d-c69f-473a-8da4-59664d87a214 f28f4213c3a14121b6f7fa15140e7aef 25696a57055b4d6bb5428f45f0473a8c - default default] [instance: f1929d75-d6ac-45af-b54b-0e10be75d155] Created live migration claim. _live_migration_claim /var/lib/kolla/venv/lib/python2.7/site-packages/nova/compute/manager.py:6659

2019-09-24 14:16:35.861 6 DEBUG nova.virt.libvirt.driver [req-5aeb2f2d-c69f-473a-8da4-59664d87a214 f28f4213c3a14121b6f7fa15140e7aef 25696a57055b4d6bb5428f45f0473a8c - default default] Built NUMA live migration info: LibvirtLiveMigrateNUMAInfo(cell_pins={0=set([0]),1=set([1])},cpu_pins={0=set([0]),1=set([15])},emulator_pins=set([0,15]),sched_priority=<?>,sched_vcpus=<?>) _get_live_migrate_numa_info /var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/libvirt/driver.py:8059

Both claimed host CPUs 0 and 15 - but how/why? What happened between those 2 claims? Going back in time, we see:

The second live migration's claim claims CPUs 0 and 15:

2019-09-24 14:16:34.290 6 DEBUG nova.virt.hardware [req-5aeb2f2d-c69f-473a-8da4-59664d87a214 f28f4213c3a14121b6f7fa15140e7aef 25696a57055b4d6bb5428f45f0473a8c - default default] Selected cores for pinning: [(0, 0)], in cell 0 _pack_instance_onto_cores /var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/hardware.py:979

[...]

2019-09-24 14:16:34.295 6 DEBUG nova.virt.hardware [req-5aeb2f2d-c69f-473a-8da4-59664d87a214 f28f4213c3a14121b6f7fa15140e7aef 25696a57055b4d6bb5428f45f0473a8c - default default] Selected cores for pinning: [(1,...

Log analysis notes:

The XML was updated to pin both instances to CPUs 0 and 15, at very different times:

2019-09-24 14:16:14.195 6 DEBUG nova.virt.libvirt.migration [-] _update_numa_xml output xml=<domain type="kvm">
  <name>instance-00000012</name>
  <uuid>17bcf040-cf68-4ac3-b365-8a77f93af85b</uuid>
[...]
    <vcpupin vcpu="0" cpuset="0"/>
    <vcpupin vcpu="1" cpuset="15"/>

2019-09-24 14:16:42.251 6 DEBUG nova.virt.libvirt.migration [-] _update_numa_xml output xml=<domain type="kvm">
  <name>instance-00000011</name>
  <uuid>f1929d75-d6ac-45af-b54b-0e10be75d155</uuid>
[...]
    <vcpupin vcpu="0" cpuset="0"/>
    <vcpupin vcpu="1" cpuset="15"/>

For the first live migration we create the claims and the NUMAMigrateInfo:

2019-09-24 14:16:08.747 6 DEBUG nova.compute.manager [req-9c210c32-614c-4040-abc8-e8a4138d885b f28f4213c3a14121b6f7fa15140e7aef 25696a57055b4d6bb5428f45f0473a8c - default default] [instance: 17bcf040-cf68-4ac3-b365-8a77f93af85b] Created live migration claim. _live_migration_claim /var/lib/kolla/venv/lib/python2.7/site-packages/nova/compute/manager.py:6659

2019-09-24 14:16:08.760 6 DEBUG nova.virt.libvirt.driver [req-9c210c32-614c-4040-abc8-e8a4138d885b f28f4213c3a14121b6f7fa15140e7aef 25696a57055b4d6bb5428f45f0473a8c - default default] Built NUMA live migration info: LibvirtLiveMigrateNUMAInfo(cell_pins={0=set([0]),1=set([1])},cpu_pins={0=set([0]),1=set([15])},emulator_pins=set([0,15]),sched_priority=<?>,sched_vcpus=<?>) _get_live_migrate_numa_info /var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/libvirt/driver.py:8059

Same for the second live migration:

2019-09-24 14:16:35.853 6 DEBUG nova.compute.manager [req-5aeb2f2d-c69f-473a-8da4-59664d87a214 f28f4213c3a14121b6f7fa15140e7aef 25696a57055b4d6bb5428f45f0473a8c - default default] [instance: f1929d75-d6ac-45af-b54b-0e10be75d155] Created live migration claim. _live_migration_claim /var/lib/kolla/venv/lib/python2.7/site-packages/nova/compute/manager.py:6659

2019-09-24 14:16:35.861 6 DEBUG nova.virt.libvirt.driver [req-5aeb2f2d-c69f-473a-8da4-59664d87a214 f28f4213c3a14121b6f7fa15140e7aef 25696a57055b4d6bb5428f45f0473a8c - default default] Built NUMA live migration info: LibvirtLiveMigrateNUMAInfo(cell_pins={0=set([0]),1=set([1])},cpu_pins={0=set([0]),1=set([15])},emulator_pins=set([0,15]),sched_priority=<?>,sched_vcpus=<?>) _get_live_migrate_numa_info /var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/libvirt/driver.py:8059

Both claimed host CPUs 0 and 15 - but how/why? What happened between those 2 claims? Going back in time, we see:

The second live migration's claim claims CPUs 0 and 15:

2019-09-24 14:16:34.290 6 DEBUG nova.virt.hardware [req-5aeb2f2d-c69f-473a-8da4-59664d87a214 f28f4213c3a14121b6f7fa15140e7aef 25696a57055b4d6bb5428f45f0473a8c - default default] Selected cores for pinning: [(0, 0)], in cell 0 _pack_instance_onto_cores /var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/hardware.py:979

[...]

2019-09-24 14:16:34.295 6 DEBUG nova.virt.hardware [req-5aeb2f2d-c69f-473a-8da4-59664d87a214 f28f4213c3a14121b6f7fa15140e7aef 25696a57055b4d6bb5428f45f0473a8c - default default] Selected cores for pinning: [(1, 15)], in cell 1 _pack_instance_onto_cores /var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/hardware.py:979

[...]

2019-09-24 14:16:34.296 6 INFO nova.compute.claims [req-5aeb2f2d-c69f-473a-8da4-59664d87a214 f28f4213c3a14121b6f7fa15140e7aef 25696a57055b4d6bb5428f45f0473a8c - default default] [instance: f1929d75-d6ac-45af-b54b-0e10be75d155] Claim successful on node t1

Before that, the first live migration's claim claims CPUs 0 and 15 as well:

2019-09-24 14:16:07.217 6 DEBUG nova.virt.hardware [req-9c210c32-614c-4040-abc8-e8a4138d885b f28f4213c3a14121b6f7fa15140e7aef 25696a57055b4d6bb5428f45f0473a8c - default default] Selected cores for pinning: [(0, 0)], in cell 0 _pack_instance_onto_cores /var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/hardware.py:979

[...]

2019-09-24 14:16:07.223 6 DEBUG nova.virt.hardware [req-9c210c32-614c-4040-abc8-e8a4138d885b f28f4213c3a14121b6f7fa15140e7aef 25696a57055b4d6bb5428f45f0473a8c - default default] Selected cores for pinning: [(1, 15)], in cell 1 _pack_instance_onto_cores /var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/hardware.py:979

[...]

2019-09-24 14:16:07.224 6 INFO nova.compute.claims [req-9c210c32-614c-4040-abc8-e8a4138d885b f28f4213c3a14121b6f7fa15140e7aef 25696a57055b4d6bb5428f45f0473a8c - default default] [instance: 17bcf040-cf68-4ac3-b365-8a77f93af85b] Claim successful on node t1

There's one interesting bit between those two claims:

2019-09-24 14:16:09.418 6 DEBUG oslo_concurrency.lockutils [req-a02e1e22-b8bf-418e-a135-85a7374d5b3a - - - - -] Lock "compute_resources" acquired by "nova.compute.resource_tracker._update_available_resource" :: waited 0.000s inner /var/lib/kolla/venv/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:327

[...]

2019-09-24 14:16:09.682 6 WARNING nova.compute.resource_tracker [req-a02e1e22-b8bf-418e-a135-85a7374d5b3a - - - - -] Instance 17bcf040-cf68-4ac3-b365-8a77f93af85b has been moved to another host t2(t2). There are allocations remaining against the source host that might need to be removed: {u'resources': {u'VCPU': 2, u'MEMORY_MB': 2048, u'DISK_GB': 1}}.

Looks like the periodic task kicked in before the migration updated the instance's host, and so freed that instance's resources. The question is - and incoming migration should be consuming those, why did they get freed?

Revision history for this message

Artom Lifshitz (notartom) wrote on 2019-09-24:

#4

Figured it out:

When the update resources periodic task runs, it pulls migrations from the database using [1], which filters out migrations in 'accepted' status. Live migrations are created with an 'accepted' status by the conductor [2], and are only set to 'preparing' by the compute manager here [3], which happens after all the new NUMA-aware live migrations claims stuff. So there's a time window after the claim but before the migration has been set to 'preparing' during which, if the periodic resource update task kicks in, it will miss the migration, see that the instance is still on the source host according to the database, and free its resources from the destination.

[1] https://github.com/openstack/nova/blob/0ce66605e16aca85df97acdd8c459802fcdb9aa0/nova/db/sqlalchemy/api.py#L4422
[2] https://github.com/openstack/nova/blob/0ce66605e16aca85df97acdd8c459802fcdb9aa0/nova/conductor/manager.py#L422
[3] https://github.com/openstack/nova/blob/0ce66605e16aca85df97acdd8c459802fcdb9aa0/nova/compute/manager.py#L7020

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-09-24: Fix proposed to nova (master)

#5

Fix proposed to branch: master
Review: https://review.opendev.org/684409

Changed in nova:
assignee:	nobody → Artom Lifshitz (notartom)
status:	New → In Progress

Revision history for this message

Artom Lifshitz (notartom) wrote on 2019-09-24:

#6

Ya, could you retry your tests with [1] applied, to confirm whether it fixes the issue?

[1] https://review.opendev.org/684409

Matt Riedemann (mriedem) on 2019-09-24

tags:

added: train-rc-potential

Matt Riedemann (mriedem) on 2019-09-26

Changed in nova:
importance:	Undecided → High

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-09-27: Fix proposed to nova (stable/train)

#7

Fix proposed to branch: stable/train
Review: https://review.opendev.org/685387

Matt Riedemann (mriedem) on 2019-09-27

no longer affects:

nova/train

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-09-28: Fix merged to nova (master)

#8

Reviewed: https://review.opendev.org/684409
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6ec686c26b2c8b18bcff522633bfe9715e0feec3
Submitter: Zuul
Branch: master

commit 6ec686c26b2c8b18bcff522633bfe9715e0feec3
Author: Artom Lifshitz <email address hidden>
Date: Tue Sep 24 13:22:23 2019 -0400

Stop filtering out 'accepted' for in-progress migrations

    Live migrations are created with an 'accepted' status. Resource claims
    on the destination are done with the migration in 'accepted' status.
    The status is set to 'preparing' a bit later, right before running
    pre_live_migration(). Migrations with status 'accepted' are filtered
    out by the database layer when getting in-progress migrations. Thus,
    there's a time window after resource claims but before 'preparing'
    during which resources have been claimed but the migration is not
    considered in-progress by the database layer. During that window, the
    instance's host is the source - that's only updated once the live
    migration finishes. If the update available resources periodic task
    runs during that window, it'll free the instance's resource from the
    destination because neither the instance nor any of its in-progress
    migrations are associated with the destination. This means that other
    incoming instances are able to consume resources that should not be
    available. This patch stops filtering out the 'accepted' status in the
    database layer when retrieving in-progress migrations.

Change-Id: I4c56925ed35bc3275ca1ac6c30d7fd641ad84260
Closes-bug: 1845146

Changed in nova:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-09-30: Fix merged to nova (stable/train)

#9

Reviewed: https://review.opendev.org/685387
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=45c2ba37bc21370d9814b81cd892abc3cb8a9f04
Submitter: Zuul
Branch: stable/train

commit 45c2ba37bc21370d9814b81cd892abc3cb8a9f04
Author: Artom Lifshitz <email address hidden>
Date: Tue Sep 24 13:22:23 2019 -0400

Stop filtering out 'accepted' for in-progress migrations

    Live migrations are created with an 'accepted' status. Resource claims
    on the destination are done with the migration in 'accepted' status.
    The status is set to 'preparing' a bit later, right before running
    pre_live_migration(). Migrations with status 'accepted' are filtered
    out by the database layer when getting in-progress migrations. Thus,
    there's a time window after resource claims but before 'preparing'
    during which resources have been claimed but the migration is not
    considered in-progress by the database layer. During that window, the
    instance's host is the source - that's only updated once the live
    migration finishes. If the update available resources periodic task
    runs during that window, it'll free the instance's resource from the
    destination because neither the instance nor any of its in-progress
    migrations are associated with the destination. This means that other
    incoming instances are able to consume resources that should not be
    available. This patch stops filtering out the 'accepted' status in the
    database layer when retrieving in-progress migrations.

    Change-Id: I4c56925ed35bc3275ca1ac6c30d7fd641ad84260
    Closes-bug: 1845146
    (cherry picked from commit 6ec686c26b2c8b18bcff522633bfe9715e0feec3)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-10-08: Related fix proposed to nova (master)

#10

Related fix proposed to branch: master
Review: https://review.opendev.org/687404

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-10-09: Fix included in openstack/nova 20.0.0.0rc2

#11

This issue was fixed in the openstack/nova 20.0.0.0rc2 release candidate.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-05-01: Related fix merged to nova (master)

#12

Reviewed: https://review.opendev.org/687404
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=32713a4fe885ee55ef0fefc8ce6c78877f2f03e7
Submitter: Zuul
Branch: master

commit 32713a4fe885ee55ef0fefc8ce6c78877f2f03e7
Author: Artom Lifshitz <email address hidden>
Date: Tue Oct 8 15:23:47 2019 -0400

NUMA LM: Add func test for bug 1845146

    Bug 1845146 was caused by the update available resources periodic task
    running during a small window in which the migration was in 'accepted'
    but resource claims had been done. 'accepted' migrations were not
    considered in progress before the fix for 1845146 merged as commit
    6ec686c26b, which caused the periodic task to incorrectly free the
    migration's resources from the destination. This patch adds a test
    that triggers this race by wrapping around the compute manager's
    live_migration() (which sets the 'queued' migration status - this was
    actually wrong in 6ec686c26b, as it talks about 'preparing') and
    running the update available resources periodic task while the
    migration is still in 'accepted'.

Related bug: 1845146

Change-Id: I78e79112a9c803fb45d828cfb4641456da66364a

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Compute (nova)	Fix Released	High	Artom Lifshitz
	Train	Fix Committed	High	Dan Smith

OpenStack Compute (nova)

NUMA aware live migration failed when vCPU pin set

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches