Live migration of realtime instances is broken

Bug #1889257 reported by Stephen Finucane
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Stephen Finucane
Train
Fix Released
Undecided
Lee Yarwood
Ussuri
Fix Released
Undecided
Unassigned

Bug Description

Attempting to live migrate an instance with realtime enabled fails on master (commit d4c857dfcb1). This appears to be a bug with the live migration of pinned instances feature introduced in Train.

# Steps to reproduce

Create a server using realtime attributes and then attempt to live migrate it. For example:

  $ openstack flavor create --ram 1024 --disk 0 --vcpu 4 \
    --property 'hw:cpu_policy=dedicated' \
    --property 'hw:cpu_realtime=yes' \
    --property 'hw:cpu_realtime_mask=^0-1' \
    realtime

  $ openstack server create --os-compute-api-version=2.latest \
    --flavor realtime --image cirros-0.5.1-x86_64-disk --nic none \
    --boot-from-volume 1 --wait \
    test.realtime

  $ openstack server migrate --live-migration test.realtime

# Expected result

Instance should be live migrated.

# Actual result

The live migration never happens. Looking at the logs we see the following error:

  Traceback (most recent call last):
    File "/usr/local/lib/python3.6/dist-packages/eventlet/hubs/hub.py", line 461, in fire_timers
      timer()
    File "/usr/local/lib/python3.6/dist-packages/eventlet/hubs/timer.py", line 59, in __call__
      cb(*args, **kw)
    File "/usr/local/lib/python3.6/dist-packages/eventlet/event.py", line 175, in _do_send
      waiter.switch(result)
    File "/usr/local/lib/python3.6/dist-packages/eventlet/greenthread.py", line 221, in main
      result = function(*args, **kwargs)
    File "/opt/stack/nova/nova/utils.py", line 670, in context_wrapper
      return func(*args, **kwargs)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 8966, in _live_migration_operation
      # is still ongoing, or failed
    File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
      raise value
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 8959, in _live_migration_operation
      # 2. src==running, dst==paused
    File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 658, in migrate
      destination, params=params, flags=flags)
    File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 190, in doit
      result = proxy_call(self._autowrap, f, *args, **kwargs)
    File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 148, in proxy_call
      rv = execute(f, *args, **kwargs)
    File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 129, in execute
      six.reraise(c, e, tb)
    File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
      raise value
    File "/usr/local/lib/python3.6/dist-packages/eventlet/tpool.py", line 83, in tworker
      rv = meth(*args, **kwargs)
    File "/usr/local/lib/python3.6/dist-packages/libvirt.py", line 1745, in migrateToURI3
      if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
  libvirt.libvirtError: vcpussched attributes 'vcpus' must not overlap

Looking further, we see there are issues with the XML we are generating for the destination. Compare what we have on the source before updating the XML for the destination:

  DEBUG nova.virt.libvirt.migration [-] _update_numa_xml input xml=<domain type="kvm">
    ...
    <cputune>
      <shares>4096</shares>
      <vcpupin vcpu="0" cpuset="0"/>
      <vcpupin vcpu="1" cpuset="1"/>
      <vcpupin vcpu="2" cpuset="4"/>
      <vcpupin vcpu="3" cpuset="5"/>
      <emulatorpin cpuset="0-1"/>
      <vcpusched vcpus="2" scheduler="fifo" priority="1"/>
      <vcpusched vcpus="3" scheduler="fifo" priority="1"/>
    </cputune
    ...
  </domain>
   {{(pid=12600) _update_numa_xml /opt/stack/nova/nova/virt/libvirt/migration.py:97}

To what we have after the update:

  DEBUG nova.virt.libvirt.migration [-] _update_numa_xml output xml=<domain type="kvm">
    ...
    <cputune>
      <shares>4096</shares>
      <vcpupin vcpu="0" cpuset="0"/>
      <vcpupin vcpu="1" cpuset="1"/>
      <vcpupin vcpu="2" cpuset="4"/>
      <vcpupin vcpu="3" cpuset="5"/>
      <emulatorpin cpuset="0-1"/>
      <vcpusched vcpus="2-3" scheduler="fifo" priority="1"/>
      <vcpusched vcpus="3" scheduler="fifo" priority="1"/>
    </cputune>
    ...
  </domain>
   {{(pid=12600) _update_numa_xml /opt/stack/nova/nova/virt/libvirt/migration.py:131}}

The issue is the 'vcpusched' elements. We're assuming there are only one of these elements when updating the XML for the destination [1]. Have to figure out why there are multiple elements and how best to handle this (likely by deleting and recreating everything).

I suspect the reason we didn't spot this is because libvirt is rewriting the XML on us. This is what nova is providing libvirt upon boot:

  DEBUG nova.virt.libvirt.driver [...] [instance: ...] End _get_guest_xml xml=<domain type="kvm">
    ...
    <cputune>
      <shares>4096</shares>
      <emulatorpin cpuset="0-1"/>
      <vcpupin vcpu="0" cpuset="0"/>
      <vcpupin vcpu="1" cpuset="1"/>
      <vcpupin vcpu="2" cpuset="4"/>
      <vcpupin vcpu="3" cpuset="5"/>
      <vcpusched vcpus="2-3" scheduler="fifo" priority="1"/>
    </cputune>
    ...
  </domain>
   {{(pid=12600) _get_guest_xml /opt/stack/nova/nova/virt/libvirt/driver.py:6331}}

but that's changed by time we get to recalculating things.

The solution is probably to remove all 'vcpusched' elements and recreate them, rather than trying to update stuff inline.

[1] https://github.com/openstack/nova/blob/21.0.0/nova/virt/libvirt/migration.py#L152-L155

Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
assignee: nobody → Stephen Finucane (stephenfinucane)
tags: added: numa
tags: added: libvirt live-migration
tags: added: realtime
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/743568

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/743588

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/743804

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/743805

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/743568
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b6aef1ec4f9848f85e1f367e560c2bdb703fa110
Submitter: Zuul
Branch: master

commit b6aef1ec4f9848f85e1f367e560c2bdb703fa110
Author: Stephen Finucane <email address hidden>
Date: Tue Jul 28 16:22:39 2020 +0100

    Handle multiple 'vcpusched' elements during live migrate

    When live migrating a pinned instance, we recalculate pinning
    information for the destination host and then update the instance's XML
    before spawning the instance there. As part of the pinning information
    recalculation, we must also recalculate information for realtime cores,
    which are configured using the '<vcpusched>' element. The
    'nova.virt.libvirt.migration._update_numa_xml' function, which handles
    this updating, was assuming there would only be one of these elements.
    This is a reasonably sane assumption since this is all we create in the
    'nova.virt.libvirt.LibvirtDriver._get_guest_numa_config' function used
    to generate the initial instance XML. However, a look at logs show that
    at least some (all?) versions of libvirt actually rewrite the XML we're
    providing them. Compare what is returned from '_get_guest_xml':

      DEBUG nova.virt.libvirt.driver [...] [instance: ...] End _get_guest_xml xml=<domain type="kvm">
        ...
        <cputune>
          <shares>4096</shares>
          ...
          <vcpusched vcpus="2-3" scheduler="fifo" priority="1"/>
        </cputune>
        ...
      </domain>
       {{(pid=12600) _get_guest_xml /opt/stack/nova/nova/virt/libvirt/driver.py:6331}}

    to what is seen when we enter '_update_numa_xml' (or via 'virsh dumpxml'
    at any point after instance creation):

      DEBUG nova.virt.libvirt.migration [-] _update_numa_xml input xml=<domain type="kvm">
        ...
        <cputune>
          <shares>4096</shares>
          ...
          <vcpusched vcpus="2" scheduler="fifo" priority="1"/>
          <vcpusched vcpus="3" scheduler="fifo" priority="1"/>
        </cputune
        ...
      </domain>
       {{(pid=12600) _update_numa_xml /opt/stack/nova/nova/virt/libvirt/migration.py:97}

    The solution is simple: rather than trying to modify the existing XML,
    simply scrap it and rebuild the elements from scratch. We should
    probably do this for all elements, but that can/should be tackled
    separately.

    Change-Id: Ic01603a91f6099f1068af0e955f3e1056021d673
    Signed-off-by: Stephen Finucane <email address hidden>
    Closes-Bug: #1889257

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/ussuri)

Reviewed: https://review.opendev.org/743804
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ac05bc3b38684e47e35e06a1514866773d8f3609
Submitter: Zuul
Branch: stable/ussuri

commit ac05bc3b38684e47e35e06a1514866773d8f3609
Author: Stephen Finucane <email address hidden>
Date: Tue Jul 28 16:22:39 2020 +0100

    Handle multiple 'vcpusched' elements during live migrate

    When live migrating a pinned instance, we recalculate pinning
    information for the destination host and then update the instance's XML
    before spawning the instance there. As part of the pinning information
    recalculation, we must also recalculate information for realtime cores,
    which are configured using the '<vcpusched>' element. The
    'nova.virt.libvirt.migration._update_numa_xml' function, which handles
    this updating, was assuming there would only be one of these elements.
    This is a reasonably sane assumption since this is all we create in the
    'nova.virt.libvirt.LibvirtDriver._get_guest_numa_config' function used
    to generate the initial instance XML. However, a look at logs show that
    at least some (all?) versions of libvirt actually rewrite the XML we're
    providing them. Compare what is returned from '_get_guest_xml':

      DEBUG nova.virt.libvirt.driver [...] [instance: ...] End _get_guest_xml xml=<domain type="kvm">
        ...
        <cputune>
          <shares>4096</shares>
          ...
          <vcpusched vcpus="2-3" scheduler="fifo" priority="1"/>
        </cputune>
        ...
      </domain>
       {{(pid=12600) _get_guest_xml /opt/stack/nova/nova/virt/libvirt/driver.py:6331}}

    to what is seen when we enter '_update_numa_xml' (or via 'virsh dumpxml'
    at any point after instance creation):

      DEBUG nova.virt.libvirt.migration [-] _update_numa_xml input xml=<domain type="kvm">
        ...
        <cputune>
          <shares>4096</shares>
          ...
          <vcpusched vcpus="2" scheduler="fifo" priority="1"/>
          <vcpusched vcpus="3" scheduler="fifo" priority="1"/>
        </cputune
        ...
      </domain>
       {{(pid=12600) _update_numa_xml /opt/stack/nova/nova/virt/libvirt/migration.py:97}

    The solution is simple: rather than trying to modify the existing XML,
    simply scrap it and rebuild the elements from scratch. We should
    probably do this for all elements, but that can/should be tackled
    separately.

    Change-Id: Ic01603a91f6099f1068af0e955f3e1056021d673
    Signed-off-by: Stephen Finucane <email address hidden>
    Closes-Bug: #1889257
    (cherry picked from commit b6aef1ec4f9848f85e1f367e560c2bdb703fa110)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/train)

Reviewed: https://review.opendev.org/743805
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=95bce095ed8489968e8dec2f0b6977f79bb8f98a
Submitter: Zuul
Branch: stable/train

commit 95bce095ed8489968e8dec2f0b6977f79bb8f98a
Author: Stephen Finucane <email address hidden>
Date: Tue Jul 28 16:22:39 2020 +0100

    Handle multiple 'vcpusched' elements during live migrate

    When live migrating a pinned instance, we recalculate pinning
    information for the destination host and then update the instance's XML
    before spawning the instance there. As part of the pinning information
    recalculation, we must also recalculate information for realtime cores,
    which are configured using the '<vcpusched>' element. The
    'nova.virt.libvirt.migration._update_numa_xml' function, which handles
    this updating, was assuming there would only be one of these elements.
    This is a reasonably sane assumption since this is all we create in the
    'nova.virt.libvirt.LibvirtDriver._get_guest_numa_config' function used
    to generate the initial instance XML. However, a look at logs show that
    at least some (all?) versions of libvirt actually rewrite the XML we're
    providing them. Compare what is returned from '_get_guest_xml':

      DEBUG nova.virt.libvirt.driver [...] [instance: ...] End _get_guest_xml xml=<domain type="kvm">
        ...
        <cputune>
          <shares>4096</shares>
          ...
          <vcpusched vcpus="2-3" scheduler="fifo" priority="1"/>
        </cputune>
        ...
      </domain>
       {{(pid=12600) _get_guest_xml /opt/stack/nova/nova/virt/libvirt/driver.py:6331}}

    to what is seen when we enter '_update_numa_xml' (or via 'virsh dumpxml'
    at any point after instance creation):

      DEBUG nova.virt.libvirt.migration [-] _update_numa_xml input xml=<domain type="kvm">
        ...
        <cputune>
          <shares>4096</shares>
          ...
          <vcpusched vcpus="2" scheduler="fifo" priority="1"/>
          <vcpusched vcpus="3" scheduler="fifo" priority="1"/>
        </cputune
        ...
      </domain>
       {{(pid=12600) _update_numa_xml /opt/stack/nova/nova/virt/libvirt/migration.py:97}

    The solution is simple: rather than trying to modify the existing XML,
    simply scrap it and rebuild the elements from scratch. We should
    probably do this for all elements, but that can/should be tackled
    separately.

    Change-Id: Ic01603a91f6099f1068af0e955f3e1056021d673
    Signed-off-by: Stephen Finucane <email address hidden>
    Closes-Bug: #1889257
    (cherry picked from commit b6aef1ec4f9848f85e1f367e560c2bdb703fa110)
    (cherry picked from commit ac05bc3b38684e47e35e06a1514866773d8f3609)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Stephen Finucane (<email address hidden>) on branch: master
Review: https://review.opendev.org/743588

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.