libvirt: Abort live-migration job when monitoring fails
During live migration process, a _live_migration_monitor thread
checks progress of migration on source host, if for any reason
we hit infrastructure issue involving a DB/RPC/libvirt-timeout
failure, an Exception is raised to the nova-compute service and
instance/migration is set to ERROR state.
The issue is that we may let live-migration job running out of nova
control. At the end of job, guest is resumed on target host while
nova still reports it on source host, this may lead to a split-brain
situation if instance is restarted.
This change proposes to abort live-migration job if issue occurs
during _live_migration_monitor.
Reviewed: https:/ /review. opendev. org/c/openstack /nova/+ /764435 /opendev. org/openstack/ nova/commit/ 39f0af5d18d6bea 34fa15b8f777811 5b25432749
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit 39f0af5d18d6bea 34fa15b8f777811 5b25432749
Author: Alexandre Arents <email address hidden>
Date: Thu Nov 26 15:24:19 2020 +0000
libvirt: Abort live-migration job when monitoring fails
During live migration process, a _live_migration _monitor thread libvirt- timeout migration is set to ERROR state.
checks progress of migration on source host, if for any reason
we hit infrastructure issue involving a DB/RPC/
failure, an Exception is raised to the nova-compute service and
instance/
The issue is that we may let live-migration job running out of nova
control. At the end of job, guest is resumed on target host while
nova still reports it on source host, this may lead to a split-brain
situation if instance is restarted.
This change proposes to abort live-migration job if issue occurs _monitor.
during _live_migration
Change-Id: Ia593b500425c81 e54eb401e38264d b5cc5fc1f93
Closes-Bug: #1905944