test_server_connectivity_cold_migration_revert randomly fails ssh check

Bug #1788403 reported by Matt Riedemann
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Matt Riedemann
Rocky
Fix Released
Medium
Lee Yarwood

Bug Description

Seen here:

http://logs.openstack.org/98/591898/3/check/tempest-slow/c480e82/job-output.txt.gz#_2018-08-21_23_20_11_337095

2018-08-21 23:20:11.337095 | controller | {0} tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_cold_migration_revert [200.028926s] ... FAILED
2018-08-21 23:20:11.337187 | controller |
2018-08-21 23:20:11.337260 | controller | Captured traceback:
2018-08-21 23:20:11.337329 | controller | ~~~~~~~~~~~~~~~~~~~
2018-08-21 23:20:11.337435 | controller | Traceback (most recent call last):
2018-08-21 23:20:11.337591 | controller | File "tempest/common/utils/__init__.py", line 89, in wrapper
2018-08-21 23:20:11.337702 | controller | return f(*func_args, **func_kwargs)
2018-08-21 23:20:11.338012 | controller | File "tempest/scenario/test_network_advanced_server_ops.py", line 258, in test_server_connectivity_cold_migration_revert
2018-08-21 23:20:11.338175 | controller | server, keypair, floating_ip)
2018-08-21 23:20:11.338571 | controller | File "tempest/scenario/test_network_advanced_server_ops.py", line 103, in _wait_server_status_and_check_network_connectivity
2018-08-21 23:20:11.338766 | controller | self._check_network_connectivity(server, keypair, floating_ip)
2018-08-21 23:20:11.339004 | controller | File "tempest/scenario/test_network_advanced_server_ops.py", line 96, in _check_network_connectivity
2018-08-21 23:20:11.339069 | controller | server)
2018-08-21 23:20:11.339251 | controller | File "tempest/scenario/manager.py", line 622, in check_vm_connectivity
2018-08-21 23:20:11.339314 | controller | msg=msg)
2018-08-21 23:20:11.339572 | controller | File "/opt/stack/tempest/.tox/tempest/local/lib/python2.7/site-packages/unittest2/case.py", line 702, in assertTrue
2018-08-21 23:20:11.339683 | controller | raise self.failureException(msg)
2018-08-21 23:20:11.339862 | controller | AssertionError: False is not true : Public network connectivity check failed
2018-08-21 23:20:11.340000 | controller | Timed out waiting for 172.24.5.13 to become reachable

The test is pretty simple:

    @decorators.idempotent_id('25b188d7-0183-4b1e-a11d-15840c8e2fd6')
    @testtools.skipUnless(CONF.compute_feature_enabled.cold_migration,
                          'Cold migration is not available.')
    @testtools.skipUnless(CONF.compute.min_compute_nodes > 1,
                          'Less than 2 compute nodes, skipping multinode '
                          'tests.')
    @decorators.attr(type='slow')
    @utils.services('compute', 'network')
    def test_server_connectivity_cold_migration_revert(self):
        keypair = self.create_keypair()
        server = self._setup_server(keypair)
        floating_ip = self._setup_network(server, keypair)
        src_host = self._get_host_for_server(server['id'])
        self._wait_server_status_and_check_network_connectivity(
            server, keypair, floating_ip)

        self.admin_servers_client.migrate_server(server['id'])
        waiters.wait_for_server_status(self.servers_client, server['id'],
                                       'VERIFY_RESIZE')
        self.servers_client.revert_resize_server(server['id'])
        self._wait_server_status_and_check_network_connectivity(
            server, keypair, floating_ip)
        dst_host = self._get_host_for_server(server['id'])

        self.assertEqual(src_host, dst_host)

It creates a server, resizes it, reverts the resize and then tries to ssh into the guest, which times out. I wonder if on the resize (or revert) we're losing the IP or failing to plug it properly.

Revision history for this message
Matt Riedemann (mriedem) wrote :

172.24.5.13 in this case is the floating IP.

Revision history for this message
Matt Riedemann (mriedem) wrote :

This is when we migrate the server:

http://logs.openstack.org/98/591898/3/check/tempest-slow/c480e82/job-output.txt.gz#_2018-08-21_23_20_11_616392

2018-08-21 23:20:11.616392 | controller | 2018-08-21 23:17:38,230 9451 INFO [tempest.lib.common.rest_client] Request (TestNetworkAdvancedServerOps:test_server_connectivity_cold_migration_revert): 202 POST https://10.209.64.15/compute/v2.1/servers/015f755f-2779-4613-9b28-9bc377056fe2/action 1.002s

Revision history for this message
Matt Riedemann (mriedem) wrote :

This is when we revert the resize:

http://logs.openstack.org/98/591898/3/check/tempest-slow/c480e82/job-output.txt.gz#_2018-08-21_23_20_11_686467

2018-08-21 23:20:11.686467 | controller | 2018-08-21 23:17:50,966 9451 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'X-Auth-Token': '<omitted>', 'Accept': 'application/json'}
2018-08-21 23:20:11.686571 | controller | Body: {"revertResize": {}}

Revision history for this message
Matt Riedemann (mriedem) wrote :

Before the ssh fails after the resize revert, the floaging IP is down:

http://logs.openstack.org/98/591898/3/check/tempest-slow/c480e82/job-output.txt.gz#_2018-08-21_23_20_11_731057

2018-08-21 23:20:11.731057 | controller | 2018-08-21 23:17:58,661 9451 INFO [tempest.scenario.manager] FloatingIP: {u'router_id': u'188eb35a-1270-4536-a02a-92f9ec031c7a', u'description': u'', u'tags': [], u'status': u'DOWN', u'tenant_id': u'38bf4237efd54095959153ffc2227946', u'created_at': u'2018-08-21T23:17:10Z', u'updated_at': u'2018-08-21T23:17:10Z', u'floating_network_id': u'0fd1d892-0aa3-4c0c-818f-44f1c9a9b569', u'fixed_ip_address': u'10.1.0.9', u'port_details': {u'status': u'ACTIVE', u'name': u'', u'admin_state_up': True, u'network_id': u'6c9a79d8-b508-4a97-a15d-bf4acee8b4a8', u'device_owner': u'compute:nova', u'mac_address': u'fa:16:3e:2b:a9:39', u'device_id': u'015f755f-2779-4613-9b28-9bc377056fe2'}, u'floating_ip_address': u'172.24.5.13', u'revision_number': 0, u'project_id': u'38bf4237efd54095959153ffc2227946', u'id': u'ac394112-d245-461b-b660-00cab7c1b51a', u'port_id': u'b835ff51-0262-4237-ab52-7b08206bd099'} is at status: ACTIVE
2018-08-21 23:20:11.731352 | controller | 2018-08-21 23:17:58,661 9451 DEBUG [tempest.scenario.manager] checking network connections to IP 172.24.5.13 with user: cirros
2018-08-21 23:20:11.731802 | controller | 2018-08-21 23:17:58,662 9451 DEBUG [tempest.scenario.manager] TestNetworkAdvancedServerOps:test_server_connectivity_cold_migration_revert begins to ping 172.24.5.13 in 120 sec and the expected result is reachable
2018-08-21 23:20:11.732072 | controller | 2018-08-21 23:19:58,848 9451 DEBUG [tempest.lib.common.utils.test_utils] Call ping returns false in 120.000000 seconds
2018-08-21 23:20:11.732512 | controller | 2018-08-21 23:19:58,850 9451 DEBUG [tempest.scenario.manager] TestNetworkAdvancedServerOps:test_server_connectivity_cold_migration_revert finishes ping 172.24.5.13 in 120 sec and the ping result is unexpected

Formatted json:

{
   u'router_id':u'188eb35a-1270-4536-a02a-92f9ec031c7a',
   u'description':u'',
   u'tags':[

   ],
   u'status':u'DOWN',
   u'tenant_id':u'38bf4237efd54095959153ffc2227946',
   u'created_at': u'2018-08-21T23:17:10 Z',
   u'updated_at': u'2018-08-21T23:17:10 Z',
   u'floating_network_id':u'0fd1d892-0aa3-4c0c-818f-44f1c9a9b569',
   u'fixed_ip_address':u'10.1.0.9',
   u'port_details':{
      u'status':u'ACTIVE',
      u'name':u'',
      u'admin_state_up':True,
      u'network_id':u'6c9a79d8-b508-4a97-a15d-bf4acee8b4a8',
      u'device_owner': u'compute:nova',
      u'mac_address': u'fa:16:3e:2 b: a9:39 ', u' device_id':u'015f755f-2779-4613-9b28-9bc377056fe2'
   },
   u'floating_ip_address':u'172.24.5.13',
   u'revision_number':0,
   u'project_id':u'38bf4237efd54095959153ffc2227946',
   u'id':u'ac394112-d245-461b-b660-00cab7c1b51a',
   u'port_id':u'b835ff51-0262-4237-ab52-7b08206bd099'
}

I'm not sure why Tempest says, "is at status: ACTIVE" when the status is clearly DOWN. Maybe it's checking the port_details?

Revision history for this message
Matt Riedemann (mriedem) wrote :

Ignore the above comment, that's a bug in the logging in the tempest check_floating_ip_status utility method, the real floating IP status is here:

http://logs.openstack.org/98/591898/3/check/tempest-slow/c480e82/job-output.txt.gz#_2018-08-21_23_20_11_728946

2018-08-21 23:20:11.728946 | controller | Body: {"floatingip": {"router_id": "188eb35a-1270-4536-a02a-92f9ec031c7a", "status": "ACTIVE", "description": "", "tags": [], "tenant_id": "38bf4237efd54095959153ffc2227946", "created_at": "2018-08-21T23:17:10Z", "updated_at": "2018-08-21T23:17:13Z", "floating_network_id": "0fd1d892-0aa3-4c0c-818f-44f1c9a9b569", "port_details": {"status": "ACTIVE", "name": "", "admin_state_up": true, "network_id": "6c9a79d8-b508-4a97-a15d-bf4acee8b4a8", "device_owner": "compute:nova", "mac_address": "fa:16:3e:2b:a9:39", "device_id": "015f755f-2779-4613-9b28-9bc377056fe2"}, "fixed_ip_address": "10.1.0.9", "floating_ip_address": "172.24.5.13", "revision_number": 1, "project_id": "38bf4237efd54095959153ffc2227946", "port_id": "b835ff51-0262-4237-ab52-7b08206bd099", "id": "ac394112-d245-461b-b660-00cab7c1b51a"}}

Revision history for this message
Matt Riedemann (mriedem) wrote :

When we plug_vifs again on the original source host as part of finish_revert_resize, the port is inactive when we plug the vif:

http://logs.openstack.org/98/591898/3/check/tempest-slow/c480e82/compute1/logs/screen-n-cpu.txt.gz#_Aug_21_23_17_56_510452

Aug 21 23:17:56.510452 ubuntu-xenial-rax-iad-0001454116 nova-compute[21579]: INFO os_vif [None req-07e0e23e-d0c4-41fb-8897-c6c5ddb4dcb2 tempest-TestNetworkAdvancedServerOps-1125563785 tempest-TestNetworkAdvancedServerOps-1125563785] Successfully plugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:2b:a9:39,bridge_name='br-int',has_traffic_filtering=True,id=b835ff51-0262-4237-ab52-7b08206bd099,network=Network(6c9a79d8-b508-4a97-a15d-bf4acee8b4a8),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tapb835ff51-02')

Maybe that's because we unplugged the vif from the dest host during revert_resize and then updating the port binding host_id to point back at the original source host?

About a second later I see a vif plugged event:

http://logs.openstack.org/98/591898/3/check/tempest-slow/c480e82/compute1/logs/screen-n-cpu.txt.gz#_Aug_21_23_17_57_152994

Aug 21 23:17:57.152994 ubuntu-xenial-rax-iad-0001454116 nova-compute[21579]: DEBUG nova.compute.manager [None req-ae5b1d99-ee4e-46e4-ba81-bc93deee2c46 service nova] [instance: 015f755f-2779-4613-9b28-9bc377056fe2] Received event network-vif-plugged-b835ff51-0262-4237-ab52-7b08206bd099 {{(pid=21579) external_instance_event /opt/stack/nova/nova/compute/manager.py:8077}}

But then tempest waits for the floating IP to be active before trying to ssh so I'm not sure what is failing.

Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Attila Fazekas (afazekas) wrote :

http://logs.openstack.org/69/557169/20/gate/tempest-slow/b483789/job-output.txt here the dhcp failed,
maybe worth to check the path to the dhcp server.

Revision history for this message
Attila Fazekas (afazekas) wrote :

http://logs.openstack.org/69/557169/20/gate/tempest-slow/b483789/controller/logs/screen-q-agt.txt.gz

Sep 18 05:00:43.534624 ubuntu-xenial-ovh-gra1-0002079055 neutron-openvswitch-agent[23053]: INFO neutron.agent.common.ovs_lib [None req-4e14c6de-ef93-44ea-ab82-312d93726646 None None] Port dcabb610-ecaa-43c4-8c3d-e7f056024c3a not present in bridge br-int

Usually nova's responsibility to put the port into br-int. Either neutron did not noticed the plug event (ovsdb-client monitor) or nova did put it there with the proper attributes.

Revision history for this message
Attila Fazekas (afazekas) wrote :

The related monitor events:
Sep 18 04:58:26.258153 ubuntu-xenial-ovh-gra1-0002079055 neutron-openvswitch-agent[23053]: DEBUG neutron.agent.common.async_process [-] Output received from [ovsdb-client monitor tcp:127.0.0.1:6640 Interface name,ofport,external_ids --format=json]: {"data":[["15170696-5e84-42e4-9aa0-5edfdeb28b70","insert","tapdcabb610-ec",["set",[]],["map",[["attached-mac","fa:16:3e:02:83:aa"],["iface-id","dcabb610-ecaa-43c4-8c3d-e7f056024c3a"],["iface-status","active"],["vm-id","d91b159b-2457-49a1-b922-507f3cf78c7c"]]]]],"headings":["row","action","name","ofport","external_ids"]} {{(pid=23053) _read_stdout /opt/stack/neutron/neutron/agent/common/async_process.py:242}}
Sep 18 04:58:26.260293 ubuntu-xenial-ovh-gra1-0002079055 neutron-openvswitch-agent[23053]: DEBUG neutron.agent.common.async_process [-] Output received from [ovsdb-client monitor tcp:127.0.0.1:6640 Interface name,ofport,external_ids --format=json]: {"data":[["15170696-5e84-42e4-9aa0-5edfdeb28b70","old",null,["set",[]],null],["","new","tapdcabb610-ec",26,["map",[["attached-mac","fa:16:3e:02:83:aa"],["iface-id","dcabb610-ecaa-43c4-8c3d-e7f056024c3a"],["iface-status","active"],["vm-id","d91b159b-2457-49a1-b922-507f3cf78c7c"]]]]],"headings":["row","action","name","ofport","external_ids"]} {{(pid=23053) _read_stdout /opt/stack/neutron/neutron/agent/common/async_process.py:242}}
Sep 18 04:58:29.464005 ubuntu-xenial-ovh-gra1-0002079055 neutron-openvswitch-agent[23053]: DEBUG neutron.agent.common.async_process [-] Output received from [ovsdb-client monitor tcp:127.0.0.1:6640 Interface name,ofport,external_ids --format=json]: {"data":[["15170696-5e84-42e4-9aa0-5edfdeb28b70","old",null,26,null],["","new","tapdcabb610-ec",-1,["map",[["attached-mac","fa:16:3e:02:83:aa"],["iface-id","dcabb610-ecaa-43c4-8c3d-e7f056024c3a"],["iface-status","active"],["vm-id","d91b159b-2457-49a1-b922-507f3cf78c7c"]]]]],"headings":["row","action","name","ofport","external_ids"]} {{(pid=23053) _read_stdout /opt/stack/neutron/neutron/agent/common/async_process.py:242}}
Sep 18 04:58:29.663664 ubuntu-xenial-ovh-gra1-0002079055 neutron-openvswitch-agent[23053]: DEBUG neutron.agent.common.async_process [-] Output received from [ovsdb-client monitor tcp:127.0.0.1:6640 Interface name,ofport,external_ids --format=json]: {"data":[["15170696-5e84-42e4-9aa0-5edfdeb28b70","delete","tapdcabb610-ec",-1,["map",[["attached-mac","fa:16:3e:02:83:aa"],["iface-id","dcabb610-ecaa-43c4-8c3d-e7f056024c3a"],["iface-status","active"],["vm-id","d91b159b-2457-49a1-b922-507f3cf78c7c"]]]]],"headings":["row","action","name","ofport","external_ids"]} {{(pid=23053) _read_stdout /opt/stack/neutron/neutron/agent/common/async_process.py:242}}

Revision history for this message
Attila Fazekas (afazekas) wrote :

n-cpu:
Sep 18 04:58:30.344891 ubuntu-xenial-ovh-gra1-0002079055 nova-compute[30350]: WARNING nova.compute.manager [None req-b4d88e87-9cd6-48ea-ad43-2e33a2d5537a service nova] [instance: d91b159b-2457-49a1-b922-507f3cf78c7c] Received unexpected event network-vif-unplugged-dcabb610-ecaa-43c4-8c3d-e7f056024c3a for instance with vm_state resized and task_state resize_reverting.

Revision history for this message
Attila Fazekas (afazekas) wrote :

Looks like nova requested the unplug, but never requested the replug.
Unplug might lead to DOWN, if the port is DOWN might not be tried to active.
Nova should try to remember the initial state of the port and replug it if at the beginning it was active (or activation was in progress) (no sign for the user wanted it DOWN).

Revision history for this message
Attila Fazekas (afazekas) wrote :

        guest = self._create_domain_and_network(context, xml, instance,
                                        network_info,
                                        block_device_info=block_device_info,
                                        power_on=power_on,
                                        vifs_already_plugged=True,
                                        post_xml_callback=gen_confdrive)

vifs_already_plugged=True # nova assumes the interfaces in the right state,
I wonder do we need to extend the port info with down_reason or something to know is nova or other user action leaded to the down state.

Revision history for this message
Attila Fazekas (afazekas) wrote :

Searching for keywords in the neutron change log:
https://bugs.launchpad.net/neutron/+bug/1757089

Revision history for this message
Attila Fazekas (afazekas) wrote :

 vifs_already_plugged=True added by https://review.openstack.org/#/c/161934/ .

Revision history for this message
Attila Fazekas (afazekas) wrote :

https://docs.openstack.org/neutron/rocky/contributor/internals/live_migration.html latest flow chart for a similar action. (resize can be on the same host)

Revision history for this message
Matt Riedemann (mriedem) wrote :

Apparently this never showed up in the bug report, but I have a related patch for this bug:

https://review.openstack.org/#/c/595069/

Related to the point made in comment 15.

Revision history for this message
Attila Fazekas (afazekas) wrote :

'admin_state_up': True in all of my greps so far.

I wonder 'vifs_already_plugged' can be set to False now, and assume neutron simply just not rewire it in case the admin_state_up is False.

https://review.openstack.org/#/c/161934/3/nova/virt/libvirt/driver.py maybe workarounded https://bugs.launchpad.net/neutron/+bug/1757089 , but since it fixed

Revision history for this message
Attila Fazekas (afazekas) wrote :

Actually migration happened, I'll reinterpret the logs..

Revision history for this message
Attila Fazekas (afazekas) wrote :

Sep 18 04:58:34.258675 ubuntu-xenial-ovh-gra1-0002079057 neutron-openvswitch-agent[17799]: DEBUG neutron.agent.common.async_process [-] Output received from [ovsdb-client monitor tcp:127.0.0.1:6640 Interface name,ofport,external_ids --format=json]: {"data":[["11a43cd8-0f06-4cb0-b2fd-08645caf07fa","old",null,["set",[]],null],["","new","tapdcabb610-ec",6,["map",[["attached-mac","fa:16:3e:02:83:aa"],["iface-id","dcabb610-ecaa-43c4-8c3d-e7f056024c3a"],["iface-status","active"],["vm-id","d91b159b-2457-49a1-b922-507f3cf78c7c"]]]]],"headings":["row","action","name","ofport","external_ids"]} {{(pid=17799) _read_stdout /opt/stack/neutron/neutron/agent/common/async_process.py:242}}

The port wired on compute1 and remains wired until 05:00:41 (~test timeout) .

Revision history for this message
Attila Fazekas (afazekas) wrote :

http://logs.openstack.org/69/557169/20/gate/tempest-slow/b483789/controller/logs/screen-q-dhcp.txt.gz

It looks like a long time for dhcp response..
Sep 18 04:59:42.150904 ubuntu-xenial-ovh-gra1-0002079055 dnsmasq-dhcp[9696]: DHCPDISCOVER(tap69ea22e5-3e) fa:16:3e:02:83:aa
Sep 18 04:59:42.150929 ubuntu-xenial-ovh-gra1-0002079055 dnsmasq-dhcp[9696]: DHCPOFFER(tap69ea22e5-3e) 10.1.0.8 fa:16:3e:02:83:aa

2018-09-18 04:58:36,141 6822 DEBUG [tempest.scenario.manager] TestNetworkAdvancedServerOps:test_server_connectivity_cold_migration_revert begins to ping 172.24.5.2 in 120 sec and the expected result is reachable
2018-09-18 05:00:37,439 6822 DEBUG [tempest.lib.common.utils.test_utils] Call ping returns false in 120.000000 seconds

Revision history for this message
Attila Fazekas (afazekas) wrote :

So the DHCPREQUEST DHCPACK did not happened , no idea what braked the sequence or why the DHCPDISCOVER delayed by ~ 1 minute.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/595069
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=4817165fc5938a553fafa1a69c6086f9ebe311af
Submitter: Zuul
Branch: master

commit 4817165fc5938a553fafa1a69c6086f9ebe311af
Author: Matt Riedemann <email address hidden>
Date: Wed Aug 22 10:43:40 2018 -0400

    Wait for network-vif-plugged on resize revert

    Change Id515137747a4b76e9b7057c95f80c8ae74017519 modified
    the libvirt driver to not wait for network-vif-plugged
    events on resize revert, saying that the vifs were never
    unplugged. That's true for the source host but not the dest
    host. The ComputeManager.revert_resize method calls driver.destroy
    and for the libvirt driver, driver.destroy calls cleanup which
    calls unplug_vifs *on the dest host*.

    As of change If00736ab36df4a5a3be4f02b0a550e4bcae77b1b, the API
    will route all events to both the source and dest host while
    the instance has a migration_context. When revert_resize is done
    on the dest host, it RPC casts to finish_revert_resize on the
    source host which calls driver.finish_revert_migration which
    restarts the guest on the source host and plugs vifs. Therefore,
    we can wait for the network-vif-plugged event before changing
    the instance status back to ACTIVE.

    Change-Id: I9e0cffb889c94713c7f28812918103a5d97cefeb
    Related-Bug: #1788403

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/rocky)

Related fix proposed to branch: stable/rocky
Review: https://review.openstack.org/605041

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/rocky)

Reviewed: https://review.openstack.org/605041
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b35b812d468cc306a412bcaa4eaad19855a4a627
Submitter: Zuul
Branch: stable/rocky

commit b35b812d468cc306a412bcaa4eaad19855a4a627
Author: Matt Riedemann <email address hidden>
Date: Wed Aug 22 10:43:40 2018 -0400

    Wait for network-vif-plugged on resize revert

    Change Id515137747a4b76e9b7057c95f80c8ae74017519 modified
    the libvirt driver to not wait for network-vif-plugged
    events on resize revert, saying that the vifs were never
    unplugged. That's true for the source host but not the dest
    host. The ComputeManager.revert_resize method calls driver.destroy
    and for the libvirt driver, driver.destroy calls cleanup which
    calls unplug_vifs *on the dest host*.

    As of change If00736ab36df4a5a3be4f02b0a550e4bcae77b1b, the API
    will route all events to both the source and dest host while
    the instance has a migration_context. When revert_resize is done
    on the dest host, it RPC casts to finish_revert_resize on the
    source host which calls driver.finish_revert_migration which
    restarts the guest on the source host and plugs vifs. Therefore,
    we can wait for the network-vif-plugged event before changing
    the instance status back to ACTIVE.

    Change-Id: I9e0cffb889c94713c7f28812918103a5d97cefeb
    Related-Bug: #1788403
    (cherry picked from commit 4817165fc5938a553fafa1a69c6086f9ebe311af)

tags: added: in-stable-rocky
Revision history for this message
Matt Riedemann (mriedem) wrote :

This isn't showing up in the gate anymore so presumably the nova change fixed the issue or at least enough to make it very rare.

no longer affects: neutron
Changed in nova:
status: Confirmed → Fix Released
assignee: nobody → Matt Riedemann (mriedem)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.