intermittent issue getting assigned MACs for SRIOV nics, causes nova timeout

Bug #1784484 reported by Chris Friesen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Brian Haley

Bug Description

We're seeing an intermittent issue with the Pike release where a VM with SRIOV NIC gets stuck in the "rebuild_spawning" task state after a "nova evacuate" operation (which rebuilds the instance on a different compute
node).

The specific issue we're seeing is that occasionally neutron seems to get "stuck". Nova is down in nova.virt.libvirt.driver.LibvirtDriver._create_domain_and_network(), and five minutes later
it hits the eventlet.timeout.Timeout exception clause at https://github.com/openstack/nova/blob/stable/pike/nova/virt/libvirt/driver.py#L5451

We think that maybe there's a race condition between nova/libvirt setting the MAC for the VF and the VF actually being usable/visible to the system. Closest mention of this upstream is this:

https://www.spinics.net/lists/kvm/msg165297.html

To test this we modified get_assigned_macs() at https://github.com/openstack/neutron/blob/stable/pike/neutron/plugins/ml2/drivers/mech_sriov/agent/pci_lib.py#L75 to wait 2 seconds if vf_lines
is empty,then call _get_vf_link_show() and get vf_lines again (max 3 tries).
After a few instance evacuates, the retry code was exercised after message "Cannot find vfs" was produced, and 1 retry was executed, and this time vf_lines had data in it.

Details:

The initial nova log looks like this:
2018-03-17 10:19:41.344 58584 WARNING nova.compute.manager [req-5283f1b9-23d3-472c-80df-4b8d73f8523b 3893b5675ba84cb5b6610dc0370d820b 4a7e8af978b043c1a0a26eafd52e0301 - default default] [instance: 983e3130-cb8f-4715-9e22-4d373fe4bb77] Received unexpected event network-vif-unplugged-0ac89834-0220-48ef-8559-e48f65fe4efb for instance
2018-03-17 10:19:42.770 58584 INFO nova.network.neutronv2.api [req-f13b904e-3711-445d-8dcd-22d122d5791e ff8718b957f6454680b820ee0d3a8cb4 78e01109154c49aab9960640407d96af - default default] [instance: 983e3130-cb8f-4715-9e22-4d373fe4bb77] Updating port 0ac89834-0220-48ef-8559-e48f65fe4efb with attributes {'binding:profile': {}, 'binding:host_id': 'compute-1'}
2018-03-17 10:19:44.032 58584 WARNING nova.compute.manager [req-a1f9f8fb-cdec-40c1-be2c-0b052305e331 3893b5675ba84cb5b6610dc0370d820b 4a7e8af978b043c1a0a26eafd52e0301 - default default] [instance: 983e3130-cb8f-4715-9e22-4d373fe4bb77] Received unexpected event network-vif-unplugged-59ce6131-07ea-4b6c-ac62-9c27e99149b8 for instance
2018-03-17 10:19:44.048 58584 INFO nova.network.neutronv2.api [req-f13b904e-3711-445d-8dcd-22d122d5791e ff8718b957f6454680b820ee0d3a8cb4 78e01109154c49aab9960640407d96af - default default] [instance: 983e3130-cb8f-4715-9e22-4d373fe4bb77] Updating port 59ce6131-07ea-4b6c-ac62-9c27e99149b8 with attributes {'binding:profile': {u'physical_network': u'group0-data0b', u'pci_request_id': u'b6ccb9b3-1551-4be7-812f-6108786b2b0a', u'pci_slot': u'0000:81:0f.7', u'pci_vendor_info': u'8086:154c'}, 'binding:host_id': 'compute-1'}
2018-03-17 10:19:44.438 58584 WARNING nova.compute.manager [req-7478c956-46cd-4651-9ea7-e477f2a99747 3893b5675ba84cb5b6610dc0370d820b 4a7e8af978b043c1a0a26eafd52e0301 - default default] [instance: 983e3130-cb8f-4715-9e22-4d373fe4bb77] Received unexpected event network-vif-plugged-59ce6131-07ea-4b6c-ac62-9c27e99149b8 for instance
2018-03-17 10:19:45.475 58584 INFO nova.compute.manager [req-f13b904e-3711-445d-8dcd-22d122d5791e ff8718b957f6454680b820ee0d3a8cb4 78e01109154c49aab9960640407d96af - default default] [instance: 983e3130-cb8f-4715-9e22-4d373fe4bb77] Detaching volume d4cb0ea9-a689-4203-8e38-40f16a448915
2018-03-17 10:19:46.091 58584 WARNING nova.compute.manager [req-b1d1c529-d32f-4d99-8b45-c2bea4ada9b3 3893b5675ba84cb5b6610dc0370d820b 4a7e8af978b043c1a0a26eafd52e0301 - default default] [instance: 983e3130-cb8f-4715-9e22-4d373fe4bb77] Received unexpected event network-vif-unplugged-59ce6131-07ea-4b6c-ac62-9c27e99149b8 for instance

Then the neutron log shows "Trigger reload_allocations" a few times, and the last one looks like:

2018-03-17 10:19:45.032 56804 INFO neutron.agent.dhcp.agent [req-5240cf46-0dff-4344-9d57-9e8f133559b0 1188dcf2254945e493726a4245e1a676 4a7e8af978b043c1a0a26eafd52e0301 - - -] Trigger reload_allocations for port admin_state_up=True, allowed_address_pairs=[], binding:host_id=compute-1, binding:profile=pci_request_id=b6ccb9b3-1551-4be7-812f-6108786b2b0a, pci_slot=0000:81:0f.7, pci_vendor_info=8086:154c, physical_network=group0-data0b, binding:vif_details=port_filter=False, vlan=2389, binding:vif_type=hw_veb, binding:vnic_type=direct, created_at=2018-03-17T10:18:29Z, description=, device_id=983e3130-cb8f-4715-9e22-4d373fe4bb77, device_owner=compute:nova, extra_dhcp_opts=[], fixed_ips=[{u'subnet_id': u'a7062041-37ab-468d-a1eb-f33d3b3621ee', u'ip_address': u'10.1.1.24'}], id=59ce6131-07ea-4b6c-ac62-9c27e99149b8, mac_address=fa:16:3e:8a:c2:d2, name=, network_id=a201980f-83e9-4fdd-9933-2fdfd58b8089, project_id=37fad54aace64d13b8d03bfa83563d59, revision_number=11, security_groups=[], status=DOWN, tags=[], tenant_id=37fad54aace64d13b8d03bfa83563d59, updated_at=2018-03-17T10:19:44Z, wrs-binding:mac_filtering=False, wrs-binding:mtu=9216, wrs-binding:vif_model=pci-sriov
2018-03-17 10:19:53.860 57003 INFO neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [req-4b2db1f7-06a8-42c5-9dab-1bbf7a937da3 - - - - -] Port fa:16:3e:8a:c2:d2 updated. Details: {u'allowed_address_pairs': [], u'device_owner': u'compute:nova', u'port_security_enabled': False, u'mtu': 9216, u'port_id': u'59ce6131-07ea-4b6c-ac62-9c27e99149b8', u'network': {u'provider:physical_network': u'group0-data0b', u'updated_at': u'2018-03-16T20:58:50Z', u'revision_number': 2, u'mtu': 9216, u'id': u'a201980f-83e9-4fdd-9933-2fdfd58b8089', u'description': u'', u'router:external': False, u'availability_zone_hints': [], u'availability_zones': [u'nova'], u'ipv4_address_scope': None, u'shared': True, u'project_id': u'78e01109154c49aab9960640407d96af', u'status': u'ACTIVE', u'subnets': [u'a7062041-37ab-468d-a1eb-f33d3b3621ee'], u'wrs-tm:qos': u'b1ec2429-3158-4716-8c5e-1848aac9c5fb', u'tags': [], u'ipv6_address_scope': None, u'provider:segmentation_id': 2389, u'name': u'internal0-net0-1', u'admin_state_up': True, u'tenant_id': u'78e01109154c49aab9960640407d96af', u'created_at': u'2018-03-16T20:58:36Z', u'provider:network_type': u'vlan', u'vlan_transparent': True}, u'fixed_ips': [{u'subnet_id': u'a7062041-37ab-468d-a1eb-f33d3b3621ee', u'ip_address': u'10.1.1.24'}], u'physical_network': u'group0-data0b', u'trunk_port': False, u'mac_address': u'fa:16:3e:8a:c2:d2', u'profile': {u'pci_request_id': u'b6ccb9b3-1551-4be7-812f-6108786b2b0a', u'pci_slot': u'0000:81:0f.7', u'physical_network': u'group0-data0b', u'pci_vendor_info': u'8086:154c'}, u'subnets': [{u'subnet': {u'ipv6_ra_mode': None, u'allocation_pools': [{u'subnet_id': u'a7062041-37ab-468d-a1eb-f33d3b3621ee', u'last_ip': u'10.1.1.254', u'id': u'55b74d0a-0e02-4026-8964-5e254d8d2566', u'first_ip': u'10.1.1.17'}], u'standard_attr_id': 88, u'ipv6_address_mode': None, u'id': u'a7062041-37ab-468d-a1eb-f33d3b3621ee', u'vlan_id': 0, u'dns_nameservers': [], u'network_standard_attr': {u'description': u'', u'tags': [], u'created_at': u'2018-03-16T20:58:36.000000', u'updated_at': u'2018-03-16T20:58:50.000000', u'revision_number': 2, u'id': 49, u'resource_type': u'networks'}, u'standard_attr': {u'description': u'', u'tags': [], u'created_at': u'2018-03-16T20:58:50.000000', u'updated_at': u'2018-03-16T20:58:50.000000', u'revision_number': 0, u'id': 88, u'resource_type': u'subnets'}, u'gateway_ip': None, u'project_id': u'78e01109154c49aab9960640407d96af', u'subnetpool': None, u'routes': [], u'cidr': u'10.1.1.0/24', u'subnetpool_id': None, u'service_types': [], u'name': u'internal0-subnet0-1', u'enable_dhcp': True, u'segment_id': None, u'network_id': u'a201980f-83e9-4fdd-9933-2fdfd58b8089', u'ip_version': 4, u'rbac_entries': [{u'action': u'access_as_shared', u'project_id': u'78e01109154c49aab9960640407d96af', u'id': u'66bc32fc-99ca-468b-a1de-d55c5c79b3bc', u'target_tenant': u'*', u'object_id': u'a201980f-83e9-4fdd-9933-2fdfd58b8089'}]}, u'physical_network': u'group0-data0b', u'network_type': u'vlan', u'segmentation_id': 2389}], u'network_qos_policy_id': None, u'segmentation_id': 2389, u'port_qos_policy': None, u'device': u'fa:16:3e:8a:c2:d2', u'qos_policy_id': None, u'admin_state_up': True, u'network_id': u'a201980f-83e9-4fdd-9933-2fdfd58b8089', u'tenant_id': u'37fad54aace64d13b8d03bfa83563d59', u'network_qos_policy': {u'scheduler': {u'weight': u'4'}}, u'network_type': u'vlan'}

Then the nova log shows libvirt timing out after 5 minutes:

2018-03-17 10:24:54.058 58584 WARNING nova.virt.libvirt.driver [req-f13b904e-3711-445d-8dcd-22d122d5791e ff8718b957f6454680b820ee0d3a8cb4 78e01109154c49aab9960640407d96af - default default] [instance: 983e3130-cb8f-4715-9e22-4d373fe4bb77] Timeout waiting for vif plugging callback for instance with vm_state error and task_state rebuild_spawning.

Repeating the experiment with debug logs enabled for neutron gave the following:

2018-04-23 18:14:19.561 73750 DEBUG neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [req-293f2f36-0257-4056-9898-5938f71e1e9e - - - - -] Agent loop found changes! {'current': set([('00:00:00:00:00:00', '0000:81:0f.5'), ('00:00:00:00:00:00', '0000:81:0f.4'), ('00:00:00:00:00:00', '0000:81:0e.6'), ('00:00:00:00:00:00', '0000:81:0e.7'), ('00:00:00:00:00:00', '0000:81:0f.6'), ('00:00:00:00:00:00', '0000:81:0e.4'), ('00:00:00:00:00:00', '0000:81:0e.5'), ('00:00:00:00:00:00', '0000:81:0e.2'), ('00:00:00:00:00:00', '0000:81:0e.3'), ('00:00:00:00:00:00', '0000:81:0f.1'), ('fa:16:3e:d6:01:a7', '0000:81:0f.7'), ('00:00:00:00:00:00', '0000:81:0e.0'), ('00:00:00:00:00:00', '0000:81:0f.0'), ('00:00:00:00:00:00', '0000:81:0e.1'), ('00:00:00:00:00:00', '0000:81:0f.3'), ('00:00:00:00:00:00', '0000:81:0f.2')]), 'removed': set([('00:00:00:00:00:00', '0000:81:0f.7')]), 'added': set([('fa:16:3e:d6:01:a7', '0000:81:0f.7')]), 'updated': set([])} daemon_loop /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/mech_sriov/agent/sriov_nic_agent.py:388
2018-04-23 18:14:19.562 73750 INFO neutron.agent.securitygroups_rpc [req-293f2f36-0257-4056-9898-5938f71e1e9e - - - - -] Skipping method prepare_devices_filter as firewall is disabled or configured as NoopFirewallDriver.
2018-04-23 18:14:19.828 73750 DEBUG neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [req-293f2f36-0257-4056-9898-5938f71e1e9e - - - - -] Port with MAC address fa:16:3e:d6:01:a7 is added treat_devices_added_updated /usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/mech_sriov/agent/sriov_nic_agent.py:294
2018-04-23 18:14:19.829 73750 INFO neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [req-293f2f36-0257-4056-9898-5938f71e1e9e - - - - -] Port fa:16:3e:d6:01:a7 updated. Details: {u'allowed_address_pairs': [], u'device_owner': u'compute:nova', u'port_security_enabled': False, u'mtu': 9216, u'port_id': u'f90e58b4-bbc7-4eb2-991b-46406d3d6470', u'network': {u'provider:physical_network': u'group0-data0b', u'updated_at': u'2018-04-23T16:48:51Z', u'revision_number': 2, u'mtu': 9216, u'id': u'a156b9b9-167c-447c-9ee0-f209f7ebb3f8', u'description': u'', u'router:external': False, u'availability_zone_hints': [], u'availability_zones': [u'nova'], u'ipv4_address_scope': None, u'shared': True, u'project_id': u'198ed4d2d0ec4440a98729ffc7bc0635', u'status': u'ACTIVE', u'subnets': [u'f452f5c3-a686-45cd-8098-0d092bd5e21e'], u'wrs-tm:qos': u'7d2f03f3-b9c6-4183-9859-878bbb8141e7', u'tags': [], u'ipv6_address_scope': None, u'provider:segmentation_id': 2285, u'name': u'internal0-net0', u'admin_state_up': True, u'tenant_id': u'198ed4d2d0ec4440a98729ffc7bc0635', u'created_at': u'2018-04-23T16:48:46Z', u'provider:network_type': u'vlan', u'vlan_transparent': True}, u'fixed_ips': [{u'subnet_id': u'f452f5c3-a686-45cd-8098-0d092bd5e21e', u'ip_address': u'10.0.0.132'}], u'physical_network': u'group0-data0b', u'trunk_port': False, u'mac_address': u'fa:16:3e:d6:01:a7', u'profile': {u'pci_request_id': u'89135f77-a4df-4744-b228-d396b7c0e876', u'pci_slot': u'0000:81:0f.7', u'physical_network': u'group0-data0b', u'pci_vendor_info': u'8086:154c'}, u'subnets': [{u'subnet': {u'ipv6_ra_mode': None, u'allocation_pools': [{u'subnet_id': u'f452f5c3-a686-45cd-8098-0d092bd5e21e', u'last_ip': u'10.0.0.254', u'id': u'86242517-d9dc-4c59-a3d6-17881ee0d42c', u'first_ip': u'10.0.0.128'}], u'standard_attr_id': 11, u'ipv6_address_mode': None, u'id': u'f452f5c3-a686-45cd-8098-0d092bd5e21e', u'vlan_id': 0, u'dns_nameservers': [], u'network_standard_attr': {u'description': u'', u'tags': [], u'created_at': u'2018-04-23T16:48:46.000000', u'updated_at': u'2018-04-23T16:48:51.000000', u'revision_number': 2, u'id': 9, u'resource_type': u'networks'}, u'standard_attr': {u'description': u'', u'tags': [], u'created_at': u'2018-04-23T16:48:51.000000', u'updated_at': u'2018-04-23T16:48:51.000000', u'revision_number': 0, u'id': 11, u'resource_type': u'subnets'}, u'gateway_ip': None, u'project_id': u'198ed4d2d0ec4440a98729ffc7bc0635', u'subnetpool': None, u'routes': [], u'cidr': u'10.0.0.0/24', u'subnetpool_id': None, u'service_types': [], u'name': u'internal0-subnet0', u'enable_dhcp': True, u'segment_id': None, u'network_id': u'a156b9b9-167c-447c-9ee0-f209f7ebb3f8', u'ip_version': 4, u'rbac_entries': [{u'action': u'access_as_shared', u'project_id': u'198ed4d2d0ec4440a98729ffc7bc0635', u'id': u'9c699b99-a98c-48ce-bdee-19e6e8483429', u'target_tenant': u'*', u'object_id': u'a156b9b9-167c-447c-9ee0-f209f7ebb3f8'}]}, u'physical_network': u'group0-data0b', u'network_type': u'vlan', u'segmentation_id': 2285}], u'network_qos_policy_id': None, u'segmentation_id': 2285, u'port_qos_policy': None, u'device': u'fa:16:3e:d6:01:a7', u'qos_policy_id': None, u'admin_state_up': True, u'network_id': u'a156b9b9-167c-447c-9ee0-f209f7ebb3f8', u'tenant_id': u'34a91443256b4db1aa2ba9d305079d41', u'network_qos_policy': {u'scheduler': {u'weight': u'4'}}, u'network_type': u'vlan'}
2018-04-23 18:14:19.830 73750 DEBUG neutron.agent.linux.utils [req-293f2f36-0257-4056-9898-5938f71e1e9e - - - - -] Running command: ['sudo', 'ip', 'link', 'show'] create_process /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:92
2018-04-23 18:14:19.850 73750 DEBUG neutron.agent.linux.utils [req-293f2f36-0257-4056-9898-5938f71e1e9e - - - - -] Running command: ['sudo', 'ip', 'link', 'show', 'ens787f3'] create_process /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:92
2018-04-23 18:14:19.872 73750 WARNING neutron.plugins.ml2.drivers.mech_sriov.agent.pci_lib [req-293f2f36-0257-4056-9898-5938f71e1e9e - - - - -] Cannot find vfs [15] in device ens787f3
2018-04-23 18:14:19.873 73750 WARNING neutron.plugins.ml2.drivers.mech_sriov.agent.eswitch_manager [req-293f2f36-0257-4056-9898-5938f71e1e9e - - - - -] device pci mismatch: fa:16:3e:d6:01:a7 - 0000:81:0f.7
2018-04-23 18:14:19.874 73750 INFO neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [req-293f2f36-0257-4056-9898-5938f71e1e9e - - - - -] No device with MAC fa:16:3e:d6:01:a7 defined on agent.
2018-04-23 18:14:19.880 73750 INFO neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [req-293f2f36-0257-4056-9898-5938f71e1e9e - - - - -] Removing device with MAC address 00:00:00:00:00:00 and PCI slot 0000:81:0f.7
2018-04-23 18:14:19.881 73750 WARNING neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [req-293f2f36-0257-4056-9898-5938f71e1e9e - - - - -] port_id to device with MAC 00:00:00:00:00:00 not found
2018-04-23 18:14:19.964 73750 INFO neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [req-293f2f36-0257-4056-9898-5938f71e1e9e - - - - -] Port with MAC 00:00:00:00:00:00 and PCI slot 0000:81:0f.7 updated.

It seems like the VF's MAC address was programmed okay, but we don't see the VF being removed even though there's a "Cannot find vfs" log.

There's a host kernel log that makes us wonder if there's some sort of delay on this hardware:
2018-04-23T18:14:19.377 compute-1 kernel: info [ 418.459004] i40e 0000:81:00.3: Setting MAC fa:16:3e:d6:01:a7 on VF 15
2018-04-23T18:14:19.472 compute-1 kernel: info [ 418.553081] i40e 0000:81:00.3: Reload the VF driver to make this change effective.
2018-04-23T18:14:19.472 compute-1 kernel: info [ 418.553138] i40e 0000:81:00.3: Setting VLAN 2285, QOS 0x0 on VF 15
2018-04-23T18:14:19.851 compute-1 kernel: info [ 418.930504] vfio-pci 0000:81:0f.7: enabling device (0000 -> 0002)
2018-04-23T18:14:19.869 compute-1 kernel: err [ 418.944122] i40e 0000:81:00.3: VF 15 still in reset. Try again.

Chris Friesen (cbf123)
summary: - intermittent issue getting assigned MACs for SRIOV nics
+ intermittent issue getting assigned MACs for SRIOV nics, causes nova
+ timeout
Revision history for this message
Miguel Lavalle (minsel) wrote :

Hi Chris,

IIUC, this is induced by the hardware not setting itself up fast enough, so the sriov neutron agent doesn't find the MAC and decides "Removing device with MAC address 00:00:00:00:00:00 and PCI slot 0000:81:0f.7". Should the retry you tested be the approach to handle this hw slowness?

Revision history for this message
Chris Friesen (cbf123) wrote :

I'm not an expert in this area, but if the described hardware behaviour is "normal" then it seems like we need to handle it, either in nova or in neutron.

Changed in neutron:
importance: Undecided → Medium
assignee: nobody → Brian Haley (brian-haley)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/604179

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/604179
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=eea5aaac4f5c62ed0c74bd7e922680e20b87bb6b
Submitter: Zuul
Branch: master

commit eea5aaac4f5c62ed0c74bd7e922680e20b87bb6b
Author: Brian Haley <email address hidden>
Date: Thu Sep 20 14:15:49 2018 -0400

    Fix corner case in failure assigning MAC to SR-IOV NIC

    Sometimes due to NIC driver incorrect behavior, VFs might be
    missing in 'ip link show' output. This may lead to a VM boot
    failure as agent will just skip such missing devices.

    Make the agent do a resync in case a newly added device
    'disappears' during processing, which should cause a MAC to
    get assigned.

    Co-authored-by: Oleg Bondarev <email address hidden>

    Change-Id: I148b5a025fc388821fd1269d02908cc8ce1882fe
    Closes-bug: #1784484

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/608052

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/608053

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/608054

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/rocky)

Reviewed: https://review.openstack.org/608052
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=9574da1a3d3ceed5963238d6ccda36413e29299e
Submitter: Zuul
Branch: stable/rocky

commit 9574da1a3d3ceed5963238d6ccda36413e29299e
Author: Brian Haley <email address hidden>
Date: Thu Sep 20 14:15:49 2018 -0400

    Fix corner case in failure assigning MAC to SR-IOV NIC

    Sometimes due to NIC driver incorrect behavior, VFs might be
    missing in 'ip link show' output. This may lead to a VM boot
    failure as agent will just skip such missing devices.

    Make the agent do a resync in case a newly added device
    'disappears' during processing, which should cause a MAC to
    get assigned.

    Co-authored-by: Oleg Bondarev <email address hidden>

    Change-Id: I148b5a025fc388821fd1269d02908cc8ce1882fe
    Closes-bug: #1784484
    (cherry picked from commit eea5aaac4f5c62ed0c74bd7e922680e20b87bb6b)

tags: added: in-stable-rocky
tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/queens)

Reviewed: https://review.openstack.org/608053
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=9664acb73e2345e8d387c7c433b704368697c6b5
Submitter: Zuul
Branch: stable/queens

commit 9664acb73e2345e8d387c7c433b704368697c6b5
Author: Brian Haley <email address hidden>
Date: Thu Sep 20 14:15:49 2018 -0400

    Fix corner case in failure assigning MAC to SR-IOV NIC

    Sometimes due to NIC driver incorrect behavior, VFs might be
    missing in 'ip link show' output. This may lead to a VM boot
    failure as agent will just skip such missing devices.

    Make the agent do a resync in case a newly added device
    'disappears' during processing, which should cause a MAC to
    get assigned.

    Co-authored-by: Oleg Bondarev <email address hidden>

    Change-Id: I148b5a025fc388821fd1269d02908cc8ce1882fe
    Closes-bug: #1784484
    (cherry picked from commit eea5aaac4f5c62ed0c74bd7e922680e20b87bb6b)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/pike)

Reviewed: https://review.openstack.org/608054
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=8b57a6d3fbcb0df17ae00de1e76637bc3894ca1c
Submitter: Zuul
Branch: stable/pike

commit 8b57a6d3fbcb0df17ae00de1e76637bc3894ca1c
Author: Brian Haley <email address hidden>
Date: Thu Sep 20 14:15:49 2018 -0400

    Fix corner case in failure assigning MAC to SR-IOV NIC

    Sometimes due to NIC driver incorrect behavior, VFs might be
    missing in 'ip link show' output. This may lead to a VM boot
    failure as agent will just skip such missing devices.

    Make the agent do a resync in case a newly added device
    'disappears' during processing, which should cause a MAC to
    get assigned.

    Co-authored-by: Oleg Bondarev <email address hidden>

    Change-Id: I148b5a025fc388821fd1269d02908cc8ce1882fe
    Closes-bug: #1784484
    (cherry picked from commit eea5aaac4f5c62ed0c74bd7e922680e20b87bb6b)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 13.0.2

This issue was fixed in the openstack/neutron 13.0.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 11.0.6

This issue was fixed in the openstack/neutron 11.0.6 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 12.0.5

This issue was fixed in the openstack/neutron 12.0.5 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 14.0.0.0b1

This issue was fixed in the openstack/neutron 14.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.