The created VFs are not announced by nova-compute

Bug #1817085 reported by Nicolas Pochet
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Neutron Open vSwitch Charm
In Progress
Medium
Sahid Orentino
OpenStack Nova Compute Charm
In Progress
Medium
Sahid Orentino

Bug Description

With the following extract from the bundle, after the creation of the virtual functions, the VFs are not announce by nova-compute.

neutron-openvswitch:
    charm: cs:neutron-openvswitch
    num_units: 0
    bindings:
      data: *overlay-space
    options:
      worker-multiplier: *worker-multiplier
      bridge-mappings: *bridge-mappings
      prevent-arp-spoofing: True
      firewall-driver: openvswitch
      dns-servers: "8.8.8.8"
      enable-local-dhcp-and-metadata: true
      data-port: 'br-data:eno4'
      enable-sriov: True
      sriov-device-mappings: 'physnet_sriov:eno5'
      sriov-numvfs: 'eno5:8'

Only one interface is included in the extract of the bundle for brevity.

On one of the hosts:
2019-02-21 14:00:32.570 340576 INFO nova.compute.resource_tracker [req-eff0e938-fced-45a4-8e02-0a7410145050 - - - - -] Final resource view: name=xxx.maas phys_ram=386699MB used_ram=1024MB phys_disk=878GB used_disk=1GB total_vcpus=72 used_vcpus=1 pci_stats=[PciDevicePool(count=2,numa_node=0,product_id='1015',tags={dev_type='type-PF',physical_network='physnet_sriov'},vendor_id='15b3'), PciDevicePool(count=4,numa_node=0,product_id='1572',tags={dev_type='type-PF',physical_network='physnet_sriov'},vendor_id='8086'), PciDevicePool(count=2,numa_node=1,product_id='1015',tags={dev_type='type-PF',physical_network='physnet_sriov'},vendor_id='15b3'), PciDevicePool(count=4,numa_node=2,product_id='1015',tags={dev_type='type-PF',physical_network='physnet_sriov'},vendor_id='15b3'), PciDevicePool(count=4,numa_node=2,product_id='1572',tags={dev_type='type-PF',physical_network='physnet_sriov'},vendor_id='8086')]

After restarting nova-compute on the host, they are correctly announced:

2019-02-21 14:00:45.546 642821 INFO nova.compute.resource_tracker [req-f53b43f9-9b58-4ca9-a9ea-02562279bf53 - - - - -] Final resource view: name=xxx.maas phys_ram=386699MB used_ram=1024MB phys_disk=878GB used_disk=1GB total_vcpus=72 used_vcpus=1 pci_stats=[PciDevicePool(count=2,numa_node=0,product_id='1015',tags={dev_type='type-PF',physical_network='physnet_sriov'},vendor_id='15b3'), PciDevicePool(count=4,numa_node=0,product_id='1572',tags={dev_type='type-PF',physical_network='physnet_sriov'},vendor_id='8086'), PciDevicePool(count=2,numa_node=1,product_id='1015',tags={dev_type='type-PF',physical_network='physnet_sriov'},vendor_id='15b3'), PciDevicePool(count=4,numa_node=2,product_id='1015',tags={dev_type='type-PF',physical_network='physnet_sriov'},vendor_id='15b3'), PciDevicePool(count=4,numa_node=2,product_id='1572',tags={dev_type='type-PF',physical_network='physnet_sriov'},vendor_id='8086'), PciDevicePool(count=16,numa_node=0,product_id='1016',tags={dev_type='type-VF',physical_network='physnet_sriov'},vendor_id='15b3'), PciDevicePool(count=16,numa_node=2,product_id='1016',tags={dev_type='type-VF',physical_network='physnet_sriov'},vendor_id='15b3'), PciDevicePool(count=32,numa_node=0,product_id='154c',tags={dev_type='type-VF',physical_network='physnet_sriov'},vendor_id='8086'), PciDevicePool(count=16,numa_node=1,product_id='1016',tags={dev_type='type-VF',physical_network='physnet_sriov'},vendor_id='15b3'), PciDevicePool(count=32,numa_node=2,product_id='154c',tags={dev_type='type-VF',physical_network='physnet_sriov'},vendor_id='8086')]

Revision history for this message
James Page (james-page) wrote :

Any change to SR-IOV configuration should trigger a remote restart in nova-compute:

        # Trigger remote restart in parent application
        remote_restart('neutron-plugin', 'nova-compute')

Changed in charm-neutron-openvswitch:
assignee: nobody → Sahid Orentino (sahid-ferdjaoui)
Revision history for this message
Nicolas Pochet (npochet) wrote :

It does indeed trigger a remote restart but the fact that nova does not take the VFs into account might be due to several factors:
- Timing issue: the restart happens too fast just after the creations of the VFs
- nova-compute is restarted before neutron-sriov-agent. See https://github.com/openstack/charm-neutron-openvswitch/blob/ee81e0eaf53a9b64319947565b378cee3e5d5f88/hooks/neutron_ovs_utils.py#L663-L672

Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :

Nova is not depending of neutron-sriov-agent to read the VFs. Nova is referring to libvirt which refers to udev events to update its cache.

Based on the code my thinking is that we restart Nova service before to have updated nova.conf that to whitelist the PCIs devices. I think that does not make sense to ask Nova to be restarted from here [0]. Basically we should ask for a restart after to have updated pci/passthrough_whitelist. Even if the VFs are not yet configured Nova will discover them during the next call to grab resources from host.

[0] https://github.com/openstack/charm-neutron-openvswitch/blob/ee81e0eaf53a9b64319947565b378cee3e5d5f88/hooks/neutron_ovs_utils.py#L663-L672

Changed in charm-nova-compute:
assignee: nobody → Sahid Orentino (sahid-ferdjaoui)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-nova-compute (master)

Fix proposed to branch: master
Review: https://review.openstack.org/639285

Changed in charm-nova-compute:
status: New → In Progress
Revision history for this message
Sahid Orentino (sahid-ferdjaoui) wrote :

Can you provide full log of nova.conf in debug?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on charm-nova-compute (master)

Change abandoned by sahid (<email address hidden>) on branch: master
Review: https://review.openstack.org/639285
Reason: nova-compute is already restarted after any config change.

James Page (james-page)
Changed in charm-nova-compute:
importance: Undecided → Medium
Changed in charm-neutron-openvswitch:
importance: Undecided → Medium
status: New → In Progress
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.