SRIOV VM MAC address changes after suspend/resume

Bug #1822366 reported by Yang Liu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Ghada Khalil

Bug Description

Brief Description
-----------------
VM MAC address for SRIOV interface changes after suspend and resume, causing that vif to be DOWN.

Severity
--------
Major

Steps to Reproduce
------------------
- Install and configure system with sriov interface on compute host
e.g.,
[wrsroot@controller-0 ~(keystone_admin)]$ system host-if-list compute-0
+--------------------------------------+----------+-----------------+----------+---------+---------------+----------+-------------+---------------------------+-------------------+
| uuid | name | class | type | vlan id | ports | uses i/f | used by i/f | attributes | data networks |
+--------------------------------------+----------+-----------------+----------+---------+---------------+----------+-------------+---------------------------+-------------------+
| 0ffaa1dd-1558-47a9-abab-9cbcdf874f3d | cluster0 | platform | ethernet | None | [u'enp5s0f1'] | [] | [] | MTU=1500 | [] |
| 38e28b07-bc74-4951-9ec6-02a73de665a4 | sriov0 | pci-sriov | ethernet | None | [u'enp6s0f1'] | [] | [] | MTU=1500 | [u'group0-data0'] |
| a9c998c5-cdb2-4c22-805a-7c984fc373e3 | mgmt0 | platform | ethernet | None | [u'eno2'] | [] | [] | MTU=1500 | [] |
| ba14cbca-740e-4e61-8e74-19834830b246 | data0 | data | ethernet | None | [u'enp5s0f0'] | [] | [] | MTU=1600,accelerated=True | [u'group0-data0'] |
| e32c9599-f0a0-42cb-83d5-c4d5dfe5317f | pthru0 | pci-passthrough | ethernet | None | [u'enp6s0f0'] | [] | [] | MTU=1500 | [u'group0-data0'] |
+--------------------------------------+----------+-----------------+----------+---------+---------------+----------+-------------+---------------------------+-------------------+

- Create a neutron port with --vnic-type=direct
- Launch a vm with a mgmt nic and another nic using the sriov port created
- Suspend and resume the vm

Expected Behavior
------------------
- VM sriov interface is still up reachable

Actual Behavior
----------------
- VM sriov interface is DOWN. It's MAC address changed while the MAC for the neutron port stays the same.

Reproducibility
---------------
Reproducible

System Configuration
--------------------
Any system with SRIOV configured on compute host

Branch/Pull Time/Commit
-----------------------
master as of 2019-03-18

Last Pass
--------------
non-containerized load

Timestamp/Logs
--------------
# suspend/resume
[2019-03-29 13:04:46,428] 262 DEBUG MainThread ssh.send :: Send 'nova --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://keystone.openstack.svc.cluster.local/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne suspend 2c3df224-453f-4a52-8b82-eabe468cf624'
[2019-03-29 13:04:50,680] 262 DEBUG MainThread ssh.send :: Send 'nova --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://keystone.openstack.svc.cluster.local/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne resume 2c3df224-453f-4a52-8b82-eabe468cf624'

# MAC on sriov interface changed on VM.
[wrsroot@controller-0 ~(keystone_admin)]$ openstack port list --server 2c3df224-453f-4a52-8b82-eabe468cf624
+--------------------------------------+------------------+-------------------+------------------------------------------------------------------------------+--------+
| ID | Name | MAC Address | Fixed IP Addresses | Status |
+--------------------------------------+------------------+-------------------+------------------------------------------------------------------------------+--------+
| 63a45cb8-4f86-47e1-ac7b-d25d3dfceba9 | | fa:16:3e:cb:5c:de | ip_address='172.18.2.157', subnet_id='715f01d4-c4bb-40c7-b13f-f7e07997d02e' | ACTIVE |
| 7ab55bf4-b07f-4785-a5c4-ed9eadfc0c13 | port_pci-sriov-1 | fa:16:3e:a6:5b:33 | ip_address='10.0.0.217', subnet_id='f37b673c-efcf-4aa7-b06a-b5c83400e01a' | ACTIVE |
| 8c8d35cf-242f-49f2-8c3a-379668c8a883 | | fa:16:3e:3d:2f:90 | ip_address='192.168.213.6', subnet_id='375a4cbe-111b-428f-be49-0247bf294a3f' | ACTIVE |
+--------------------------------------+------------------+-------------------+------------------------------------------------------------------------------+--------+

compute-2:~$ sudo virsh domiflist instance-000002c3
Interface Type Source Model MAC
-------------------------------------------------------
vhu8c8d35cf-24 vhostuser - virtio fa:16:3e:3d:2f:90
vhu63a45cb8-4f vhostuser - virtio fa:16:3e:cb:5c:de
- hostdev - - fa:16:3e:a6:5b:33

tenant2-multiports-pci-2:~# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether fa:16:3e:3d:2f:90 brd ff:ff:ff:ff:ff:ff
    inet 192.168.213.6/27 brd 192.168.213.31 scope global dynamic eth0
       valid_lft 85732sec preferred_lft 85732sec
    inet6 fe80::f816:3eff:fe3d:2f90/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether fa:16:3e:cb:5c:de brd ff:ff:ff:ff:ff:ff
    inet 172.18.2.157/24 brd 172.18.2.255 scope global dynamic eth1
       valid_lft 85734sec preferred_lft 85734sec
    inet6 fe80::f816:3eff:fecb:5cde/64 scope link
       valid_lft forever preferred_lft forever
5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether d6:88:0a:b5:3c:ef brd ff:ff:ff:ff:ff:ff

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as release gating; requires further investigation to determine if this is an upstream nova/neutron issue (which would then require an upstream bug to be opened) or something in the starlingx env

Assigning to Forrest's team to reproduce and investigate

Changed in starlingx:
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Forrest Zhao (forrest.zhao)
tags: added: stx.2019.05 stx.networking
Revision history for this message
cheng li (chengli3) wrote :

Yang, what guest image were you using?

Revision history for this message
Yang Liu (yliu12) wrote :

CentOS guest.

Changed in starlingx:
assignee: Forrest Zhao (forrest.zhao) → cheng li (chengli3)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Adding stx.distro.other to the labels as this maybe an openstack bug

tags: added: stx.distro.openstack
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Adding stx.distro.openstack to the tags as this maybe an openstack bug

Revision history for this message
cheng li (chengli3) wrote :

Yang, where did you get your CentOS guest image? Download from the official site[1]? Or you built it by yourself? If you downloaded it, could you please provide the link, thanks.

[1] http://cloud.centos.org/centos/7/images/

Revision history for this message
cheng li (chengli3) wrote :

Yang, could you please provide the dumpxml of guest VM as well? (virsh dumpxml <instance>)

Revision history for this message
Yang Liu (yliu12) wrote :
Revision history for this message
Yang Liu (yliu12) wrote :
Revision history for this message
Yang Liu (yliu12) wrote :
Revision history for this message
Yang Liu (yliu12) wrote :
Revision history for this message
Yang Liu (yliu12) wrote :
Revision history for this message
Yang Liu (yliu12) wrote :

guest image were split into 5 files via split cmd and can be combined using
cat tis-centos-guest-* > tis-centos-guest.qcow2

Revision history for this message
Yang Liu (yliu12) wrote :

1. My previous environment is gone. I tried it with latest load (20190404T013000Z) on a different system with CX-4 for sriov interface, and did not reproduce the issue. I notice the dev name changed after suspend/resume, but the MAC stays the same. The dev name reverted back after a reboot of the guest.

2. In previous system where the issue was seen, it was using Niantic (82599ES 10-Gigabit SFI/SFP+ Network Connection) for sriov interface. I will try setting up this system again with latest load.

Revision history for this message
Yang Liu (yliu12) wrote :
Download full text (7.2 KiB)

Following up with 2 in previous comment. It is reproduced on system with Niantic sriov interface.

Here's the virsh dump.

compute-2:~$ sudo virsh dumpxml instance-00000008
Password:
<domain type='kvm' id='3'>
  <name>instance-00000008</name>
  <uuid>1e0d75bc-8499-4e58-86f7-5bda232943d4</uuid>
  <metadata>
    <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
      <nova:package version="19.1.0"/>
      <nova:name>tenant2-multiports_pci-2</nova:name>
      <nova:creationTime>2019-04-04 20:49:50</nova:creationTime>
      <nova:flavor name="dedicated-1">
        <nova:memory>2048</nova:memory>
        <nova:disk>2</nova:disk>
        <nova:swap>0</nova:swap>
        <nova:ephemeral>0</nova:ephemeral>
        <nova:vcpus>2</nova:vcpus>
      </nova:flavor>
      <nova:owner>
        <nova:user uuid="f4d5ec50ae8c43a2b519dc22e6c4e4d1">tenant2</nova:user>
        <nova:project uuid="a886e45924234b24a9fbc442053d2666">tenant2</nova:project>
      </nova:owner>
    </nova:instance>
  </metadata>
  <memory unit='KiB'>2097152</memory>
  <currentMemory unit='KiB'>2097152</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='0'/>
    </hugepages>
  </memoryBacking>
  <vcpu placement='static'>2</vcpu>
  <cputune>
    <shares>2048</shares>
    <vcpupin vcpu='0' cpuset='34'/>
    <vcpupin vcpu='1' cpuset='14'/>
    <emulatorpin cpuset='14,34'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <sysinfo type='smbios'>
    <system>
      <entry name='manufacturer'>OpenStack Foundation</entry>
      <entry name='product'>OpenStack Nova</entry>
      <entry name='version'>19.1.0</entry>
      <entry name='serial'>1e0d75bc-8499-4e58-86f7-5bda232943d4</entry>
      <entry name='uuid'>1e0d75bc-8499-4e58-86f7-5bda232943d4</entry>
      <entry name='family'>Virtual Machine</entry>
    </system>
  </sysinfo>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.4.0'>hvm</type>
    <boot dev='hd'/>
    <smbios mode='sysinfo'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu>
    <topology sockets='1' cores='1' threads='2'/>
    <numa>
      <cell id='0' cpus='0-1' memory='2097152' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
  <clock offset='utc'>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='writeback' discard='unmap'/>
      <source protocol='rbd' name='cinder-volumes/a8ef4ae2-94c2-4e1a-9032-223f84efab0a'>
        <host name='192.168.204.3' port='6789'/>
        <host name='192.168.204.4' port='6789'/>
        <host name='192.168.204.217' port='6789'/>
      </source>
      <target dev='vda' bus='virtio'/>
      <serial>a8ef4ae2-94c2-4e1a-9032-223f84efab0a</serial>
      <alias name='virtio-disk0'/>
     ...

Read more...

Ken Young (kenyis)
tags: added: stx.2.0
removed: stx.2019.05
Ghada Khalil (gkhalil)
tags: added: stx.retestneeded
ChenjieXu (midone)
Changed in starlingx:
assignee: cheng li (chengli3) → ChenjieXu (midone)
Revision history for this message
ChenjieXu (midone) wrote :

Hi Yang,

Are you using OVS or OVS-DPDK?

Could you please share your steps to enable SR-IOV on StarlingX? I think that I missed some steps. On my environment, pci_passthrough_whitelist is not configured and thus VM can't be scheduled. And the configuration "physical_device_mappings = physnet2:enp3s0f1" is not correct. It should be "physnet0:enp1s0f1".

Revision history for this message
Yang Liu (yliu12) wrote :

The only step is to add sriov interfaces. e.g.,

system host-if-modify -m 1500 -n sriov0 -N 16 -p group0-data0 -c pci-sriov compute-0 68544dbc-244c-4d24-a629-ca8e4543c6f8

Revision history for this message
ChenjieXu (midone) wrote :

Hi Yang,

Is group0-data0 a provider network? As I know, provider network has been removed. For now should we use datanetwork instead of provider network?

Revision history for this message
Yang Liu (yliu12) wrote :

It is a data network.
system datanetwork-add group0-data0 vlan -m 1500

Revision history for this message
ChenjieXu (midone) wrote :

Hi Yang,

Could you please help me review my steps? I still can't create a VM with SR-IOV port.
1. Deploy StarlingX 0411 AIO simplex.
2. export COMPUTE=controller-0
   PHYSNET0='physnet0'
   system host-lock controller-0
   system datanetwork-add ${PHYSNET0} vlan
   system host-if-modify -m 1500 -n sriov1 -p $DATANETWORK_UUID -c pci-sriov -N 6 controller-0 $SRIOV_INTERFACE_UUID
   system host-unlock controller-0
3. Create networks following installation guide:
   https://wiki.openstack.org/w/index.php?title=StarlingX/Containers/Installation&oldid=169189
4. net_id=`neutron net-show public-net0 | grep "\ id\ " | awk '{ print $4 }'`
   port_id=`neutron port-create $net_id --name sriov_port --binding:vnic_type direct | grep "\ id\ " | awk '{ print $4 }'`
   openstack server create --image centos --flavor my.tiny --nic port-id=$port_id test-sriov

The VM test-sriov can't be launched because "No valid host was found. There are not enough hosts available". By checking the logs of nova sheduler, I find following lines:
9adb3d18879c4eedb1ef69fb454995f3 444668d30d39473fa706fcaba804f708 - default default] Filtering removed all hosts for the request with instance ID '56d91045-9f47-4190-b928-ef0287f7996b'. Filter results: ['RetryFilter: (start: 1, end: 1)', 'ComputeFilter: (start: 1, end: 1)', 'AvailabilityZoneFilter: (start: 1, end: 1)', 'AggregateInstanceExtraSpecsFilter: (start: 1, end: 1)', 'ComputeCapabilitiesFilter: (start: 1, end: 1)', 'ImagePropertiesFilter: (start: 1, end: 1)', 'NUMATopologyFilter: (start: 1, end: 1)', 'ServerGroupAffinityFilter: (start: 1, end: 1)', 'ServerGroupAntiAffinityFilter: (start: 1, end: 1)', 'PciPassthroughFilter: (start: 1, end: 0)']

The controller-0 is not valid because it fails to pass "PciPassthroughFilter: (start: 1, end: 0)']".
According to the openstack guide for SR-IOV, pci_passthrough_whitelist should be configured:
https://docs.openstack.org/ocata/networking-guide/config-sriov.html
However pci_passthrough_whitelist is not configured by checking the configurations in nova compute container. Could you please help check "pci_passthrough_whitelist" in your nova compute container?
   kubectl get pod -n openstack | grep nova
   kubectl exec -it -n openstack nova-compute-controller-0-a762cb46-6knkl bash
   cd /etc/nova
   grep -rn "pci_passthrough_whitelist"

Revision history for this message
Yang Liu (yliu12) wrote :
Download full text (4.5 KiB)

Hi Chenjie,

What nic did you use to configure SRIOV interface? You need to use one of the supported nics for sriov/pcipt interface, and to reproduce this issue, Intel 82599 (Niantic) 10 G is needed.

After configure the sriov interface for worker node and apply the stx-openstack application, the passthrough_whitelist were automatically included in the system override in the helm charts overrides.

You should be seeing something like this:

$ lspci | grep 06
06:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
06:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
06:10.1 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
06:10.3 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
06:10.5 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
06:10.7 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

$ system helm-override-show stx-openstack nova openstack

| | pci: |
| | passthrough_whitelist: |
| | type: multistring |
| | values: ['{"physical_network": "group0-data0", "address": "0000:06:00.0"}', |
| | '{"class_id": "030000", "address": "0000:0c:00.0"}', '{"physical_network": ...

Read more...

Revision history for this message
ChenjieXu (midone) wrote :

Hi Yang,

I'm using "Intel Corporation Ethernet Connection X722 for 10GBASE-T (rev 09)" and I don't have the NIC "Intel 82599 (Niantic) 10 G".

Revision history for this message
Yang Liu (yliu12) wrote :

It probably will not work then. Here's a list of supported NICs for SRIOV:

Intel 82599 (Niantic) 10 G >> Issue is seen with this NIC.

Intel X710/XL710 (Fortville) 10G >> Not sure if the issue can be reproduced with Fortville.

Mellanox Technologies >> This issue is not reproducible with Mellanox CX-4 though.
- MT27710 Family (ConnectX-4) 10G/25G
- MT27700 Family (ConnectX-4) 40G

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Needs to be re-assigned since the current developer doesn't have the NIC in question

tags: added: stx.helpwanted
Changed in starlingx:
assignee: ChenjieXu (midone) → nobody
Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: nobody → Cindy Xie (xxie1)
Revision history for this message
Cindy Xie (xxie1) wrote :

we also do not have the NIC specified in LP.

Changed in starlingx:
assignee: Cindy Xie (xxie1) → Ghada Khalil (gkhalil)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

@Yang, can you please re-test this scenario? There was a driver upgrade from the Niantic which was merged on 2019-06-11
https://review.opendev.org/#/c/664363/

Revision history for this message
Yang Liu (yliu12) wrote :

This issue is no longer seen after the driver upgrade for Niantic.
Load: master "20190713T013000Z"
SRIOV NIC: Intel 82599 (Niantic)

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Thanks Yang.
I am going to mark this bug as Fix Released; addressed by the driver upversion: https://review.opendev.org/#/c/664363/

Changed in starlingx:
status: Triaged → Incomplete
Ghada Khalil (gkhalil)
Changed in starlingx:
status: Incomplete → Fix Released
tags: removed: stx.distro.openstack stx.helpwanted stx.retestneeded
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.