ovs cannot use unaccelerated interfaces

Bug #1834556 reported by Litao Gao
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
marvin Yu

Bug Description

Brief Description
-----------------
This was on AIO-SX. (stx.2.0 milestone-3)

# Create datanetwork and assign it to a unaccelerated interface, here DATA0IFUUID is the UUID of unaccelerated interface, in my case it is 4507463c-3c71-44be-9d57-0bae63d38b5e. (port name is eno4)

PHYSNET0='physnet0'
system datanetwork-add ${PHYSNET0} flat
system host-if-modify -m 1500 -n data0 -c data ${COMPUTE} ${DATA0IFUUID}
system interface-datanetwork-assign ${COMPUTE} ${DATA0IFUUID} ${PHYSNET0}

[root@controller-0 log(keystone_admin)]# system host-if-list controller-0
+--------------------------------------+-------+----------+----------+------+-------------+------+------+----------------------------+---------------+
| uuid | name | class | type | vlan | ports | uses | used | attributes | data networks |
| | | | | id | | i/f | by | | |
| | | | | | | | i/f | | |
+--------------------------------------+-------+----------+----------+------+-------------+------+------+----------------------------+---------------+
| 2783d422-10bb-4ccf-bd80-7fa2d84ed7cc | eno1 | platform | ethernet | None | [u'eno1'] | [] | [] | MTU=1500 | [] |
| 4507463c-3c71-44be-9d57-0bae63d38b5e | data0 | data | ethernet | None | [u'eno4'] | [] | [] | MTU=1500,accelerated=False | [u'physnet0'] |
| b07fcce7-3e9a-4cb5-bc58-8a285a54869f | lo | platform | virtual | None | [] | [] | [] | MTU=1500 | [] |
| d45e77dc-7727-4b17-9060-d6bcf1a30c4a | data1 | data | ethernet | None | [u'ens1f0'] | [] | [] | MTU=1500,accelerated=True | [u'physnet1'] |
+--------------------------------------+-------+----------+----------+------+-------------+------+------+----------------------------+---------------+

# after host unlock, we can see there is error for port of eno4 which is 'Device or resource busy'

[root@controller-0 log(keystone_admin)]# ovs-vsctl show

    Bridge "br-phy0"
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port "br-phy0"
            Interface "br-phy0"
                type: internal
        Port "phy-br-phy0"
            Interface "phy-br-phy0"
                type: patch
                options: {peer="int-br-phy0"}
        Port "eno4"
            Interface "eno4"
                error: "could not add network device eno4 to ofproto (Device or resource busy)"

# seems that eno4 has already been used by br-eno4
[root@controller-0 log(keystone_admin)]# brctl show
bridge name bridge id STP enabled interfaces
br-eno4 8000.ac162d72cd3f no eno4
docker0 8000.024203ebe05c no

As we know, for OVS-DPDK, we cannot use unaccelerated interfaces, but for OVS, it should be fine to handle these unaccelerated interfaces.

[sysadmin@controller-0 ~(keystone_admin)]$ system show
+----------------------+--------------------------------------+
| Property | Value |
+----------------------+--------------------------------------+
| contact | None |
| created_at | 2019-06-25T07:27:12.548784+00:00 |
| description | None |
| https_enabled | False |
| location | None |
| name | c5dc54f9-65b3-40b4-b459-f765c4d25031 |
| region_name | RegionOne |
| sdn_enabled | False |
| security_feature | spectre_meltdown_v1 |
| service_project_name | services |
| software_version | 19.01 |
| system_mode | simplex |
| system_type | All-in-one |
| timezone | UTC |
| updated_at | 2019-06-25T07:28:58.556992+00:00 |
| uuid | 769fa94f-db56-44d8-bf03-49c75fd2f3e2 |
| vswitch_type | none |
+----------------------+--------------------------------------+

Severity
--------
Major

Steps to Reproduce
------------------
See above

Expected Behavior
------------------
unaccelerated interfaces should be used by OVS and configured correctly

Actual Behavior
----------------
See above

Reproducibility
---------------
Yes

System Configuration
--------------------
One node
Should be applicable to other configs

Branch/Pull Time/Commit
-----------------------
OS="centos"
SW_VERSION="19.01"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID="20190621T013000Z"

JOB="STX_build_master_master"
<email address hidden>"
BUILD_NUMBER="154"
BUILD_HOST="starlingx_mirror"
BUILD_DATE="2019-06-21 01:30:00 +0000"

Last Pass
---------
Don't know

Timestamp/Logs
--------------
puppet attached

Test Activity
-------------
Evaluation

Litao Gao (gaolitao)
tags: added: stx.2.0
tags: added: stx.config
Revision history for this message
Ghada Khalil (gkhalil) wrote :

@Litao, In the future, please do not add tags. This is done during screening.

tags: added: stx.networking
removed: stx.config
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Please attach the collect logs from your system.

Changed in starlingx:
status: New → Incomplete
assignee: nobody → Litao Gao (gaolitao)
Revision history for this message
Litao Gao (gaolitao) wrote :

/var/log/puppet/2019-07-01-00-37-51_worker

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Logs provided. Assigning to Forrest's team to investigate.

Changed in starlingx:
assignee: Litao Gao (gaolitao) → Forrest Zhao (forrest.zhao)
status: Incomplete → Triaged
importance: Undecided → Medium
Changed in starlingx:
assignee: Forrest Zhao (forrest.zhao) → marvin Yu (marvin-yu)
Revision history for this message
marvin Yu (marvin-yu) wrote :

Hi Litao,
I Tried it in the VM,and didn`t reproduce the issue.did I miss any steps?

Revision history for this message
marvin Yu (marvin-yu) wrote :
Download full text (3.7 KiB)

Hi Litao,
I have created datanetwork and assign it to a unaccelerated interface,the interface`s named "bugdata" and uuid is 464dca96-c18b-4850-8999-ba77e0bd6c42;

[sysadmin@controller-0 ~(keystone_admin)]$ system host-if-list controller-0
+--------------------------------------+---------+----------+----------+------+--------------+------+------+-----------------------+----------------+
| uuid | name | class | type | vlan | ports | uses | used | attributes | data networks |
| | | | | id | | i/f | by | | |
| | | | | | | | i/f | | |
+--------------------------------------+---------+----------+----------+------+--------------+------+------+-----------------------+----------------+
| 1e3dbf08-e36a-45fd-a5f6-8abb55053a64 | data1 | data | ethernet | None | [u'eth1001'] | [] | [] | MTU=1500,accelerated= | [u'physnet1'] |
| | | | | | | | | True | |
| | | | | | | | | | |
| 464dca96-c18b-4850-8999-ba77e0bd6c42 | bugdata | data | ethernet | None | [u'enp0s10'] | [] | [] | MTU=1500,accelerated= | [u'bugdevice'] |
| | | | | | | | | False | |
| | | | | | | | | | |
| 7165f351-daa7-4d1d-9479-bdbe8a6ca53a | enp0s3 | platform | ethernet | None | [u'enp0s3'] | [] | [] | MTU=1500 | [] |
| 8401ce1a-ea1c-413d-9e36-ceb774f7fe54 | data0 | data | ethernet | None | [u'eth1000'] | [] | [] | MTU=1500,accelerated= | [u'physnet0'] |
| | | | | | | | | True | |
| | | | | | | | | | |
| f4115eee-99b5-474d-9452-488c233d6975 | lo | platform | virtual | None | [] | [] | [] | MTU=1500 | [] |
+--------------------------------------+---------+----------+----------+------+--------------+------+------+-----------------------+----------------+

after host unlock, it seems that enp0s10 works well.

[sysadmin@controller-0 ~(keystone_admin)]$ sudo ovs-vsctl show
Password:
7d554c33-3cc2-4c29-b0dd-f305c6ab48f8
    Manager "ptcp:6640:127.0.0.1"
    Bridge "br-phy0"
        Controller "tcp:127.0.0.1:6633"
        fail_mode: secure
        Port "eth1000"
            Interface "eth1000"
        Port "enp0s10"
            Interface "enp0s10"
        Port "p...

Read more...

Revision history for this message
Litao Gao (gaolitao) wrote :
Download full text (8.7 KiB)

Hi Marvin,

The interface related configuration pasted in this comment.

I am using STX R2 milestone3 ISO image for this testing.

Per the related code (sysinv/puppet/interface.py), it seems that the bridge will be created if
the interface is data ethernet and not dpdk compatible.

 672 bridge = None
 673 if (iface['iftype'] == constants.INTERFACE_TYPE_ETHERNET and
 674 is_data_interface(context, iface) and
 675 not is_dpdk_compatible(context, iface)):
 676 bridge = 'br-' + get_interface_os_ifname(context, iface)
 677 iface['_bridge'] = bridge # cache the result

[root@controller-0 ~(keystone_admin)]# system host-port-list controller-0
+--------------------------------------+--------+----------+--------------+--------+-----------+-------------+-----------------------------------------+
| uuid | name | type | pci address | device | processor | accelerated | device type |
+--------------------------------------+--------+----------+--------------+--------+-----------+-------------+-----------------------------------------+
| 1716a76f-cf19-46eb-bdde-6564357ac385 | eno1 | ethernet | 0000:03:00.0 | 0 | 0 | False | NetXtreme BCM5719 Gigabit Ethernet PCIe |
| 44b2628a-2280-47c3-b437-ccaf37257233 | eno2 | ethernet | 0000:03:00.1 | 0 | 0 | False | NetXtreme BCM5719 Gigabit Ethernet PCIe |
| 907fb9c0-b930-4f91-8753-de3022ef1368 | eno3 | ethernet | 0000:03:00.2 | 0 | 0 | False | NetXtreme BCM5719 Gigabit Ethernet PCIe |
| bc002f2f-f606-4f4e-acac-ef25432e20d6 | eno4 | ethernet | 0000:03:00.3 | 0 | 0 | False | NetXtreme BCM5719 Gigabit Ethernet PCIe |
| e83801d1-96a5-4fdd-9a90-4933213c77ed | ens1f0 | ethernet | 0000:04:00.0 | 0 | 0 | True | 82599ES 10-Gigabit SFI/SFP+ Network |
| | | | | | | | Connection |
| | | | | | | | |
| 29e6496f-dd93-4d77-8554-06154ac7ebe1 | ens1f1 | ethernet | 0000:04:00.1 | 0 | 0 | True | 82599ES 10-Gigabit SFI/SFP+ Network |
| | | | | | | | Connection |
| | | | | | | | |
+--------------------------------------+--------+----------+--------------+--------+-----------+-------------+-----------------------------------------+
[root@controller-0 ~(keystone_admin)]# system host-port-show controller-0 eno4
+-----------------------+-----------------------------------------+
| Property | Value |
+-----------------------+-----------------------------------------+
| name | eno4 ...

Read more...

Revision history for this message
marvin Yu (marvin-yu) wrote :

Hi Litao,

I tried it in the VM,and didn`t reproduce the issue. In my test environment, the bridge name is 'br-<ifname>'
never has been created.

did you create the bridge br-eno4 manually or not?

What configuration did you specify for the eno4 interface?

Can you provide the file that path is /opt/platform/puppet/XXXX/hieradata/<hostaddr>.yaml?

Revision history for this message
Litao Gao (gaolitao) wrote :

Hi Marvin,

Are you also using milestone3?

The br-eno4 is not created manually.

/opt/platform/puppet/19.01/hieradata/192.168.204.3.yaml has been attached.

Revision history for this message
marvin Yu (marvin-yu) wrote :

I was using the ISO image with the same BUILD_ID, but I`m not sure if it`s exactly the same.

could you provide the ISO download link?

By the way, what is your test environment? (vm or bare metal)

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Matt Peters believes this is a duplicate of https://bugs.launchpad.net/starlingx/+bug/1832697
Fixed by https://review.opendev.org/#/c/670286/
Merged on 2019-07-23

Marvin, do you agree?

Revision history for this message
marvin Yu (marvin-yu) wrote :
Download full text (3.3 KiB)

 I think this Bug is different form that one.

After investigating the cause, the Bug is caused as below:

Create datanetwork and assign to a interface that is not is_dpdk_compatible, then the linux bridge configuration will be genernated for this interface. the detail as below:

----------------192.168.204.3.yaml-----------------------
 !!python/unicode 'enp0s8':
    ensure: present
    family: inet
    hotplug: 'false'
    method: manual
    mtu: '1500'
    onboot: 'true'
    options:
      BRIDGE: !!python/unicode 'br-enp0s8'
      LINKDELAY: '20'
----------------------------------------------------------

after host unlock, the interface 'enp0s8' has been used by 'br-enp0s8'. But the ovs agent still auto add the interface to the ovs bridge 'br-phy1'.

------------openstack-neutron.yaml-------------------------
    Bridge "br-phy1"
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port "phy-br-phy1"
            Interface "phy-br-phy1"
                type: patch
                options: {peer="int-br-phy1"}
        Port "enp0s8"
            Interface "enp0s8"
                error: "could not add network device enp0s8 to ofproto (Device or resource busy)"
----------------------------------------------------------

@Matt, I do not know the details it designed. so, what should I do?

the related code flow:
----------------------------------------------------------------------------------------------------------
def get_interface_network_config(...):
......
# Add type specific options
if iface['iftype'] == constants.INTERFACE_TYPE_VLAN:
config = get_vlan_network_config(context, iface, config)
elif iface['iftype'] == constants.INTERFACE_TYPE_AE:
config = get_bond_network_config(context, iface, config)
else:
config = get_ethernet_network_config(context, iface, config)
......
----------------------------------------------------------------------------------------------------------
def get_ethernet_network_config(...):
if is_bridged_interface(context, iface):
options['BRIDGE'] = get_bridge_interface_name(context, iface)
    elif ......:
......
----------------------------------------------------------------------------------------------------------
def is_bridged_interface(context, iface):

    if '_bridged' in iface: # check the cached result
        return iface['_bridged']
    else:
        bridge = get_bridge_interface_name(context, iface)
        iface['_bridged'] = bool(bridge) # cache the result
        return iface['_bridged']
----------------------------------------------------------------------------------------------------------
def get_bridge_interface_name(context, iface):
    """
    If the given interface is a bridge member then retrieve the bridge
    interface name otherwise return None.
    """
    if '_bridge' in iface: # check the cached result
        return iface['_bridge']
    else:
        bridge = None
        if (iface['iftype'] == constants.INTERFACE_TYPE_ETHERNET and
                is_data_interface(context, iface) and
                not is_dpdk_compatible(context, iface)):
            bridge = 'br-' + get_interface_os_ifname(context, iface)
        iface['...

Read more...

Revision history for this message
Matt Peters (mpeters-wrs) wrote :

get_bridge_interface_name will need an additional check (not is_vswitch_type_unaccelerated) to determine if the vswitch type supports accelerated interfaces. If it doesn't support accelerated interfaces, a bridge should not be created.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/675811

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/675811
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=8a4fd5dc79565a29d359bc2d99841e7ebea7e1ef
Submitter: Zuul
Branch: master

commit 8a4fd5dc79565a29d359bc2d99841e7ebea7e1ef
Author: marvin <email address hidden>
Date: Mon Aug 12 02:26:15 2019 +0000

    Add an additional check for data interface configuration

    If an interface is data interface and the vswitch type is not
    accelerated, the data interface is a member of a linux bridge
    and then the linux bridge will occupy the interface after host
    unlock. if so, it will cause conflict with ovs bridge when ovs
    agent add the interface to ovs bridge.
    this patch add an additional check to determine if the vswitch
    type supports accelerated interfaces. If it doesn`t support
    accelerated interfaces, a bridge should not be created.

    Change-Id: I0fbae866c364fe7b787aa850db79bca1bf597389
    Closes-bug: #1834556
    Signed-off-by: marvin <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Ghada Khalil (gkhalil) wrote :

@Marvin, please cherrypick this to the stx.2.0 release branch.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (r/stx.2.0)

Fix proposed to branch: r/stx.2.0
Review: https://review.opendev.org/676299

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (r/stx.2.0)

Reviewed: https://review.opendev.org/676299
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=5b65f38599d4c1286875770562ba9101a2f24c1f
Submitter: Zuul
Branch: r/stx.2.0

commit 5b65f38599d4c1286875770562ba9101a2f24c1f
Author: marvin <email address hidden>
Date: Mon Aug 12 02:26:15 2019 +0000

    Add an additional check for data interface configuration

    If an interface is data interface and the vswitch type is not
    accelerated, the data interface is a member of a linux bridge
    and then the linux bridge will occupy the interface after host
    unlock. if so, it will cause conflict with ovs bridge when ovs
    agent add the interface to ovs bridge.
    this patch add an additional check to determine if the vswitch
    type supports accelerated interfaces. If it doesn`t support
    accelerated interfaces, a bridge should not be created.

    Change-Id: I0fbae866c364fe7b787aa850db79bca1bf597389
    Closes-bug: #1834556
    Signed-off-by: marvin <email address hidden>
    (cherry picked from commit 8a4fd5dc79565a29d359bc2d99841e7ebea7e1ef)

Ghada Khalil (gkhalil)
tags: added: in-r-stx20
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.