could not open one of network device dpdk when balance-tcp bonding configured (No such device) after compute node rebooted

Bug #1676329 reported by Sergii
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Ivan Suzdal

Bug Description

Detailed bug description:
 After rebooting compute node one of dpdk interface belonging to balance-tcp bonding mode is lost.

 ovs-vsctl show:

    Bridge br-prv
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port "bond0"
            Interface "dpdk1"
                type: dpdk
                error: "could not open network device dpdk1 (No such device)"
            Interface "dpdk0"
                type: dpdk
        Port br-prv
            Interface br-prv
                type: internal
        Port phy-br-prv
            Interface phy-br-prv
                type: patch
                options: {peer=int-br-prv}
    ovs_version: "2.6.1"

Steps to reproduce:
 1. Deploy cluster with dpdk and bond mode balance-tcp. (In my case I used 2 compute nodes)
 2. Configure LACP in ovs and on the hardware switch

        ovs-vsctl set port bond0 lacp=active
        ovs-vsctl set port bond0 other_config:lacp-time=slow

 3. Check that LACP in negotiated

        root@node-3:~# ovs-appctl bond/show bond0
        ---- bond0 ----
        bond_mode: balance-tcp
        bond may use recirculation: yes, Recirc-ID : 1
        bond-hash-basis: 0
        updelay: 3000 ms
        downdelay: 1000 ms
        next rebalance: 7774 ms
        lacp_status: negotiated
        active slave mac: 00:25:90:0a:4b:dc(dpdk0)

 2. Reboot any compute node

 3. Look at port status in ovs
        ovs-vsctl show

Expected results:
 LACP works correctly.

Actual result:
 Could not open network device dpdk0 or 1. Only one interface is available

Reproducibility:
 Checked on builds 1499, 1507,1513. Each time after rebooting compute node.

Description of the environment:
 Fuel 10 build 1513
 1 controller, 2 computes, 1 BASE-OS node. Bonding 2x10G on both compute nodes with DPDK and balance-tcp mode.

Additional information:

 root@node-1:~# dpdk-devbind -s

Network devices using DPDK-compatible driver
============================================
0000:03:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv=igb_uio unused=ixgbe
0000:03:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection' drv=igb_uio unused=ixgbe

Network devices using kernel driver
===================================
0000:0a:00.0 'I350 Gigabit Network Connection' if=enp10s0f0 drv=igb unused=igb_uio
0000:0a:00.1 'I350 Gigabit Network Connection' if=enp10s0f1 drv=igb unused=igb_uio
0000:81:00.0 'Ethernet Controller XL710 for 40GbE QSFP+' if=ens11f0,ens11f0d1 drv=i40e unused=igb_uio
0000:81:00.1 'Ethernet Controller XL710 for 40GbE QSFP+' if=ens11f1,ens11f1d1 drv=i40e unused=igb_uio

Fuel diagnostic snapshot:
 http://mos-scale-share.mirantis.com/sgudz/fuel-snapshot-2017-03-27_08-56-43.tar

Sergii (sgudz)
description: updated
description: updated
Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :

Please attach diagnostic snapshot.

Changed in fuel:
status: New → Incomplete
assignee: nobody → Sergii (sgudz)
Revision history for this message
Sergii (sgudz) wrote :

Added snapshot

description: updated
Changed in fuel:
status: Incomplete → Confirmed
assignee: Sergii (sgudz) → Fuel Sustaining (fuel-sustaining-team)
importance: Undecided → High
Pavel (p.petrov)
tags: added: blocker-for-qa
Pavel (p.petrov)
tags: removed: blocker-for-qa
Revision history for this message
Atsuko Ito (yottatsa) wrote :

--- dpdk.service.bak 2017-03-30 17:23:58.113055992 +0000
+++ /lib/systemd/system/dpdk.service 2017-03-30 17:19:46.886305041 +0000
@@ -2,6 +2,7 @@
 Description=DPDK runtime environment
 DefaultDependencies=false
 After=network-pre.target local-fs.target
+Before=openvswitch-nonetwork.service

 [Service]
 Type=oneshot

Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Ivan Suzdal (isuzdal)
Revision history for this message
Atsuko Ito (yottatsa) wrote :

Could you please commit it tomorrow.

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to packages/xenial/dpdk (master)

Fix proposed to branch: master
Change author: Ivan Suzdal <email address hidden>
Review: https://review.fuel-infra.org/32688

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to packages/xenial/dpdk (10.0/newton)

Fix proposed to branch: 10.0/newton
Change author: Ivan Suzdal <email address hidden>
Review: https://review.fuel-infra.org/32689

Revision history for this message
Michael Polenchuk (mpolenchuk) wrote :

Vladimir, your comment here https://review.openstack.org/418821
<quote>
right now we dont use systemctl neither for ovs nor dpdk
</quote>

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to packages/xenial/dpdk (master)

Reviewed: https://review.fuel-infra.org/32688
Submitter: Pkgs Jenkins <email address hidden>
Branch: master

Commit: 6218dfb1cd8b5f46af18090080916134646f40fe
Author: Ivan Suzdal <email address hidden>
Date: Thu Mar 30 17:29:29 2017

Add "before" condition for dpdk service

Change-Id: Id7d479b9bf1790ca68e80212e553b8fef848dbde
Closes-Bug: #1676329

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to packages/xenial/dpdk (10.0/newton)

Reviewed: https://review.fuel-infra.org/32689
Submitter: Pkgs Jenkins <email address hidden>
Branch: 10.0/newton

Commit: d0764a0c6bbae4d250f73e08c28a766b953409a9
Author: Ivan Suzdal <email address hidden>
Date: Thu Mar 30 17:29:51 2017

Add "before" condition for dpdk service

Change-Id: Id7d479b9bf1790ca68e80212e553b8fef848dbde
Closes-Bug: #1676329

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.