Cluster deploy fails if one port of Intel XL710 40G dual-port NIC is allocated for DPDK and the other for SR-IOV

Bug #1583077 reported by Mikhail Chernik
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Anastasia Balobashina
Mitaka
Fix Committed
High
Anastasia Balobashina
Newton
Fix Committed
High
Anastasia Balobashina

Bug Description

Environment: Hardware Lab, MOS 9.0 ISO 362 Intel XL710 dual-port NIC, PCI ID 8086:1583

Detailed description:
If both DPDK and SR-IOV are configured on ports of same XL710 NIC, DPDK fails to initialize port and ovs-vswitchd is terminated during deploy.

ovs-vswitchd output: http://paste.openstack.org/show/497462/

Steps to reproduce:

1. Create cluster with 1 controller and 1 compute with Intel XL710 dual-port NIC
2. Configure hugepages for host and DPDK on compute
3. Turn on DPDK on one port of 40G NIC and SR-IOV on another port of same NIC
4. Start deployment

Expected result:
Cluster is deployed

Actual result:
Deployment failed with message 'Failed tasks: Task[netconfig/5] Stopping the deployment process!'

Lab is available for investigation.

Revision history for this message
Mikhail Chernik (mchernik) wrote :

The issue is not reproduced on 10G 82599-based dual-port NIC (AOC-STGN-i2S 8086:10fb) on the same host.

Deploy completed successfully.

Revision history for this message
Alexander Duyck (alexander-duyck) wrote :

More information would be useful. Can you provide the "ethtool -i" output for the interface being used for SR-IOV by the kernel?

Revision history for this message
Alexander Duyck (alexander-duyck) wrote :

It seems like everything points to a firmware bug on this part. If we can first verify the firmware version installed via "ethtool -i" as in the example below which is from my system with 5.0 firmware, then we could update the firmware to see if it resolves the issue.

[root@ahduyck-xeon-server ~]# ethtool -i ens7f0
driver: i40e
version: 1.3.21-k
firmware-version: f5.0.40043 a1.5 n5.02 e2284
bus-info: 0000:81:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

The firmware update is available from:
https://downloadcenter.intel.com/download/24769

Revision history for this message
Mikhail Chernik (mchernik) wrote :

root@node-5:~# ethtool -i ens11f1
driver: i40e
version: 1.3.47
firmware-version: 4.53 0x80001dca 0.0.0
bus-info: 0000:81:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

It is currently the latest version since Intel removed update v5.02.

Revision history for this message
Mikhail Chernik (mchernik) wrote :

It turns out that we have a NIC with 5.02 firmware
root@node-2:~# ethtool -i ens11f1
driver: i40e
version: 1.3.47
firmware-version: 5.02 0x80002285 0.0.0
bus-info: 0000:81:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

However, nothing has changed.

Dmitry Klenov (dklenov)
tags: added: area-mos
tags: added: area-linux
removed: area-mos
Revision history for this message
Albert Syriy (asyriy) wrote :

There is a new version of the i40e driver 1.5.18 just has been merged for MOS 10.0
Could you try running the new version of the driver?
http://perestroika-repo-tst.infra.mirantis.net/mos-repos/ubuntu/master

Revision history for this message
Mikhail Chernik (mchernik) wrote :

As far as I understand, the root cause of this problem is not connected with driver version, however, I will try. Can you please debuild openvswitch-switch-dpdk with DPDK debug logging?

Revision history for this message
Albert Syriy (asyriy) wrote :

Yes, you are right!
I found the very similar issue and fortunately it has been fixed.
The root cause of the issue is the firmware.

Please see the link for details:
http://dpdk.org/dev/patchwork/patch/9631/

The work-around patch the i40e_ethdev.c file, but better update the firmware (rev 2.5) FVL5.

I also checked the latest version of driver. It doesn't contain the work-around in the driver code.
So fix is in firmware.

Revision history for this message
Mikhail Chernik (mchernik) wrote :

Request for adding this issue to release notes: https://bugs.launchpad.net/fuel/+bug/1587867
Closed for 9.0.

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Related fix proposed to mos/mos-docs (master)

Related fix proposed to branch: master
Change author: Olena Logvinova <email address hidden>
Review: https://review.fuel-infra.org/22403

tags: added: release-notes
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Related fix merged to mos/mos-docs (master)

Reviewed: https://review.fuel-infra.org/22403
Submitter: Evgeny Konstantinov <email address hidden>
Branch: master

Commit: 9bfce94c4aec46c05e744432fd307a020d4c62c7
Author: Olena Logvinova <email address hidden>
Date: Wed Jun 22 13:20:42 2016

[RN 9.0] [NFV] [Intel XL710 40] known issues

This patch adds the following bugs and their workarounds
to the RN 9.0 Known issues section:

https://bugs.launchpad.net/fuel/+bug/1583077
https://bugs.launchpad.net/fuel/+bug/1587310

Change-Id: Ie15d0888afd46599a0ed421bd757bf1d471504a5
Related-Bug: #1583077
Related-Bug: #1587310
Closes-Bug: #1587867

tags: added: release-notes-done
removed: release-notes
Changed in fuel:
assignee: MOS Scale (mos-scale) → Anastasia Balobashina (atolochkova)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/411316
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=53f2fe91c9468e03694b8453d5c03fa1dbb8800d
Submitter: Jenkins
Branch: master

commit 53f2fe91c9468e03694b8453d5c03fa1dbb8800d
Author: Anastasiya <email address hidden>
Date: Thu Dec 15 17:22:18 2016 +0400

    Replace dpdk driver to vfio-pci if sriov is enabled

    Change-Id: Ic87b926b4f547f91b2f130830b35fafc195ada92
    Partial-Bug: #1583077

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-web (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/414536

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-web (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/415196

Changed in fuel:
milestone: 10.0 → 11.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-web (stable/mitaka)

Reviewed: https://review.openstack.org/414536
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=7ea2d025255946325ed9e107c90a821cae52625d
Submitter: Jenkins
Branch: stable/mitaka

commit 7ea2d025255946325ed9e107c90a821cae52625d
Author: Anastasiya <email address hidden>
Date: Thu Dec 15 17:22:18 2016 +0400

    Replace dpdk driver to vfio-pci if sriov is enabled

    Change-Id: Ic87b926b4f547f91b2f130830b35fafc195ada92
    Partial-Bug: #1583077
    (cherry picked from commit 53f2fe91c9468e03694b8453d5c03fa1dbb8800d)

tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-web (stable/newton)

Reviewed: https://review.openstack.org/415196
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=9c984a1ad3c32c6fc9e43da99eba39f6d9c22b29
Submitter: Jenkins
Branch: stable/newton

commit 9c984a1ad3c32c6fc9e43da99eba39f6d9c22b29
Author: Anastasiya <email address hidden>
Date: Thu Dec 15 17:22:18 2016 +0400

    Replace dpdk driver to vfio-pci if sriov is enabled

    Change-Id: Ic87b926b4f547f91b2f130830b35fafc195ada92
    Partial-Bug: #1583077
    (cherry picked from commit 53f2fe91c9468e03694b8453d5c03fa1dbb8800d)

tags: added: in-stable-newton
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Related fix proposed to mos/mos-docs (master)

Related fix proposed to branch: master
Change author: Evgeny Konstantinov <email address hidden>
Review: https://review.fuel-infra.org/30384

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Change abandoned on mos/mos-docs (master)

Change abandoned by Evgeny Konstantinov <email address hidden> on branch: master
Review: https://review.fuel-infra.org/30384
Reason: will do a new commit

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Related fix proposed to mos/mos-docs (master)

Related fix proposed to branch: master
Change author: Evgeny Konstantinov <email address hidden>
Review: https://review.fuel-infra.org/30429

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Related fix merged to mos/mos-docs (master)

Reviewed: https://review.fuel-infra.org/30429
Submitter: Mariia Zlatkova <email address hidden>
Branch: master

Commit: fcd84a46e56a2f316ff2cc1f55fa116775fa148b
Author: Evgeny Konstantinov <email address hidden>
Date: Thu Feb 2 14:57:32 2017

Add DPDK SR-IOV resolved issue to relnotes 9.2

Change-Id: I39b36f7a50b97f06bb909cbdbb7ed63703bab72a
Related-Bug: #1583077

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Related fix proposed to mos/mos-docs (stable/9.2)

Related fix proposed to branch: stable/9.2
Change author: Evgeny Konstantinov <email address hidden>
Review: https://review.fuel-infra.org/30444

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Related fix merged to mos/mos-docs (stable/9.2)

Reviewed: https://review.fuel-infra.org/30444
Submitter: Mariia Zlatkova <email address hidden>
Branch: stable/9.2

Commit: 223d9cd4197fddead56e4a066281632a010e4db6
Author: Evgeny Konstantinov <email address hidden>
Date: Thu Feb 2 16:08:03 2017

Add DPDK SR-IOV resolved issue to relnotes 9.2

Change-Id: I39b36f7a50b97f06bb909cbdbb7ed63703bab72a
Related-Bug: #1583077
(cherry picked from commit fcd84a46e56a2f316ff2cc1f55fa116775fa148b)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.