Many PMD driver options are disabled, including "BNX2X"!

Bug #1559408 reported by Thiago Martins
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
dpdk (Ubuntu)
Won't Fix
Low
Unassigned

Bug Description

Guys,

 After deep research on the following problem:

 Xenial - OpenvSwitch with DPDK binding to 10G NIC, not working:
 https://lists.ubuntu.com/archives/ubuntu-devel-discuss/2016-March/016287.html

 I realized that, going against the DPDK documentation:

 http://dpdk.org/doc/guides/nics/bnx2x.html

 The option "CONFIG_RTE_LIBRTE_BNX2X_PMD" is DISABLED by default! Not enabled, as the doc points (CONFIG_RTE_LIBRTE_BNX2X_PMD (default y))...

 Take a look:

---
cd ~/sources/dpdk

apt source dpdk

cd dpdk-2.2.0

$ grep PMD config/common_linuxapp | grep ^CONFIG | grep n
CONFIG_RTE_LIBRTE_MLX4_PMD=n
CONFIG_RTE_LIBRTE_MLX5_PMD=n
CONFIG_RTE_LIBRTE_BNX2X_PMD=n
CONFIG_RTE_LIBRTE_NFP_PMD=n
CONFIG_RTE_LIBRTE_PMD_SZEDATA2=n
CONFIG_RTE_LIBRTE_PMD_PCAP=n
CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
CONFIG_RTE_LIBRTE_PMD_QAT=n
CONFIG_RTE_LIBRTE_PMD_QAT_DEBUG_INIT=n
CONFIG_RTE_LIBRTE_PMD_QAT_DEBUG_TX=n
CONFIG_RTE_LIBRTE_PMD_QAT_DEBUG_RX=n
CONFIG_RTE_LIBRTE_PMD_QAT_DEBUG_DRIVER=n
CONFIG_RTE_LIBRTE_PMD_AESNI_MB=n
CONFIG_RTE_LIBRTE_PMD_AESNI_MB_DEBUG=n
CONFIG_RTE_TEST_PMD_RECORD_CORE_CYCLES=n
CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=n
---

 I strong believe that this is the main problem with OpenvSwitch + DPDK on Ubuntu, that causes OVS to not find any "dpdkX" interfaces! Even if DPDK-Compatible driver is there, being used!!!

 So, I added the following line:

---
-e 's,(CONFIG_RTE_LIBRTE_BNX2X_PMD=).*,\1y,' \
---

 To DPDK's debian/rules file, it now looks like this:

---
build-config:
        dh_testdir
        $(MAKE) O=$(DPDK_STATIC_DIR) T=$(DPDK_CONFIG) config
        sed -ri -e 's,(RTE_MACHINE=).*,\1"default",' \
                -e 's,(RTE_NEXT_ABI=).*,\1n,' \
                -e 's,(RTE_APP_TEST=).*,\1n,' \
                -e 's,(CONFIG_RTE_EAL_IGB_UIO=).*,\1n,' \
                -e 's,(CONFIG_RTE_KNI_KMOD=).*,\1n,' \
                -e 's,(CONFIG_RTE_BUILD_COMBINE_LIBS=).*,\1y,' \
                -e 's,(CONFIG_RTE_LIBRTE_BNX2X_PMD=).*,\1y,' \
                -e 's,(LIBRTE_PMD_PCAP=).*,\1y,' \
                -e 's,(LIBRTE_PMD_XENVIRT=).*,\1y,' \
                $(DPDK_STATIC_DIR)/.config
---

 Note: There is a need to install:

sudo apt install zlib1g-dev

 After that, now, I can see a very different behavior! Still not working... lol

 But some progress, look, the error on OVS log is very different now:

---
ovs-ctl[2170]: EAL: PCI device 0000:01:00.0 on NUMA socket 0
ovs-ctl[2170]: EAL: probe driver: 14e4:168a rte_bnx2x_pmd
ovs-vswitchd[2551]: EAL: PCI device 0000:01:00.0 on NUMA socket 0
ovs-ctl[2170]: EAL: PCI memory mapped at 0x7f6180000000
ovs-ctl[2170]: EAL: PCI memory mapped at 0x7f6180800000
ovs-ctl[2170]: EAL: PCI memory mapped at 0x7f6181000000
ovs-vswitchd[2551]: EAL: probe driver: 14e4:168a rte_bnx2x_pmd
ovs-ctl[2170]: ovs-vswitchd: /home/ubuntu/sources/dpdk/dpdk-2.2.0/drivers/net/bnx2x/bnx2x_ethdev.c:453: bnx2x_common_dev_init: Assertion `sc->firmware' failed.
ovs-vswitchd[2551]: EAL: PCI memory mapped at 0x7f6180000000
ovs-vswitchd[2551]: EAL: PCI memory mapped at 0x7f6180800000
ovs-vswitchd[2551]: EAL: PCI memory mapped at 0x7f6181000000
ovs-ctl[2170]: Aborted (core dumped)
---

 So, I tried something different, changed /etc/dpdk/interfaces from this:

---
pci 0000:01:00.0 uio_pci_generic
pci 0000:01:00.1 uio_pci_generic
---

 To this:

---
pci 0000:01:00.0 vfio-pci
pci 0000:01:00.1 vfio-pci
---

 And now, a completely different, error message (still not working):

---
ovs-vswitchd[2950]: EAL: PCI device 0000:01:00.0 on NUMA socket 0
ovs-vswitchd[2950]: EAL: probe driver: 14e4:168a rte_bnx2x_pmd
ovs-vswitchd[2950]: EAL: 0000:01:00.0 VFIO group is not viable!
ovs-vswitchd[2950]: EAL: Error - exiting with code: 1
ovs-vswitchd[2950]: Requested device 0000:01:00.0 cannot be used
---

 It looks much better with VFIO! There is no more "core dumped"!

 But still not working.

 So, I'm opening this bug report, so you guys can enable the PMD drivers for all the remaining drivers, like BNX2X, MLX4 & 5, XENVIRT, QAT, PCAP and etc (maybe I am missing other DPDK options that we must enable! Lets take a deep look into those disabled options)...

 However, I'm curious about the usage of BNX2X when with "uio_pci_generic", what is that firmware error followed by a core dump? Very intriguing...

 BTW, when with VFIO, I added the following options "iommu=pt intel_iommu=on" to /etc/default/grub, otherwise, VFIO doesn't work.

 And, after reading the VFIO OVS error message, "VFIO group is not viable", I google for it and I found the following blog post about this:

[IOMMU] The error of "VFIO group is not viable":
http://danny270degree.blogspot.com.br/2015/12/iommu-error-of-vfio-group-is-not-viable.html

 So, I think that I am almost there! It is just a matter of configuring this VFIO group thing but, I have no idea about how to do that... :-P

 Christian is using ixgbe driver, which have PMD enabled by default! That's why his setup works (ixgbe), and mine doesn't (bnx2x), am I right? Look:

---
.../dpdk-2.2.0$ grep -i ixgbe_pmd config/common_linuxapp
CONFIG_RTE_LIBRTE_IXGBE_PMD=y
---

 I can't think about any other reason for this problem... PMD is disabled for many drivers!

 It is Friday night, long week... I'll play with this during weekend, for sure! ^_^

Cheers!
Thiago

Revision history for this message
Thiago Martins (martinx) wrote :

So,

 I managed to include all NICs under the same IOMMU Group, like this:

---
dpdk_nic_bind --status

Network devices using DPDK-compatible driver
============================================
0000:01:00.0 'NetXtreme II BCM57800 1/10 Gigabit Ethernet' drv=vfio-pci unused=bnx2x
0000:01:00.1 'NetXtreme II BCM57800 1/10 Gigabit Ethernet' drv=vfio-pci unused=bnx2x
0000:01:00.2 'NetXtreme II BCM57800 1/10 Gigabit Ethernet' drv=vfio-pci unused=bnx2x
0000:01:00.3 'NetXtreme II BCM57800 1/10 Gigabit Ethernet' drv=vfio-pci unused=bnx2x

Network devices using kernel driver
===================================
<none>

Other network devices
=====================
<none>
---

 However, now, OVS+DPDK doesn't complain anymore about the "VFIO group is not viable", but, I'm seeing now the very same error when with "uio_pci_generic", no firmware:

--
ovs-vswitchd[3007]: EAL: TSC frequency is ~2299998 KHz
ovs-ctl[2975]: EAL: Master lcore 0 is ready (tid=f8604b00;cpuset=[0])
ovs-ctl[2975]: EAL: PCI device 0000:01:00.0 on NUMA socket 0
ovs-ctl[2975]: EAL: probe driver: 14e4:168a rte_bnx2x_pmd
ovs-vswitchd[3007]: EAL: Master lcore 0 is ready (tid=f8604b00;cpuset=[0])
ovs-vswitchd[3007]: EAL: PCI device 0000:01:00.0 on NUMA socket 0
ovs-vswitchd[3007]: EAL: probe driver: 14e4:168a rte_bnx2x_pmd
ovs-ctl[2975]: EAL: PCI memory mapped at 0x7f3840000000
ovs-ctl[2975]: EAL: PCI memory mapped at 0x7f3840800000
ovs-ctl[2975]: EAL: Trying to map BAR 4 that contains the MSI-X table. Trying offsets: 0x40000000000:0x0000, 0x1000:0xf000
ovs-ctl[2975]: EAL: PCI memory mapped at 0x7f3841001000
ovs-ctl[2975]: ovs-vswitchd: /home/ubuntu/sources/dpdk/dpdk-2.2.0/drivers/net/bnx2x/bnx2x_ethdev.c:453: bnx2x_common_dev_init: Assertion `sc->firmware' failed.
ovs-vswitchd[3007]: EAL: PCI memory mapped at 0x7f3840000000
ovs-vswitchd[3007]: EAL: PCI memory mapped at 0x7f3840800000
ovs-vswitchd[3007]: EAL: Trying to map BAR 4 that contains the MSI-X table. Trying offsets: 0x40000000000:0x0000, 0x1000:0xf000
ovs-vswitchd[3007]: EAL: PCI memory mapped at 0x7f3841001000
ovs-ctl[2975]: Aborted (core dumped)
ovs-ctl[2975]: * Starting ovs-vswitchd
ovs-ctl[2975]: * Enabling remote OVSDB managers
---

 So, the problem now, looks like the OVS with DPDK is unable to find the BNX2X firmware, but I have the linux-firmware package, that contains the bnx firmwares...

Cheers!
Thiago

Revision history for this message
Thiago Martins (martinx) wrote :

After digging into DPDK source code, I found this:

--
cd ~/sources/dpdk/dpdk-2.2.0/drivers/net/

grep -ri \/firmware *

bnx2x/bnx2x.c:#define FW_NAME_57711 "/lib/firmware/bnx2x/bnx2x-e1h-7.2.51.0.fw"
bnx2x/bnx2x.c:#define FW_NAME_57810 "/lib/firmware/bnx2x/bnx2x-e2-7.2.51.0.fw"
--

However, those files doesn't exists on Ubuntu 16.04! Package linux-firmware doesn't have any of it. So, I downloaded it:

--
cd /lib/firmware/bnx2x/

wget https://github.com/cernekee/linux-firmware/raw/master/bnx2x/bnx2x-e2-7.2.51.0.fw
wget https://github.com/cernekee/linux-firmware/raw/master/bnx2x/bnx2x-e1h-7.2.51.0.fw
--

And then, no more firmware errors!

--
journalctl | grep -i ovs

https://paste.ubuntu.com/15429791/
--

 However, still doesn't work! Look:

--
ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev

log:

ovs-vsctl[3804]: ovs|00001|vsctl|INFO|Called as ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
kernel: device ovs-netdev entered promiscuous mode
kernel: device br0 entered promiscuous mode
-

ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
ovs-vsctl: Error detected while setting up 'dpdk0'. See ovs-vswitchd log for details.

log:

ovs-vsctl[3890]: ovs|00001|vsctl|INFO|Called as ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
kernel: ovs-vswitchd[3793]: segfault at c80086e430 ip 00007f0a4f222b26 sp 00007ffce3539190 error 4 in libdpdk.so.0[7f0a4f1ff000+22f000]
ovs-vswitchd[3792]: ovs|00003|daemon_unix(monitor)|ERR|1 crashes: pid 3793 died, killed (Segmentation fault), core dumped, restarting
kernel: device ovs-netdev entered promiscuous mode
ovs-vswitchd[3897]: EAL: memzone_reserve_aligned_thread_unsafe(): memzone <RG_MP_ovs_mp_1500_0_262144> already exists
ovs-vswitchd[3897]: RING: Cannot reserve memory
kernel: device br0 entered promiscuous mode
ovs-vswitchd[3897]: EAL: memzone_reserve_aligned_thread_unsafe(): memzone <RG_MP_ovs_mp_1500_0_262144> already exists
ovs-vswitchd[3897]: RING: Cannot reserve memory
--

Investigating but, running out of ideas...

Best,
Thiago

Revision history for this message
Thiago Martins (martinx) wrote :

Right,

 Just for the record, for the first time ever, I'm seeing a different message here:

---
# ovs-vsctl show
b70c1e0a-20d7-4bdb-98db-467330b72d07
    Bridge "br0"
        Port "dpdk0"
            Interface "dpdk0"
                type: dpdk
                error: "could not open network device dpdk0 (Cannot allocate memory)"
        Port "br0"
            Interface "br0"
                type: internal
    ovs_version: "2.5.0"
---

 It is not "could not open network device dpdk0 (No such device)" anymore! I'm smelling progress!

 I'll take a look on my Hugetable settings now...

Revision history for this message
Thiago Martins (martinx) wrote :

Mmmm... Might not be a hugetable memory, double checked everything.

I'm thinking about the following error:

-
kernel: ovs-vswitchd[3793]: segfault at c80086e430 ip 00007f0a4f222b26 sp 00007ffce3539190 error 4 in libdpdk.so.0[7f0a4f1ff000+22f000]
-

Since I repackaged DPDK to enable BNX2X with PMD, I think that I'll need to also, rebuild OpenvSwitch against new libdpdk-dev...

I'll try this tomorrow...

Revision history for this message
Thiago Martins (martinx) wrote :

Yeah, I am unable to rebuild OpenvSwitch against my libdpdk-dev... :-(

I can easily rebuild it, using Ubuntu's libdpdk-dev but, after enabling BNX2X_PMD, OpenvSwitch doesn't build, I'm seeing the following error:

-
apt source openvswitch
cd openvswitch-2.5.0
dpkg-buildpackage -rfakeroot -uc -us

......
configure: error: cannot link with dpdk
......
-

 Now I'm really stucked, I'll wait for Christian's help next week... =)

Best,
Thiago

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Thiago,
you were busy weren't you :-)

Thanks your further experiments and reports - I'll try to refer all the open points.
Trying to summarize:

1. The vfio grouping issue is not a bug, just the vfio setup being more complex in general. But you already handled that for your setup - great.
But as you found - eventually a working vfio and a uio_pci_generic gets you to the same - the PMD driver (trying to) work on the card.

2. Firmware issue - that is very specific to the BNX2X_PMD PMD and Cards you use.
I'd consider adding something to the Readme if we end up enabling that driver, but not really considering pushing oco FW packages with dpdk. Yet that should help the next getting to it.

3. Enabling this or other PMDs.
You already did for for your experiments which is great, but so far we only enabled those that were enabled by default and only a few that different parties have asked for. There are some reasons not "just" to enable all others.
- One is testability - I don't have all the HW and also time is short while still more issue being open against dpdk.
- the other one is that a lessons learned is that everything that is not actively used/tested is broken or at least can not be considered very stable.
As dpdk by history came from a "build the solution with the dpdk source" approach where he specific solution setup can apply a multitude of tweaks/fixes without caring about "others" too much that is not a good option for a generically provided package.
On the good side though, until the specific cards PMD driver is used the code is not really active.
Unfortunately I don't see a good option of saying something like "and these 5 more PMD are experimental" - I need to discuss that with more experienced packagers.
Maybe an extra package that brings "more" pmd drivers named dpdk-experimental, but that would due to its dependencies still have to be in main which I don't like. Also this is not working well with the combined library approach being used and linker script based solution is post dpdk 2.2

4. segfault/mem alloc issue
That is an interesting one which I was working on last week as well.
In fact this is not specific to your PMD driver - I was hitting that with virtio-pci based setup which should be "more" supported (default enabled by upstream dpdk). It is an issue that as far as I can tell only occurs in combination with OVS-DPDK (testpmd and such are running fine).
I was just about filing a bug to track that effort - please see 1559912

Final summary:
- I'll discuss if it is doable/reasonable to enable so more low-tested extra drivers with some co-devs here for this bug.
- We will track the issues regarding ovs-dpdk+dpdk+some-PMDs in bug 1559912

Changed in dpdk (Ubuntu):
status: New → Triaged
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Since I consider the extra PMDs kind of "unsupported" upstream I have to rate it low for now.

Changed in dpdk (Ubuntu):
importance: Undecided → Low
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

As an example what I meant to imply here the reason why it is disabled by default as of now.

commit ce9b8bb8b99877026fcca00fdb253fa3ec3a7e06
Author: Thomas Monjalon <thomas.monjalon@6wind.com>
Date: Tue Jul 28 18:22:39 2015 +0200

    config: disable bnx2x driver

    This driver has too many issues:
        - too big
        - bad coding style
        - no git history (dropped in 2 patches)
        - no documentation
        - no BSD support
        - no maintainer
    And the biggest one, constraining this disabling:
        - many build issues

    If the last 4 issues are not fixed in the next release 2.2,
    the driver must be removed.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I checked the other default disabled drivers.
To all some kind of incompleteness, non-support or being only stubs applies.

It could be said, that the same is true for the virtual ones we enable like PCAP and XEN.
But there is a major difference in:
- nobody is "buying" the HW for pcap or XEN to then realize it is not supported
- those virtual environments are good for experiments and proof of concepts but not suitable for most production cases anyway

So (keep) enabling those virtual ones is kind of ok, while I'd consider enabling the PMDs for more of the HW cards is not.

To get those enabled in my opinion one has to work with the upstream project to get it properly supported and default enabled.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Martin since I'm happy you work with that code base I'd recommend you create a test setup based on virtio.
You will still be blocked on bug 1559912 for now, but I'm working on that this week and stil hope to get some upstream support.

I'll reject this bug for the given reasons now, but please catch me on IRC for a more interactive discussion if you like.

Changed in dpdk (Ubuntu):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers