Failing to start up ovs-dpdk with 17.11

Bug #1741244 reported by Christian Ehrhardt 
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
dpdk (Ubuntu)
Fix Released
High
Unassigned

Bug Description

DPDK Status lists the device assigned to compatible drivers:
 0000:04:00.0 'Ethernet Controller 10-Gigabit X540-AT2 1528' drv=uio_pci_generic unused=

Nothing in dmesg, but OVS fails on some assert:
ovs-vswitchd: dpdk|EMER|PANIC in rte_bus_register():
ovs-vswitchd: dpdk|EMER|line 53 assert "bus->find_device" failed
[... Stack trace]

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Note: worked on PPAs with essentially the same code

Maybe my system setup itself is broken, considering a re-install to get rid of old test/debug crap I might have done - automation will resetup as needed.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Check came in in 17.08
commit dd288f0dfbfe8159ca80719de0a0c477698d0226
Author: Jan Blunck <email address hidden>
Date: Fri Jun 30 20:19:36 2017 +0200

    bus: require to implement device finding

    Signed-off-by: Jan Blunck <email address hidden>

diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
index 3094daa..e04ab4f 100644
--- a/lib/librte_eal/common/eal_common_bus.c
+++ b/lib/librte_eal/common/eal_common_bus.c
@@ -50,6 +50,7 @@ rte_bus_register(struct rte_bus *bus)
        /* A bus should mandatorily have the scan implemented */
        RTE_VERIFY(bus->scan);
        RTE_VERIFY(bus->probe);
+ RTE_VERIFY(bus->find_device);

        TAILQ_INSERT_TAIL(&rte_bus_list, bus, next);
        RTE_LOG(DEBUG, EAL, "Registered [%s] bus.\n", bus->name)

I need to find which bus it is trying to add and if there might be some odd .so lookup involved (we had thos in the past)

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The issue is that I had 17.05 libs still installed => http://paste.ubuntu.com/26319020/

That worked on past upgrades.
But some of the drivers are no more "normal" linkages.
Instead DPDK searches a dir now to include all drivers.

Feature exists quite some time:
http://dpdk.org/ml/archives/dev/2015-November/027962.html

But now since we are finally perfectly coinstallable we can hit this issue :-/

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Removing the libs solves the issue.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Reinstalling anything that drops something into CONFIG_RTE_EAL_PMD_PATH is triggering the issue.

For example having a DPDK 17.11 system plus librte-pmd-ixgbe17.05 which drops:
/usr/lib/x86_64-linux-gnu/dpdk-pmds/librte_pmd_ixgbe.so.17.05

Will make it break.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

That said while they can exist in the archive together, we have to make the packages conflict prior versions to avoid this issue for now.

There might be a more complex solution with something like modversions in the kernel.
But then this was meant to be a drop-in for 3rd party drivers, which would loose much of it's eas-of-use.

After a short discussion we agreed on a middle ground.
We make the dir versioned.

So installs in Debian/Ubuntu work as they should by default.
And 3rd party driver providers can still drop them in, but have to either
a) link into all driver dirs (not very careful but working if the driver is ok)
b) link their driver to versioned DIRs they really support/qualified

Changed in dpdk (Ubuntu):
status: New → Triaged
importance: Undecided → High
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

This is actually fixed a while, but stalled on its way due to the LP build farm maintenance.
We essentially wait for [1] to fully build all arches and then check proposed migration.

[1]: https://launchpad.net/ubuntu/+source/dpdk/17.11-4

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Still waiting on tests to complete due to the reduced capacity of these and huge queues after meltdown avoidance maintenance.
But it is on it's way still ...

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package dpdk - 17.11-4

---------------
dpdk (17.11-4) unstable; urgency=low

  [ Luca Boccassi ]
  * Fix librte-gro17.11 short description to mention -gro instead of
    -eal. (Closes: #885832)

  [ Christian Ehrhardt ]
  * d/rules: make auto-loaded drivers dir versioned (LP: #1741244).
    3rd party drivers should drop into the versioned directories now to show
    their support for that version and to be autoloaded by librte_eal due
    to that.

 -- Luca Boccassi <email address hidden> Thu, 04 Jan 2018 13:52:07 +0000

Changed in dpdk (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.