dpdk app is reporting: net_mlx5: probe of PCI device xxxx aborted after encountering an error: Unknown error -95

Bug #1823836 reported by bugproxy on 2019-04-09
26
This bug affects 2 people
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
High
Canonical Server Team
Ubuntu on IBM z Systems
Undecided
Unassigned
dpdk (Ubuntu)
Undecided
Ubuntu on IBM Power Systems Bug Triage
Disco
Undecided
Unassigned
rdma-core (Ubuntu)
Undecided
Unassigned
Disco
Undecided
Unassigned

Bug Description

[Impact]

 * a missing memset can make rdma (users) use uninitialized memory
   In the reported case this was a fail to initialize DPDK devices on
   ppc64, but it could be almost anything else using the cmd buffers

 * The patch is already at the v22 stable branch (backported and
   intended to be in v22.2 once released)

[Test Case]

 * So far the only way to trigger this that was found is to run a
   Connect-X5 card on ppc64 (power9) and try to initialize it, e.g.
     $ /usr/bin/dpdk-testpmd -i -a

   This requires special HW, but I hope due to the patch bein a simple
   one liner that should not be concerning for the SRU.

[Regression Potential]

 * Without the memset it would be random memory, I could imagine a lucky
   case that ran despite this issue but I can not imagine an issue
   "relying" on the memory being not-set-to-zero (unless stealing data
   was your use case).

[Other Info]

 * n/a

---- originla bug report ----

== Comment: #0 - David J. Wilder - 2019-04-05 12:44:56 ==
---Problem Description---
dpdk-testpmd is failing in net_mlx5.

/usr/bin/dpdk-testpmd \
-w 0000:01:00.0 \
-l 0-3 \
-n 4 -- \
-i -a
EAL: Detected 128 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: No free hugepages reported in hugepages-2048kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: PCI device 0000:01:00.0 on NUMA socket 0
EAL: probe driver: 15b3:1019 net_mlx5
net_mlx5: probe of PCI device 0000:01:00.0 aborted after encountering an error: Unknown error -95
testpmd: No probed ethernet devices
Interactive-mode selected
Auto-start selected
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=171456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Done
Start automatic packet forwarding
io packet forwarding - ports=0 - cores=0 - streams=0 - NUMA support enabled, MP allocation mode: native

  io packet forwarding packets/burst=32
  nb forwarding cores=1 - nb forwarding ports=0

Contact Information = David <email address hidden>

---uname output---
Linux ltc17u31 5.0.0-8-generic #9-Ubuntu SMP Tue Mar 12 21:59:39 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux

Machine Type = 9006-22P Boston

---Debugger---
A debugger is not configured

---Steps to Reproduce---
 Installed 19.04 (ppc64le)
Installed dpdk and dpdk-dev
----
run dpdk-testpmd

/usr/bin/dpdk-testpmd \
-w 0000:01:00.0 \
-l 0-3 \
-n 4 -- \
-i -a
EAL: Detected 128 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: No free hugepages reported in hugepages-2048kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: PCI device 0000:01:00.0 on NUMA socket 0
EAL: probe driver: 15b3:1019 net_mlx5
net_mlx5: probe of PCI device 0000:01:00.0 aborted after encountering an error: Unknown error -95
testpmd: No probed ethernet devices
Interactive-mode selected
Auto-start selected
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=171456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Done
Start automatic packet forwarding
io packet forwarding - ports=0 - cores=0 - streams=0 - NUMA support enabled, MP allocation mode: native

  io packet forwarding packets/burst=32
  nb forwarding cores=1 - nb forwarding ports=0

Userspace tool common name: testpmd

The userspace tool has the following bit modes: 64-bit

Userspace rpm: dpdk-dev/disco,now 18.11-6 ppc64el

== Comment: #1 - David J. Wilder - 2019-04-05 12:45:35 ==
# lspci -vvv -s 0000:01:00.0
0000:01:00.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
 Subsystem: IBM MT28800 Family [ConnectX-5 Ex]
 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
 Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 Latency: 0
 Interrupt: pin A routed to IRQ 24
 NUMA node: 0
 Region 0: Memory at 6000800000000 (64-bit, prefetchable) [size=512M]
 [virtual] Expansion ROM at 600c000000000 [disabled] [size=1M]
 Capabilities: [60] Express (v2) Endpoint, MSI 00
  DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
   ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W
  DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
   RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
   MaxPayload 512 bytes, MaxReadReq 512 bytes
  DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
  LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM not supported, Exit Latency L0s unlimited, L1 unlimited
   ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
  LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
   ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
  LnkSta: Speed 16GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
  DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
  DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis+, LTR-, OBFF Disabled
  LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
    Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
    Compliance De-emphasis: -6dB
  LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
    EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
 Capabilities: [48] Vital Product Data
  Product Name: PCIe4 2-port 100Gb EDR Adapter x16
  Read-only fields:
   [PN] Part number: 00WT174
   [EC] Engineering changes: P40094
   [VF] Vendor specific: 00WT176
   [SN] Serial number: YA50YF7CE0V3
   [Z0] Unknown: 49 42 4d 30 30 30 30 30 30 30 30 30 32
   [VC] Vendor specific: EC64
   [MN] Manufacture ID: 37 35 30 58 30 39 31 37 32 35 33 30 38 37 20
   [VH] Vendor specific: 2CF2
   [VK] Vendor specific: ipzSeries
   [RV] Reserved: checksum good, 0 byte(s) reserved
  End
 Capabilities: [9c] MSI-X: Enable+ Count=256 Masked-
  Vector table: BAR=0 offset=00002000
  PBA: BAR=0 offset=00003000
 Capabilities: [c0] Vendor Specific Information: Len=18 <?>
 Capabilities: [40] Power Management version 3
  Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot-,D3cold+)
  Status: D0 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-
 Capabilities: [100 v1] Advanced Error Reporting
  UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
  UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
  UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
  CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
  CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
  AERCap: First Error Pointer: 04, GenCap+ CGenEn+ ChkCap+ ChkEn+
 Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
  ARICap: MFVC- ACS-, Next Function: 1
  ARICtl: MFVC- ACS-, Function Group: 0
 Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV)
  IOVCap: Migration-, Interrupt Message Number: 000
  IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
  IOVSta: Migration-
  Initial VFs: 8, Total VFs: 8, Number of VFs: 0, Function Dependency Link: 00
  VF offset: 2, stride: 1, Device ID: 101a
  Supported Page Size: 000007ff, System Page Size: 00000010
  Region 0: Memory at 0006000000000000 (64-bit, prefetchable)
  VF Migration: offset: 00000000, BIR: 0
 Capabilities: [1c0 v1] #19
 Capabilities: [230 v1] Access Control Services
  ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
  ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
 Capabilities: [320 v1] #27
 Capabilities: [370 v1] #26
 Capabilities: [420 v1] #25
 Kernel driver in use: mlx5_core
 Kernel modules: mlx5_core

== Comment: #2 - David J. Wilder - 2019-04-05 12:54:17 ==
Building from git://dpdk.org/dpdk tag=v18.11 in the same environment also shows the same error.

== Comment: #4 - David J. Wilder - 2019-04-05 12:56:25 ==
Testing dpdk on beta 19.04 is showing an error with Mellanox Technologies MT28800 Family [ConnectX-5 Ex] ethernet controller.

== Comment: #6 - David J. Wilder - 2019-04-05 13:35:12 ==
Chasing the source of the error.

gdb dpdk/ppc_64-power8-linuxapp-gcc/app/testpmd
<....>
(gdb) break mlx5_ind_table_ibv_drop_new
Breakpoint 1 at 0x4998e8: file /home/wilder/ubuntu-19.04-debug/dpdk/drivers/net/mlx5/mlx5_rxq.c, line 2067.
(gdb) run -w 0000:01:00.0 -l 0-3 -n 4 -- -i -a
Starting program: /home/wilder/ubuntu-19.04-debug/dpdk/ppc_64-power8-linuxapp-gcc/app/testpmd -w 0000:01:00.0 -l 0-3 -n 4 -- -i -a
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/powerpc64le-linux-gnu/libthread_db.so.1".
EAL: Detected 128 lcore(s)
EAL: Detected 2 NUMA nodes
[New Thread 0x7ffff795dc90 (LWP 117018)]
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
[New Thread 0x7ffff714dc90 (LWP 117019)]
EAL: No free hugepages reported in hugepages-2048kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
[New Thread 0x7ffff693dc90 (LWP 117020)]
[New Thread 0x7ffff612dc90 (LWP 117021)]
[New Thread 0x7ffff591dc90 (LWP 117022)]
EAL: PCI device 0000:01:00.0 on NUMA socket 0
EAL: probe driver: 15b3:1019 net_mlx5

Thread 1 "testpmd" hit Breakpoint 1, 0x00000001004998e8 in mlx5_ind_table_ibv_drop_new (dev=0x100d97580 <rte_eth_devices>)
    at /home/wilder/ubuntu-19.04-debug/dpdk/drivers/net/mlx5/mlx5_rxq.c:2067
2067 {
(gdb) list
2062 * @return
2063 * The Verbs object initialised, NULL otherwise and rte_errno is set.
2064 */
2065 struct mlx5_ind_table_ibv *
2066 mlx5_ind_table_ibv_drop_new(struct rte_eth_dev *dev)
2067 {
2068 struct priv *priv = dev->data->dev_private;
2069 struct mlx5_ind_table_ibv *ind_tbl;
2070 struct mlx5_rxq_ibv *rxq;
2071 struct mlx5_ind_table_ibv tmpl;
(gdb)
2072
2073 rxq = mlx5_rxq_ibv_drop_new(dev);
2074 if (!rxq)
2075 return NULL;
2076 tmpl.ind_table = mlx5_glue->create_rwq_ind_table
2077 (priv->ctx,
2078 &(struct ibv_rwq_ind_table_init_attr){
2079 .log_ind_tbl_size = 0,
2080 .ind_tbl = &rxq->wq,
2081 .comp_mask = 0,
(gdb)
2082 });
2083 if (!tmpl.ind_table) {
2084 DEBUG("port %u cannot allocate indirection table for drop"
2085 " queue",
2086 dev->data->port_id);
2087 rte_errno = errno;
2088 goto error;
2089 }
2090 ind_tbl = rte_calloc(__func__, 1, sizeof(*ind_tbl), 0);
2091 if (!ind_tbl) {

(gdb) break 2084
Breakpoint 2 at 0x1004999d0: file /home/wilder/ubuntu-19.04-debug/dpdk/drivers/net/mlx5/mlx5_rxq.c, line 2084.
(gdb) cont
Continuing.

Thread 1 "testpmd" hit Breakpoint 2, mlx5_ind_table_ibv_drop_new (dev=0x100d97580 <rte_eth_devices>)
    at /home/wilder/ubuntu-19.04-debug/dpdk/drivers/net/mlx5/mlx5_rxq.c:2087
2087 rte_errno = errno;
(gdb) print errno
$1 = 95
(gdb)
------

== Comment: #7 - David J. Wilder - 2019-04-05 18:53:33 ==
Interesting excerpt from strace:
write(1, "mlx5_glue_create_rwq_ind_table: "..., 65) = 65
ioctl(23, RDMA_VERBS_IOCTL, 0x7fffe3966c70) = -1 EOPNOTSUPP (Operation not supported)

== Comment: #8 - David J. Wilder <email address hidden> - 2019-04-05 21:05:21 ==
ConnectX-5 Firmware version:
# mstflint -d 0000:01:00.0 q
Image type: FS4
FW Version: 16.23.1020
FW Release Date: 10.7.2018
Product Version: 16.23.1020
Description: UID GuidsNumber
Base GUID: ec0d9a0300cab17c 4
Base MAC: ec0d9acab17c 4
Image VSD: N/A
Device VSD: N/A
PSID: IBM0000000020
Security Attributes: N/A

Related branches

Default Comment by Bridge

tags: added: architecture-ppc64le bugnameltc-176743 severity-high targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → dpdk (Ubuntu)
Changed in ubuntu-power-systems:
importance: Undecided → High
assignee: nobody → Canonical Server Team (canonical-server)

------- Comment From <email address hidden> 2019-04-09 20:09 EDT-------
-EOPNOTSUPP is returned from ib_uverbs kernel module.

Call stack:
ib_uverbs_ex_create_rwq_ind_table() >> uverbs_request_finish()

I found iter.cur may not be correct at the time uverbs_request_finish() is called. iter.cur was been computed in uverbs_request_next() prior to calling uverbs_request_finish().

Here is my kernel tracing showing the problem.

uverbs_request_next.isra.7: Befor iter->cur=00000000e6061050 len=4
uverbs_request_next.isra.7: After iter->cur=00000000ced0d1bc len=4

The second value of iter->cur not correct, this is not making much sense, is a cast is needed?

static int uverbs_request_next(struct uverbs_req_iter *iter, void *val,
size_t len)
{
if (iter->cur + len > iter->end)
return -ENOSPC;

if (copy_from_user(val, iter->cur, len))
return -EFAULT;

trace_printk("Befor iter->cur=%p len=%lu\n",iter->cur,len);

iter->cur += len;

trace_printk("After iter->cur=%p len=%lu\n",iter->cur,len);

return 0;
}

Hi David,
I added Thomas here as he knows Mellanox HW well and might know the case and/or someone who knows.

I realize that this is occurring on ppc64 - which recently was not very well maintained upstream
 - gowrishankar became unavailable, Chao Zhu has had quite some other work - so I'm wondering if this might be ppc64 specific.
Do you - by any chance - know if this can be reproduced on x86 or if it might be ppc64 specific?

Changed in ubuntu-power-systems:
status: New → Triaged

10/04/2019 14:24, Christian Ehrhardt :
> Hi David,
> I added Thomas here as he knows Mellanox HW well and might know the case and/or someone who knows.
>
>
> I realize that this is occurring on ppc64 - which recently was not very well maintained upstream
> - gowrishankar became unavailable, Chao Zhu has had quite some other work - so I'm wondering if this might be ppc64 specific.
> Do you - by any chance - know if this can be reproduced on x86 or if it might be ppc64 specific?

It seems to be mlx5_glue_create_rwq_ind_table which is unsupported
with the version of libibverbs installed on the machine.
Which version of rdma-core is it?

------- Comment From <email address hidden> 2019-04-10 12:47 EDT-------
(In reply to comment #15)
> 10/04/2019 14:24, Christian Ehrhardt ?:
> > Hi David,
> > I added Thomas here as he knows Mellanox HW well and might know the case and/or someone who knows.
> >
> >
> > I realize that this is occurring on ppc64 - which recently was not very well maintained upstream
> > - gowrishankar became unavailable, Chao Zhu has had quite some other work - so I'm wondering if this might be ppc64 specific.
> > Do you - by any chance - know if this can be reproduced on x86 or if it might be ppc64 specific?
>
> It seems to be mlx5_glue_create_rwq_ind_table which is unsupported
> with the version of libibverbs installed on the machine.
> Which version of rdma-core is it?

Hi Thomas

I have not tried this on x86. Agreed it could be a ppc64le issue :)

# apt list libibverbs*
Listing... Done
libibverbs-dev/disco,now 22.1-1 ppc64el [installed]
libibverbs1-dbgsym/disco,now 22.1-1 ppc64el [installed]
libibverbs1/disco,now 22.1-1 ppc64el [installed,automatic]

I am not sure what you mean by mlx5_glue_create_rwq_ind_table which is unsupported. Are you saying that dpdk 18.11 is not supported on this version of libibverbs?

Any thoughts about my last comment re: ib_uverbs: function uverbs_request_next() ?

10/04/2019 18:49, bugproxy:
> ------- Comment From <email address hidden> 2019-04-10 12:47 EDT-------
> (In reply to comment #15)
> > 10/04/2019 14:24, Christian Ehrhardt ?:
> > > Hi David,
> > > I added Thomas here as he knows Mellanox HW well and might know the case and/or someone who knows.
> > >
> > >
> > > I realize that this is occurring on ppc64 - which recently was not very well maintained upstream
> > > - gowrishankar became unavailable, Chao Zhu has had quite some other work - so I'm wondering if this might be ppc64 specific.
> > > Do you - by any chance - know if this can be reproduced on x86 or if it might be ppc64 specific?
> >
> > It seems to be mlx5_glue_create_rwq_ind_table which is unsupported
> > with the version of libibverbs installed on the machine.
> > Which version of rdma-core is it?
>
> Hi Thomas
>
> I have not tried this on x86. Agreed it could be a ppc64le issue :)
>
> # apt list libibverbs*
> Listing... Done
> libibverbs-dev/disco,now 22.1-1 ppc64el [installed]
> libibverbs1-dbgsym/disco,now 22.1-1 ppc64el [installed]
> libibverbs1/disco,now 22.1-1 ppc64el [installed,automatic]
>
> I am not sure what you mean by mlx5_glue_create_rwq_ind_table which is
> unsupported. Are you saying that dpdk 18.11 is not supported on this
> version of libibverbs?

No, DPDK 18.11 should be supported with rdma-core >= 16.

> Any thoughts about my last comment re: ib_uverbs: function
> uverbs_request_next() ?

I don't see this comment.

I'm not an expert of libibverbs.
Please forward to Mellanox experts.

------- Comment From <email address hidden> 2019-04-10 15:31 EDT-------
> I'm not an expert of libibverbs.
> Please forward to Mellanox experts.

Thanks Thomas!

To document the versions used, he is on Ubuntu 19.04 which means we should talking about:
DPDK: 18.11-6 (latest LTS)
rdma-core (all ib things): 22.1-1
This matches ther versions mentioned in the comments before.

I checked the very recently released 18.11.1 but there was nothing fixed that seems to match this issue.

I also checked if I'd have a P9 to test the same, but I don't have one with a MLX5 Card :-/

I agree to Thomas to go to Mellanox experts you might forward the request to rdma-core [1] or dpdk bugtracker [2] and let us know a link to the ML archive or bug to follow. Maybe spawn a bug in DPDK and then a post to rdma referencing to it - so far we just don't know yet on which side to look at it?
To get the extra Mellanox experienced eyes on it you could CC (from Maintainers files in rdma and dpdk projects):
For RDMA MLX:
 Jason Gunthorpe <email address hidden>
 Yishai Hadas <email address hidden>
 Doug Ledford <email address hidden>
For DPDK MLX:
 Matan Azrad <email address hidden>
 Shahaf Shuler <email address hidden>
 Yongseok Koh <email address hidden>
I'm not sure if you have taken over from Chao or talk to him anyway, you might add him as well?
 Chao Zhu <email address hidden>
Myself to keep track
 <email address hidden>

Is this expected to work?
Is this in any way Ubuntu (or the combination of software in Ubuntu) specific in your testing?
After all you did change the support to Alpha level in [3] right :-)

Don't get my feedback above wrong David, I'm happy that somebody picks up and works on DPDK on ppc64 at all again!

[1]: https://github.com/linux-rdma/rdma-core#reporting-bugs
[2]: https://bugs.dpdk.org/buglist.cgi?bug_status=__open__&product=DPDK
[3]: https://git.dpdk.org/dpdk/commit/?id=6b062d56bcff9a393809f46e0aae19a18bc0fe57

TBH - I was even considering to drop ppc64 builds from the next Ubuntu release for the unreliable support in the last year. It seems the overall community feels that way as well reading [1], if you work on getting it into proper support (add ppc CI to the project, run tests regular, ensure all supported HW runs, ...) I'd highly appreciate that.

[1]: http://mails.dpdk.org/archives/dev/2019-April/129894.html

Sorry for the off topic post, lets keep the bug to the error you reported.
I'll probably start a discussion for the general PPC64-DPDK status via mail and keep you and others I know being involved in the loop.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-04-11 12:22 EDT-------
----- Comment From paelzer 2019-04-11 12:22:31 UTC-------
> Sorry for the off topic post, lets keep the bug to the error you reported.
> I'll probably start a discussion for the general PPC64-DPDK status via mail
> and keep you and others I know being involved in the loop.

The fact that we started to try out Ubuntu 19.04 must be illustrative of our commitments. There have been enough discussions out ppc64 support on the mailing list. Let us not allow that to cloud this discussion. Can we agree to limit the discussion to this bug?

Based on Dave's debug, there doesn't appear to be anything that is architecture specific. We have run a number of tests with DPDK-18.11 with and without MLNX_OFED on ppc64le without any issues.

The current situation is that we have access to P9, but not x86. On the other hand, you are likely to have access to x86 with CX-5. To narrow down this problem, can you see if this can be recreated on x86?
The underlying assumption is that all the relevant package versions between x86 and ppc64le are identical. Is that assumption correct?

Download full text (5.2 KiB)

> ------- Comment From <email address hidden> 2019-04-11 12:22 EDT-------
>. Can we agree to limit the discussion to this bug?

As I said the same already, yes.
Thanks to focus more on it again - this calms my former concerns!

> Based on Dave's debug, there doesn't appear to be anything that is
> architecture specific.

That is good the more that are affected the more will join the thought about it.
We still might end up needing to report it to upstream(s)

> The current situation is that we have access to P9, but not x86. On the other hand, you are likely to have access to x86 with CX-5. To narrow down this problem, can you see if this can be recreated on x86?

Of the HW I can tap at the moment my P9s have no Mellanox network, I
have an x86 box with CX-4, but none with CX-5 right now.
Trying on CX-4 just in case ...

$ lspci
08:00.0 Ethernet controller: Mellanox Technologies MT27710 Family
[ConnectX-4 Lx]
       Subsystem: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
       Physical Slot: 1
       Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr+ Stepping- SERR+ FastB2B- DisINTx+
       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
       Latency: 0, Cache Line Size: 64 bytes
       Interrupt: pin A routed to IRQ 16
       NUMA node: 0
       Region 0: Memory at 94000000 (64-bit, prefetchable) [size=32M]
       [virtual] Expansion ROM at 92b00000 [disabled] [size=1M]
       Capabilities: <access denied>
       Kernel driver in use: mlx5_core
       Kernel modules: mlx5_core

But it CX-4 probes with mlx5 driver in DPDK as well.
Superficially one might think I hit the same error (-95 is ENOTSUP).
[...]
EAL: probe driver: 15b3:1015 net_mlx5
net_mlx5: port 0 verbs maximum priority: 8 expected 8/16
net_mlx5: probe of PCI device 0000:08:00.0 aborted after encountering
an error: Unknown error -95
[...]

I have never used this card for DPDK, just found the box with it.
So setup might be incomplete - and analysis shows that I'm stuck with
a different issue than you.

I installed the debug symbols from the archive and checked in GDB
which path it takes through mlx5_pci_probe
I reach detection of case #2, then mlx5_dev_spawn throws the message
  net_mlx5: port 0 verbs maximum priority: 8 expected 8/16
And then it takes the path of failing to initialize from there.

In the initialization function it seems to take a normal route first
(I still would expect it to use the mlx4 pmd)
Single stepping through mlx5_dev_spawn seems normal (it reaches far)

Close to the end this call:
err = mlx5_flow_discover_priorities(eth_dev);
Then yields the message we saw in my case:
  net_mlx5: port 0 verbs maximum priority: 8 expected 8/16
This is setting the err to -95 which fails us eventually.

This is interesting as it does
  priority = vprio[i];
[...]
  switch (priority) {
  8: ...
  16: ...
   default: ...
It reaches default, but then reports vprio[i] as being 8 which means
it should not have reached the default: path.
It is the mlx5_glue->create_flow(drop->qp, &flow_attr.attr); that
returns a null pointer due to "ibv_create_flow" failing.

As I said this card was never used/set up - so t...

Read more...

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-04-12 13:01 EDT-------
(In reply to comment #22)
> Ok, once my setup was fixed it worked fine.
> By any chance are your FW levels up to date?
> For your card that should be:
> ConnectX-5: 16.21.1000 and above.
> ConnectX-5 Ex: 16.21.1000 and above.
>
> You can run ibv_devinfo to get what you currently have.
> $ sudo ibv_devinfo
>
> If it is not your FW, we are back to my suggestion to file this with
> the infiniband/dpdk/mellanox folks to get their experience as well.
>

Our firmware level looks ok, higher than you suggested.

ConnectX-5 Firmware version:
# mstflint -d 0000:01:00.0 q
Image type: FS4
FW Version: 16.23.1020 <<<<<<<<<
FW Release Date: 10.7.2018
Product Version: 16.23.1020
Description: UID GuidsNumber
Base GUID: ec0d9a0300cab17c 4
Base MAC: ec0d9acab17c 4
Image VSD: N/A
Device VSD: N/A
PSID: IBM0000000020
Security Attributes: N/A

I have chased the -95 into the ib_uverbs modules, the error is returned from uverbs_request_finish(), called from ib_uverbs_ex_create_rwq_ind_table().

If I understand, the purpose of uverbs_request_finish() is to verify that the unused portion of some part of buffer passed in from userspace is filled with NULLs. The error indicates its not. If I remove the call to uverbs_request_finish() testpmd starts up fine, but I cant say its functioning beyond that.

I am thinking some incompatibility between libibverbs and the ib_uverbs. There have been recent changes with this kernel interface that may be new for the 5.0 kernel. I am trying to correlate the user space structures with the kernel, but I am struggling at the moment.

Thank you for for testing on x86, I agree its time for me to reach out to Mellanox.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-04-12 18:13 EDT-------
I upgraded a built the following packages, I am continuing to see the same error with the new versions.

kernel 5.0.7-stable
dpdk-19.02
rdma-core (master branch, commit 4bb3583c228566958e2c6f34392c9e39e799bacf)

Changed in ubuntu-power-systems:
status: Triaged → In Progress
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-04-17 18:20 EDT-------
(In reply to comment #24)
> I upgraded a built the following packages, I am continuing to see the same
> error with the new versions.
>
> kernel 5.0.7-stable
> dpdk-19.02
> rdma-core (master branch, commit 4bb3583c228566958e2c6f34392c9e39e799bacf)

Correction:

When building libibverbs from the rdma-core from the master branch I do NOT see the problem.

When using the libibverbs supplied with ubuntu 19.04 (libibverbs1/disco,now 22.1-1 ppc64el) the problem occurs.

Using libibverbs built from the debian/master branch of rdma-core, the problem occurs.

(the results are the same with both the 5.0.7 kernel and 5.0.0-8-generic kernel)

------- Comment From <email address hidden> 2019-04-17 18:25 EDT-------
Mellanox case 601585.

Thanks for that check David, that is good news to finally get a hold on this.

In git current branch "master" (4bb3583c) is at 24.0-1 which isn't in either Debian nor Ubuntu.
Both Distros due to the freezes are on 22.1-1 from branch "debian/master" e5651d01 for now.

v22.1 is just a minimal set of fixes on top of v22:
 * Backport fixes:
   * ibacm: Print correct pkey
   * libhns: Bugfix for allocating and freeing sq db buffer
   * verbs: Fix pingpong buffer validation
   * ABI Files

Thanks to your test we know we are most likely looking at something in v22..head on the master branch to be a fix for your case.

I further checked if the v24 from 4bb3583c would have any massive changes in the packaging itself that could explain it, but no - it is only minor changes for pyverbs. That is good as that means one of the actual code changes most likely is the fix you are looking for.

There are fixes queued in the "stable-v22" branch already (but not yet released):
d05900db libhns: Bugfix for filtering zero length sge
f3bb8968 buildlib: Ensure stanza is properly sorted
e02238ea debian: Create empty pyverbs package for builds without pyverbs
90886054 verbs: Fix attribute returning
c7c842a3 build: Fix pyverbs build issues on Debian
8043035f travis: Change SuSE package target due to Travis CI failures
a4bbfc33 verbs: Avoid inline send when using device memory in rc_pingpong
4b38d3cd mlx5: Use copy loop to read from device memory
d38817ea verbs: clear cmd buffer when creating indirection table
9dcfa6cd libhns: Bugfix for using buffer length
23e3a5da mlx5: Fix incorrect error handling when SQ wqe count is 0

With some (still unlikely) luck 23e3a5da might already be your fix and is backported.
OTOH none of the other changes v22..head seem to be obviously the fix, none of the ib/lmx fixes states your problem exactly and no change is made to uverbs_request_finish or ib_uverbs_ex_create_rwq_ind_table.

Since you fortunately have a setup that can rebuild and retest, could you test with a:
- rdma-core build on 23e3a5da (probably fix on stable-v22)
- rdma-core build on d05900db (current head on stable-v22)
- if none of the above quick checks helps, we'd need a bisect a la:
  $ git bisect start --term-new fixed --term-old broken
  $ git bisect broken v22
  $ git bisect fixed 4bb3583c
  Bisecting: 86 revisions left to test after this (roughly 7 steps)
  [85cf1829e94585ceb38c5c221b49305866fb4344] Merge pull request #472 from oulijun/lijun-rdma-core

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-04-18 17:51 EDT-------
(In reply to comment #27)

Thanks for the tips Christian, I was able to isolated the fix to commit d38817ea

---
From d38817ea554a9fd140e56414a7c74e7eef444c50 Mon Sep 17 00:00:00 2001
From: Mark Bloch <email address hidden>
Date: Wed, 20 Feb 2019 20:00:45 +0000
Subject: [PATCH] verbs: clear cmd buffer when creating indirection table

[ Upstream commit 34225464c1d165f63ffd462364810cc9f374109d ]

Make sure we clear the cmd buffer, not doing so will make the kernel to
fail the command.

Fixes: 75c65bbcadcd ("verbs: Consolidate duplicate code in create_rwq_ind_table")
Signed-off-by: Mark Bloch <email address hidden>
Signed-off-by: Yishai Hadas <email address hidden>
Signed-off-by: Nicolas Morey-Chaisemartin <email address hidden>
---
libibverbs/cmd.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/libibverbs/cmd.c b/libibverbs/cmd.c
index 34c71e56..5817c598 100644
--- a/libibverbs/cmd.c
+++ b/libibverbs/cmd.c
@@ -1854,6 +1854,7 @@ int ibv_cmd_create_rwq_ind_table(struct ibv_context *context,
cmd_size = sizeof(*cmd) + num_tbl_entries * sizeof(cmd->wq_handles[0]);
cmd_size = (cmd_size + 7) / 8 * 8;
cmd = alloca(cmd_size);
+ memset(cmd, 0, cmd_size);

for (i = 0; i < num_tbl_entries; i++)
cmd->wq_handles[i] = init_attr->ind_tbl[i]->handle;
--
2.20.1

This change makes sense based on my debug.

I also applied the change to the head of debian/master branch and verified it works as well.

This should not be a Power specific bug, We just got lucky on x86_64, cmd must have already zeroed.

tags: added: architecture-all
removed: architecture-ppc64le
Changed in dpdk (Ubuntu):
status: New → Triaged
Changed in dpdk (Ubuntu Disco):
status: New → Triaged

Thanks for the Checks David!

This is tagged as being in v23 and Debian/experimental is already at v24.
So mid term this will be resolved by a re-sync of a latter version.
But for now we will need an Ubuntu Delta to fix it.

Eoan isn't fully open yet, but we can prep SRU template, PPAs and MPs to review to get that already out of the way.

An MP is opened at [1] and the respective MR is at [2].

Until 19.10 is fully ready for development this might take a bit, but you should be able to use the PPA as needed until then.

[1]: https://launchpad.net/~paelzer/+archive/ubuntu/bug-1823836-rdma-core-mem-clear/+packages
[2]: https://code.launchpad.net/~paelzer/ubuntu/+source/rdma-core/+git/rdma-core/+merge/366380

description: updated
bugproxy (bugproxy) on 2019-04-23
tags: added: targetmilestone-inin1910
removed: targetmilestone-inin---
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-04-23 14:31 EDT-------
(In reply to comment #29)
> Thanks for the Checks David!
>
> This is tagged as being in v23 and Debian/experimental is already at v24.
> So mid term this will be resolved by a re-sync of a latter version.
> But for now we will need an Ubuntu Delta to fix it.
>
> Eoan isn't fully open yet, but we can prep SRU template, PPAs and MPs to
> review to get that already out of the way.
>
> An MP is opened at [1] and the respective MR is at [2].
>
> Until 19.10 is fully ready for development this might take a bit, but you
> should be able to use the PPA as needed until then.
>
> [1]:
> https://launchpad.net/~paelzer/+archive/ubuntu/bug-1823836-rdma-core-mem-
> clear/+packages
> [2]:
> https://code.launchpad.net/~paelzer/ubuntu/+source/rdma-core/+git/rdma-core/
> +merge/366380

Thanks Christian

BTW: I am not finding ppc64el debs in the above link, should I?

arr - I mis-clicked powerpc instead of ppc64el, I will re-build it.
If only I could change my personal default to always have ppc64el and s390x enabled ...

Since the fixes will be in rdma-core add a task for that as well.

Changed in rdma-core (Ubuntu Disco):
status: New → Triaged
Changed in rdma-core (Ubuntu):
status: New → Triaged
Changed in dpdk (Ubuntu Disco):
status: Triaged → Invalid
Changed in dpdk (Ubuntu):
status: Triaged → Invalid

FYI: Currently a lot of packages are building and testing against eoan which makes this process slower than usual. Once migrated there I can upload it for the SRU teams review.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package rdma-core - 22.1-1ubuntu1

---------------
rdma-core (22.1-1ubuntu1) eoan; urgency=medium

  * d/p/lp-1823836-clear-cmd-buffer.patch: verbs: clear cmd buffer when
    creating indirection table; fixing MLX device initialization (LP: #1823836)

 -- Christian Ehrhardt <email address hidden> Tue, 23 Apr 2019 09:29:45 +0200

Changed in rdma-core (Ubuntu):
status: Triaged → Fix Released
Changed in rdma-core (Ubuntu Disco):
status: Triaged → In Progress

The update to the latest dev release Eoan is complete, lets fix the affected Disco release as well.
I uploaded the same change, but for Disco for the SRU Team to consider it.

An upload of rdma-core to disco-proposed has been rejected from the upload queue for the following reason: "Your source.changes file contains changes for more than the version you intended, please reupload it. https://launchpadlibrarian.net/421428873/rdma-core_22.1-1ubuntu0.1_source.changes".

Interesting, thanks Brian for the hint.
Let me check what went wrong ...

Checking:
- Disco (which the upload is for) is at rdma-core 22.1-1
- The upload was a proper 22.1-1ubuntu0.1 with just the fix on top
- The MP and git branch only has the fix (d/p/lp-1823836-clear-cmd-buffer.patch)
- Here is the issue, the .changes file contained all version history mentioned back until 15-1 (first version)
- if I run dpkg-buildpackage again the .changes is generated correctly (just the last fix as intended)
- TBH I'd not even know how I'd make the broken .changes file, if I add -v15-1 that version would be missing and if I add an empty -v it refuses to build.
- Only something like -v0 or so would do, maybe by accident I fetched something with a lower version from bash history by accident
- :-/ I might not find what happened, but a fix is easy - rebuild and re-upload
- The content was already right, just the too long .changes history

Uploaded for re-review

Hello bugproxy, or anyone else affected,

Accepted rdma-core into disco-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/rdma-core/22.1-1ubuntu0.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-disco to verification-done-disco. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-disco. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in rdma-core (Ubuntu Disco):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-disco
Changed in ubuntu-power-systems:
status: In Progress → Fix Committed

@David - could you verify the version in Disco proposed with your setup please?
It should be the same result that we had with the PPA before.

------- Comment From <email address hidden> 2019-05-10 13:03 EDT-------
(In reply to comment #40)
> @David - could you verify the version in Disco proposed with your setup
> please?
> It should be the same result that we had with the PPA before.

Verification complete.

I tested libibverbs1 version 22.1-1ubuntu0.1 from Disco Purposed, I found the issue was successfully resolved with this update.

Description of test:
Rad testpmd on a ppc64le system with a Mellanox Technologies MT28800 Family [ConnectX-5 Ex] adapter, the pmd correctly initialised the adapter with no errors. This was failing prior to upgrading libibverbs.

Thanks

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-05-10 13:11 EDT-------
(In reply to comment #41)
> (In reply to comment #40)
> > @David - could you verify the version in Disco proposed with your setup
> > please?
> > It should be the same result that we had with the PPA before.
>
> Verification complete.
>
> I tested libibverbs1 version 22.1-1ubuntu0.1 from Disco Purposed, I found
> the issue was successfully resolved with this update.
>
> Description of test:
> Rad testpmd on a ppc64le system with a Mellanox Technologies MT28800 Family
> [ConnectX-5 Ex] adapter, the pmd correctly initialised the adapter with no
> errors. This was failing prior to upgrading libibverbs.
>
> Thanks

PS:
Please change the tag of the bug to "verification-done-disco". I cant figure out how to change the tag.

bugproxy (bugproxy) on 2019-05-10
tags: added: verification-done verification-done-disco
removed: verification-needed verification-needed-disco
Frank Heimes (frank-heimes) wrote :

tags adjusted - thx David

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package rdma-core - 22.1-1ubuntu0.1

---------------
rdma-core (22.1-1ubuntu0.1) disco; urgency=medium

  * d/p/lp-1823836-clear-cmd-buffer.patch: verbs: clear cmd buffer when
    creating indirection table; fixing MLX device initialization (LP: #1823836)

 -- Christian Ehrhardt <email address hidden> Tue, 23 Apr 2019 09:29:45 +0200

Changed in rdma-core (Ubuntu Disco):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for rdma-core has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released

------- Comment From <email address hidden> 2019-06-11 06:38 EDT-------
IBM Bugzilla Status -> Fix Released for requested distro

Changed in ubuntu-z-systems:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers