Activity log for bug #1823836

Date Who What changed Old value New value Message
2019-04-09 05:09:24 bugproxy bug added bug
2019-04-09 05:09:26 bugproxy tags architecture-ppc64le bugnameltc-176743 severity-high targetmilestone-inin---
2019-04-09 05:09:29 bugproxy attachment added strace output of testpmd showing error. https://bugs.launchpad.net/bugs/1823836/+attachment/5254393/+files/testpmdstrace.out
2019-04-09 05:09:31 bugproxy ubuntu: assignee Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
2019-04-09 05:09:35 bugproxy affects ubuntu dpdk (Ubuntu)
2019-04-09 05:20:58 Frank Heimes bug task added ubuntu-power-systems
2019-04-09 05:21:12 Frank Heimes ubuntu-power-systems: importance Undecided High
2019-04-09 05:21:33 Frank Heimes ubuntu-power-systems: assignee Canonical Server Team (canonical-server)
2019-04-10 12:20:42 Christian Ehrhardt  bug added subscriber Thomas Monjalon
2019-04-10 12:44:38 Frank Heimes ubuntu-power-systems: status New Triaged
2019-04-12 18:24:51 Terry Rudd bug added subscriber Terry Rudd
2019-04-15 14:18:15 Andrew Cloke ubuntu-power-systems: status Triaged In Progress
2019-04-18 21:59:41 bugproxy tags architecture-ppc64le bugnameltc-176743 severity-high targetmilestone-inin--- architecture-all bugnameltc-176743 severity-high targetmilestone-inin---
2019-04-23 07:24:55 Christian Ehrhardt  nominated for series Ubuntu Disco
2019-04-23 07:24:55 Christian Ehrhardt  bug task added dpdk (Ubuntu Disco)
2019-04-23 07:25:00 Christian Ehrhardt  dpdk (Ubuntu): status New Triaged
2019-04-23 07:25:02 Christian Ehrhardt  dpdk (Ubuntu Disco): status New Triaged
2019-04-23 07:35:21 Christian Ehrhardt  merge proposal linked https://code.launchpad.net/~paelzer/ubuntu/+source/rdma-core/+git/rdma-core/+merge/366380
2019-04-23 07:43:46 Christian Ehrhardt  description == Comment: #0 - David J. Wilder - 2019-04-05 12:44:56 == ---Problem Description--- dpdk-testpmd is failing in net_mlx5. /usr/bin/dpdk-testpmd \ -w 0000:01:00.0 \ -l 0-3 \ -n 4 -- \ -i -a EAL: Detected 128 lcore(s) EAL: Detected 2 NUMA nodes EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: No free hugepages reported in hugepages-2048kB EAL: Probing VFIO support... EAL: VFIO support initialized EAL: PCI device 0000:01:00.0 on NUMA socket 0 EAL: probe driver: 15b3:1019 net_mlx5 net_mlx5: probe of PCI device 0000:01:00.0 aborted after encountering an error: Unknown error -95 testpmd: No probed ethernet devices Interactive-mode selected Auto-start selected testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=171456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc Done Start automatic packet forwarding io packet forwarding - ports=0 - cores=0 - streams=0 - NUMA support enabled, MP allocation mode: native io packet forwarding packets/burst=32 nb forwarding cores=1 - nb forwarding ports=0 Contact Information = David Wilder/wilder@us.ibm.com ---uname output--- Linux ltc17u31 5.0.0-8-generic #9-Ubuntu SMP Tue Mar 12 21:59:39 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 9006-22P Boston ---Debugger--- A debugger is not configured ---Steps to Reproduce--- Installed 19.04 (ppc64le) Installed dpdk and dpdk-dev ---- run dpdk-testpmd /usr/bin/dpdk-testpmd \ -w 0000:01:00.0 \ -l 0-3 \ -n 4 -- \ -i -a EAL: Detected 128 lcore(s) EAL: Detected 2 NUMA nodes EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: No free hugepages reported in hugepages-2048kB EAL: Probing VFIO support... EAL: VFIO support initialized EAL: PCI device 0000:01:00.0 on NUMA socket 0 EAL: probe driver: 15b3:1019 net_mlx5 net_mlx5: probe of PCI device 0000:01:00.0 aborted after encountering an error: Unknown error -95 testpmd: No probed ethernet devices Interactive-mode selected Auto-start selected testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=171456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc Done Start automatic packet forwarding io packet forwarding - ports=0 - cores=0 - streams=0 - NUMA support enabled, MP allocation mode: native io packet forwarding packets/burst=32 nb forwarding cores=1 - nb forwarding ports=0 Userspace tool common name: testpmd The userspace tool has the following bit modes: 64-bit Userspace rpm: dpdk-dev/disco,now 18.11-6 ppc64el == Comment: #1 - David J. Wilder - 2019-04-05 12:45:35 == # lspci -vvv -s 0000:01:00.0 0000:01:00.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] Subsystem: IBM MT28800 Family [ConnectX-5 Ex] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 24 NUMA node: 0 Region 0: Memory at 6000800000000 (64-bit, prefetchable) [size=512M] [virtual] Expansion ROM at 600c000000000 [disabled] [size=1M] Capabilities: [60] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset- MaxPayload 512 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM not supported, Exit Latency L0s unlimited, L1 unlimited ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 16GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis+, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+ EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest- Capabilities: [48] Vital Product Data Product Name: PCIe4 2-port 100Gb EDR Adapter x16 Read-only fields: [PN] Part number: 00WT174 [EC] Engineering changes: P40094 [VF] Vendor specific: 00WT176 [SN] Serial number: YA50YF7CE0V3 [Z0] Unknown: 49 42 4d 30 30 30 30 30 30 30 30 30 32 [VC] Vendor specific: EC64 [MN] Manufacture ID: 37 35 30 58 30 39 31 37 32 35 33 30 38 37 20 [VH] Vendor specific: 2CF2 [VK] Vendor specific: ipzSeries [RV] Reserved: checksum good, 0 byte(s) reserved End Capabilities: [9c] MSI-X: Enable+ Count=256 Masked- Vector table: BAR=0 offset=00002000 PBA: BAR=0 offset=00003000 Capabilities: [c0] Vendor Specific Information: Len=18 <?> Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot-,D3cold+) Status: D0 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 04, GenCap+ CGenEn+ ChkCap+ ChkEn+ Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI) ARICap: MFVC- ACS-, Next Function: 1 ARICtl: MFVC- ACS-, Function Group: 0 Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV) IOVCap: Migration-, Interrupt Message Number: 000 IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+ IOVSta: Migration- Initial VFs: 8, Total VFs: 8, Number of VFs: 0, Function Dependency Link: 00 VF offset: 2, stride: 1, Device ID: 101a Supported Page Size: 000007ff, System Page Size: 00000010 Region 0: Memory at 0006000000000000 (64-bit, prefetchable) VF Migration: offset: 00000000, BIR: 0 Capabilities: [1c0 v1] #19 Capabilities: [230 v1] Access Control Services ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- Capabilities: [320 v1] #27 Capabilities: [370 v1] #26 Capabilities: [420 v1] #25 Kernel driver in use: mlx5_core Kernel modules: mlx5_core == Comment: #2 - David J. Wilder - 2019-04-05 12:54:17 == Building from git://dpdk.org/dpdk tag=v18.11 in the same environment also shows the same error. == Comment: #4 - David J. Wilder - 2019-04-05 12:56:25 == Testing dpdk on beta 19.04 is showing an error with Mellanox Technologies MT28800 Family [ConnectX-5 Ex] ethernet controller. == Comment: #6 - David J. Wilder - 2019-04-05 13:35:12 == Chasing the source of the error. gdb dpdk/ppc_64-power8-linuxapp-gcc/app/testpmd <....> (gdb) break mlx5_ind_table_ibv_drop_new Breakpoint 1 at 0x4998e8: file /home/wilder/ubuntu-19.04-debug/dpdk/drivers/net/mlx5/mlx5_rxq.c, line 2067. (gdb) run -w 0000:01:00.0 -l 0-3 -n 4 -- -i -a Starting program: /home/wilder/ubuntu-19.04-debug/dpdk/ppc_64-power8-linuxapp-gcc/app/testpmd -w 0000:01:00.0 -l 0-3 -n 4 -- -i -a [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/powerpc64le-linux-gnu/libthread_db.so.1". EAL: Detected 128 lcore(s) EAL: Detected 2 NUMA nodes [New Thread 0x7ffff795dc90 (LWP 117018)] EAL: Multi-process socket /var/run/dpdk/rte/mp_socket [New Thread 0x7ffff714dc90 (LWP 117019)] EAL: No free hugepages reported in hugepages-2048kB EAL: Probing VFIO support... EAL: VFIO support initialized [New Thread 0x7ffff693dc90 (LWP 117020)] [New Thread 0x7ffff612dc90 (LWP 117021)] [New Thread 0x7ffff591dc90 (LWP 117022)] EAL: PCI device 0000:01:00.0 on NUMA socket 0 EAL: probe driver: 15b3:1019 net_mlx5 Thread 1 "testpmd" hit Breakpoint 1, 0x00000001004998e8 in mlx5_ind_table_ibv_drop_new (dev=0x100d97580 <rte_eth_devices>) at /home/wilder/ubuntu-19.04-debug/dpdk/drivers/net/mlx5/mlx5_rxq.c:2067 2067 { (gdb) list 2062 * @return 2063 * The Verbs object initialised, NULL otherwise and rte_errno is set. 2064 */ 2065 struct mlx5_ind_table_ibv * 2066 mlx5_ind_table_ibv_drop_new(struct rte_eth_dev *dev) 2067 { 2068 struct priv *priv = dev->data->dev_private; 2069 struct mlx5_ind_table_ibv *ind_tbl; 2070 struct mlx5_rxq_ibv *rxq; 2071 struct mlx5_ind_table_ibv tmpl; (gdb) 2072 2073 rxq = mlx5_rxq_ibv_drop_new(dev); 2074 if (!rxq) 2075 return NULL; 2076 tmpl.ind_table = mlx5_glue->create_rwq_ind_table 2077 (priv->ctx, 2078 &(struct ibv_rwq_ind_table_init_attr){ 2079 .log_ind_tbl_size = 0, 2080 .ind_tbl = &rxq->wq, 2081 .comp_mask = 0, (gdb) 2082 }); 2083 if (!tmpl.ind_table) { 2084 DEBUG("port %u cannot allocate indirection table for drop" 2085 " queue", 2086 dev->data->port_id); 2087 rte_errno = errno; 2088 goto error; 2089 } 2090 ind_tbl = rte_calloc(__func__, 1, sizeof(*ind_tbl), 0); 2091 if (!ind_tbl) { (gdb) break 2084 Breakpoint 2 at 0x1004999d0: file /home/wilder/ubuntu-19.04-debug/dpdk/drivers/net/mlx5/mlx5_rxq.c, line 2084. (gdb) cont Continuing. Thread 1 "testpmd" hit Breakpoint 2, mlx5_ind_table_ibv_drop_new (dev=0x100d97580 <rte_eth_devices>) at /home/wilder/ubuntu-19.04-debug/dpdk/drivers/net/mlx5/mlx5_rxq.c:2087 2087 rte_errno = errno; (gdb) print errno $1 = 95 (gdb) ------ == Comment: #7 - David J. Wilder - 2019-04-05 18:53:33 == Interesting excerpt from strace: write(1, "mlx5_glue_create_rwq_ind_table: "..., 65) = 65 ioctl(23, RDMA_VERBS_IOCTL, 0x7fffe3966c70) = -1 EOPNOTSUPP (Operation not supported) == Comment: #8 - David J. Wilder <wilder@us.ibm.com> - 2019-04-05 21:05:21 == ConnectX-5 Firmware version: # mstflint -d 0000:01:00.0 q Image type: FS4 FW Version: 16.23.1020 FW Release Date: 10.7.2018 Product Version: 16.23.1020 Description: UID GuidsNumber Base GUID: ec0d9a0300cab17c 4 Base MAC: ec0d9acab17c 4 Image VSD: N/A Device VSD: N/A PSID: IBM0000000020 Security Attributes: N/A [Impact] * a missing memset can make rdma (users) use uninitialized memory In the reported case this was a fail to initialize DPDK devices on ppc64, but it could be almost anything else using the cmd buffers * The patch is already at the v22 stable branch (backported and intended to be in v22.2 once released) [Test Case] * So far the only way to trigger this that was found is to run a Connect-X5 card on ppc64 (power9) and try to initialize it, e.g. $ /usr/bin/dpdk-testpmd -i -a This requires special HW, but I hope due to the patch bein a simple one liner that should not be concerning for the SRU. [Regression Potential] * Without the memset it would be random memory, I could imagine a lucky case that ran despite this issue but I can not imagine an issue "relying" on the memory being not-set-to-zero (unless stealing data was your use case). [Other Info] * n/a ---- originla bug report ---- == Comment: #0 - David J. Wilder - 2019-04-05 12:44:56 == ---Problem Description--- dpdk-testpmd is failing in net_mlx5. /usr/bin/dpdk-testpmd \ -w 0000:01:00.0 \ -l 0-3 \ -n 4 -- \ -i -a EAL: Detected 128 lcore(s) EAL: Detected 2 NUMA nodes EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: No free hugepages reported in hugepages-2048kB EAL: Probing VFIO support... EAL: VFIO support initialized EAL: PCI device 0000:01:00.0 on NUMA socket 0 EAL: probe driver: 15b3:1019 net_mlx5 net_mlx5: probe of PCI device 0000:01:00.0 aborted after encountering an error: Unknown error -95 testpmd: No probed ethernet devices Interactive-mode selected Auto-start selected testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=171456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc Done Start automatic packet forwarding io packet forwarding - ports=0 - cores=0 - streams=0 - NUMA support enabled, MP allocation mode: native   io packet forwarding packets/burst=32   nb forwarding cores=1 - nb forwarding ports=0 Contact Information = David Wilder/wilder@us.ibm.com ---uname output--- Linux ltc17u31 5.0.0-8-generic #9-Ubuntu SMP Tue Mar 12 21:59:39 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 9006-22P Boston ---Debugger--- A debugger is not configured ---Steps to Reproduce---  Installed 19.04 (ppc64le) Installed dpdk and dpdk-dev ---- run dpdk-testpmd /usr/bin/dpdk-testpmd \ -w 0000:01:00.0 \ -l 0-3 \ -n 4 -- \ -i -a EAL: Detected 128 lcore(s) EAL: Detected 2 NUMA nodes EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: No free hugepages reported in hugepages-2048kB EAL: Probing VFIO support... EAL: VFIO support initialized EAL: PCI device 0000:01:00.0 on NUMA socket 0 EAL: probe driver: 15b3:1019 net_mlx5 net_mlx5: probe of PCI device 0000:01:00.0 aborted after encountering an error: Unknown error -95 testpmd: No probed ethernet devices Interactive-mode selected Auto-start selected testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=171456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc Done Start automatic packet forwarding io packet forwarding - ports=0 - cores=0 - streams=0 - NUMA support enabled, MP allocation mode: native   io packet forwarding packets/burst=32   nb forwarding cores=1 - nb forwarding ports=0 Userspace tool common name: testpmd The userspace tool has the following bit modes: 64-bit Userspace rpm: dpdk-dev/disco,now 18.11-6 ppc64el == Comment: #1 - David J. Wilder - 2019-04-05 12:45:35 == # lspci -vvv -s 0000:01:00.0 0000:01:00.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]  Subsystem: IBM MT28800 Family [ConnectX-5 Ex]  Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+  Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-  Latency: 0  Interrupt: pin A routed to IRQ 24  NUMA node: 0  Region 0: Memory at 6000800000000 (64-bit, prefetchable) [size=512M]  [virtual] Expansion ROM at 600c000000000 [disabled] [size=1M]  Capabilities: [60] Express (v2) Endpoint, MSI 00   DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited    ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W   DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-    RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-    MaxPayload 512 bytes, MaxReadReq 512 bytes   DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-   LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM not supported, Exit Latency L0s unlimited, L1 unlimited    ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+   LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-    ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-   LnkSta: Speed 16GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-   DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported   DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis+, LTR-, OBFF Disabled   LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-     Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-     Compliance De-emphasis: -6dB   LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+     EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-  Capabilities: [48] Vital Product Data   Product Name: PCIe4 2-port 100Gb EDR Adapter x16   Read-only fields:    [PN] Part number: 00WT174    [EC] Engineering changes: P40094    [VF] Vendor specific: 00WT176    [SN] Serial number: YA50YF7CE0V3    [Z0] Unknown: 49 42 4d 30 30 30 30 30 30 30 30 30 32    [VC] Vendor specific: EC64    [MN] Manufacture ID: 37 35 30 58 30 39 31 37 32 35 33 30 38 37 20    [VH] Vendor specific: 2CF2    [VK] Vendor specific: ipzSeries    [RV] Reserved: checksum good, 0 byte(s) reserved   End  Capabilities: [9c] MSI-X: Enable+ Count=256 Masked-   Vector table: BAR=0 offset=00002000   PBA: BAR=0 offset=00003000  Capabilities: [c0] Vendor Specific Information: Len=18 <?>  Capabilities: [40] Power Management version 3   Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot-,D3cold+)   Status: D0 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME-  Capabilities: [100 v1] Advanced Error Reporting   UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-   UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-   UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-   CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-   CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+   AERCap: First Error Pointer: 04, GenCap+ CGenEn+ ChkCap+ ChkEn+  Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)   ARICap: MFVC- ACS-, Next Function: 1   ARICtl: MFVC- ACS-, Function Group: 0  Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV)   IOVCap: Migration-, Interrupt Message Number: 000   IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+   IOVSta: Migration-   Initial VFs: 8, Total VFs: 8, Number of VFs: 0, Function Dependency Link: 00   VF offset: 2, stride: 1, Device ID: 101a   Supported Page Size: 000007ff, System Page Size: 00000010   Region 0: Memory at 0006000000000000 (64-bit, prefetchable)   VF Migration: offset: 00000000, BIR: 0  Capabilities: [1c0 v1] #19  Capabilities: [230 v1] Access Control Services   ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-   ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-  Capabilities: [320 v1] #27  Capabilities: [370 v1] #26  Capabilities: [420 v1] #25  Kernel driver in use: mlx5_core  Kernel modules: mlx5_core == Comment: #2 - David J. Wilder - 2019-04-05 12:54:17 == Building from git://dpdk.org/dpdk tag=v18.11 in the same environment also shows the same error. == Comment: #4 - David J. Wilder - 2019-04-05 12:56:25 == Testing dpdk on beta 19.04 is showing an error with Mellanox Technologies MT28800 Family [ConnectX-5 Ex] ethernet controller. == Comment: #6 - David J. Wilder - 2019-04-05 13:35:12 == Chasing the source of the error. gdb dpdk/ppc_64-power8-linuxapp-gcc/app/testpmd <....> (gdb) break mlx5_ind_table_ibv_drop_new Breakpoint 1 at 0x4998e8: file /home/wilder/ubuntu-19.04-debug/dpdk/drivers/net/mlx5/mlx5_rxq.c, line 2067. (gdb) run -w 0000:01:00.0 -l 0-3 -n 4 -- -i -a Starting program: /home/wilder/ubuntu-19.04-debug/dpdk/ppc_64-power8-linuxapp-gcc/app/testpmd -w 0000:01:00.0 -l 0-3 -n 4 -- -i -a [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/powerpc64le-linux-gnu/libthread_db.so.1". EAL: Detected 128 lcore(s) EAL: Detected 2 NUMA nodes [New Thread 0x7ffff795dc90 (LWP 117018)] EAL: Multi-process socket /var/run/dpdk/rte/mp_socket [New Thread 0x7ffff714dc90 (LWP 117019)] EAL: No free hugepages reported in hugepages-2048kB EAL: Probing VFIO support... EAL: VFIO support initialized [New Thread 0x7ffff693dc90 (LWP 117020)] [New Thread 0x7ffff612dc90 (LWP 117021)] [New Thread 0x7ffff591dc90 (LWP 117022)] EAL: PCI device 0000:01:00.0 on NUMA socket 0 EAL: probe driver: 15b3:1019 net_mlx5 Thread 1 "testpmd" hit Breakpoint 1, 0x00000001004998e8 in mlx5_ind_table_ibv_drop_new (dev=0x100d97580 <rte_eth_devices>)     at /home/wilder/ubuntu-19.04-debug/dpdk/drivers/net/mlx5/mlx5_rxq.c:2067 2067 { (gdb) list 2062 * @return 2063 * The Verbs object initialised, NULL otherwise and rte_errno is set. 2064 */ 2065 struct mlx5_ind_table_ibv * 2066 mlx5_ind_table_ibv_drop_new(struct rte_eth_dev *dev) 2067 { 2068 struct priv *priv = dev->data->dev_private; 2069 struct mlx5_ind_table_ibv *ind_tbl; 2070 struct mlx5_rxq_ibv *rxq; 2071 struct mlx5_ind_table_ibv tmpl; (gdb) 2072 2073 rxq = mlx5_rxq_ibv_drop_new(dev); 2074 if (!rxq) 2075 return NULL; 2076 tmpl.ind_table = mlx5_glue->create_rwq_ind_table 2077 (priv->ctx, 2078 &(struct ibv_rwq_ind_table_init_attr){ 2079 .log_ind_tbl_size = 0, 2080 .ind_tbl = &rxq->wq, 2081 .comp_mask = 0, (gdb) 2082 }); 2083 if (!tmpl.ind_table) { 2084 DEBUG("port %u cannot allocate indirection table for drop" 2085 " queue", 2086 dev->data->port_id); 2087 rte_errno = errno; 2088 goto error; 2089 } 2090 ind_tbl = rte_calloc(__func__, 1, sizeof(*ind_tbl), 0); 2091 if (!ind_tbl) { (gdb) break 2084 Breakpoint 2 at 0x1004999d0: file /home/wilder/ubuntu-19.04-debug/dpdk/drivers/net/mlx5/mlx5_rxq.c, line 2084. (gdb) cont Continuing. Thread 1 "testpmd" hit Breakpoint 2, mlx5_ind_table_ibv_drop_new (dev=0x100d97580 <rte_eth_devices>)     at /home/wilder/ubuntu-19.04-debug/dpdk/drivers/net/mlx5/mlx5_rxq.c:2087 2087 rte_errno = errno; (gdb) print errno $1 = 95 (gdb) ------ == Comment: #7 - David J. Wilder - 2019-04-05 18:53:33 == Interesting excerpt from strace: write(1, "mlx5_glue_create_rwq_ind_table: "..., 65) = 65 ioctl(23, RDMA_VERBS_IOCTL, 0x7fffe3966c70) = -1 EOPNOTSUPP (Operation not supported) == Comment: #8 - David J. Wilder <wilder@us.ibm.com> - 2019-04-05 21:05:21 == ConnectX-5 Firmware version: # mstflint -d 0000:01:00.0 q Image type: FS4 FW Version: 16.23.1020 FW Release Date: 10.7.2018 Product Version: 16.23.1020 Description: UID GuidsNumber Base GUID: ec0d9a0300cab17c 4 Base MAC: ec0d9acab17c 4 Image VSD: N/A Device VSD: N/A PSID: IBM0000000020 Security Attributes: N/A
2019-04-23 18:09:41 bugproxy tags architecture-all bugnameltc-176743 severity-high targetmilestone-inin--- architecture-all bugnameltc-176743 severity-high targetmilestone-inin1910
2019-04-25 06:50:55 Christian Ehrhardt  bug task added rdma-core (Ubuntu)
2019-04-25 06:51:01 Christian Ehrhardt  rdma-core (Ubuntu Disco): status New Triaged
2019-04-25 06:51:03 Christian Ehrhardt  rdma-core (Ubuntu): status New Triaged
2019-04-25 06:51:06 Christian Ehrhardt  dpdk (Ubuntu Disco): status Triaged Invalid
2019-04-25 06:51:09 Christian Ehrhardt  dpdk (Ubuntu): status Triaged Invalid
2019-04-26 14:20:45 Launchpad Janitor rdma-core (Ubuntu): status Triaged Fix Released
2019-04-29 04:51:14 Christian Ehrhardt  rdma-core (Ubuntu Disco): status Triaged In Progress
2019-05-09 11:34:23 Łukasz Zemczak rdma-core (Ubuntu Disco): status In Progress Fix Committed
2019-05-09 11:34:24 Łukasz Zemczak bug added subscriber Ubuntu Stable Release Updates Team
2019-05-09 11:34:26 Łukasz Zemczak bug added subscriber SRU Verification
2019-05-09 11:34:29 Łukasz Zemczak tags architecture-all bugnameltc-176743 severity-high targetmilestone-inin1910 architecture-all bugnameltc-176743 severity-high targetmilestone-inin1910 verification-needed verification-needed-disco
2019-05-09 12:00:48 Frank Heimes ubuntu-power-systems: status In Progress Fix Committed
2019-05-10 17:30:04 bugproxy tags architecture-all bugnameltc-176743 severity-high targetmilestone-inin1910 verification-needed verification-needed-disco architecture-all bugnameltc-176743 severity-high targetmilestone-inin1910 verification-done verification-done-disco
2019-05-20 09:15:43 Launchpad Janitor rdma-core (Ubuntu Disco): status Fix Committed Fix Released
2019-05-20 09:15:48 Łukasz Zemczak removed subscriber Ubuntu Stable Release Updates Team
2019-05-20 09:26:51 Frank Heimes ubuntu-power-systems: status Fix Committed Fix Released
2019-06-11 13:19:31 Frank Heimes bug task added ubuntu-z-systems
2019-06-11 13:19:41 Frank Heimes ubuntu-z-systems: status New Fix Released