[dpdk]rte_memcpy() moves data incorrectly on Ubuntu 18.04 on Intel Skylake.

Bug #1799397 reported by Talat Batheesh on 2018-10-23
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
DPDK
Undecided
Unassigned
dpdk (Ubuntu)
Low
Unassigned
Bionic
Undecided
Unassigned
Cosmic
Undecided
Unassigned
gcc-7 (Ubuntu)
Undecided
Unassigned

Bug Description

[Impact]

 * Crashing on certain SkyLake Chips

 * Follow upstream disabling one of the gcc options

[Test Case]

 * Part of the MRE bug 1817675 following the MRE verficiation process as
   defined there.

[Regression Potential]

 * Rebuilds with the new code using DPDK headers will be slightly slower
   (not using the feature) but avoiding the crash. The slowdown should
   be negligible for most cases and the crash avoidance outweigh this.

[Other Info]

 * n/a

---

Hi, Christian

We've recently encountered a weird issue with Ubuntu 18.04 on the Skylake
server. I can always reproduce this crash and I could narrowed it down. I guess
it could be a GCC issue.

[1] How to reproduce
- ConnectX-4Lx/ConnectX-5 with mlx5 PMD in DPDK 18.02.1
- Ubuntu 18.04 on Intel Skylake server
- gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
- Testpmd crashes when it starts to forward traffic. Easy to reproduce.
- Only happens on the Skylake server.
- DPDK 18.05 and later don't have such issue. git-bisect gives no clue.

This is because I enabled MEMPOOL_DEBUG and MLX5_DEBUG. As mempool/rte_memcpy is
inlined function, it should be affected. Now I can see the crash regardlessly -
18.02, 18.05 and 18.08.

[2] Failure point

The attached patch gives an insight of why it crashes. The following is the
result of the patch and the GDB commands.

In summary, rte_memcpy() doesn't work as expected. In __mempool_generic_put(),
there's rte_memcpy() to move the array of objects to the lcore cache. If I run
memcmp() right after rte_memcpy(dst, src, n), data in dst differs from data in
src. And it looks like some of data got shifted by a few bytes as you can see
below.

 [GDB command]
 $dst = 0x7ffff4e09ea8
 $src = 0x7fffce3fb970
 $n = 256
 x/32gx 0x7ffff4e09ea8
 x/32gx 0x7fffce3fb970
 testpmd: /home/mlnxtest/dpdk/build/include/rte_mempool.h:1140: __mempool_generic_put: Assertion `0' failed.

 Thread 4 "lcore-slave-1" received signal SIGABRT, Aborted.
 [Switching to Thread 0x7fffce3ff700 (LWP 69913)]
 (gdb) x/32gx 0x7ffff4e09ea8
 0x7ffff4e09ea8: 0x00007fffaac38ec0 0x00007fffaac38500
 0x7ffff4e09eb8: 0x00007fffaac37b40 0x00007fffaac37180
 0x7ffff4e09ec8: 0x850000007fffaac3 0x7b4000007fffaac3
 0x7ffff4e09ed8: 0x00007fffaac35440 0x00007fffaac34a80
 0x7ffff4e09ee8: 0xaac3850000007fff 0xaac37b4000007fff
 0x7ffff4e09ef8: 0x00007fffaac32d40 0x00007fffaac32380
 0x7ffff4e09f08: 0x7fffaac385000000 0x7fffaac37b400000
 0x7ffff4e09f18: 0x00007fffaac30640 0x00007fffaac2fc80
 0x7ffff4e09f28: 0x00007fffaac2f2c0 0x00007fffaac2e900
 0x7ffff4e09f38: 0x00007fffaac2df40 0x00007fffaac2d580
 0x7ffff4e09f48: 0x00007fffaac2cbc0 0x00007fffaac2c200
 0x7ffff4e09f58: 0x00007fffaac2b840 0x00007fffaac2ae80
 0x7ffff4e09f68: 0x00007fffaac2a4c0 0x00007fffaac29b00
 0x7ffff4e09f78: 0x00007fffaac29140 0x00007fffaac28780
 0x7ffff4e09f88: 0x00007fffaac27dc0 0x00007fffaac27400
 0x7ffff4e09f98: 0x00007fffaac26a40 0x00007fffaac26080
 (gdb) x/32gx 0x7fffce3fb970
 0x7fffce3fb970: 0x00007fffaac38ec0 0x00007fffaac38500
 0x7fffce3fb980: 0x00007fffaac37b40 0x00007fffaac37180
 0x7fffce3fb990: 0x00007fffaac367c0 0x00007fffaac35e00
 0x7fffce3fb9a0: 0x00007fffaac35440 0x00007fffaac34a80
 0x7fffce3fb9b0: 0x00007fffaac340c0 0x00007fffaac33700
 0x7fffce3fb9c0: 0x00007fffaac32d40 0x00007fffaac32380
 0x7fffce3fb9d0: 0x00007fffaac319c0 0x00007fffaac31000
 0x7fffce3fb9e0: 0x00007fffaac30640 0x00007fffaac2fc80
 0x7fffce3fb9f0: 0x00007fffaac2f2c0 0x00007fffaac2e900
 0x7fffce3fba00: 0x00007fffaac2df40 0x00007fffaac2d580
 0x7fffce3fba10: 0x00007fffaac2cbc0 0x00007fffaac2c200
 0x7fffce3fba20: 0x00007fffaac2b840 0x00007fffaac2ae80
 0x7fffce3fba30: 0x00007fffaac2a4c0 0x00007fffaac29b00
 0x7fffce3fba40: 0x00007fffaac29140 0x00007fffaac28780
 0x7fffce3fba50: 0x00007fffaac27dc0 0x00007fffaac27400
 0x7fffce3fba60: 0x00007fffaac26a40 0x00007fffaac26080

AFAIK, AVX512F support is disabled by default in DPDK as it is still
experimental (CONFIG_RTE_ENABLE_AVX512=n). But with gcc optimization, AVX2
version of rte_memcpy() seems to be optimized with 512b instructions. If I
disable it by adding EXTRA_CFLAGS="-mno-avx512f", then it works fine and doesn't
crash.

Do you have any idea regarding this issue or are you already aware of it?

Thanks,
Yongseok

$ git diff
diff --git a/config/common_base b/config/common_base
index ad03cf433..f512b5a88 100644
--- a/config/common_base
+++ b/config/common_base
@@ -275,8 +275,8 @@ CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE=8
 #
 # Compile burst-oriented Mellanox ConnectX-4 & ConnectX-5 (MLX5) PMD
 #
-CONFIG_RTE_LIBRTE_MLX5_PMD=n
-CONFIG_RTE_LIBRTE_MLX5_DEBUG=n
+CONFIG_RTE_LIBRTE_MLX5_PMD=y
+CONFIG_RTE_LIBRTE_MLX5_DEBUG=y
 CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS=n
 CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE=8

@@ -597,7 +597,7 @@ CONFIG_RTE_RING_USE_C11_MEM_MODEL=n
 #
 CONFIG_RTE_LIBRTE_MEMPOOL=y
 CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE=512
-CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG=n
+CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG=y

 #
 # Compile Mempool drivers
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 8b1b7f7ed..9f48028d9 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -39,6 +39,7 @@
 #include <errno.h>
 #include <inttypes.h>
 #include <sys/queue.h>
+#include <assert.h>

 #include <rte_config.h>
 #include <rte_spinlock.h>
@@ -1123,6 +1124,22 @@ __mempool_generic_put(struct rte_mempool *mp, void * const *obj_table,
        /* Add elements back into the cache */
        rte_memcpy(&cache_objs[0], obj_table, sizeof(void *) * n);

+ if(memcmp(&cache_objs[0], obj_table, sizeof(void *) * n)) {
+ printf("[GDB command] \n"
+ "$dst = %p\n"
+ "$src = %p\n"
+ "$n = %ld\n"
+ "x/%ldgx %p\n"
+ "x/%ldgx %p\n",
+ (void *)&cache_objs[0],
+ (const void *)obj_table,
+ sizeof(void *) * n,
+ sizeof(void *) * n / 8, (void *)&cache_objs[0],
+ sizeof(void *) * n / 8, (const void *)obj_table
+ );
+ assert(0);
+ }
+
        cache->len += n;

        if (cache->len >= cache->flushthresh) {

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in dpdk (Ubuntu):
status: New → Confirmed

In the upstream discussion my confusion was resolved, this is in fact no issue of the binaries we generate.
It only happens if you rebuild it yourself with AVX512f enabled.
I'll participate and track the upstream discussion in case we can help fixing the code anyway - but by the provided binaries being ok this is much lower severity than I initially thought.

Changed in dpdk (Ubuntu):
importance: Undecided → Low

copy&paste interim summary from upstream thread

Summary:
        - CPU: Intel Skylake
        - Linux environment: Ubuntu 18.04
        - Compiler: GCC 7 or 8
        - Scenario: testpmd crashes when it starts forwarding
        - Behaviour: AVX2 version of rte_memcpy() fails if optimized for AVX512
        - Context: inline rte_memcpy() is called from
                        inline rte_mempool_put_bulk(), called from
                        mlx5_tx_complete() (inline or not)
        - Analysis: AVX512 optimization changes vmovdqu to vmovdqu8

Latest status can be found in Bugzilla:
        https://bugs.dpdk.org/show_bug.cgi?id=97#c35

It seems offsets are compiled wrong with our compiler

Being a potential gcc bug I'll subscribe doko and add a bug task.

--- bad-avx512-enabled
+++ good-avx512-disabled
- vmovdqu8 xmm0,XMMWORD PTR [rax*8+0x0]
+ vmovdqu xmm0,XMMWORD PTR [rax*8+0x0]
     vinserti128 ymm0,ymm0,XMMWORD PTR [rax*8+0x10],0x1
     vmovups XMMWORD PTR [rsi],xmm0
     vextracti128 XMMWORD PTR [rsi+0x10],ymm0,0x1
- vmovdqu8 xmm0,XMMWORD PTR [rax*8+0x2]
+ vmovdqu xmm0,XMMWORD PTR [rax*8+0x20]
     vinserti128 ymm0,ymm0,XMMWORD PTR [rax*8+0x30],0x1
     vmovups XMMWORD PTR [rsi+0x20],xmm0
     vextracti128 XMMWORD PTR [rsi+0x30],ymm0,0x1
- vmovdqu8 xmm0,XMMWORD PTR [rax*8+0x4]
+ vmovdqu xmm0,XMMWORD PTR [rax*8+0x40]
     vinserti128 ymm0,ymm0,XMMWORD PTR [rax*8+0x50],0x1
     vmovups XMMWORD PTR [rsi+0x40],xmm0
     vextracti128 XMMWORD PTR [rsi+0x50],ymm0,0x1
- vmovdqu8 xmm0,XMMWORD PTR [rax*8+0x6]
+ vmovdqu xmm0,XMMWORD PTR [rax*8+0x60]
     vinserti128 ymm0,ymm0,XMMWORD PTR [rax*8+0x70],0x1
     vmovups XMMWORD PTR [rsi+0x60],xmm0
     vextracti128 XMMWORD PTR [rsi+0x70],ymm0,0x1

I know I'll need to ask for or recreate on my own a repro for Doko to take a look ... time arr

Matthias Klose (doko) wrote :

is that seen as well with gcc-7 in disco, or gcc-8 in disco?

Upstream bug is at https://bugs.dpdk.org/show_bug.cgi?id=97#c18
(Can't link it here :-/ )

They say:
- No crash seen with code generated by clang-6 or gcc-6, probably because they do not generate AVX512 instructions.
- Crash is confirmed with gcc-7 and gcc-8 when using AVX2 version of rte_memcpy.

Nobody tested the one in Disco yet, and I have no repro yet for my own.

Note: Really worth to read through all the info in that bug.

I checked the default build, as expected no breakage.
But this no more is about the default dpdk build, but a potential gcc bug lets ignore the default gcc build and use DPDK source as a test.

To enable a non portable build you would have to use the rte_machine DBE_BUILD_OPTION in dpdk.

$ DEB_BUILD_OPTIONS="parallel=8 rte_machine=native" sbuild --purge=never -Adbionic-amd64 dpdk_18.08-1~ubuntu0.18.04.5.dsc

Further is you don't have the very latest skylake you'd also need to modify a build file to set this march still.
  => mk/machine/native/rte.vars.mk

Replace -march=native with -march=skylake-avx512

Then check the built testpmd program from the static build tree (to have all in one object):
$ objdump -dM intel /build/dpdk-9ZbA0X/dpdk-18.08/debian/build/static-root/build/app/test-pmd/testpmd > /tmp/testpmd.objdump
grep -e 'vmovdqu' /tmp/testpmd.objdump | grep -e '\[rax.*0x[2468]\]' | pastebinit

-march=skylake-avx512 -mno-avx512f
=> http://paste.ubuntu.com/p/nGrfJgffbk/
-march=skylake-avx512
=> http://paste.ubuntu.com/p/zhbhFRqVjF/

I do not see the same error sequence as reported in https://bugs.dpdk.org/show_bug.cgi?id=97#c39

Could it be that:
gdb -batch -ex 'file /build/dpdk-9ZbA0X/dpdk-18.08/debian/build/static-root/build/app/test-pmd/testpmd' -ex 'set disassembly-flavor intel' -ex 'disassemble/rs mlx5_tx_descriptor_status' | less
/build/dpdk-9ZbA0X/dpdk-18.08/debian/build/static-root/include/rte_memcpy.h:
427 dst = (uint8_t *)dst + 128;
   0x000000000045a1d7 <+2311>: 48 83 ea 80 sub rdx,0xffffffffffffff80

/usr/lib/gcc/x86_64-linux-gnu/7/include/avxintrin.h:
921 return *__P;
   0x000000000045a1db <+2315>: 62 f1 fe 28 6f 04 c5 01 00 00 00 vmovdqu64 ymm0,YMMWORD PTR [rax*8+0x1]

922 }
923
924 extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
925 _mm256_storeu_si256 (__m256i_u *__P, __m256i __A)
926 {
927 *__P = __A;
   0x000000000045a1e6 <+2326>: 62 f1 fe 28 7f 42 fd vmovdqu64 YMMWORD PTR [rdx-0x60],ymm0

921 return *__P;
   0x000000000045a1ed <+2333>: 62 f1 fe 28 6f 04 c5 02 00 00 00 vmovdqu64 ymm0,YMMWORD PTR [rax*8+0x2]

922 }
923
924 extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
925 _mm256_storeu_si256 (__m256i_u *__P, __m256i __A)
926 {
927 *__P = __A;
   0x000000000045a1f8 <+2344>: 62 f1 fe 28 7f 42 fe vmovdqu64 YMMWORD PTR [rdx-0x40],ymm0

921 return *__P;
   0x000000000045a1ff <+2351>: 62 f1 fe 28 6f 04 c5 03 00 00 00 vmovdqu64 ymm0,YMMWORD PTR [rax*8+0x3]

Suspicious sequence is in most, functions reported to be broken - but it is not identical to what is reported upstream:

(bionic-amd64)root@Keschdeichel:/build/dpdk-9ZbA0X/dpdk-18.08# gdb -batch -ex 'file /tmp/testpmd-skylakeavx512' -ex 'set disassembly-flavor intel' -ex 'disassemble/rs mlx5_tx_burst' | grep -e grep -e 'vmovdqu.*\[rax.*0x[0-9]\]'
   0x000000000045c60b <+7243>: 62 f1 fe 28 6f 04 c5 01 00 00 00 vmovdqu64 ymm0,YMMWORD PTR [rax*8+0x1]
   0x000000000045c61d <+7261>: 62 f1 fe 28 6f 04 c5 02 00 00 00 vmovdqu64 ymm0,YMMWORD PTR [rax*8+0x2]
   0x000000000045c62f <+7279>: 62 f1 fe 28 6f 04 c5 03 00 00 00 vmovdqu64 ymm0,YMMWORD PTR [rax*8+0x3]

here with avx512f disabled

(bionic-amd64)root@Keschdeichel:/build/dpdk-9ZbA0X/dpdk-18.08# gdb -batch -ex 'file /tmp/testpmd-skylakeavx512-mnoavx512f' -ex 'set disassembly-flavor intel' -ex 'disassemble/rs mlx5_tx_burst' | grep -e grep -e 'vmovdqu.*\[rax.*0x[0-9]\]'

But since it isn't exactly the same as reported it is too unreliable and I don't have skylake+mlx5 card I can't test for it by compiling alone :-/

PPA to test newer gcc versions in Bionic
=> https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3514/+packages

But debugging turned out I seem to really need an affected machine to get any further :-/
I'll ask around if one could hand me a login.

The compilers didn't make a difference.
The versions in Disco for gcc7/8 generate the same results.

But then for me it never generated the wrong code:

Steps:
git clone git://dpdk.org/dpdk
apt build-dep dpdk
apt install gdb build-essential libmnl-dev
apt update && apt upgrade
make defconfig
vim build/.config
# switch on MLX PMDs
make -j
# safe the static linked testpmd to analyze it
cp ./build/build/app/test-pmd/testpmd /root/testpmd.native
objdump -d -M intel -gS /root/testpmd.native > /root/testpmd.native.objdump.intel

This is the upstream discussed repro, by default it builds -march=native which will select -mavx512f

$ gcc -march=native -Q --help=target | grep avx512f
  -mavx512f [enabled]

But in my case no broken vmovdqu[0-9] were generated (with none of the versions).
Upstream seems to settle on disabling -mavx512f, that is recommended to everybody building that manually for now.

An upcoming stable release might add the same in the code, but -mno-avx512f seems to do the trick for anyone else until then.

As long as this isn't reproducible I'm not spending more time for now.

17.11.5 carries http://git.dpdk.org/dpdk-stable/commit/?id=1bc8541000ff27f2ff4da3349b90cc2fd650523f
So this shall be fixed in the upcoming MRE update.

Changed in dpdk:
status: New → Fix Released
Changed in gcc-7 (Ubuntu):
status: New → Invalid
Changed in dpdk (Ubuntu):
status: Confirmed → Fix Released
Changed in dpdk (Ubuntu Bionic):
status: New → Triaged
Changed in dpdk (Ubuntu Cosmic):
status: New → Triaged
no longer affects: gcc-7 (Ubuntu Bionic)
no longer affects: gcc-7 (Ubuntu Cosmic)
description: updated

FYI - all prechecks complete - uploaded to -unapproved

Hello Talat, or anyone else affected,

Accepted dpdk into cosmic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/dpdk/17.11.5-0~ubuntu18.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-cosmic to verification-done-cosmic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-cosmic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in dpdk (Ubuntu Cosmic):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-cosmic
Brian Murray (brian-murray) wrote :

Hello Talat, or anyone else affected,

Accepted dpdk into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/dpdk/17.11.5-0~ubuntu18.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in dpdk (Ubuntu Bionic):
status: Triaged → Fix Committed
tags: added: verification-needed-bionic

As outlined in the SRU template this is covered by the MRE verification - setting this bug to verified (gating will be done by the MRE checks)

tags: added: verification-done verification-done-bionic verification-done-cosmic
removed: verification-needed verification-needed-bionic verification-needed-cosmic
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package dpdk - 17.11.5-0~ubuntu18.10.1

---------------
dpdk (17.11.5-0~ubuntu18.10.1) cosmic; urgency=medium

  * New upstream release 17.11.5; for a full list of changes see:
    https://doc.dpdk.org/guides-17.11/rel_notes/release_17_11.html#id4
    https://doc.dpdk.org/guides-17.11/rel_notes/release_17_11.html#id5
    Among many other fixes this closes the following bugs:
    - request to merge 17.11.5 (LP: #1817675)
    - issues with -mavx512f on recent Skylake chips (LP: #1799397)
    - Drop d/p/net-mlx5-fix-build-with-rdma-core-v19.patch which is part of
      17.11.4
  * d/p/*kni-fix-build*: fix build with kernel 5.0 (LP: #1814919)
    as preparation for a HWE kernel based on the 5.0 version of 19.04

 -- Christian Ehrhardt <email address hidden> Tue, 26 Feb 2019 12:34:12 +0100

Changed in dpdk (Ubuntu Cosmic):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for dpdk has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package dpdk - 17.11.5-0~ubuntu18.04.1

---------------
dpdk (17.11.5-0~ubuntu18.04.1) bionic; urgency=medium

  * New upstream release 17.11.5; for a full list of changes see:
    https://doc.dpdk.org/guides-17.11/rel_notes/release_17_11.html#id4
    https://doc.dpdk.org/guides-17.11/rel_notes/release_17_11.html#id5
    Among many other fixes this closes the following bugs:
    - request to merge 17.11.5 (LP: #1817675)
    - issues with -mavx512f on recent Skylake chips (LP: #1799397)
    - Drop d/p/net-mlx5-fix-build-with-rdma-core-v19.patch which is part of
      17.11.4
  * d/p/*kni-fix-build*: fix build with kernel 5.0 (LP: #1814919)
    as preparation for a HWE kernel based on the 5.0 version of 19.04

 -- Christian Ehrhardt <email address hidden> Tue, 26 Feb 2019 12:34:12 +0100

Changed in dpdk (Ubuntu Bionic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.