Spurious valgrind errors due to memcpy replacement getting autovectorised

Bug #1434222 reported by bugproxy
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
valgrind (Ubuntu)
Fix Released
Medium
Unassigned
Trusty
Fix Released
Medium
Unassigned

Bug Description

== Comment: #0 - Anton Blanchard <email address hidden> - 2015-03-18 00:05:56 ==
I'm seeing enormous numbers of these type of errors when using valgrind:

==95540== Invalid read of size 8
==95540== at 0x408D038: memcpy (in /usr/lib/valgrind/vgpreload_memcheck-ppc64le-linux.so)
==95540== by 0x10414F5B: mem_file_write (in /usr/bin/gdb)
==95540== by 0x10414CF3: null_file_fputs (in /usr/bin/gdb)
==95540== by 0x104160E3: fputs_unfiltered (in /usr/bin/gdb)
==95540== by 0x1040E20B: fprintf_unfiltered (in /usr/bin/gdb)

In this case I ran "valgrind gdb". The issue here is the valgrind memcpy replacement code is getting autovectorised (since we are building the package with -O3 on Ubuntu):

   0x000000000408d034: rldicr r9,r5,0,59
=> 0x000000000408d038: lxvd2x vs33,0,r9
   0x000000000408d03c: xxswapd vs33,vs33
   0x000000000408d040: vperm v13,v1,v0,v12
   0x000000000408d044: xxlor vs32,vs33,vs33
   0x000000000408d048: xxswapd vs0,vs45
   0x000000000408d04c: stxvd2x vs0,r10,r5
   0x000000000408d050: addi r5,r5,16
   0x000000000408d054: bdnz 0x408d034

In this case the source and destination are not 16B aligned, and gcc has decided to realign things via a permute. The problem is this code will always read too much data (which it just throws away). A safe optimisation, but one which confuses valgrind.

The simple fix is to override any optimise flags and build shared/vg_replace_strmem.c with -O2.

Some of the commit messages on shared/vg_replace_strmem.c, suggest we would like these loops to be autovectorised for performance, but I'm not sure if we can do that and avoid gcc tricks that read in too much data.

== Comment: #1 - William J. Schmidt <email address hidden> - 2015-03-18 09:23:23 ==
Hi Anton,

Note there is a pending fix to GCC that will avoid the realignment code for POWER8, where unaligned load cost is much lower than previously. See https://bugzilla.linux.ibm.com/show_bug.cgi?id=122395.

The current status is that the GCC trunk is closed until GCC 5 releases. Once that occurs, I will be backporting the fix to 5, 4.9, and 4.8 where it can get picked up at the next opportunity by each of the distros. We will also provide it in the next releases of the Advance Toolchain (AT7, AT8, AT9).

== Comment: #2 - David Heller <email address hidden> - 2015-03-19 01:28:25 ==
So is the short term fix to build valgrind (or at least the one module) with -O2, and is that what we want to ask Canonical to do?

== Comment: #3 - William J. Schmidt <email address hidden> - 2015-03-19 09:37:57 ==
For 15.04, yes, that would be best. The GCC schedules make it impossible for us to fix the compiler in time for 15.04.

Note that a less impactful change to the compile would be to replace -O3 with -O3 -fno-tree-vectorize. I'd predict this will still solve the problem.

We will be fixing this properly in time for 15.10, so Canonical should treat this as a one-time change.

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-122870 severity-medium targetmilestone-inin1504
Revision history for this message
Dave Heller (hellerda) wrote :

The request is to build valgrind with "-O3 -fno-tree-vectorize" as a workaround to the above problem. This should be necessary for 15.04 only. Thx.

Luciano Chavez (lnx1138)
affects: ubuntu → valgrind (Ubuntu)
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2015-04-26 16:02 EDT-------
This problem has now been fixed upstream; see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65456.

There is an open feature request for Canonical to implement this in 15.10: https://bugzilla.linux.ibm.com/show_bug.cgi?id=124329.

All required development work for this item is complete.

Revision history for this message
Matthias Klose (doko) wrote :

gcc-4.8 from trusty-updates now has the required PR target/65456 change.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package valgrind - 1:3.10.1-1ubuntu3

---------------
valgrind (1:3.10.1-1ubuntu3) wily; urgency=medium

  * Build with GCC fixing PR target/65456. LP: #1434222.
  * Fix DCACHEBSIZE and HWCAP2 issues. LP: #1428002.
  * Fix bash command completion to respect end of options argument (--).
    LP: #840467.

 -- Matthias Klose <email address hidden> Wed, 15 Jul 2015 09:48:13 +0200

Changed in valgrind (Ubuntu):
status: New → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-09-30 23:29 EDT-------
This bug has reappeared in Wily. Either the gcc patch didn't make it in, or the patch is not sufficient.

Will: could you please take a look?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-10-01 01:46 EDT-------
Assuming the upstream GCC 5 code was used, as I'm sure it was, the patches were definitely present.

I think the problem is that Ubuntu is still being built with -mcpu=power7 -mtune=power8. That will cause the alignment optimization to be missed; since unaligned loads perform poorly on Power7, gcc favors the realignment code that confuses valgrind.

My understanding is that Ubuntu 16.04 will move to using -mcpu=power8, at which point this should no longer be an issue.

So unfortunately I believe we need to carry over the -O3 -fno-tree-vectorize workaround for shared/vg_replace_strmem.c in 15.10. This workaround can then be removed for 16.04.

Matthias, could you please change the build rules to implement the circumvention?

Thanks!

Bill

Revision history for this message
Chris J Arges (arges) wrote : Please test proposed package

Hello bugproxy, or anyone else affected,

Accepted valgrind into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/valgrind/1:3.10.1-1ubuntu3~14.04 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in valgrind (Ubuntu Trusty):
status: New → Fix Committed
tags: added: verification-needed
bugproxy (bugproxy)
tags: added: verification-done
removed: verification-needed
Mathew Hodson (mhodson)
Changed in valgrind (Ubuntu):
importance: Undecided → Medium
Changed in valgrind (Ubuntu Trusty):
importance: Undecided → Medium
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package valgrind - 1:3.10.1-1ubuntu3~14.04

---------------
valgrind (1:3.10.1-1ubuntu3~14.04) trusty-proposed; urgency=medium

  * SRU. Backport the 1.10 series to 14.04 for ppc64el support. LP: #1386524.
  * Also fixes segfault on ppc64el, LP: #1377796.

valgrind (1:3.10.1-1ubuntu3) wily; urgency=medium

  * Build with GCC fixing PR target/65456. LP: #1434222.
  * Fix DCACHEBSIZE and HWCAP2 issues. LP: #1428002.
  * Fix bash command completion to respect end of options argument (--).
    LP: #840467.

valgrind (1:3.10.1-1ubuntu2) vivid; urgency=medium

  * Build for glibc-2.21. LP: #1435261.

valgrind (1:3.10.1-1ubuntu1) vivid; urgency=medium

  * Merge with Debian; remaining changes:
    - Remove valgrind-mpi package from Ubuntu and any references to it, as
      mpi-default-dev is in universe for the time being.
    - Lower over-inflated valgrind-dbg Recommends to Suggests instead.
    - Don't strip vgpreload* on ARM; this results in unusable stack traces
      without valgrind-dbg.
    - Configure with --enable-only64bit on AArch64.

 -- Matthias Klose <email address hidden> Wed, 15 Jul 2015 09:37:42 +0200

Changed in valgrind (Ubuntu Trusty):
status: Fix Committed → Fix Released
Revision history for this message
Chris J Arges (arges) wrote : Update Released

The verification of the Stable Release Update for valgrind has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.