gcc tree optimizer generates incorrect vector load instructions for x86_64

Bug #953617 reported by Matthias Klose
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro GCC
Invalid
Undecided
Unassigned
gcc
Invalid
Medium
gcc-4.6 (Debian)
Fix Released
Unknown
gcc-4.6 (Ubuntu)
Invalid
High
Unassigned
Precise
Invalid
High
Unassigned

Bug Description

see the Debian report

Revision history for this message
In , Doko-v (doko-v) wrote :

[forwarded from http://bugs.debian.org/663654]

The following versions of gcc:
 Debian gcc-4.6.3-1,
 Debain gcc-4.4.6-14,
 Debian gcc-4.6.2-14,
 Debian gcc-4.4.6-15,
 Ubuntu 4.4.3-4ubuntu5
generates *wrong* code - aligned vector loads instead of unaligned vector loads
for x86_64 arch. This causes the compiled code to crash with
SIGSEGV(General Protection Fault).

Bug *not* present on trunk and gcc-4.5.3-12.

Consider the following program:

        void foo(int* __restrict ia, int n){
          int i;
          for(i=0;i<n;i++){
            ia[i]=ia[i]*ia[i];
          }
        }

        int main(){
          int a[9];
          int sum=0,i;
          for(i=0;i<9;i++){
            a[i]=(i*i)%128;
          }

          foo((int*)((char*)a+2), 8);

          for(i=0;i<9;i++){
            sum+=a[i];
          }
          return sum;
        }

In x86 and x86_64, unaligned word access are valid
  - *((int*)<unaligned memory address>)
But x86_64 SSE has two kinds of vector instructions
  - aligned vector move (movdqa)
  - unaligned vector move (movdqu)
Use of aligned vector move with an unaligned vector address,
will trigger the application to crash.

When compiled with any of the following command lines:
  gcc -O3 foo.c
  g++ -O3 foo.c
  gcc -m64 -O2 -ftree-vectorize gcc_bug.c
  g++ -m64 -O2 -ftree-vectorize gcc_bug.c
gcc generates an aligned vector load
  movdqa -54(%rsp,%rax), %xmm0
instead of unaligned vector load - movdqu.

This result in above application to crash with
SIGSEGV(General Protection Fault).

gcc-4.7 correctly generates
    movdqu -54(%rsp), %xmm0

Matthias Klose (doko)
Changed in gcc-4.6 (Ubuntu):
importance: Undecided → High
milestone: none → ubuntu-12.04-beta-2
status: New → Confirmed
Changed in gcc-4.6 (Debian):
status: Unknown → New
Changed in gcc:
importance: Unknown → Medium
status: Unknown → New
Revision history for this message
In , Jakub-gcc (jakub-gcc) wrote :

The testcase is invalid C, while x86_64/i?86 will do the expected thing of doing unaligned loads/stores silently, it won't do that in vectorized code or for atomic accesses. You need to tell the compiler that ia isn't aligned through aligned attribute. E.g. typedef int T __attribute__((aligned (2)));
and using T *__restrict ia instead of int *__restrict ia.

Revision history for this message
In , Deepak-ravi (deepak-ravi) wrote :

(In reply to comment #1)
> The testcase is invalid C, while x86_64/i?86 will do the expected thing of
> doing unaligned loads/stores silently, it won't do that in vectorized code or
> for atomic accesses.

Shouldn't the compiler vectorize the code _conservatively_, by generating code to check if the address is aligned or generating unaligned vector load instructions, as any code written for x86_64 will break with -O3, with newer gcc.

Also note that, this bug will get triggered only when __restricted is used. If you remove __restricted, gcc is generating proper code. Also it works properly for gcc 4.7 too (even with __restricted).

Revision history for this message
Michael Hope (michaelh1) wrote :

Is this code valid? foo() takes a pointer to int and correctly assumes that the memory is int aligned. The test code passes in an unaligned pointer.

gcc-4.7 might be wrong, or might be inlineing foo() and recognising the loss of alignment.

Changed in gcc-4.6 (Debian):
status: New → Fix Released
Changed in gcc:
status: New → Invalid
Martin Pitt (pitti)
Changed in gcc-4.6 (Ubuntu Precise):
milestone: ubuntu-12.04-beta-2 → ubuntu-12.04
Michael Hope (michaelh1)
Changed in gcc-linaro:
status: New → Invalid
Matthias Klose (doko)
Changed in gcc-4.6 (Ubuntu Precise):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.