Bug #852760 “valgrind false positives on gcc-generated string ro...” : Bugs : valgrind package : Ubuntu

Revision history for this message

In KDE Bug Tracking System #264936, Joost-vandevondele (joost-vandevondele) wrote on 2011-01-31:

#2

This bug report relates to two (closed invalid) bug reports in gcc bugzilla.

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47522
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44183

PR47522 includes a runable example in the first comment.

the issue appears to be that vectorization can result in code that loads elements beyond the last element of an allocated array. However, these loads will only happen for unaligned data, where access to the last+1 element can't trigger a page fault or other side effects (according to my interpretation of comments by gcc developers) and are never used. As such, this is considered valid.

Since this kind of code will be produced increasingly by gcc, especially for numerical codes (whenever vectorization triggers, essentially) it would be great to have this somehow dealt with in valgrind.

Revision history for this message

In KDE Bug Tracking System #264936, Jseward (jseward) wrote on 2011-01-31:

#3

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47522#c4
>
> I think valgrind should simply special-case these kind of out of bounds
> checks based on the instruction that was used.

Great. Why don't you tell me then how I am supposed to differentiate
between a vector load that is deliberately out of bounds vs one that is
out of bounds by accident, so I can emit an error for the latter but
not for the former?

Revision history for this message

In KDE Bug Tracking System #264936, Joost-vandevondele (joost-vandevondele) wrote on 2011-01-31:

#4

(In reply to comment #1)

> Great. Why don't you tell me then how I am supposed to differentiate
> between a vector load that is deliberately out of bounds vs one that is
> out of bounds by accident, so I can emit an error for the latter but
> not for the former?

Hey.... I'm a user, you're the developer ;-)

I'm really not the right person to ask. I guess there are some signatures... it is a vector load, with at least one element that is still part of an allocated array. Additionally, based on alignment the 'offending load(s)' can not cross a page boundary. Finally, the loaded byte(s) propagate as uninitialized data, but never trigger the 'used uninitialized error'. I suppose that you might get more details in the gcc bugzilla.

Revision history for this message

In KDE Bug Tracking System #264936, Jseward (jseward) wrote on 2011-01-31:

#5

Can you objdump -d the loop containing the complained-about load,
and post the results?

Revision history for this message

In KDE Bug Tracking System #264936, Joost-vandevondele (joost-vandevondele) wrote on 2011-01-31:

#6

Download full text (22.8 KiB)

So the valgrind message I have is:

==12860== Invalid read of size 8
==12860== at 0x400A38: integrate_gf_npbc_ (in /data03/vondele/bugs/valgrind/a.out)
==12860== by 0x40245B: main (in /data03/vondele/bugs/valgrind/a.out)
==12860== Address 0x58e9e40 is 0 bytes after a block of size 272 alloc'd
==12860== at 0x4C26C3A: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==12860== by 0x402209: main (in /data03/vondele/bugs/valgrind/a.out)

The corresponding asm from objdump is:

0000000000400720 <integrate_gf_npbc_>:
  400720: 41 57 push %r15
  400722: 41 56 push %r14
  400724: 41 55 push %r13
  400726: 41 54 push %r12
  400728: 49 89 fc mov %rdi,%r12
  40072b: 31 ff xor %edi,%edi
  40072d: 55 push %rbp
  40072e: 53 push %rbx
  40072f: 48 83 ec 50 sub $0x50,%rsp
  400733: 49 63 18 movslq (%r8),%rbx
  400736: 45 8b 09 mov (%r9),%r9d
  400739: 48 89 54 24 20 mov %rdx,0x20(%rsp)
  40073e: 49 63 50 04 movslq 0x4(%r8),%rdx
  400742: 48 89 74 24 a0 mov %rsi,-0x60(%rsp)
  400747: 49 63 70 08 movslq 0x8(%r8),%rsi
  40074b: 48 8b 84 24 b0 00 00 mov 0xb0(%rsp),%rax
  400752: 00
  400753: 48 83 c2 01 add $0x1,%rdx
  400757: 48 29 da sub %rbx,%rdx
  40075a: 48 0f 48 d7 cmovs %rdi,%rdx
  40075e: 48 89 54 24 f0 mov %rdx,-0x10(%rsp)
  400763: 49 63 50 0c movslq 0xc(%r8),%rdx
  400767: 48 8b 6c 24 f0 mov -0x10(%rsp),%rbp
  40076c: 48 83 c2 01 add $0x1,%rdx
  400770: 48 29 f2 sub %rsi,%rdx
  400773: 48 0f af 54 24 f0 imul -0x10(%rsp),%rdx
  400779: 48 85 d2 test %rdx,%rdx
  40077c: 48 0f 49 fa cmovns %rdx,%rdi
  400780: 48 89 da mov %rbx,%rdx
  400783: 48 01 db add %rbx,%rbx
  400786: 48 0f af ee imul %rsi,%rbp
  40078a: 48 89 7c 24 c0 mov %rdi,-0x40(%rsp)
  40078f: 48 f7 da neg %rdx
  400792: 49 63 78 10 movslq 0x10(%r8),%rdi
  400796: 48 01 f6 add %rsi,%rsi
  400799: 48 f7 d3 not %rbx
  40079c: 48 f7 d6 not %rsi
  40079f: 48 89 5c 24 b0 mov %rbx,-0x50(%rsp)
  4007a4: 44 89 4c 24 cc mov %r9d,-0x34(%rsp)
  4007a9: 48 89 74 24 10 mov %rsi,0x10(%rsp)
  4007ae: 48 8b b4 24 88 00 00 mov 0x88(%rsp),%rsi
  4007b5: 00
  4007b6: 48 29 ea sub %rbp,%rdx
  4007b9: 48 8b 6c 24 c0 mov -0x40(%rsp),%rbp
  4007be: 48 8d 1c 3f lea (%rdi,%rdi,1),%rbx
  4007c2: 8b 36 mov (%rsi),%esi
  4007c4: 48 0f af ef im...

So the valgrind message I have is:

==12860== Invalid read of size 8
==12860==    at 0x400A38: integrate_gf_npbc_ (in /data03/vondele/bugs/valgrind/a.out)
==12860==    by 0x40245B: main (in /data03/vondele/bugs/valgrind/a.out)
==12860==  Address 0x58e9e40 is 0 bytes after a block of size 272 alloc'd
==12860==    at 0x4C26C3A: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==12860==    by 0x402209: main (in /data03/vondele/bugs/valgrind/a.out)

The corresponding asm from objdump is:

0000000000400720 <integrate_gf_npbc_>:
  400720:       41 57                   push   %r15
  400722:       41 56                   push   %r14
  400724:       41 55                   push   %r13
  400726:       41 54                   push   %r12
  400728:       49 89 fc                mov    %rdi,%r12
  40072b:       31 ff                   xor    %edi,%edi
  40072d:       55                      push   %rbp
  40072e:       53                      push   %rbx
  40072f:       48 83 ec 50             sub    $0x50,%rsp
  400733:       49 63 18                movslq (%r8),%rbx
  400736:       45 8b 09                mov    (%r9),%r9d
  400739:       48 89 54 24 20          mov    %rdx,0x20(%rsp)
  40073e:       49 63 50 04             movslq 0x4(%r8),%rdx
  400742:       48 89 74 24 a0          mov    %rsi,-0x60(%rsp)
  400747:       49 63 70 08             movslq 0x8(%r8),%rsi
  40074b:       48 8b 84 24 b0 00 00    mov    0xb0(%rsp),%rax
  400752:       00
  400753:       48 83 c2 01             add    $0x1,%rdx
  400757:       48 29 da                sub    %rbx,%rdx
  40075a:       48 0f 48 d7             cmovs  %rdi,%rdx
  40075e:       48 89 54 24 f0          mov    %rdx,-0x10(%rsp)
  400763:       49 63 50 0c             movslq 0xc(%r8),%rdx
  400767:       48 8b 6c 24 f0          mov    -0x10(%rsp),%rbp
  40076c:       48 83 c2 01             add    $0x1,%rdx
  400770:       48 29 f2                sub    %rsi,%rdx
  400773:       48 0f af 54 24 f0       imul   -0x10(%rsp),%rdx
  400779:       48 85 d2                test   %rdx,%rdx
  40077c:       48 0f 49 fa             cmovns %rdx,%rdi
  400780:       48 89 da                mov    %rbx,%rdx
  400783:       48 01 db                add    %rbx,%rbx
  400786:       48 0f af ee             imul   %rsi,%rbp
  40078a:       48 89 7c 24 c0          mov    %rdi,-0x40(%rsp)
  40078f:       48 f7 da                neg    %rdx
  400792:       49 63 78 10             movslq 0x10(%r8),%rdi
  400796:       48 01 f6                add    %rsi,%rsi
  400799:       48 f7 d3                not    %rbx
  40079c:       48 f7 d6                not    %rsi
  40079f:       48 89 5c 24 b0          mov    %rbx,-0x50(%rsp)
  4007a4:       44 89 4c 24 cc          mov    %r9d,-0x34(%rsp)
  4007a9:       48 89 74 24 10          mov    %rsi,0x10(%rsp)
  4007ae:       48 8b b4 24 88 00 00    mov    0x88(%rsp),%rsi
  4007b5:       00
  4007b6:       48 29 ea                sub    %rbp,%rdx
  4007b9:       48 8b 6c 24 c0          mov    -0x40(%rsp),%rbp
  4007be:       48 8d 1c 3f             lea    (%rdi,%rdi,1),%rbx
  4007c2:       8b 36                   mov    (%rsi),%esi
  4007c4:       48 0f af ef             imul   %rdi,%rbp
  4007c8:       48 f7 d3                not    %rbx
  4007cb:       89 74 24 08             mov    %esi,0x8(%rsp)
  4007cf:       48 29 ea                sub    %rbp,%rdx
  4007d2:       41 39 f1                cmp    %esi,%r9d
  4007d5:       0f 8f 4d 05 00 00       jg     400d28 <integrate_gf_npbc_+0x608>
  4007db:       48 8b b4 24 90 00 00    mov    0x90(%rsp),%rsi
  4007e2:       00
  4007e3:       48 8b 7c 24 10          mov    0x10(%rsp),%rdi
  4007e8:       4c 8b 74 24 20          mov    0x20(%rsp),%r14
  4007ed:       8b 36                   mov    (%rsi),%esi
  4007ef:       89 74 24 04             mov    %esi,0x4(%rsp)
  4007f3:       48 8b b4 24 98 00 00    mov    0x98(%rsp),%rsi
  4007fa:       00
  4007fb:       8b 36                   mov    (%rsi),%esi
  4007fd:       89 74 24 0c             mov    %esi,0xc(%rsp)
  400801:       83 ee 01                sub    $0x1,%esi
  400804:       89 74 24 1c             mov    %esi,0x1c(%rsp)
  400808:       2b 74 24 04             sub    0x4(%rsp),%esi
  40080c:       d1 ee                   shr    %esi
  40080e:       89 74 24 2c             mov    %esi,0x2c(%rsp)
  400812:       48 63 74 24 0c          movslq 0xc(%rsp),%rsi
  400817:       44 8b 7c 24 2c          mov    0x2c(%rsp),%r15d
  40081c:       48 89 74 24 30          mov    %rsi,0x30(%rsp)
  400821:       49 63 f1                movslq %r9d,%rsi
  400824:       48 8d 5c 73 01          lea    0x1(%rbx,%rsi,2),%rbx
  400829:       48 0f af 74 24 c0       imul   -0x40(%rsp),%rsi
  40082f:       4c 8d 2c d9             lea    (%rcx,%rbx,8),%r13
  400833:       48 8b 4c 24 f0          mov    -0x10(%rsp),%rcx
  400838:       48 01 d6                add    %rdx,%rsi
  40083b:       48 8b 54 24 30          mov    0x30(%rsp),%rdx
  400840:       48 0f af 54 24 f0       imul   -0x10(%rsp),%rdx
  400846:       48 8d 14 16             lea    (%rsi,%rdx,1),%rdx
  40084a:       48 89 54 24 f8          mov    %rdx,-0x8(%rsp)
  40084f:       48 63 54 24 04          movslq 0x4(%rsp),%rdx
  400854:       48 0f af ca             imul   %rdx,%rcx
  400858:       48 8d 14 57             lea    (%rdi,%rdx,2),%rdx
  40085c:       49 8d 14 d6             lea    (%r14,%rdx,8),%rdx
  400860:       48 8d 0c 0e             lea    (%rsi,%rcx,1),%rcx
  400864:       48 89 54 24 38          mov    %rdx,0x38(%rsp)
  400869:       8b 54 24 04             mov    0x4(%rsp),%edx
  40086d:       48 89 4c 24 e0          mov    %rcx,-0x20(%rsp)
  400872:       48 89 4c 24 e8          mov    %rcx,-0x18(%rsp)
  400877:       48 8b 4c 24 30          mov    0x30(%rsp),%rcx
  40087c:       46 8d 7c 7a 01          lea    0x1(%rdx,%r15,2),%r15d
  400881:       44 89 7c 24 44          mov    %r15d,0x44(%rsp)
  400886:       48 8d 4c 4f 01          lea    0x1(%rdi,%rcx,2),%rcx
  40088b:       48 89 4c 24 48          mov    %rcx,0x48(%rsp)
  400890:       8b 5c 24 1c             mov    0x1c(%rsp),%ebx
  400894:       39 5c 24 04             cmp    %ebx,0x4(%rsp)
  400898:       ba ff ff ff 7f          mov    $0x7fffffff,%edx
  40089d:       0f 8f 51 03 00 00       jg     400bf4 <integrate_gf_npbc_+0x4d4>
  4008a3:       48 8b b4 24 a0 00 00    mov    0xa0(%rsp),%rsi
  4008aa:       00
  4008ab:       48 8b bc 24 a8 00 00    mov    0xa8(%rsp),%rdi
  4008b2:       00
  4008b3:       48 8b 4c 24 e8          mov    -0x18(%rsp),%rcx
  4008b8:       4c 8b 74 24 f0          mov    -0x10(%rsp),%r14
  4008bd:       4c 8b 5c 24 f0          mov    -0x10(%rsp),%r11
  4008c2:       4c 03 5c 24 e8          add    -0x18(%rsp),%r11
  4008c7:       8b 36                   mov    (%rsi),%esi
  4008c9:       8b 3f                   mov    (%rdi),%edi
  4008cb:       48 8b 5c 24 b0          mov    -0x50(%rsp),%rbx
  4008d0:       f2 44 0f 10 10          movsd  (%rax),%xmm10
  4008d5:       f2 44 0f 10 48 08       movsd  0x8(%rax),%xmm9
  4008db:       49 c1 e6 04             shl    $0x4,%r14
  4008df:       48 63 d6                movslq %esi,%rdx
  4008e2:       89 74 24 9c             mov    %esi,-0x64(%rsp)
  4008e6:       89 7c 24 98             mov    %edi,-0x68(%rsp)
  4008ea:       48 8d 0c 0a             lea    (%rdx,%rcx,1),%rcx
  4008ee:       f2 44 0f 10 40 10       movsd  0x10(%rax),%xmm8
  4008f4:       4c 89 74 24 88          mov    %r14,-0x78(%rsp)
  4008f9:       4c 8b 74 24 a0          mov    -0x60(%rsp),%r14
  4008fe:       4e 8d 1c 1a             lea    (%rdx,%r11,1),%r11
  400902:       49 8d 34 cc             lea    (%r12,%rcx,8),%rsi
  400906:       8b 4c 24 98             mov    -0x68(%rsp),%ecx
  40090a:       2b 4c 24 9c             sub    -0x64(%rsp),%ecx
  40090e:       48 8d 14 53             lea    (%rbx,%rdx,2),%rdx
  400912:       4c 8b 7c 24 f0          mov    -0x10(%rsp),%r15
  400917:       4c 8b 54 24 f0          mov    -0x10(%rsp),%r10
  40091c:       4c 03 54 24 e0          add    -0x20(%rsp),%r10
  400921:       48 8b 7c 24 38          mov    0x38(%rsp),%rdi
  400926:       49 c1 e3 03             shl    $0x3,%r11
  40092a:       4d 8d 74 d6 10          lea    0x10(%r14,%rdx,8),%r14
  40092f:       4c 8b 4c 24 e0          mov    -0x20(%rsp),%r9
  400934:       44 8b 44 24 2c          mov    0x2c(%rsp),%r8d
  400939:       83 c1 01                add    $0x1,%ecx
  40093c:       4d 01 ff                add    %r15,%r15
  40093f:       48 89 54 24 b8          mov    %rdx,-0x48(%rsp)
  400944:       89 cd                   mov    %ecx,%ebp
  400946:       89 4c 24 a8             mov    %ecx,-0x58(%rsp)
  40094a:       4c 89 7c 24 90          mov    %r15,-0x70(%rsp)
  40094f:       d1 ed                   shr    %ebp
  400951:       4c 89 74 24 d0          mov    %r14,-0x30(%rsp)
  400956:       8d 4c 2d 00             lea    0x0(%rbp,%rbp,1),%ecx
  40095a:       89 4c 24 ac             mov    %ecx,-0x54(%rsp)
  40095e:       03 4c 24 9c             add    -0x64(%rsp),%ecx
  400962:       89 4c 24 dc             mov    %ecx,-0x24(%rsp)
  400966:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  40096d:       00 00 00
  400970:       44 8b 7c 24 98          mov    -0x68(%rsp),%r15d
  400975:       44 39 7c 24 9c          cmp    %r15d,-0x64(%rsp)
  40097a:       66 0f 57 ed             xorpd  %xmm5,%xmm5
  40097e:       66 0f 28 c5             movapd %xmm5,%xmm0
  400982:       66 0f 28 fd             movapd %xmm5,%xmm7
  400986:       0f 8f 1e 02 00 00       jg     400baa <integrate_gf_npbc_+0x48a>
  40098c:       8b 54 24 ac             mov    -0x54(%rsp),%edx
  400990:       85 d2                   test   %edx,%edx
  400992:       0f 84 9f 03 00 00       je     400d37 <integrate_gf_npbc_+0x617>
  400998:       83 7c 24 a8 09          cmpl   $0x9,-0x58(%rsp)
  40099d:       0f 86 94 03 00 00       jbe    400d37 <integrate_gf_npbc_+0x617>
  4009a3:       66 0f 57 e4             xorpd  %xmm4,%xmm4
  4009a7:       48 8b 54 24 b8          mov    -0x48(%rsp),%rdx
  4009ac:       4c 8b 74 24 a0          mov    -0x60(%rsp),%r14
  4009b1:       48 8b 5c 24 d0          mov    -0x30(%rsp),%rbx
  4009b6:       4f 8d 3c 1c             lea    (%r12,%r11,1),%r15
  4009ba:       66 0f 28 fc             movapd %xmm4,%xmm7
  4009be:       66 0f 28 ec             movapd %xmm4,%xmm5
  4009c2:       66 44 0f 28 dc          movapd %xmm4,%xmm11
  4009c7:       49 8d 4c d6 08          lea    0x8(%r14,%rdx,8),%rcx
  4009cc:       31 d2                   xor    %edx,%edx
  4009ce:       45 31 f6                xor    %r14d,%r14d
  4009d1:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)
  4009d8:       f2 44 0f 10 24 16       movsd  (%rsi,%rdx,1),%xmm12
  4009de:       41 83 c6 01             add    $0x1,%r14d
  4009e2:       f2 0f 10 31             movsd  (%rcx),%xmm6
  4009e6:       66 44 0f 16 64 16 08    movhpd 0x8(%rsi,%rdx,1),%xmm12
  4009ed:       f2 41 0f 10 04 17       movsd  (%r15,%rdx,1),%xmm0
  4009f3:       66 0f 16 71 08          movhpd 0x8(%rcx),%xmm6
  4009f8:       66 41 0f 28 dc          movapd %xmm12,%xmm3
  4009fd:       f2 44 0f 10 61 10       movsd  0x10(%rcx),%xmm12
  400a03:       66 0f 28 ce             movapd %xmm6,%xmm1
  400a07:       66 41 0f 16 44 17 08    movhpd 0x8(%r15,%rdx,1),%xmm0
  400a0e:       66 44 0f 16 61 18       movhpd 0x18(%rcx),%xmm12
  400a14:       f2 0f 10 33             movsd  (%rbx),%xmm6
  400a18:       66 0f 28 d0             movapd %xmm0,%xmm2
  400a1c:       48 83 c2 10             add    $0x10,%rdx
  400a20:       66 41 0f 14 cc          unpcklpd %xmm12,%xmm1
  400a25:       66 0f 16 73 08          movhpd 0x8(%rbx),%xmm6
  400a2a:       f2 44 0f 10 63 10       movsd  0x10(%rbx),%xmm12
  400a30:       48 83 c1 20             add    $0x20,%rcx
  400a34:       66 0f 28 c6             movapd %xmm6,%xmm0
  400a38:       66 44 0f 16 63 18       movhpd 0x18(%rbx),%xmm12
  400a3e:       66 0f 28 f1             movapd %xmm1,%xmm6
  400a42:       66 0f 59 ca             mulpd  %xmm2,%xmm1
  400a46:       48 83 c3 20             add    $0x20,%rbx
  400a4a:       41 39 ee                cmp    %ebp,%r14d
  400a4d:       66 41 0f 14 c4          unpcklpd %xmm12,%xmm0
  400a52:       66 0f 59 f3             mulpd  %xmm3,%xmm6
  400a56:       66 0f 59 d8             mulpd  %xmm0,%xmm3
  400a5a:       66 0f 58 f9             addpd  %xmm1,%xmm7
  400a5e:       66 0f 59 c2             mulpd  %xmm2,%xmm0
  400a62:       66 44 0f 58 de          addpd  %xmm6,%xmm11
  400a67:       66 0f 58 eb             addpd  %xmm3,%xmm5
  400a6b:       66 0f 58 e0             addpd  %xmm0,%xmm4
  400a6f:       0f 82 63 ff ff ff       jb     4009d8 <integrate_gf_npbc_+0x2b8>
  400a75:       66 0f 28 c4             movapd %xmm4,%xmm0
  400a79:       8b 54 24 a8             mov    -0x58(%rsp),%edx
  400a7d:       66 44 0f 28 e7          movapd %xmm7,%xmm12
  400a82:       39 54 24 ac             cmp    %edx,-0x54(%rsp)
  400a86:       66 0f 15 c0             unpckhpd %xmm0,%xmm0
  400a8a:       8b 4c 24 dc             mov    -0x24(%rsp),%ecx
  400a8e:       66 45 0f 15 e4          unpckhpd %xmm12,%xmm12
  400a93:       66 0f 28 f0             movapd %xmm0,%xmm6
  400a97:       66 0f 28 c5             movapd %xmm5,%xmm0
  400a9b:       f2 0f 58 f4             addsd  %xmm4,%xmm6
  400a9f:       66 41 0f 28 e4          movapd %xmm12,%xmm4
  400aa4:       66 0f 15 c0             unpckhpd %xmm0,%xmm0
  400aa8:       66 45 0f 28 e3          movapd %xmm11,%xmm12
  400aad:       f2 0f 58 e7             addsd  %xmm7,%xmm4
  400ab1:       66 45 0f 15 e4          unpckhpd %xmm12,%xmm12
  400ab6:       66 0f 28 f8             movapd %xmm0,%xmm7
  400aba:       f2 0f 58 fd             addsd  %xmm5,%xmm7
  400abe:       66 41 0f 28 ec          movapd %xmm12,%xmm5
  400ac3:       f2 41 0f 58 eb          addsd  %xmm11,%xmm5
  400ac8:       0f 84 90 00 00 00       je     400b5e <integrate_gf_npbc_+0x43e>
  400ace:       48 63 d1                movslq %ecx,%rdx
  400ad1:       4c 8b 7c 24 b0          mov    -0x50(%rsp),%r15
  400ad6:       4a 8d 1c 0a             lea    (%rdx,%r9,1),%rbx
  400ada:       4d 8d 34 dc             lea    (%r12,%rbx,8),%r14
  400ade:       4a 8d 1c 12             lea    (%rdx,%r10,1),%rbx
  400ae2:       49 8d 54 57 01          lea    0x1(%r15,%rdx,2),%rdx
  400ae7:       4c 8b 7c 24 a0          mov    -0x60(%rsp),%r15
  400aec:       49 8d 1c dc             lea    (%r12,%rbx,8),%rbx
  400af0:       49 8d 14 d7             lea    (%r15,%rdx,8),%rdx
  400af4:       44 8b 7c 24 98          mov    -0x68(%rsp),%r15d
  400af9:       41 29 cf                sub    %ecx,%r15d
  400afc:       31 c9                   xor    %ecx,%ecx
  400afe:       4e 8d 3c fd 08 00 00    lea    0x8(,%r15,8),%r15
  400b05:       00
  400b06:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  400b0d:       00 00 00
  400b10:       f2 41 0f 10 1e          movsd  (%r14),%xmm3
  400b15:       48 83 c1 08             add    $0x8,%rcx
  400b19:       f2 0f 10 0b             movsd  (%rbx),%xmm1
  400b1d:       49 83 c6 08             add    $0x8,%r14
  400b21:       f2 0f 10 12             movsd  (%rdx),%xmm2
  400b25:       48 83 c3 08             add    $0x8,%rbx
  400b29:       f2 0f 10 42 08          movsd  0x8(%rdx),%xmm0
  400b2e:       48 83 c2 10             add    $0x10,%rdx
  400b32:       66 44 0f 28 db          movapd %xmm3,%xmm11
  400b37:       4c 39 f9                cmp    %r15,%rcx
  400b3a:       f2 0f 59 d8             mulsd  %xmm0,%xmm3
  400b3e:       f2 44 0f 59 da          mulsd  %xmm2,%xmm11
  400b43:       f2 0f 59 c1             mulsd  %xmm1,%xmm0
  400b47:       f2 0f 59 d1             mulsd  %xmm1,%xmm2
  400b4b:       f2 0f 58 fb             addsd  %xmm3,%xmm7
  400b4f:       f2 41 0f 58 eb          addsd  %xmm11,%xmm5
  400b54:       f2 0f 58 f0             addsd  %xmm0,%xmm6
  400b58:       f2 0f 58 e2             addsd  %xmm2,%xmm4
  400b5c:       75 b2                   jne    400b10 <integrate_gf_npbc_+0x3f0>
  400b5e:       f2 0f 10 4f 08          movsd  0x8(%rdi),%xmm1
  400b63:       f2 0f 10 57 18          movsd  0x18(%rdi),%xmm2
  400b68:       66 0f 28 dc             movapd %xmm4,%xmm3
  400b6c:       f2 0f 10 47 10          movsd  0x10(%rdi),%xmm0
  400b71:       f2 0f 59 da             mulsd  %xmm2,%xmm3
  400b75:       f2 0f 59 c5             mulsd  %xmm5,%xmm0
  400b79:       f2 0f 59 67 20          mulsd  0x20(%rdi),%xmm4
  400b7e:       f2 0f 59 e9             mulsd  %xmm1,%xmm5
  400b82:       f2 0f 59 f9             mulsd  %xmm1,%xmm7
  400b86:       f2 0f 59 f2             mulsd  %xmm2,%xmm6
  400b8a:       f2 41 0f 10 4d 00       movsd  0x0(%r13),%xmm1
  400b90:       f2 0f 58 eb             addsd  %xmm3,%xmm5
  400b94:       f2 0f 58 c4             addsd  %xmm4,%xmm0
  400b98:       f2 0f 58 fe             addsd  %xmm6,%xmm7
  400b9c:       f2 41 0f 59 6d 08       mulsd  0x8(%r13),%xmm5
  400ba2:       f2 0f 59 c1             mulsd  %xmm1,%xmm0
  400ba6:       f2 0f 59 f9             mulsd  %xmm1,%xmm7
  400baa:       f2 44 0f 58 d7          addsd  %xmm7,%xmm10
  400baf:       48 83 c7 20             add    $0x20,%rdi
  400bb3:       48 03 74 24 88          add    -0x78(%rsp),%rsi
  400bb8:       f2 44 0f 58 c8          addsd  %xmm0,%xmm9
  400bbd:       4c 03 5c 24 88          add    -0x78(%rsp),%r11
  400bc2:       4c 03 4c 24 90          add    -0x70(%rsp),%r9
  400bc7:       f2 44 0f 58 c5          addsd  %xmm5,%xmm8
  400bcc:       4c 03 54 24 90          add    -0x70(%rsp),%r10
  400bd1:       45 85 c0                test   %r8d,%r8d
  400bd4:       f2 44 0f 11 10          movsd  %xmm10,(%rax)
  400bd9:       f2 44 0f 11 48 08       movsd  %xmm9,0x8(%rax)
  400bdf:       f2 44 0f 11 40 10       movsd  %xmm8,0x10(%rax)
  400be5:       74 09                   je     400bf0 <integrate_gf_npbc_+0x4d0>
  400be7:       41 83 e8 01             sub    $0x1,%r8d
  400beb:       e9 80 fd ff ff          jmpq   400970 <integrate_gf_npbc_+0x250>
  400bf0:       8b 54 24 44             mov    0x44(%rsp),%edx
  400bf4:       3b 54 24 0c             cmp    0xc(%rsp),%edx
  400bf8:       0f 84 f5 00 00 00       je     400cf3 <integrate_gf_npbc_+0x5d3>
  400bfe:       48 8b 94 24 a0 00 00    mov    0xa0(%rsp),%rdx
  400c05:       00
  400c06:       48 8b 9c 24 a8 00 00    mov    0xa8(%rsp),%rbx
  400c0d:       00
  400c0e:       66 0f 57 c0             xorpd  %xmm0,%xmm0
  400c12:       8b 0a                   mov    (%rdx),%ecx
  400c14:       8b 33                   mov    (%rbx),%esi
  400c16:       66 0f 28 d0             movapd %xmm0,%xmm2
  400c1a:       66 0f 28 d8             movapd %xmm0,%xmm3
  400c1e:       39 f1                   cmp    %esi,%ecx
  400c20:       0f 8f b1 00 00 00       jg     400cd7 <integrate_gf_npbc_+0x5b7>
  400c26:       48 8b 5c 24 f8          mov    -0x8(%rsp),%rbx
  400c2b:       48 8b 7c 24 b0          mov    -0x50(%rsp),%rdi
  400c30:       48 63 d1                movslq %ecx,%rdx
  400c33:       4c 8b 74 24 a0          mov    -0x60(%rsp),%r14
  400c38:       66 0f 57 c9             xorpd  %xmm1,%xmm1
  400c3c:       29 ce                   sub    %ecx,%esi
  400c3e:       31 c9                   xor    %ecx,%ecx
  400c40:       48 8d 1c 1a             lea    (%rdx,%rbx,1),%rbx
  400c44:       48 8d 54 57 01          lea    0x1(%rdi,%rdx,2),%rdx
  400c49:       48 8d 34 f5 08 00 00    lea    0x8(,%rsi,8),%rsi
  400c50:       00
  400c51:       66 0f 28 d1             movapd %xmm1,%xmm2
  400c55:       49 8d 1c dc             lea    (%r12,%rbx,8),%rbx
  400c59:       49 8d 14 d6             lea    (%r14,%rdx,8),%rdx
  400c5d:       0f 1f 00                nopl   (%rax)
  400c60:       f2 0f 10 03             movsd  (%rbx),%xmm0
  400c64:       48 83 c1 08             add    $0x8,%rcx
  400c68:       f2 0f 10 1a             movsd  (%rdx),%xmm3
  400c6c:       48 83 c3 08             add    $0x8,%rbx
  400c70:       f2 0f 59 d8             mulsd  %xmm0,%xmm3
  400c74:       f2 0f 59 42 08          mulsd  0x8(%rdx),%xmm0
  400c79:       48 83 c2 10             add    $0x10,%rdx
  400c7d:       48 39 f1                cmp    %rsi,%rcx
  400c80:       f2 0f 58 cb             addsd  %xmm3,%xmm1
  400c84:       f2 0f 58 d0             addsd  %xmm0,%xmm2
  400c88:       75 d6                   jne    400c60 <integrate_gf_npbc_+0x540>
  400c8a:       48 8b 54 24 20          mov    0x20(%rsp),%rdx
  400c8f:       4c 8b 7c 24 48          mov    0x48(%rsp),%r15
  400c94:       f2 41 0f 10 65 00       movsd  0x0(%r13),%xmm4
  400c9a:       48 8b 4c 24 30          mov    0x30(%rsp),%rcx
  400c9f:       48 8b 5c 24 10          mov    0x10(%rsp),%rbx
  400ca4:       48 8b 74 24 20          mov    0x20(%rsp),%rsi
  400ca9:       f2 42 0f 10 04 fa       movsd  (%rdx,%r15,8),%xmm0
  400caf:       66 0f 28 d8             movapd %xmm0,%xmm3
  400cb3:       48 8d 54 4b 02          lea    0x2(%rbx,%rcx,2),%rdx
  400cb8:       f2 41 0f 59 45 08       mulsd  0x8(%r13),%xmm0
  400cbe:       f2 0f 59 dc             mulsd  %xmm4,%xmm3
  400cc2:       f2 0f 59 da             mulsd  %xmm2,%xmm3
  400cc6:       f2 0f 10 14 d6          movsd  (%rsi,%rdx,8),%xmm2
  400ccb:       f2 0f 59 c1             mulsd  %xmm1,%xmm0
  400ccf:       f2 0f 59 d4             mulsd  %xmm4,%xmm2
  400cd3:       f2 0f 59 d1             mulsd  %xmm1,%xmm2
  400cd7:       f2 0f 58 18             addsd  (%rax),%xmm3
  400cdb:       f2 0f 58 50 08          addsd  0x8(%rax),%xmm2
  400ce0:       f2 0f 58 40 10          addsd  0x10(%rax),%xmm0
  400ce5:       f2 0f 11 18             movsd  %xmm3,(%rax)
  400ce9:       f2 0f 11 50 08          movsd  %xmm2,0x8(%rax)
  400cee:       f2 0f 11 40 10          movsd  %xmm0,0x10(%rax)
  400cf3:       48 8b 7c 24 c0          mov    -0x40(%rsp),%rdi
  400cf8:       49 83 c5 10             add    $0x10,%r13
  400cfc:       48 01 7c 24 f8          add    %rdi,-0x8(%rsp)
  400d01:       48 01 7c 24 e0          add    %rdi,-0x20(%rsp)
  400d06:       44 8b 74 24 08          mov    0x8(%rsp),%r14d
  400d0b:       48 01 7c 24 e8          add    %rdi,-0x18(%rsp)
  400d10:       44 39 74 24 cc          cmp    %r14d,-0x34(%rsp)
  400d15:       74 11                   je     400d28 <integrate_gf_npbc_+0x608>
  400d17:       83 44 24 cc 01          addl   $0x1,-0x34(%rsp)
  400d1c:       e9 6f fb ff ff          jmpq   400890 <integrate_gf_npbc_+0x170>
  400d21:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)
  400d28:       48 83 c4 50             add    $0x50,%rsp
  400d2c:       5b                      pop    %rbx
  400d2d:       5d                      pop    %rbp
  400d2e:       41 5c                   pop    %r12
  400d30:       41 5d                   pop    %r13
  400d32:       41 5e                   pop    %r14
  400d34:       41 5f                   pop    %r15
  400d36:       c3                      retq
  400d37:       66 0f 57 e4             xorpd  %xmm4,%xmm4
  400d3b:       8b 4c 24 9c             mov    -0x64(%rsp),%ecx
  400d3f:       66 0f 28 ec             movapd %xmm4,%xmm5
  400d43:       66 0f 28 f4             movapd %xmm4,%xmm6
  400d47:       66 0f 28 fc             movapd %xmm4,%xmm7
  400d4b:       e9 7e fd ff ff          jmpq   400ace <integrate_gf_npbc_+0x3ae>

Revision history for this message

In KDE Bug Tracking System #264936, Jseward (jseward) wrote on 2011-01-31:

#7

This seems to me like a bug in gcc. From the following analysis
(start reading at 0x400a38), the value loaded from memory is never
used -- xmm12 is completely overwritten by subsequent instructions,
either in the post-loop block, or in the first instruction of the
next iteration.

==12860== Invalid read of size 8
==12860== at 0x400A38: integrate_gf_npbc_

  # def xmm12 (low half loaded, high half zeroed)
  4009d8: f2 44 0f 10 24 16 movsd (%rsi,%rdx,1),%xmm12
  4009de: 41 83 c6 01 add $0x1,%r14d
  4009e2: f2 0f 10 31 movsd (%rcx),%xmm6
  4009e6: 66 44 0f 16 64 16 08 movhpd 0x8(%rsi,%rdx,1),%xmm12
  4009ed: f2 41 0f 10 04 17 movsd (%r15,%rdx,1),%xmm0
  4009f3: 66 0f 16 71 08 movhpd 0x8(%rcx),%xmm6
  4009f8: 66 41 0f 28 dc movapd %xmm12,%xmm3
  4009fd: f2 44 0f 10 61 10 movsd 0x10(%rcx),%xmm12
  400a03: 66 0f 28 ce movapd %xmm6,%xmm1
  400a07: 66 41 0f 16 44 17 08 movhpd 0x8(%r15,%rdx,1),%xmm0
  400a0e: 66 44 0f 16 61 18 movhpd 0x18(%rcx),%xmm12
  400a14: f2 0f 10 33 movsd (%rbx),%xmm6
  400a18: 66 0f 28 d0 movapd %xmm0,%xmm2
  400a1c: 48 83 c2 10 add $0x10,%rdx
  400a20: 66 41 0f 14 cc unpcklpd %xmm12,%xmm1
  400a25: 66 0f 16 73 08 movhpd 0x8(%rbx),%xmm6
  400a2a: f2 44 0f 10 63 10 movsd 0x10(%rbx),%xmm12
  400a30: 48 83 c1 20 add $0x20,%rcx
  400a34: 66 0f 28 c6 movapd %xmm6,%xmm0

  # load high half xmm12 (error reported here). low half unchanged.
  400a38: 66 44 0f 16 63 18 movhpd 0x18(%rbx),%xmm12
  400a3e: 66 0f 28 f1 movapd %xmm1,%xmm6
  400a42: 66 0f 59 ca mulpd %xmm2,%xmm1
  400a46: 48 83 c3 20 add $0x20,%rbx
  400a4a: 41 39 ee cmp %ebp,%r14d

  # reads low half xmm12 only
  400a4d: 66 41 0f 14 c4 unpcklpd %xmm12,%xmm0
  400a52: 66 0f 59 f3 mulpd %xmm3,%xmm6
  400a56: 66 0f 59 d8 mulpd %xmm0,%xmm3
  400a5a: 66 0f 58 f9 addpd %xmm1,%xmm7
  400a5e: 66 0f 59 c2 mulpd %xmm2,%xmm0
  400a62: 66 44 0f 58 de addpd %xmm6,%xmm11
  400a67: 66 0f 58 eb addpd %xmm3,%xmm5
  400a6b: 66 0f 58 e0 addpd %xmm0,%xmm4
  400a6f: 0f 82 63 ff ff ff jb 4009d8 # (loop head)

400a75: 66 0f 28 c4 movapd %xmm4,%xmm0
400a79: 8b 54 24 a8 mov -0x58(%rsp),%edx

# def xmm12 (overwrite both halves)
400a7d: 66 44 0f 28 e7 movapd %xmm7,%xmm12

This seems to me like a bug in gcc.  From the following analysis
(start reading at 0x400a38), the value loaded from memory is never
used -- xmm12 is completely overwritten by subsequent instructions,
either in the post-loop block, or in the first instruction of the
next iteration.

==12860== Invalid read of size 8
==12860==    at 0x400A38: integrate_gf_npbc_

# def xmm12 (low half loaded, high half zeroed)
  4009d8:       f2 44 0f 10 24 16       movsd  (%rsi,%rdx,1),%xmm12
  4009de:       41 83 c6 01             add    $0x1,%r14d
  4009e2:       f2 0f 10 31             movsd  (%rcx),%xmm6
  4009e6:       66 44 0f 16 64 16 08    movhpd 0x8(%rsi,%rdx,1),%xmm12
  4009ed:       f2 41 0f 10 04 17       movsd  (%r15,%rdx,1),%xmm0
  4009f3:       66 0f 16 71 08          movhpd 0x8(%rcx),%xmm6
  4009f8:       66 41 0f 28 dc          movapd %xmm12,%xmm3
  4009fd:       f2 44 0f 10 61 10       movsd  0x10(%rcx),%xmm12
  400a03:       66 0f 28 ce             movapd %xmm6,%xmm1
  400a07:       66 41 0f 16 44 17 08    movhpd 0x8(%r15,%rdx,1),%xmm0
  400a0e:       66 44 0f 16 61 18       movhpd 0x18(%rcx),%xmm12
  400a14:       f2 0f 10 33             movsd  (%rbx),%xmm6
  400a18:       66 0f 28 d0             movapd %xmm0,%xmm2
  400a1c:       48 83 c2 10             add    $0x10,%rdx
  400a20:       66 41 0f 14 cc          unpcklpd %xmm12,%xmm1
  400a25:       66 0f 16 73 08          movhpd 0x8(%rbx),%xmm6
  400a2a:       f2 44 0f 10 63 10       movsd  0x10(%rbx),%xmm12
  400a30:       48 83 c1 20             add    $0x20,%rcx
  400a34:       66 0f 28 c6             movapd %xmm6,%xmm0

# load high half xmm12 (error reported here).  low half unchanged.
  400a38:       66 44 0f 16 63 18       movhpd 0x18(%rbx),%xmm12
  400a3e:       66 0f 28 f1             movapd %xmm1,%xmm6
  400a42:       66 0f 59 ca             mulpd  %xmm2,%xmm1
  400a46:       48 83 c3 20             add    $0x20,%rbx
  400a4a:       41 39 ee                cmp    %ebp,%r14d

# reads low half xmm12 only
  400a4d:       66 41 0f 14 c4          unpcklpd %xmm12,%xmm0
  400a52:       66 0f 59 f3             mulpd  %xmm3,%xmm6
  400a56:       66 0f 59 d8             mulpd  %xmm0,%xmm3
  400a5a:       66 0f 58 f9             addpd  %xmm1,%xmm7
  400a5e:       66 0f 59 c2             mulpd  %xmm2,%xmm0
  400a62:       66 44 0f 58 de          addpd  %xmm6,%xmm11
  400a67:       66 0f 58 eb             addpd  %xmm3,%xmm5
  400a6b:       66 0f 58 e0             addpd  %xmm0,%xmm4
  400a6f:       0f 82 63 ff ff ff       jb     4009d8 # (loop head)

400a75:       66 0f 28 c4             movapd %xmm4,%xmm0
  400a79:       8b 54 24 a8             mov    -0x58(%rsp),%edx

# def xmm12 (overwrite both halves)
  400a7d:       66 44 0f 28 e7          movapd %xmm7,%xmm12

Revision history for this message

In KDE Bug Tracking System #264936, Jakub Jelinek (jakub-redhat) wrote on 2011-01-31:

#8

Similar testcase is gcc's own libcpp/lex.c optimization, which also can access a few bytes after malloced area, as long as at least one byte in the value read is from within the malloced area. See search_line_* routines in lex.c, not just SSE4.2/SSE2, but also even the generic C version actually does this.
I guess valgrind could mark somehow the extra bytes as undefined content and propagate it through following arithmetic instructions, complain only if some conditional jump was made solely on the undefined bits or if the undefined bits were stored somewhere (or similar heuristics).

Revision history for this message

In KDE Bug Tracking System #264936, Joost-vandevondele (joost-vandevondele) wrote on 2011-01-31:

#9

(In reply to comment #5)
> This seems to me like a bug in gcc.

Unfortunately, I'm an asm novice, so I can't tell. I see Jakub is on the CC as well, so maybe he can judge?

Alternatively, I can reopen
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47522
and refer here?

Revision history for this message

In KDE Bug Tracking System #264936, Jseward (jseward) wrote on 2011-01-31:

#10

(In reply to comment #6)
> Similar testcase is gcc's own libcpp/lex.c optimization, which also can access
> a few bytes after malloced area, as long as at least one byte in the value read
> is from within the malloced area.

Those loops are (effectively) vectorised while loops, in which you use
standard carry-chain propagation tricks to ensure that the stopping
condition for the loop does not rely on the data from beyond the malloced
area. It is not possible to vectorise them without such over-reading.

By contrast, Joost's loop (and anything gcc can vectorise) are countable
loops: the trip count is known (at run time) before the loop begins. It
is always possible to vectorise such a loop without generating memory
over reads, by having a vector loop to do (trip_count / vector_width)
iterations, and a scalar fixup loop to do the final (trip_count % vector_width)
iterations.

> I guess valgrind could mark somehow the extra bytes as undefined content and
> propagate it through following arithmetic instructions, complain only if some
> conditional jump was made solely on the undefined bits or if the undefined bits
> were stored somewhere (or similar heuristics).

Well, maybe .. but Memcheck is too slow already. I don't want to junk it up
with expensive and complicated heuristics that are irrelevant for 99.9% of
the loads it will encounter.

If you can show me some way to identify just the loads that need special
treatment, then maybe. I don't see how to identify them, though.

Revision history for this message

In Red Hat Bugzilla #678518, Kamil (kamil-redhat-bugs) wrote on 2011-02-18:

#20

Version-Release number of selected component (if applicable):
libidn-1.19-2.fc15.x86_64

How reproducible:
100 %

Steps to Reproduce:
1. run the attached reproducer

Additional info:
The bug breaks builds of rawhide curl:

http://koji.fedoraproject.org/koji/getfile?taskID=2846952&name=build.log

Revision history for this message

In Red Hat Bugzilla #678518, Kamil (kamil-redhat-bugs) wrote on 2011-02-18:

#21

Created attachment 479480
a reproducer

Revision history for this message

In Red Hat Bugzilla #678518, Kamil (kamil-redhat-bugs) wrote on 2011-02-18:

#22

$ curl -JO 'https://bugzilla.redhat.com/attachment.cgi?id=479480'
$ sh bz678518.c

Invalid read of size 4
   at 0x4E33791: idna_to_ascii_4z (idna.c:514)
   by 0x4E33A34: idna_to_ascii_8z (idna.c:565)
   by 0x4E33A98: idna_to_ascii_lz (idna.c:597)
   by 0x4005A3: main (bz678518.c:15)
Address 0x5405fcc is 12 bytes inside a block of size 15 alloc'd
   at 0x4C284F2: realloc (vg_replace_malloc.c:525)
   by 0x4E3380A: idna_to_ascii_4z (idna.c:514)
   by 0x4E33A34: idna_to_ascii_8z (idna.c:565)
   by 0x4E33A98: idna_to_ascii_lz (idna.c:597)
   by 0x4005A3: main (bz678518.c:15)

Revision history for this message

In Red Hat Bugzilla #678518, Kamil (kamil-redhat-bugs) wrote on 2011-02-18:

#23

*** Bug 678521 has been marked as a duplicate of this bug. ***

Revision history for this message

In Red Hat Bugzilla #678518, Miroslav (miroslav-redhat-bugs) wrote on 2011-02-18:

#24

Looks like a gcc bug, perhaps a wrong assumption about alignment of the buffer returned by realloc?

Reproducer:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main() {
char *x;

x = malloc(4);
x = realloc(x, 15);
sprintf(x, "www.xn--4cab6c");
return strlen(x);
}

Can this cause problems outside valgrind?

Revision history for this message

In Red Hat Bugzilla #678518, Jakub (jakub-redhat-bugs) wrote on 2011-02-18:

#25

No, this is yet another case where valgrind just don't understand common programming techniques.
Here, gcc knows it can assume that x from the realloc is 2 * sizeof (void *) bytes aligned, therefore it implements strlen with this tuning as an inlined loop which reads a word at a time and looks for zero bytes in it.
It knows it won't crash if there is a zero byte before end of cliff, so it doesn't matter if it reads 1-7 extra bytes after malloced block when the access is aligned.
See e.g. https://bugs.kde.org/show_bug.cgi?id=264936
but right now it is not anywhere near a fix upstream.

Revision history for this message

In Red Hat Bugzilla #678518, Kamil (kamil-redhat-bugs) wrote on 2011-02-18:

#26

Is there any gcc flag to disable the optimization that causes problems to valgrind while keeping all the other -O2 optimization passes?

Revision history for this message

In Red Hat Bugzilla #678518, Jakub (jakub-redhat-bugs) wrote on 2011-02-18:

#27

-fno-builtin-strlen, but you really don't want to use that, it will penalize code way too much (then even strlen ("abc") isn't folded etc.).

Just live with valgrind false positives, any time it reports a read access
with size N as M bytes inside a block of size O alloc'd
where (M % N) == 0 && M == (O & ~(N - 1)) && O > M it is suspect you might be looking at a valgrind false positive.

Revision history for this message

In Red Hat Bugzilla #678518, Kamil (kamil-redhat-bugs) wrote on 2011-02-18:

#28

(In reply to comment #7)
> -fno-builtin-strlen, but you really don't want to use that, it will penalize
> code way too much (then even strlen ("abc") isn't folded etc.).

Thanks. Any chance to get back the 'repnz scasb'-based implementation that -O1 produces?

> Just live with valgrind false positives, any time it reports a read access
> with size N as M bytes inside a block of size O alloc'd
> where (M % N) == 0 && M == (O & ~(N - 1)) && O > M it is suspect you might be
> looking at a valgrind false positive.

That's something hard to distinguish from the regular off-by-one errors occurring in C code as long as the unrolled strlen() does not appear on the backtrace.

Revision history for this message

In Red Hat Bugzilla #678518, Jakub (jakub-redhat-bugs) wrote on 2011-02-18:

#29

Off by one errors typically would not be accessing in one read some bytes that are part of the allocation and some bytes after it, it would be a whole access to bytes after the allocation.

Revision history for this message

In Red Hat Bugzilla #678518, Kamil (kamil-redhat-bugs) wrote on 2011-02-18:

#30

That's true. I've disabled use of valgrind for the particular failing test-case for now, will look for a more generic solution if the problem spreads over more test-cases.

http://lists.fedoraproject.org/pipermail/scm-commits/2011-February/567665.html

Revision history for this message

In KDE Bug Tracking System #264936, Jakub Jelinek (jakub-redhat) wrote on 2011-02-18:

#11

Another simple testcase: https://bugzilla.redhat.com/show_bug.cgi?id=678518
I don't think 99% above is the right figure, at least with recent gcc generated code these false positives are just way too common. We disable a bunch of them in glibc through a suppression file or overloading the strops implementations,
but when gcc inlines those there is no way to get rid of the false positives.

Can't valgrind just start tracking in more details whether the bytes are actually used or not when memcheck sees a suspect read (in most cases just an aligned read where at least the first byte is still in the allocated region and perhaps some further ones aren't)? Force then retranslation of the bb it was used in or something similar?

Revision history for this message

In KDE Bug Tracking System #264936, Jseward (jseward) wrote on 2011-02-18:

#12

(In reply to comment #9)
> I don't think 99% above is the right figure, at least with recent
> gcc generated

What version of gcc?

Revision history for this message

In KDE Bug Tracking System #264936, Jakub Jelinek (jakub-redhat) wrote on 2011-02-18:

#13

The
#include <stdlib.h>
#include <string.h>

__attribute__((noinline)) void
foo (void *p)
{
memcpy (p, "0123456789abcd", 15);
}

int
main (void)
{
  void *p = malloc (15);
  foo (p);
  return strlen (p) - 14;
}
testcase where strlen does this is expanded that way with GCC 4.6 (currently used e.g. in Fedora 15) with default options, but e.g. 4.5 or even earlier versions expand this the same way with -O2 -minline-all-stringops.

Revision history for this message

In KDE Bug Tracking System #264936, Jseward (jseward) wrote on 2011-02-18:

#14

I can see this problem isn't going to go away (alas); and we are
seeing similar things on icc generated code. I'll look into it,
but that won't happen for at least a couple of weeks.

Chris Bainbridge (chris-bainbridge) on 2011-09-17

tags:

added: oneiric

Bug Watch Updater (bug-watch-updater) on 2011-09-17

Changed in valgrind (ALT Linux):
importance:	Unknown → Medium
status:	Unknown → New

Revision history for this message

In KDE Bug Tracking System #264936, Patrick J. LoPresti (lopresti) wrote on 2012-02-17:

#15

Isn't this exactly the problem that "--partial-loads-ok" is meant to address? (cf. bug 294285)

http://valgrind.org/docs/manual/mc-manual.html#opt.partial-loads-ok

Revision history for this message

Alessandro Ghedini (ghedo) wrote on 2012-05-12:

#1

I can reproduce this with GCC 4.6, but not with GCC 4.7 or Clang.

Revision history for this message

In KDE Bug Tracking System #264936, Kevyn-Alexandre Pare (kapare) wrote on 2012-06-14:

#16

Could this bug be the same issue?: bug 301922

Bug Watch Updater (bug-watch-updater) on 2012-06-15

Changed in valgrind:
importance:	Unknown → Medium
status:	Unknown → New

Revision history for this message

In KDE Bug Tracking System #264936, Offringa (offringa) wrote on 2012-08-05:

#17

Hi all,

I think I'm also seeing false positives because of vectorization, that unfortunately decreases the usefulness of valgrind. Below is a minimal working example that reproduces problems with std::string. The code is basically extracted from a library I was using (casacore 1.5) and in my software it generates a lot of incorrect "invalid read"s, although the library seems to be valid (although inherriting from string would not be my preferred solution). I hope this example is of use for evaluating the problem further.

#include <string>
#include <iostream>
#include <cstring>

#include <malloc.h>

class StringAlt : public std::string
{
public:
StringAlt(const char *c) : std::string(c) { }
void operator=(const char *c) { std::string::operator=(c); }
};

typedef StringAlt StringImp;
//typedef std::string StringImp; //<-- replacing prev with this also solves issue

int main(int argc, char *argv[])
{
  const char *a1 = "blaaa";
  char *a2 = strdup(a1);
  a2[2] = 0;
  StringImp s(a1);
  std::cout << "Assign A2\n";
  s = a2;
  std::cout << s << '\n';

  std::cout << "Assign A1\n";
  s = a1;
  std::cout << s << '\n';

  char *a3 = strdup(s.c_str());
  std::cout << "Assign A3\n";
  s = a3;
  std::cout << s << '\n';

free(a2);
free(a3);
}

Compiled with g++ Debian 4.7.1-2, with "-O2" or "-O3" results in the error below. With "-O0", it works fine. Changing the order of statements can also cause the error to disappear, which makes it very hard to debug. Output:

Assign A2
bl
Assign A1
blaaa
Assign A3
==20872== Invalid read of size 4
==20872== at 0x400C5C: main (in /home/anoko/projects/test/test)
==20872== Address 0x59550f4 is 4 bytes inside a block of size 6 alloc'd
==20872== at 0x4C28BED: malloc (vg_replace_malloc.c:263)
==20872== by 0x564D911: strdup (strdup.c:43)
==20872== by 0x400C46: main (in /home/anoko/projects/test/test)
==20872==
blaaa

Hi all,

I think I'm also seeing false positives because of vectorization, that unfortunately decreases the usefulness of valgrind. Below is a minimal working example that reproduces problems with std::string. The code is basically extracted from a library I was using (casacore 1.5) and in my software it generates a lot of incorrect "invalid read"s, although the library seems to be valid (although inherriting from string would not be my preferred solution). I hope this example is of use for evaluating the problem further.

#include <string>
#include <iostream>
#include <cstring>

#include <malloc.h>

class StringAlt : public std::string
{
public:
  StringAlt(const char *c) : std::string(c) { }
  void operator=(const char *c) { std::string::operator=(c);  }
};

typedef StringAlt StringImp;
//typedef std::string StringImp; //<-- replacing prev with this also solves issue

int main(int argc, char *argv[])
{
  const char *a1 = "blaaa";
  char *a2 = strdup(a1);
  a2[2] = 0;
  StringImp s(a1);
  std::cout << "Assign A2\n";
  s = a2;
  std::cout << s << '\n';

std::cout << "Assign A1\n";
  s = a1;
  std::cout << s << '\n';

char *a3 = strdup(s.c_str());
  std::cout << "Assign A3\n";
  s = a3;
  std::cout << s << '\n';

free(a2);
  free(a3);
}

Compiled with g++ Debian 4.7.1-2, with "-O2" or "-O3" results in the error below. With "-O0", it works fine. Changing the order of statements can also cause the error to disappear, which makes it very hard to debug. Output:

Assign A2                                                                                  
bl                                                               
Assign A1                                              
blaaa                                         
Assign A3             
==20872== Invalid read of size 4 
==20872==    at 0x400C5C: main (in /home/anoko/projects/test/test)                           
==20872==  Address 0x59550f4 is 4 bytes inside a block of size 6 alloc'd     
==20872==    at 0x4C28BED: malloc (vg_replace_malloc.c:263)     
==20872==    by 0x564D911: strdup (strdup.c:43)     
==20872==    by 0x400C46: main (in /home/anoko/projects/test/test)                 
==20872==                                                                                                                                                   
blaaa

Revision history for this message

In KDE Bug Tracking System #264936, Kamil Dudka (kdudka) wrote on 2012-08-05:

#18

(In reply to comment #15)
Try to rebuild the library with -fno-builtin-strdup, chances are it will make valgrind working again.

Revision history for this message

In KDE Bug Tracking System #264936, Offringa (offringa) wrote on 2012-08-05:

#19

-fno-builtin-strdup does indeed get rid of the valgrind message.

Revision history for this message

In Red Hat Bugzilla #678518, Fedora (fedora-redhat-bugs) wrote on 2013-04-03:

#31

This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle.
Changing version to '19'.

(As we did not run this process for some time, it could affect also pre-Fedora 19 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19

Revision history for this message

In Red Hat Bugzilla #678518, Stef (stef-redhat-bugs) wrote on 2014-01-31:

#32

This also affects p11-kit on both Fedora 19 and Fedora 20.

Running 'make memcheck' target for valgrind on p11-kit exposes this bug with issues like this:

FAIL: test-token
==4047== Invalid read of size 4
==4047== at 0x40EA07: on_pem_block (parser.c:616)
==4047== by 0x41079D: p11_pem_parse (pem.c:228)
==4047== by 0x40DE38: parse_pem_certificates (parser.c:665)
==4047== by 0x40F120: p11_parse_memory (parser.c:753)
==4047== Address 0x54afc20 is 16 bytes inside a block of size 18 alloc'd
==4047== at 0x4A081D4: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4047== by 0x40E9D8: on_pem_block (parser.c:612)
==4047== by 0x41079D: p11_pem_parse (pem.c:228)
==4047== by 0x40DE38: parse_pem_certificates (parser.c:665)

Revision history for this message

In Red Hat Bugzilla #678518, Fedora (fedora-redhat-bugs) wrote on 2015-01-09:

#33

This message is a notice that Fedora 19 is now at end of life. Fedora
has stopped maintaining and issuing updates for Fedora 19. It is
Fedora's policy to close all bug reports from releases that are no
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 19 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Revision history for this message

In Red Hat Bugzilla #678518, Fedora (fedora-redhat-bugs) wrote on 2015-02-17:

#34

Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Revision history for this message

In Red Hat Bugzilla #678518, Kamil (kamil-redhat-bugs) wrote on 2015-02-17:

#35

The bug still exists in Fedora 21:

$ rpm -q valgrind
valgrind-3.10.1-1.fc21.x86_64

$ curl -JO 'https://bugzilla.redhat.com/attachment.cgi?id=479480'
curl: Saved to filename 'bz678518.c'

$ sh bz678518.c
==543== Memcheck, a memory error detector
==543== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==543== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==543== Command: ./a.out
==543==
==543== Invalid read of size 4
==543== at 0x3EB8A056EB: idna_to_ascii_4z (idna.c:529)
==543== by 0x3EB8A05947: idna_to_ascii_8z (idna.c:582)
==543== by 0x3EB8A059A9: idna_to_ascii_lz (idna.c:614)
==543== by 0x400685: main (bz678518.c:15)
==543== Address 0x4c45fcc is 12 bytes inside a block of size 15 alloc'd
==543== at 0x4A08B1C: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==543== by 0x3EB8A05764: idna_to_ascii_4z (idna.c:530)
==543== by 0x3EB8A05947: idna_to_ascii_8z (idna.c:582)
==543== by 0x3EB8A059A9: idna_to_ascii_lz (idna.c:614)
==543== by 0x400685: main (bz678518.c:15)
==543==
www.xn--4cab6c.se
==543==
==543== HEAP SUMMARY:
==543== in use at exit: 18 bytes in 1 blocks
==543== total heap usage: 27 allocs, 26 frees, 35,129 bytes allocated
==543==
==543== LEAK SUMMARY:
==543== definitely lost: 18 bytes in 1 blocks
==543== indirectly lost: 0 bytes in 0 blocks
==543== possibly lost: 0 bytes in 0 blocks
==543== still reachable: 0 bytes in 0 blocks
==543== suppressed: 0 bytes in 0 blocks
==543== Rerun with --leak-check=full to see details of leaked memory
==543==
==543== For counts of detected and suppressed errors, rerun with: -v
==543== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

Revision history for this message

In Red Hat Bugzilla #678518, Mark (mark-redhat-bugs) wrote on 2015-02-17:

#36

Note the issue is really in libidn and the reproducer needs libidn-devel (and libidn-debuginfo) installed.

The workaround is to run with --partial-loads-ok=yes

       --partial-loads-ok=<yes|no> [default: no]
           Controls how Memcheck handles 32-, 64-, 128- and 256-bit naturally
           aligned loads from addresses for which some bytes are addressable
           and others are not. When yes, such loads do not produce an address
           error. Instead, loaded bytes originating from illegal addresses are
           marked as uninitialised, and those corresponding to legal addresses
           are handled in the normal way.

           When no, loads from partially invalid addresses are treated the
           same as loads from completely invalid addresses: an illegal-address
           error is issued, and the resulting bytes are marked as initialised.

           Note that code that behaves in this way is in violation of the ISO
           C/C++ standards, and should be considered broken. If at all
           possible, such code should be fixed. This option should be used
           only as a last resort.

Revision history for this message

In Red Hat Bugzilla #678518, Kamil (kamil-redhat-bugs) wrote on 2015-02-17:

#37

(In reply to Mark Wielaard from comment #16)
> Note the issue is really in libidn and the reproducer needs libidn-devel
> (and libidn-debuginfo) installed.

Thanks for the reply! I am moving the component back to libidn...

$ rpm -aq libidn\* | sort -V
libidn2-0.10-1.fc21.x86_64
libidn2-devel-0.10-1.fc21.x86_64
libidn-1.28-5.fc21.x86_64
libidn-debuginfo-1.28-5.fc21.x86_64
libidn-devel-1.28-5.fc21.x86_64

Revision history for this message

In Red Hat Bugzilla #678518, Miroslav (miroslav-redhat-bugs) wrote on 2015-02-17:

#38

I don't see how is this a libidn problem, the code just calls strlen() on a properly allocated buffer. Either it's a bug in valgrind that it reports false positives or gcc generates bad code.

Revision history for this message

In Red Hat Bugzilla #678518, Mark (mark-redhat-bugs) wrote on 2015-02-17:

#39

(In reply to Miroslav Lichvar from comment #18)
> I don't see how is this a libidn problem, the code just calls strlen() on a
> properly allocated buffer. Either it's a bug in valgrind that it reports
> false positives or gcc generates bad code.

As stated in comment #16 gcc generates code that valgrind can only "prove" correct when using --partial-loads-ok=yes. The code does a multi-byte loads from addresses which are partially valid and partially invalid. See that comment or the valgrind manual for more explanation.

Revision history for this message

In Red Hat Bugzilla #678518, Mark (mark-redhat-bugs) wrote on 2015-09-23:

#40

Note that valgrind 3.11.0, which was just uploaded for f23 https://bodhi.fedoraproject.org/updates/FEDORA-2015-16525, changed the default for partial-loads-ok to yes. From the release notes:

  - The default value for --partial-loads-ok has been changed from "no" to
    "yes", so as to avoid false positive errors resulting from some kinds
    of vectorised loops.

Revision history for this message

In Red Hat Bugzilla #678518, Jan (jan-redhat-bugs) wrote on 2015-09-23:

#41

re [comment 20]:
This sounds reasonable, thanks for the update, Mark.

Btw. I hit this issue with F22 + strlen standard function in the code:

> valgrind-3.10.1-9.fc22.x86_64
> gcc-5.1.1-4.fc22.x86_64

Revision history for this message

In Red Hat Bugzilla #678518, Jan (jan-redhat-bugs) wrote on 2015-09-23:

#42

Actually not sure whether libidn is affected there as well, please adjust
appropriately if needed (incl. the component?).

Revision history for this message

In Red Hat Bugzilla #678518, Fedora (fedora-redhat-bugs) wrote on 2016-07-19:

#43

Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Bug Watch Updater (bug-watch-updater) on 2017-10-27

Changed in valgrind (Fedora):
importance:	Unknown → Medium
status:	Unknown → Won't Fix

	Status	Importance	Assigned to
Valgrind	New	Medium	kde-bugs #264936
valgrind (ALT Linux)	New	Medium	kde-bugs #264936
valgrind (Fedora)	Won't Fix	Medium	redhat-bugs #678518
valgrind (Ubuntu)	New	Undecided	Unassigned

Ubuntu
valgrind package

valgrind false positives on gcc-generated string routines

Bug Description

Other bug subscribers

Remote bug watches

Ubuntuvalgrind package

valgrind false positives on gcc-generated string routines

Bug Description

Other bug subscribers

Remote bug watches

Ubuntu
valgrind package