[armel] telepathy-glib segfaults in selftests during build

Bug #736081 reported by Oliver Grawert
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Linaro GCC
Fix Released
High
Unassigned
Release Notes for Ubuntu
Invalid
Undecided
Canonical ARM Developers
gcc-4.5 (Ubuntu)
Fix Released
High
Unassigned
Natty
Fix Released
High
Unassigned
Oneiric
Fix Released
High
Unassigned
telepathy-glib (Ubuntu)
Fix Released
Low
Unassigned
Natty
Won't Fix
High
Unassigned
Oneiric
Fix Released
Low
Unassigned

Bug Description

as already seen in bug #623979, telepathy-glib segfaults in a selftest during package build.
the workaround from the above bug was dropped during a merge from debian which makes it likely that this causes the FTBFS.

after all though it seems like a toolchain error if dropping optimization fixes it.

Revision history for this message
Oliver Grawert (ogra) wrote :

on a sidenote this makes empathy uninstallable and breaks armel netbook image builds.

Changed in telepathy-glib (Ubuntu):
importance: Undecided → High
Changed in telepathy-glib (Ubuntu Natty):
milestone: none → ubuntu-11.04-beta-1
tags: added: arm-porting-queue armel
Revision history for this message
Oliver Grawert (ogra) wrote :
Revision history for this message
Jani Monoses (jani) wrote :

This was worked around by building with -O0 but it is still possibly a toolchain issue.

Revision history for this message
Michael Hope (michaelh1) wrote :

Can you provide some more detail and do an initial investigation please? Building at a different optimisation level can hide application bugs such as aliasing problems or uninitialised variables.

Changed in gcc-linaro:
status: New → Incomplete
Martin Pitt (pitti)
summary: - telepathy-glib segfaults in selftests during build
+ [armel] telepathy-glib segfaults in selftests during build
Revision history for this message
Jani Monoses (jani) wrote :

The function that appears to be miscompiled and which is called by the test is in telepathy-glib/debug.c

Building this file alone with -O0 does not lead to crash. Same result if the debug_flag_to_domain() function which is called here, and which is static to the debug.c file is either made non-static, or flagged with __attribute__((noinline)). In both of these cases
this will not be inlined and code generated is correct.

void _tp_log (GLogLevelFlags level,
              TpDebugFlags flag,
              const gchar *format,
              ...)
{
  if ((flag & flags) || level > G_LOG_LEVEL_DEBUG)
    {
      va_list args;
      va_start (args, format);
      g_logv (debug_flag_to_domain (flag), level, format, args);
      va_end (args);
    }

}

The function has 3 arguments, but r2 is overwritten by r0 without being saved first.
So format <= level on the second line of the disassembled code

000000d8 <_tp_log>:
  d8: f240 0300 movw r3, #0
  dc: 4602 mov r2, r0 // THIS APPEARS TO OVERWRITE THE FORMAT ARGUMENT WITH THE LEVEL ARGUMENT
  de: f2c0 0300 movt r3, #0
  e2: 681b ldr r3, [r3, #0]
  e4: 4219 tst r1, r3
  e6: bf0c ite eq
  e8: 2300 moveq r3, #0
  ea: 2301 movne r3, #1
  ec: 2880 cmp r0, #128 ; 0x80
  ee: bfc8 it gt
  f0: f043 0301 orrgt.w r3, r3, #1

gdb backtrace of the crash confirms format is passed down to callees with an invalid value (0x80 = G_LOG_LEVEL_DEBUG)

Program received signal SIGSEGV, Segmentation fault.
__strchrnul (s=<value optimized out>, c_in=<value optimized out>) at strchrnul.c:122
122 strchrnul.c: No such file or directory.
        in strchrnul.c
(gdb) bt
#0 __strchrnul (s=<value optimized out>, c_in=<value optimized out>) at strchrnul.c:122
#1 0x40366dd2 in __find_specmb (s=0xbec1d098, format=0x80 <Address 0x80 out of bounds>, ap=...) at printf-parse.h:99
#2 _IO_vfprintf_internal (s=0xbec1d098, format=0x80 <Address 0x80 out of bounds>, ap=...) at vfprintf.c:1325
#3 0x403d5b56 in __vasprintf_chk (result_ptr=0xbec1d174, flags=1, format=0x80 <Address 0x80 out of bounds>, args=...) at vasprintf_chk.c:68
#4 0x402e3a1a in vasprintf (string=0xbec1d174, format=0x80 <Address 0x80 out of bounds>, args=<value optimized out>) at /usr/include/bits/stdio2.h:199
#5 g_vasprintf (string=0xbec1d174, format=0x80 <Address 0x80 out of bounds>, args=<value optimized out>) at /build/buildd/glib2.0-2.28.4/./glib/gprintf.c:318
#6 0x402c9eca in g_strdup_vprintf (format=<value optimized out>, args=<value optimized out>) at /build/buildd/glib2.0-2.28.4/./glib/gstrfuncs.c:255
#7 0x402ba1f8 in g_logv (log_domain=0x8da0 "tp-glib/accounts", log_level=G_LOG_LEVEL_DEBUG, format=0x80 <Address 0x80 out of bounds>, args1=...) at /build/buildd/glib2.0-2.28.4/./glib/gmessages.c:524
#8 0x00008a2c in _tp_log (level=-1094593012, flag=37, format=0x80 <Address 0x80 out of bounds>) at debug.c:311
#9 0x000088fc in main (argc=<value optimized out>, argv=<value optimized out>) at debug-domain.c:17

Revision history for this message
Michael Hope (michaelh1) wrote :

Confirmed in gcc-linaro-4.5-2011.03:
  /tools/toolchains/gcc-linaro-4.5-2011.03-0-armv7l-maverick-cbuild71-carina7-cortexa8r1/bin/gcc -S -O2 debug.i

...gives the same output as above. Adding -fno-shrink-wrap works around the problem.

Changed in gcc-linaro:
importance: Undecided → High
tags: added: bad-code shrinkwrap
Changed in gcc-linaro:
status: Incomplete → Triaged
Revision history for this message
Michael Hope (michaelh1) wrote :

Also fails in gcc-linaro-4.5+bzr99489.

Revision history for this message
Kate Stewart (kate.stewart) wrote :

marked confirmed based on comments from Michael Hope

Changed in telepathy-glib (Ubuntu Natty):
milestone: ubuntu-11.04-beta-1 → ubuntu-11.04-beta-2
status: New → Confirmed
Revision history for this message
Oliver Grawert (ogra) wrote :

we have a workaround in natty, nominated for oneric

Changed in telepathy-glib (Ubuntu Natty):
milestone: ubuntu-11.04-beta-2 → none
Steve Langasek (vorlon)
Changed in telepathy-glib (Ubuntu Oneiric):
status: New → Triaged
Revision history for this message
Steve Langasek (vorlon) wrote :

This compiler bug also affects dpkg (bug #762082, marked as a dupe). Worked around there as well with -fno-shrink-wrap.

The GCC Linaro task is marked 'high' but has no assignee - Michael, is this on your roadmap somewhere? I fear there may be quite a few packages hit by this at runtime that we don't know about.

Changed in telepathy-glib (Ubuntu Natty):
status: Confirmed → Won't Fix
Changed in gcc-4.5 (Ubuntu Oneiric):
status: New → Triaged
importance: Undecided → High
Changed in gcc-4.5 (Ubuntu Natty):
importance: Undecided → High
status: New → Triaged
Changed in telepathy-glib (Ubuntu Oneiric):
importance: Undecided → Low
Steve Langasek (vorlon)
Changed in telepathy-glib (Ubuntu):
status: Confirmed → Triaged
importance: High → Low
Revision history for this message
Ira Rosen (irar) wrote : AUTO: Ira Rosen is out of the office. (returning 17/04/2011)

I am out of the office until 17/04/2011.

Note: This is an automated response to your message "[Bug 736081] Re:
[armel] telepathy-glib segfaults in selftests during build" sent on
16/4/2011 0:16:08.

This is the only notification you will receive while this person is away.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package gcc-4.5 - 4.5.2-8ubuntu4

---------------
gcc-4.5 (4.5.2-8ubuntu4) natty; urgency=low

  * For Linaro based builds, do not turn on -fshrink-wrap by default on armel
    (was already disabled for other architectures). LP: #736081.
 -- Matthias Klose <email address hidden> Mon, 18 Apr 2011 13:25:46 +0200

Changed in gcc-4.5 (Ubuntu Natty):
status: Triaged → Fix Released
Matthias Klose (doko)
Changed in gcc-4.5 (Ubuntu Oneiric):
status: Triaged → Fix Released
Michael Hope (michaelh1)
Changed in gcc-linaro:
milestone: none → 4.5-2011.04-0
Michael Hope (michaelh1)
Changed in gcc-linaro:
status: Triaged → Fix Committed
Michael Hope (michaelh1)
Changed in gcc-linaro:
status: Fix Committed → Fix Released
Oliver Grawert (ogra)
Changed in ubuntu-release-notes:
assignee: nobody → Canonical ARM Developers (canonical-arm-dev)
Revision history for this message
Michael Hope (michaelh1) wrote :

@Oliver: I'm not sure this needs to go in the release notes. The package works around the earlier fault by turning off optimisations.

Note that gcc-linaro-2011.04 disables shrink wrap by default. The underlying fault still exists if you add -fshrink-wrap and will be fixed before shrink wrap is turned on by default again.

Revision history for this message
Oliver Grawert (ogra) wrote :

indeed this needed to go into the release notes, all packages in the archive are built with -fshrink-wrap given the gcc fix only went into the archive after beta2. we would have done an archive rebuild for that change if it would have been possible at that timing but it was to late ...

calling "apt-get build-dep <package> && apt-get source -b <package>" in a natty installation might now result in a different binary than the one in the archive. such an inconsistency needs to be release noted ...

Adam Conrad (adconrad)
Changed in telepathy-glib (Ubuntu Oneiric):
status: Triaged → Fix Released
Pete Graner (pgraner)
Changed in ubuntu-release-notes:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.