Linaro GCC 4.6-2011.06-0 gets ICE when compiling bionic's libm

Bug #809768 reported by Ken Werner
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Linaro GCC
Fix Released
High
Richard Sandiford
4.6-2011.07-stable
Fix Released
High
Michael Hope
Linaro GCC Tracking
Fix Committed
Undecided
Unassigned

Bug Description

Attached is the .i file that causes the ICE. It occurs when using a Linaro GCC 4.6-2011.06-0 based Android toolchain to compile the bionic libm. A prebuilt toolchain can be found at:
  https://android-build.linaro.org/builds/~linaro-android/toolchain-4.6-2011.06-0/

I think it has been built like this (at least I was able to reproduce the ICE using the following steps):
  mkdir linaro-4.6-toolchain && cd linaro-4.6-toolchain
  mkdir bin
  wget -O bin/repo http://android.git.kernel.org/repo
  chmod a+x bin/repo
  export PATH="`pwd`/bin:$PATH"
  repo init -u git://git.linaro.org/android/toolchain/manifest.git
  repo sync
  LINARO_BUILD_EXTRA_CONFIGURE_FLAGS=--disable-libquadmath build/linaro-build.sh --with-gcc="http://launchpad.net/gcc-linaro/4.6/4.6-2011.06-0/+download/gcc-linaro-4.6-2011.06-0.tar.bz2" --prefix=`readlink -f $(pwd)/../linaro-4.6-toolchain-install` &> build_$(date +%Y-%m-%d_%H-%M).log &

These are the flags to get the ICE:
   ../linaro-4.6-toolchain-install/bin/arm-eabi-gcc -c -march=armv7-a -mfloat-abi=softfp -mfpu=neon -g -O1 e_cosh.i

You get something like this:
  bionic/libm/src/e_cosh.c: In function 'cosh':
  bionic/libm/src/e_cosh.c:86:1: internal compiler error: Segmentation fault

One interesting detail is that it works when using the Linaro GCC 4.6-2011.05-0 (http://launchpad.net/gcc-linaro/4.6/4.6-2011.05-0/+download/gcc-linaro-4.6-2011.05-0.tar.bz2).

Related branches

Revision history for this message
Ken Werner (kwerner) wrote :
Revision history for this message
Ken Werner (kwerner) wrote :

Michael Hope wasn't able to reproduce this with a plain gcc-linaro-4.6-2011.06.

Revision history for this message
Ken Werner (kwerner) wrote :

The "--disable-libquadmath" switch that the Linaro Android folks are using does not seem to be related. If it's omitted a configure test of the target libiberty would fail as described here: https://bugs.launchpad.net/gcc-linaro/+bug/809435 . With HP's patch applied and libquadmath enabled the ICE remains.

Revision history for this message
Michael K. Edwards (m-k-edwards) wrote :

Can you capture a core and get a backtrace with gdb? And test with -O0 to see if the SEGV goes away? I have encountered a segfaulting null-pointer-reference optimizer bug in fold_stmt_1(), which I haven't filed yet (the testcase is almost 80K lines and probably not something I can post publicly as-is) but have worked around. (Patch at https://github.com/mkedwards/crosstool-ng/blob/master/patches/gcc/linaro-4.6-bzr/1008-handle-statement-removal.patch)

Revision history for this message
Ken Werner (kwerner) wrote :

@Michael - thanks for the hint! It won't segfault with -Os and -O0 but with -O1, -O2 and -O3. I think it's something different. I haven't get a chance to take a closer look. I'll attach a core file and a backtrace. But I haven't figured you how to build the Android GCC with -g yet.

Revision history for this message
Ken Werner (kwerner) wrote :

Program received signal SIGSEGV, Segmentation fault.
0x0824ffd6 in multiple_reg_loc_descriptor ()
(gdb) bt
#0 0x0824ffd6 in multiple_reg_loc_descriptor ()
#1 0x0824fdc8 in reg_loc_descriptor ()
#2 0x08252a31 in loc_descriptor ()
#3 0x08252b41 in loc_descriptor ()
#4 0x0825355d in dw_loc_list_1 ()
#5 0x08253d55 in dw_loc_list ()
#6 0x082548cd in loc_list_from_tree ()
#7 0x08257b3a in add_location_or_const_value_attribute ()
#8 0x0825cab2 in gen_variable_die ()
#9 0x0825f391 in gen_decl_die ()
#10 0x0825e7d6 in process_scope_var ()
#11 0x0825e814 in decls_for_scope ()
#12 0x0825c0af in gen_subprogram_die ()
#13 0x0825f1bf in gen_decl_die ()
#14 0x0825fac5 in dwarf2out_decl ()
#15 0x0825faf6 in dwarf2out_function_decl ()
#16 0x082b06a9 in rest_of_handle_final ()
#17 0x083b5643 in execute_one_pass ()
#18 0x083b57ff in execute_pass_list ()
#19 0x083b581b in execute_pass_list ()
#20 0x083b581b in execute_pass_list ()
#21 0x084c9e20 in tree_rest_of_compilation ()
#22 0x086507e4 in cgraph_expand_function ()
#23 0x08650993 in cgraph_expand_all_functions ()
#24 0x08650fc4 in cgraph_optimize ()
#25 0x0864f729 in cgraph_finalize_compilation_unit ()
#26 0x0812d5c5 in c_write_global_declarations ()
#27 0x084751d8 in compile_file ()
#28 0x08476e10 in do_compile ()
#29 0x08476f62 in toplev_main ()
#30 0x081a5c13 in main ()

Revision history for this message
Ken Werner (kwerner) wrote :
Revision history for this message
Ken Werner (kwerner) wrote :
Revision history for this message
Ken Werner (kwerner) wrote :

It might be a ARM specific bug in the ARM backend that implements TARGET_DWARF_REGISTER_SPAN target hook http://gcc.gnu.org/onlinedocs/gccint/Exception-Region-Output.html#index-TARGET_005fDWARF_005fREGISTER_005fSPAN-4599 . See: arm.c:arm_dwarf_register_span() (starts at line 22533)

The gcc/dwarf2out.c:reg_loc_descriptor() calls the target hook like this (line 13355):
regs = targetm.dwarf_register_span (rtl);

The input is:
  (gdb) p debug_rtx (rtl)
  (reg:SI 95 d16 [orig:149 lx ] [149])

The output is an empty array:
  (gdb) p debug_rtx (regs)
  (parallel [])

I'll attach a small patch that solves the ICE but it would be great if one of the more experienced GCC developers could have a look at this ICE too. There could be a better way to fix this.

Revision history for this message
Ken Werner (kwerner) wrote :

The above patch prevents the ICE but also has the effect of not having the correct DWARF info. So, this is not an upstreamable fix for this bug.

GCC calls the ARM backend (arm_dwarf_register_span) with:
  (gdb) p debug_rtx (rtl)
  (reg:SI 95 d16 [orig:149 lx ] [149])

Wich mean we have SI mode in register d16. So, the mode size is 4 that gets devided by 8 which is zero and leads to the ICE. That code isn't really prepared to handle this case. It works if the mode size is a multiple of eight which is likely to happen for the operands of some NEON instruction. However, according to arm_hard_regno_mode_ok() SI more and d16 seems to be an illegal combination. Now, the question is how did we get there.

Revision history for this message
Ken Werner (kwerner) wrote :
Revision history for this message
Ken Werner (kwerner) wrote :

I've uploaded a debuggable cc1 executable (-O0 -ggdb) to http://people.linaro.org/~kwerner/809768-cc1.tar.bz2 . The .i file, the cmd line to reproduce the ICE and the backtrace is also included.

Revision history for this message
Ken Werner (kwerner) wrote :

A binary toolchain with the 809768-workaround.patch applied can be found at http://people.linaro.org/~kwerner/linaro-4.6-toolchain-install.tar.bz2 . This is mainly for testing purposes for the Linaro Android folks.

Revision history for this message
Ulrich Weigand (uweigand) wrote :

The (reg:SI 95 d16 [orig:149 lx ] [149]) is generated by the cprop_hardreg pass substituting d16 for r2 inside a var_location debug insn.

Of course, the regcprop.c code does perform the usual back-end checks:

static bool
mode_change_ok (enum machine_mode orig_mode, enum machine_mode new_mode,
                unsigned int regno ATTRIBUTE_UNUSED)
{
  if (GET_MODE_SIZE (orig_mode) < GET_MODE_SIZE (new_mode))
    return false;

#ifdef CANNOT_CHANGE_MODE_CLASS
  return !REG_CANNOT_CHANGE_MODE_P (regno, orig_mode, new_mode);
#endif

  return true;
}

However, in Linaro GCC 4.6, the REG_CANNOT_CHANGE_MODE_P check does not prohibit the change from DFmode to SImode in register 95 (d16), even though HARD_REGNO_MODE_OK would reject SImode in d16.

This seems an inconsistency in the back-end. This looks to have been introduced via a backport from mainline of the following patch:

http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01631.html

2011-05-05 Richard Sandiford <email address hidden>

       gcc/
       Backport from mainline:

       2011-03-25 Richard Sandiford <email address hidden>

       * config/arm/arm.h (CANNOT_CHANGE_MODE_CLASS): Restrict FPA_REGS
       case to VFPv1.

Richard, any thoughts? Should CANNOT_CHANGE_MODE_CLASS accept modes in registers that are not accepted by HARD_REGNO_MODE_OK? If not, should regcprop.c check both macros?

Revision history for this message
Ken Werner (kwerner) wrote :

Uli, Thanks for your thorough analysis!
I can confirm that the ICE does not occur with the old CANNOT_CHANGE_MODE_CLASS macro.

Revision history for this message
Richard Sandiford (rsandifo) wrote : Re: [Bug 809768] Re: Linaro GCC 4.6-2011.06-0 gets ICE when compiling bionic's libm

Ulrich Weigand <email address hidden> writes:
> The (reg:SI 95 d16 [orig:149 lx ] [149]) is generated by the
> cprop_hardreg pass substituting d16 for r2 inside a var_location debug
> insn.
>
> Of course, the regcprop.c code does perform the usual back-end checks:
>
> static bool
> mode_change_ok (enum machine_mode orig_mode, enum machine_mode new_mode,
> unsigned int regno ATTRIBUTE_UNUSED)
> {
> if (GET_MODE_SIZE (orig_mode) < GET_MODE_SIZE (new_mode))
> return false;
>
> #ifdef CANNOT_CHANGE_MODE_CLASS
> return !REG_CANNOT_CHANGE_MODE_P (regno, orig_mode, new_mode);
> #endif
>
> return true;
> }

We ought to be checking HARD_REGNO_MODE_OK as well. That's required
for correctness even on targets for which CANNOT_CHANGE_MODE_CLASS
is not defined.

CANNOT_CHANGE_MODE_CLASS is designed to be used in cases where mode
changes aren't allowed even though the old and new registers are
valid as far as HARD_REGNO_MODE_OK is concerned.

Richard

Changed in gcc-linaro:
assignee: nobody → Richard Sandiford (rsandifo)
Revision history for this message
Michael Hope (michaelh1) wrote :

The fault doesn't seem to occur on a 64 bit host with a 64 bit GCC. It does occur on a 64 bit host with the default 32 bit GCC that linaro-build.sh tries to build.

Revision history for this message
Michael Hope (michaelh1) wrote :

Note that there are two work arounds:
 * Remove the -g
 * Build at -Os or -O0

Revision history for this message
Richard Sandiford (rsandifo) wrote :

FWIW -- and it took me depressingly long to figure this out --
the easiest way of making sure that the bug triggers is to
recompile dwarf2out.c with -O0. The segfault comes from
the rhs of the size assignment here:

  gcc_assert (GET_CODE (regs) == PARALLEL);

  size = GET_MODE_SIZE (GET_MODE (XVECEXP (regs, 0, 0)));
  loc_result = NULL;

The assignment is dead, so a sufficiently clever compiler
will optimise it (and the segfault) away.

Revision history for this message
Richard Sandiford (rsandifo) wrote :

This patch seems to fix it. I'll run a full test and submit upstream.

Alexander Sack (asac)
Changed in gcc-linaro:
status: New → In Progress
Loïc Minier (lool)
Changed in gcc-linaro:
milestone: none → 2011.07
Revision history for this message
Alexander Sack (asac) wrote :

blocks android binary toolchain build from building android platform to a working state. Should justify a -1 tarball respin. doing a testbuild; wait for confirm.

Loïc Minier (lool)
Changed in gcc-linaro:
importance: Undecided → High
Revision history for this message
Ken Werner (kwerner) wrote :

Thanks Richard! After applying your patch onto gcc-linaro-4.6-2011.07 the ICE is gone and I was able to complete the build of Linaro-Android for my PandaBoard.

Michael Hope (michaelh1)
Changed in gcc-linaro:
milestone: 2011.07 → 2011.08
status: In Progress → Fix Committed
Michael Hope (michaelh1)
Changed in gcc-linaro:
milestone: 2011.08 → 4.6-2011.08
Michael Hope (michaelh1)
Changed in gcc-linaro:
status: Fix Committed → Fix Released
Revision history for this message
Michael Hope (michaelh1) wrote :

Need to confirm that this exists in the -stable branch.

Revision history for this message
Andrew Stubbs (ams-codesourcery) wrote :

This has been committed upstream.

Related: lp:gcc-linaro/4.6,revno=106782

Changed in gcc-linaro-tracking:
milestone: none → 4.7.0
status: New → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers