Comment 12 for bug 1304754

Hi Anton,

I've been looking at another angle via a different crash. I see a
crash if a child process gets a signal, which sort of reflects back on
the parent.

Are there any alignment requirements for signal handling on 64k kernels ?

Dave

On Wed, Apr 16, 2014 at 4:28 PM, Anton Blanchard <email address hidden> wrote:
> This doesn't explain why we failed in the first place however. Using
> gdb, I have seen a couple of SEGVs in:
>
> * 1 Thread 0x3fffa8c447e0 (LWP 5562) "jujud" timerproc
> (dummy=<optimized out>) at ../../../gcc/libgo/runtime/time.goc:217
>
> ie:
>
> f = (void*)t->fv->fn;
>
> Perhaps a stale timer that we aren't cancelling?
>
> I've also seen a fail here:
>
> fatal error: runtime_lock: lock count
>
> goroutine 2 [running]:
> runtime_dopanic
> ../../../gcc/libgo/runtime/panic.c:78
> runtime_throw
> ../../../gcc/libgo/runtime/panic.c:116
> runtime_lock
> ../../../gcc/libgo/runtime/lock_futex.c:41
> runtime_allocmcache
> ../../../gcc/libgo/runtime/malloc.goc:337
> runtime_startpanic
> ../../../gcc/libgo/runtime/panic.c:46
> runtime_throw
> ../../../gcc/libgo/runtime/panic.c:114
> runtime_unlock
> ../../../gcc/libgo/runtime/lock_futex.c:101
> runtime_MHeap_Scavenger
> ../../../gcc/libgo/runtime/mheap.c:482
> kickoff
> ../../../gcc/libgo/runtime/proc.c:237
>
> :0
>
> :0
> created by runtime_main
> ../../../gcc/libgo/runtime/proc.c:565
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1304754
>
> Title:
> gccgo on ppc64el using split stacks when not supported
>
> Status in “gccgo-4.9” package in Ubuntu:
> Confirmed
>
> Bug description:
> On kernels 3.13-18 and 3.13-23 (there may be others) the kernel is
> killing gccgo compiled binaries
>
> [18519.444748] jujud[19277]: bad frame in setup_rt_frame:
> 0000000000000000 nip 0000000000000000 lr 0000000000000000
> [18519.673632] init: juju-agent-ubuntu-local main process (19220)
> killed by SEGV signal
> [18519.673651] init: juju-agent-ubuntu-local main process ended, respawning
>
> In powerpc/kernel/signal_64.c:
>
> sys_rt_sigreturn is jumping to the badframe: label and executing an
> unconditional force_sigsegv which is delivered to the userland
> process. Like C++, gccgo tries to decode SIGSEGV as a nil pointer
> access and blame some random function that happened to be the top
> stack frame.
>
> Reverting to the 3.13-08 kernel appears to resolve the issue which
> (weakly) points the finger at the recent switch to 64k pages.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/gccgo-4.9/+bug/1304754/+subscriptions