Comment 14 for bug 1304754

Revision history for this message
Dave Cheney (dave-cheney) wrote : Re: [Bug 1304754] Re: gccgo on ppc64el using split stacks when not supported

An excellent point. Timers are managed by a single goroutine and a
priority queue of events to wait on and channels to send the timer
event. It should be doable to write some code that stresses timers.

However I don't believe that SIGALARM is used, well at least not in gc
which most of the gccgo standard library extends from, gccgo might be
slightly different.

The event that crashes the go process is related to a watchdog timer
that expires and tries to kill the subprocess.

On Wed, Apr 16, 2014 at 6:04 PM, Anton Blanchard <email address hidden> wrote:
> There shouldn't be any difference in terms of signal handling.
>
> I've now seen a couple of failures in mongodb/TLS networking code:
>
> panic: runtime error: invalid memory address or nil pointer dereference
> [signal 0xb code=0x1 addr=0x38]
>
> goroutine 16 [running]:
> crypto_tls.SetWriteDeadline.pN15_crypto_tls.Conn
> ../../../gcc/libgo/go/crypto/tls/conn.go:111
> labix.org_v2_mgo.updateDeadline.pN28_labix.org_v2_mgo.mongoSocket
> /home/anton/juju-core-1.18.1/src/labix.org/v2/mgo/socket.go:273
> labix.org_v2_mgo.Query.pN28_labix.org_v2_mgo.mongoSocket
> /home/anton/juju-core-1.18.1/src/labix.org/v2/mgo/socket.go:474
> labix.org_v2_mgo.SimpleQuery.pN28_labix.org_v2_mgo.mongoSocket
> /home/anton/juju-core-1.18.1/src/labix.org/v2/mgo/socket.go:320
> labix.org_v2_mgo.pinger.pN28_labix.org_v2_mgo.mongoServer
> /home/anton/juju-core-1.18.1/src/labix.org/v2/mgo/server.go:278
> created by mgo.newServer
> /home/anton/juju-core-1.18.1/src/labix.org/v2/mgo/server.go:80
>
> which is:
>
> func (c *Conn) SetWriteDeadline(t time.Time) error {
> return c.conn.SetWriteDeadline(t)
> }
>
> SetWriteDeadline will end up in timer code, and I've previously seen
> failures in the timer code.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1304754
>
> Title:
> gccgo on ppc64el using split stacks when not supported
>
> Status in “gccgo-4.9” package in Ubuntu:
> Confirmed
>
> Bug description:
> On kernels 3.13-18 and 3.13-23 (there may be others) the kernel is
> killing gccgo compiled binaries
>
> [18519.444748] jujud[19277]: bad frame in setup_rt_frame:
> 0000000000000000 nip 0000000000000000 lr 0000000000000000
> [18519.673632] init: juju-agent-ubuntu-local main process (19220)
> killed by SEGV signal
> [18519.673651] init: juju-agent-ubuntu-local main process ended, respawning
>
> In powerpc/kernel/signal_64.c:
>
> sys_rt_sigreturn is jumping to the badframe: label and executing an
> unconditional force_sigsegv which is delivered to the userland
> process. Like C++, gccgo tries to decode SIGSEGV as a nil pointer
> access and blame some random function that happened to be the top
> stack frame.
>
> Reverting to the 3.13-08 kernel appears to resolve the issue which
> (weakly) points the finger at the recent switch to 64k pages.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/gccgo-4.9/+bug/1304754/+subscriptions