Comment 2 for bug 1248181

Revision history for this message
Joshua M. Clulow (jclulow) wrote :

Hello!

I work on the illumos project, the open source continuation of OpenSolaris.
I've had a look at your test program, and read through your description of the
signal handling behaviour that you're seeing.

I think there are a few things going on here, so I'll try and lay out a few
suggestions and ask a few questions. I've put some links to our online manual
pages at the end.

1. I'm not sure under what conditions you'll receive a NULL siginfo, but
   the sigaction(2) manual page definitely suggests that it might be NULL
   sometimes. Do you recall which specific signals (e.g., SIGCHLD, etc)
   you were handling when siginfo was NULL?

2. As you've noted, when multiple signals arrive at around the same time,
   their delivery may overlap. When signals overlap, we do not always
   completely unwind the signal handling machinery in libc before
   delivering subsequent signals. In these cases, the context object
   may refer to the state of the signal delivery parts of libc which
   were interrupted by the nested signal, rather than to the part of
   your main program that was interrupted.

   In ucontext.h(3HEAD), the "uc_link" member from the context object is
   described as follows:

       The uc_link member is a pointer to the context that to be resumed
       when this context returns. If uc_link is equal to 0, this context
       is the main context and the process exits when this context returns.

   It sounds like when handling SIGPROF, you're interested in that "main
   context", rather than in any of the signal handling code. This context
   represents the state we preserved when taking the first of the coincident
   signals, and in order to make sure you find it you always need to walk up
   the context chain until "uc_link" is NULL.

3. As noted in siginfo.h(3HEAD), "si_addr" is populated with the address
   of the faulting instruction for SIGILL signals. That's not true of other
   signals, though; e.g., it isn't true for SIGPROF and setitimer(2).
   For SPARC systems where you are using an undefined instruction to
   generate a trap for allocation, using "si_addr" when handling SIGILL is
   definitely the right way to find the instruction in question. If nested
   signal delivery has occurred, the context object you get in your SIGILL
   handler will not match up with the "si_addr" value.

   The context in the nested delivery case will restore execution to the
   previously running signal handler, rather than to your main program.
   If you only need the program counter value, use "si_addr". If you
   need the rest of the context to correctly handle the trap, you'll have
   to walk the "uc_link" chain out to find it. There are two options:

      - Walk up to a context where the program counter matches "si_addr".
      - Walk up to the main context, where "uc_link" is NULL.

   The option that is most correct will depend on the structure of your
   program; e.g., do you expect to generate a SIGILL in any of the
   signal handlers, or just in the main program?

4. To reiterate, the context that signal handlers receive is only
   guaranteed to be right for one purpose: the restoration of execution
   state from before we started handling the signal. Any nested
   execution state is stored in the context chain (via "uc_link").
   If your program needs to reason about the state in the context object,
   it also needs to handle the case where the relevant context is not
   at the head of the chain.

   I think this is true of any vaguely POSIX system, not just illumos.
   Depending on the design of the signal machinery on a particular
   operating system, software may experience the edge cases more or
   less frequently, but portable software probably needs to handle
   them all.

If this is unclear, or if I can help in some way, please let me know!

Manual page references:

    https://illumos.org/man/2/sigaction
    https://illumos.org/man/2/setitimer
    https://illumos.org/man/3HEAD/siginfo.h
    https://illumos.org/man/3HEAD/ucontext.h