Comment 10 for bug 1429756

Revision history for this message
Andy Whitcroft (apw) wrote : Re: FTBFS: test_job_process fails in majority of cases

Ok, after a lot of testing and reading of the traces we are seeing a spurious EIO reported to the read at the end of the file. This in combination with the way nih handles IO means we lose the last buffer of the file at times. From the strace log:

    1396 read(17, 0x7f78214c3b30, 4096) = -1 EIO (Input/output error)

Looking through the v3.19 changelogs the following commit looks suspicious:

  commit 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
  Author: Peter Hurley <email address hidden>
  Date: Wed Nov 5 12:13:05 2014 -0500

    pty, n_tty: Simplify input processing on final close

    When releasing one end of a pty pair, that end may just have written
    to the other, which the input processing worker, flush_to_ldisc(), is
    still working on but has not completed the copy to the other end's
    read buffer. So input may not appear to be available to a waiting
    reader but yet TTY_OTHER_CLOSED is now observed. The n_tty line
    discipline has worked around this by waiting for input processing
    to complete and then re-checking if input is available before
    exiting with -EIO.

    Since the tty/ldisc lock reordering, the wait for input processing
    to complete can now occur during final close before setting
    TTY_OTHER_CLOSED. In this way, a waiting reader is guaranteed to
    see input available (if any) before observing TTY_OTHER_CLOSED.

    Reviewed-by: Alan Cox <email address hidden>
    Signed-off-by: Peter Hurley <email address hidden>
    Signed-off-by: Greg Kroah-Hartman <email address hidden>

Indeed reverting this change seems to eliminate these symtoms.