Ok, after a lot of testing and reading of the traces we are seeing a spurious EIO reported to the read at the end of the file. This in combination with the way nih handles IO means we lose the last buffer of the file at times. From the strace log:
Looking through the v3.19 changelogs the following commit looks suspicious:
commit 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
Author: Peter Hurley <email address hidden>
Date: Wed Nov 5 12:13:05 2014 -0500
pty, n_tty: Simplify input processing on final close
When releasing one end of a pty pair, that end may just have written
to the other, which the input processing worker, flush_to_ldisc(), is
still working on but has not completed the copy to the other end's
read buffer. So input may not appear to be available to a waiting
reader but yet TTY_OTHER_CLOSED is now observed. The n_tty line
discipline has worked around this by waiting for input processing
to complete and then re-checking if input is available before
exiting with -EIO.
Since the tty/ldisc lock reordering, the wait for input processing
to complete can now occur during final close before setting
TTY_OTHER_CLOSED. In this way, a waiting reader is guaranteed to
see input available (if any) before observing TTY_OTHER_CLOSED.
Reviewed-by: Alan Cox <email address hidden>
Signed-off-by: Peter Hurley <email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>
Indeed reverting this change seems to eliminate these symtoms.
Ok, after a lot of testing and reading of the traces we are seeing a spurious EIO reported to the read at the end of the file. This in combination with the way nih handles IO means we lose the last buffer of the file at times. From the strace log:
1396 read(17, 0x7f78214c3b30, 4096) = -1 EIO (Input/output error)
Looking through the v3.19 changelogs the following commit looks suspicious:
commit 52bce7f8d4fc633 c9a9d0646eef58b a6ae9a3b73
Author: Peter Hurley <email address hidden>
Date: Wed Nov 5 12:13:05 2014 -0500
pty, n_tty: Simplify input processing on final close
When releasing one end of a pty pair, that end may just have written
to the other, which the input processing worker, flush_to_ldisc(), is
still working on but has not completed the copy to the other end's
read buffer. So input may not appear to be available to a waiting
reader but yet TTY_OTHER_CLOSED is now observed. The n_tty line
discipline has worked around this by waiting for input processing
to complete and then re-checking if input is available before
exiting with -EIO.
Since the tty/ldisc lock reordering, the wait for input processing OTHER_CLOSED. In this way, a waiting reader is guaranteed to
to complete can now occur during final close before setting
TTY_
see input available (if any) before observing TTY_OTHER_CLOSED.
Reviewed-by: Alan Cox <email address hidden>
Signed-off-by: Peter Hurley <email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>
Indeed reverting this change seems to eliminate these symtoms.