Well, let me back that off a little. We're going to look into the TLE code a little more. The various __lll_*_elision routines are handed a pointer to short that they update, which certainly looks suspicious. So it's certainly possible that something in the pthreads implementation that calls this code is providing a bad pointer to TLE.
However, we would expect to see the same problem on x86 or s390 if that is the case, unless there is some POWER-specific code in the pthreads implementation. So again a Skylake experiment would be helpful.
Well, let me back that off a little. We're going to look into the TLE code a little more. The various __lll_*_elision routines are handed a pointer to short that they update, which certainly looks suspicious. So it's certainly possible that something in the pthreads implementation that calls this code is providing a bad pointer to TLE.
However, we would expect to see the same problem on x86 or s390 if that is the case, unless there is some POWER-specific code in the pthreads implementation. So again a Skylake experiment would be helpful.