Comment 1 for bug 1641241

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-11-11 18:14 EDT-------
A few notes on the debug effort since the previous update:

- We opened an issue with the TensorFlow project in github:
https://github.com/tensorflow/tensorflow/issues/5482

- Crash occurs with any of TensorFlow 0.9, 0.10, and recent master
(e.g. 2cbb9b5 of Oct 26th)

- Crash occurs while training inception model on ILSVRC 2012 dataset. Model is
from: https://github.com/tensorflow/models.git inception/

- Crash is independent of CUDA / GPU; occurs even if we compile without CUDA
and GPU support

- Crash occurs after varying run times. Appears to occur sooner with higher
thread counts. Very possibly a race of some sort.

Our team continues to try to isolate and simplify the problem. We'll try
buildling on a 14.04 system with AT 9. That setup uses GLIBC 2.22, and
so may allow us to narrow the problem to either 2.21 -> 2.22 or 2.22 -> 2.23.

Any debug advice or suggestions are appreciated.