------- Comment From <email address hidden> 2016-11-15 18:14 EDT-------
Just got the following hit with the "distinct value in adapt_count" version of
the library:
Thread 4 "python" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x3fff9b7cf1a0 (LWP 26828)]
...
Earlier we would often see that a NULL pointer on the stack was damaged
to become 0x0001000000000000. Here we have that same scenario, but with the
damage including the distinct value that the test code uses in adapt_count.
So I'm confident it's at least an adapt_count store into an on-stack
mutex that causing our crashes. Looks like a match.
------- Comment From <email address hidden> 2016-11-15 18:14 EDT-------
Just got the following hit with the "distinct value in adapt_count" version of
the library:
Thread 4 "python" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x3fff9b7cf1a0 (LWP 26828)]
...
(gdb) x/i $pc w12_GLOBAL_ _N_113ExecutorS tate8NodeDoneER KNS_6StatusEPKN S_4NodeERKNS_ 3gtl13InlinedVe ctorINS1_ 10TaggedNodeELi 8EEEPNS_ 13NodeExecStats EPNS1_20TaggedN odeReadyQueueE. constprop. 432+208> : ld r9,104(r30)
=> 0x3fffb5898df0 <_ZN10tensorflo
(gdb) info registers r30
r30 0x1111000000000000
Earlier we would often see that a NULL pointer on the stack was damaged
to become 0x0001000000000000. Here we have that same scenario, but with the
damage including the distinct value that the test code uses in adapt_count.
So I'm confident it's at least an adapt_count store into an on-stack
mutex that causing our crashes. Looks like a match.