So it looks like the issue was resolved in r977. The only problem is; comparing the valgrind output and the diff of r977 I cannot tell if the bug was fixed, or if we just changed the test cases sufficiently to accidentally stop triggering a side-effect.
Update:
I only have one machine reproducing the bug now. And it's still not reliable. The test results are (+ pass, X fail):
979 +++++
978
977 ++++++++
976 XXX+++X
975 +XX+X+X
974 +
973
972 X
971
970 X++
So it looks like the issue was resolved in r977. The only problem is; comparing the valgrind output and the diff of r977 I cannot tell if the bug was fixed, or if we just changed the test cases sufficiently to accidentally stop triggering a side-effect.