Yes, you are right. The proposed way breaks simulations (--check does not path with j>1).
I have done locks in non-parallel implementation, and the performance goes down, as you said:
-j Parallel Non-Parallel % ------------------------------------ 1 997 1002 -0.5 2 821 842 -2.6 3 764 825 -8 4 752 908 -20.7 5 745 1132 -51.9 6 742 1180 -59
Yes, you are right. The proposed way breaks simulations (--check does not path with j>1).
I have done locks in non-parallel implementation, and the performance goes down, as you said:
-j Parallel Non-Parallel % ------- ------- ------- ------- -
-------
1 997 1002 -0.5
2 821 842 -2.6
3 764 825 -8
4 752 908 -20.7
5 745 1132 -51.9
6 742 1180 -59