Okay, turns out to be an utterly trivial problem. This is the disassembly of SB-VM::ALLOC-TLS-INDEX-IN-RAX. See instructions 0x20000f42 and 0x20000f47. You can't use %rax as the new value and the compare-to value for cmpxchg! So, the lock was never actually taken. The variants for allocating in everything else worked fine. Just RAX is broken.
Okay, turns out to be an utterly trivial problem. This is the disassembly of SB-VM:: ALLOC-TLS- INDEX-IN- RAX. See instructions 0x20000f42 and 0x20000f47. You can't use %rax as the new value and the compare-to value for cmpxchg! So, the lock was never actually taken. The variants for allocating in everything else worked fine. Just RAX is broken.
0x20000f38: mov %rbp,0xb8(%r12)
0x20000f40: push %rcx
0x20000f41: push %rax
0x20000f42: mov $0x1,%eax
0x20000f47: xor %eax,%eax
0x20000f49: lock cmpxchg %rax,0x20100b88
0x20000f53: jne 0x20000f42
0x20000f55: pop %rcx
0x20000f56: mov 0x21(%rcx),%rax
0x20000f5a: or %rax,%rax
0x20000f5d: jne 0x20000f8d
0x20000f5f: mov 0x20100b48,%rax
0x20000f67: cmp $0x8000,%rax
0x20000f6d: jl 0x20000f80
0x20000f6f: movq $0x0,0xb8(%r12)
0x20000f7b: jmpq 0x20001874
0x20000f80: addq $0x8,0x20100b48
0x20000f89: mov %rax,0x21(%rcx)
0x20000f8d: xor %ecx,%ecx
0x20000f8f: xchg %rcx,0x20100b88
0x20000f97: pop %rcx
0x20000f98: xor %rbp,0xb8(%r12)
0x20000fa0: je 0x20000fa4
0x20000fa2: int3
0x20000fa3: 0x09
0x20000fa4: ret