Frank, can you confirm that you got in touch with the upstream author about it?
I probably could "fix" it by sprinkling casts all over the place, which could be wildly unsafe depending on the exact properties of __int128, or by actually using a local __int128 variable and memcpy it or something like that, which might destroy performance? (I recall something like that being a problem in some glibc routines on ARM where moving data from SIMD registers was *really* slow on the Pis)
As you see, I'm in much doubt so knowledge from the people who wrote the patch (and own the hardware) would be greatly appreciated!
Frank, can you confirm that you got in touch with the upstream author about it?
I probably could "fix" it by sprinkling casts all over the place, which could be wildly unsafe depending on the exact properties of __int128, or by actually using a local __int128 variable and memcpy it or something like that, which might destroy performance? (I recall something like that being a problem in some glibc routines on ARM where moving data from SIMD registers was *really* slow on the Pis)
As you see, I'm in much doubt so knowledge from the people who wrote the patch (and own the hardware) would be greatly appreciated!