Attached are all the generated plots for the various benchmarks.
My matplotlib skills being rather poor, here's the legend: y axis in %, we want negative values, x axis is the size of the buffers being processed.
My conclusions are that the memcmp patch for focal should be rolled back, but the Jammy results are fairly OK. There's a huge (150%!) performance regression for very small memmoves (as in < 16 bytes), but I think it's only showing that the fixed cost of the function has increased. I also think the "fixed input" part of the benchmarks are actually hitting a worst-case scenario, as we show no improvement in the non-random benchmarks (including the large ones), whereas the ones on random inputs are much more satisfying, with impressive results on graviton3 without significant regression on non-SVE machines.
Thus, I'm marking the jammy upload as verified.
I'll have to re-do a focal upload without the memcmp patch, though.
Attached are all the generated plots for the various benchmarks.
My matplotlib skills being rather poor, here's the legend: y axis in %, we want negative values, x axis is the size of the buffers being processed.
My conclusions are that the memcmp patch for focal should be rolled back, but the Jammy results are fairly OK. There's a huge (150%!) performance regression for very small memmoves (as in < 16 bytes), but I think it's only showing that the fixed cost of the function has increased. I also think the "fixed input" part of the benchmarks are actually hitting a worst-case scenario, as we show no improvement in the non-random benchmarks (including the large ones), whereas the ones on random inputs are much more satisfying, with impressive results on graviton3 without significant regression on non-SVE machines.
Thus, I'm marking the jammy upload as verified.
I'll have to re-do a focal upload without the memcmp patch, though.