While proceeding to the next steps on Rafael's analysis, I realized that the Ubuntu package and the self compiled version of PHP were calling different functions to process the same input file (the test file uploaded by the bug reporter). The Ubuntu PHP package calls the function reported above (php_base64_decode_ex), while the local compiled package calls php_base64_decode_ex_avx2. The locally compiled binary does process the test script 10x faster than the packaged one. This is also true for the Debian and the Fedora packages (they are slower) but not to the upstream PHP Docker image shipped in dockerhub under docker.io/php. After some investigation, I realized that PHP will use different x86_64 instructions for some tasks when such instructions are available. That is either done when the binaries are compiled to target specific x86_64 micro-architectures or through the "target" function attribute provided by gcc (see https://gcc.gnu.org/onlinedocs/gcc/x86-Function-Attributes.html) When specific (newer) x86_64 instructions are available, PHP will use them to speed up some specific tasks (such as base64 encoding/decoding). If they are not available, it falls back to other (older) available instructions. For instance, in the base64 decoding case, it will use avx2, introduced in x86_64-v3 when available; it then falls back to sse3, introduced in x86_64-v2, and finally uses the common v1 instructions when neither is available. The option to compile such additional functions is set during configuration time, through the "ax_cv_have_func_attribute_target" configuration. This configuration option is set by a modified version of an embedded gcc macro shipped in $PHP_SRC_ROOT/build/ax_gcc_func_attribute.m4. The original macro is described at https://www.gnu.org/software/autoconf-archive/ax_gcc_func_attribute.html. This macro relies on warnings being thrown to decide if the system supports a specific function attribute. The (old) version being shipped by PHP considers any warnings as a negative (i.e., the system does not support the function attribute in question), always generating false negatives when "-Wall" is enabled, since it relies on a test that declares and does not define a fuction, resulting in: "warning: ‘bar’ declared ‘static’ but never defined [-Wunused-function]", which leads the configuration step to always define "ax_cv_have_func_attribute_target=no" When "-Wall" is present in the CFLAGS (which is true for Ubuntu, Debian, Fedora, etc). The issue with the macro have been fixed in autoconf-archive upstream at http://git.savannah.gnu.org/gitweb/?p=autoconf-archive.git;a=commitdiff;h=df0894ad1a8195df67a52108b931e07d708cec9a but this has not been backported into the php embedded macro yet. Updating the macro and rebuilding the package generates PHP binaries with the improved performance perceived by the bug reporter. A PPA with the proposed change is available at https://launchpad.net/~athos-ribeiro/+archive/ubuntu/lp1882279-php-perf/+packages. I proposed updating the macro upstream at https://github.com/php/php-src/pull/8483. I tested the patched package emulating different x86_64 CPUs, and it will either use the avx2, the sse3, or fallback to the regular (slower) code path. The test being performed in the configuration step verifies if the "sse2" target is available. While this is true for x86_64 v1 and on, when dealing with i386, it is was only made available from pentium 4 and on. LP i386 builds are setting "ax_cv_have_func_attribute_target=yes", So we would still need to test the patched binaries on an i386 CPU older than a pentium 4. No performance gains should be expected, but we also should make sure no regressions will be introduced (i.e., the patch should no break the binaries for such arches). Note that the explanations above also contain the reason why Rafael could not perceive the performance differences on his tests: > "2 seconds in my old AMD (4.4 GHz) CPUs.. before and after local compilation"... Next, I will propose the patch to Debian with a salsa MR and perform the i386 test before filing a MP with this patch as a Delta for kinetic. Then we can proceed to filing SRUs here.