Activity log for bug #2030515

Date Who What changed Old value New value Message
2023-08-07 14:23:05 Bruce Merry bug added bug
2023-08-07 14:23:32 Bruce Merry summary Terribly memcpy performance on Zen 3 when using rep movsb Terrible memcpy performance on Zen 3 when using rep movsb
2023-08-07 15:03:35 Bruce Merry attachment added Output of /lib64/ld-linux-x86-64.so.2 --list-diagnostics https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/2030515/+attachment/5690910/+files/glibc-diagnostics.txt
2023-10-24 06:22:22 Bruce Merry bug watch added https://sourceware.org/bugzilla/show_bug.cgi?id=30994
2023-11-29 15:57:45 Jeff bug added subscriber Jeff
2023-11-29 21:28:30 Sebastian bug added subscriber Sebastian
2023-11-29 23:15:09 Benjamin Drung bug task added glibc
2023-11-29 23:15:50 Benjamin Drung description On CPUs that advertise FSRM (fast short rep movsb), glibc 2.35 uses REP MOVSB for memcpy for sizes above 2112 (up to some threshold that depends on the cache size). Unfortunately, it seems that Zen 3 (at least in the microcode we're running) is extremely slow at REP MOVSB when the data are not well-aligned. I've found this using a memcpy benchmark at https://github.com/ska-sa/katgpucbf/blob/69752be58fb8ab0668ada806e0fd809e782cc58b/scratch/memcpy_loop.cpp (compiled with the adjacent Makefile). To demonstrate the issue, run ./memcpy_loop -b 2113 -p 1000000 -t mmap -S 0 -D 1 0 This runs: - 2113-byte memory copies - 1,000,000 times per timing measurement - in memory allocated with mmap - with the source 0 bytes from the start of the page - with the destination 1 byte from the start of the page - on core 0. It reports about 3.2 GB/s. Change the -b argument to 2111 and it reports over 100 GB/s. So the REP MOVSB case is about 30× slower! This will most likely need to be reported and fixed upstream, but I'm reporting it to Ubuntu first since I don't know if Ubuntu has modified glibc in any way that would be significant. ProblemType: Bug DistroRelease: Ubuntu 22.04 Package: libc6 2.35-0ubuntu3.1 ProcVersionSignature: Ubuntu 5.19.0-46.47~22.04.1-generic 5.19.17 Uname: Linux 5.19.0-46-generic x86_64 NonfreeKernelModules: nvidia_modeset nvidia ApportVersion: 2.20.11-0ubuntu82.5 Architecture: amd64 CasperMD5CheckResult: unknown Date: Mon Aug 7 14:02:28 2023 RebootRequiredPkgs: Error: path contained symlinks. SourcePackage: glibc UpgradeStatus: No upgrade log present (probably fresh install) On CPUs that advertise FSRM (fast short rep movsb), glibc 2.35 uses REP MOVSB for memcpy for sizes above 2112 (up to some threshold that depends on the cache size). Unfortunately, it seems that Zen 3 (at least in the microcode we're running) is extremely slow at REP MOVSB when the data are not well-aligned. I've found this using a memcpy benchmark at https://github.com/ska-sa/katgpucbf/blob/69752be58fb8ab0668ada806e0fd809e782cc58b/scratch/memcpy_loop.cpp (compiled with the adjacent Makefile). To demonstrate the issue, run ./memcpy_loop -b 2113 -p 1000000 -t mmap -S 0 -D 1 0 This runs: - 2113-byte memory copies - 1,000,000 times per timing measurement - in memory allocated with mmap - with the source 0 bytes from the start of the page - with the destination 1 byte from the start of the page - on core 0. It reports about 3.2 GB/s. Change the -b argument to 2111 and it reports over 100 GB/s. So the REP MOVSB case is about 30× slower! This will most likely need to be reported and fixed upstream, but I'm reporting it to Ubuntu first since I don't know if Ubuntu has modified glibc in any way that would be significant. See also: https://xuanwo.io/2023/04-rust-std-fs-slower-than-python/ ProblemType: Bug DistroRelease: Ubuntu 22.04 Package: libc6 2.35-0ubuntu3.1 ProcVersionSignature: Ubuntu 5.19.0-46.47~22.04.1-generic 5.19.17 Uname: Linux 5.19.0-46-generic x86_64 NonfreeKernelModules: nvidia_modeset nvidia ApportVersion: 2.20.11-0ubuntu82.5 Architecture: amd64 CasperMD5CheckResult: unknown Date: Mon Aug 7 14:02:28 2023 RebootRequiredPkgs: Error: path contained symlinks. SourcePackage: glibc UpgradeStatus: No upgrade log present (probably fresh install)
2023-11-29 23:27:57 Bug Watch Updater glibc: status Unknown New
2023-11-29 23:27:57 Bug Watch Updater glibc: importance Unknown Low
2023-11-30 10:15:17 Xuanwo bug added subscriber Xuanwo
2023-12-04 15:22:54 Taihsiang Ho bug added subscriber Taihsiang Ho