Optimized memcmp for arm64

Bug #1720832 reported by dann frazier on 2017-10-02
This bug affects 1 person
Affects Status Importance Assigned to Milestone
glibc (Ubuntu)
Adam Conrad

Bug Description

A patch has recently landed upstream to optimize memcmp for AArch64:

commit 922369032c604b4dcfd535e1bcddd4687e7126a5
Author: Wilco Dijkstra <email address hidden>
Date: Thu Aug 10 17:00:38 2017 +0100

    [AArch64] Optimized memcmp.

    This is an optimized memcmp for AArch64. This is a complete rewrite
    using a different algorithm. The previous version split into cases
    where both inputs were aligned, the inputs were mutually aligned and
    unaligned using a byte loop. The new version combines all these cases,
    while small inputs of less than 8 bytes are handled separately.

    This allows the main code to be sped up using unaligned loads since
    there are now at least 8 bytes to be compared. After the first 8 bytes,
    align the first input. This ensures each iteration does at most one
    unaligned access and mutually aligned inputs behave as aligned.
    After the main loop, process the last 8 bytes using unaligned accesses.

    This improves performance of (mutually) aligned cases by 25% and
    unaligned by >500% (yes >6 times faster) on large inputs.

            * sysdeps/aarch64/memcmp.S (memcmp):
            Rewrite of optimized memcmp.

dann frazier (dannf) on 2017-10-04
Changed in glibc (Ubuntu):
assignee: nobody → Adam Conrad (adconrad)
status: New → Confirmed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package glibc - 2.26-0ubuntu2

glibc (2.26-0ubuntu2) artful; urgency=medium

  * Cherry-pick some changes from Debian git for a few pending Ubuntu bugfixes:
    - Update to master and drop redundant submitted-tst-tlsopt-powerpc.diff.
    - debian/patches/any/local-cudacc-float128.diff: Local patch to prevent
      defining __HAVE_FLOAT128 on NVIDIA's CUDA compilers (LP: #1717257)
    - debian/patches/arm/git-arm64-memcmp.diff: Backport optimized memcmp
      for AArch64, improving performance from 25% to 500% (LP: #1720832)
    - debian/patches/amd64/git-x86_64-search.diff: Backport upstream commit
      to put x86_64 back in the search path, like in 2.25 (LP: #1718928)
    - debian/rules.d/debhelper.mk: Filter python hooks in stage1 (LP: #1715366)

 -- Adam Conrad <email address hidden> Wed, 11 Oct 2017 14:21:40 -0600

Changed in glibc (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers