Optimized memcmp for arm64

Bug #1720832 reported by dann frazier
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
glibc (Ubuntu)
Fix Released
Undecided
Adam Conrad

Bug Description

A patch has recently landed upstream to optimize memcmp for AArch64:

commit 922369032c604b4dcfd535e1bcddd4687e7126a5
Author: Wilco Dijkstra <email address hidden>
Date: Thu Aug 10 17:00:38 2017 +0100

    [AArch64] Optimized memcmp.

    This is an optimized memcmp for AArch64. This is a complete rewrite
    using a different algorithm. The previous version split into cases
    where both inputs were aligned, the inputs were mutually aligned and
    unaligned using a byte loop. The new version combines all these cases,
    while small inputs of less than 8 bytes are handled separately.

    This allows the main code to be sped up using unaligned loads since
    there are now at least 8 bytes to be compared. After the first 8 bytes,
    align the first input. This ensures each iteration does at most one
    unaligned access and mutually aligned inputs behave as aligned.
    After the main loop, process the last 8 bytes using unaligned accesses.

    This improves performance of (mutually) aligned cases by 25% and
    unaligned by >500% (yes >6 times faster) on large inputs.

            * sysdeps/aarch64/memcmp.S (memcmp):
            Rewrite of optimized memcmp.

dann frazier (dannf)
Changed in glibc (Ubuntu):
assignee: nobody → Adam Conrad (adconrad)
status: New → Confirmed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package glibc - 2.26-0ubuntu2

---------------
glibc (2.26-0ubuntu2) artful; urgency=medium

  * Cherry-pick some changes from Debian git for a few pending Ubuntu bugfixes:
    - Update to master and drop redundant submitted-tst-tlsopt-powerpc.diff.
    - debian/patches/any/local-cudacc-float128.diff: Local patch to prevent
      defining __HAVE_FLOAT128 on NVIDIA's CUDA compilers (LP: #1717257)
    - debian/patches/arm/git-arm64-memcmp.diff: Backport optimized memcmp
      for AArch64, improving performance from 25% to 500% (LP: #1720832)
    - debian/patches/amd64/git-x86_64-search.diff: Backport upstream commit
      to put x86_64 back in the search path, like in 2.25 (LP: #1718928)
    - debian/rules.d/debhelper.mk: Filter python hooks in stage1 (LP: #1715366)

 -- Adam Conrad <email address hidden> Wed, 11 Oct 2017 14:21:40 -0600

Changed in glibc (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.