Performance regression related to inlining
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
SBCL |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
The following code (without any inline/notinline declarations) takes 0.5s on SBCL 2.1.7 but 16.5s on 2.1.8 - 2.2.1 (current):
(defun rad (x)
(declare (double-float x))
(* (* 2d0 pi) (* x (/ 1 360d0))))
(time
(dotimes (k 100000000)
(cis (rad 90d0))))
Declaiming RAD inline achieves the performance of 2.1.7 without inline declarations.
Surprisingly, declaiming RAD notinline is even faster.
Repeatable test case:
Save to test-inlining.cl and execute on 2.1.7 an 2.1.8 with
sbcl --no-userinit --eval '(progn (compile-file "test-inlining.cl") (load "test-inlining"))'
Add inline declarations and repeat.
;;;;;;;
;;; Run with
;;; /opt/sbcl-
;;; OS: Fedora 34, Fedora 35
(defpackage :regression
(:use :common-lisp))
(in-package :regression)
;; -- Performance regression when not using any inline declarations
;; -- Inlining RAD eliminates the performance regression
;; (declaim (inline rad))
;; -- Declaim notinline is faster than decaim inline ?!
;; (declaim (notinline rad))
(defun rad (x)
(declare (double-float x))
(* (* 2d0 pi) (* x (/ 1 360d0))))
;; SBCL 2.1.4, 2.1.6, 2.1.7:
;; no inline/notinline 1.9s
;; SBCL 2.2.0, 2.2.1, 2.1.8:
;; no inline/notinline: 16.5s
;; declaim inline rad: 1.5s
;; declaim notinline rad: 0.5s
(time
(dotimes (k 100000000)
(cis (rad 90d0))))
;;; EOF
;;;;;;;
CL-USER> *features*
(:SWANK :CL-FAD :BORDEAUX-THREADS :THREAD-SUPPORT CFFI-FEATURES:
CFFI-FEATURES:
:SPLIT-SEQUENCE :SBCL-USES-
:ASDF3.1 :ASDF3 :ASDF2 :ASDF :OS-UNIX :NON-BASE-
:X86-64 :GENCGC :64-BIT :ANSI-CL :COMMON-LISP :ELF :IEEE-FLOATING-
:LITTLE-ENDIAN :PACKAGE-
:SB-UNICODE :SBCL :UNIX)
-------
uname -a
Linux aguas-13 5.15.18-
-------
Changed in sbcl: | |
status: | In Progress → Fix Committed |
Changed in sbcl: | |
status: | Fix Committed → Fix Released |
Is the better performance of 2.1.7 because it throws away the body of dotimes?