Improve /= on large numbers of arguments

Bug #1745196 reported by Paul F. Dietz on 2018-01-24
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Undecided
Unassigned

Bug Description

The current algorithm for /= does O(N^2) comparisons on N arguments. For large N, it would be more efficient to use an EQUALP hash table (or a hash table specialized for numbers). The crossover is somewhere around N=60, by experiment using a standard EQUALP hash table created with :size N.

(Use of SXHASH and a Bloom filter might be even faster.)

description: updated
Stas Boukarev (stassats) wrote :

Who calls /= with 60 arguments? That's even more than call-arguments-limit.

Paul F. Dietz (paul-f-dietz) wrote :

It's more than the mandated minimum value of call-arguments-limit, but the actual value of that constant in my 64 bit SBCL is 4611686018427387903.

You might as well ask who creates vectors with more than 1024 elements (the mandated minimum value of array-dimension-limit).

Stas Boukarev (stassats) wrote :

Are 60 argument calls to /= as frequent and useful as large arrays?

Paul F. Dietz (paul-f-dietz) wrote :

Specifically for /=? Probably not (although (apply #'/= list) is a nice way to ask if there are any duplicates in list.) The general (apply fn list) idiom occurs with many n-ary builtins, and it would be nice to be able to apply it to arbitrary lists. I know this is not completely portable.

Stas Boukarev (stassats) wrote :

(apply #'/= list) is a nice way to blow the stack.

Paul F. Dietz (paul-f-dietz) wrote :

Yes, it fails somewhere above 200000 parameters. That makes me wonder why call-arguments-limit and lambda-parameters-limit are so much higher.

If lambda-parameters-limit were something much smaller, I imagine the ABI could be designed so the extra parameters beyond the limit are always in a list, not allocated on the stack (they'd only be accessed by &rest and &key parameters.) The &rest parameter is allowed to share structure with the last argument to APPLY.

John Cowan (johnwcowan) wrote :

Granted that >2 arguments is not a typical case, the argument list can be sorted and each consecutive pair compared for inequality. This is O(n log n) rather than O(n^2), though it does require n additional cons cells; it is slower but less space overhead than a hash table.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers