Bug #756926 “missing REPLACE transforms for strings” : Bugs : SBCL

Revision history for this message

MrOrdinaire (mrordinaire) wrote on 2013-04-16:

#1

hi, I'd like to take on this bug. could you give me some pointers on how to get started?

Revision history for this message

Stas Boukarev (stassats) wrote on 2013-04-16:

#2

In slime, press M-. on replace. You'll see lots of :DEFTRANSFORM, select one of them, and you'll see how they're generated.

So, to add support transforms for simple strings, try adding (define-one-transform simple-string simple-string) to the macrolet body (and other combinations, like simple-string <=> simple-base-string).
For non-inline version, a function which would do the copying needs to be defined and then a transform to that function, something like:

(deftransform replace ((string1 string2) (simple-string simple-string) simple-string)
`(replace-simple-strings string1 string2))

A macro for defining for all the combinations of simple-string with simple-base-string and (simple-array character (*)) could be useful.

Revision history for this message

Paul Khuong (pvk) wrote on 2013-04-16:

#3

Not quite. The problem is that simple-string is an union type: it could be a simple-base-string or a (simple-array character 1) (or a simple-array nil, but that's not that important). Without additional runtime type dispatch, DEFINE-ONE-TRANSFORM will have to call generic array accessors.

The best way is probably to have code that does something like

(typecase string1
  (simple-base-string
   (typecase string2
     (simple-base-string
      (replace ...))
     ((simple-array character 1)
      (replace ...))
     (t ; simple-array nil
      (error ...))))
  ((simple-array character 1)
   (typecase string2
     (simple-base-string
      (replace ...))
     ((simple-array character 1)
      (replace ...))
     (t
      (error ...))))
  (t
   (error ...)))

I'm pretty sure I would rather have this in an out of line function (let's call it %replace-simple-strings).

The problem is that a transform that substitutes a call to %replace-simple-strings instead of regular REPLACE might trigger when a more specific transform would. I see two and a half complementary measures:

1. transforms that are defined last are executed first, so we want to define the one that calls to %replace-simple-strings before the define-replace-transforms block;
2 a) more specific transforms might only trigger after constraint propagation has tightened derived types; DELAY-IR1-TRANSFORM/:CONSTRAINT can help there;
2 b) add some transforms to specialise calls to %replace-simple-strings like we currently do for calls to REPLACE.

Defining new out-of-line functions is a bit of work: you have to find a good spot to put the definition (in this case, src/code/seq.lisp would make sense), edit package definitions to export the symbol (see package-data-list.lisp-expr), and then create the transform. However, that's only needed for a fresh build. Nearly all the development can be done on a live image, including (re)defining new transforms [*] that expand into calls to CL-USER::%REPLACE-SIMPLE-STRINGS or whatever you want.

[*] For that to work well, you want to write a docstring. If the types and docstring match, the previous definition will be overwritten.

Now, particularly if one of the strings is more specifically typed (e.g. one is a simple-string but the other a simple-base-string), it might be worth it to inline the typecase. If you do that, the transform might trigger recursively. Simply adding a call to DELAY-IR1-TRANSFORM would work. A wordier and slightly less modular approach is to directly emit the code that would be inlined instead of the REPLACE forms. Factoring a chunk out of make-replace-transform will probably be necessary then. The advantage is that the transform won't slow down compilation as much and will work even if contraint propagation is eventually disabled.

I also noticed that these optimisations only happen for simple-arrays. There's probably a decent win to be had by handling potentially non-simple arrays via with-array-data, but that's completely a different project.