Comment 177 for bug 727064

Revision history for this message
In , torvalds (torvalds-redhat-bugs) wrote :

(In reply to comment #130)
>
> Good software _will_ notice, if it's using memcpy() deliberately, for better
> performance, and doesn't want it aliased.

Bullshit. You clearly don't know what you're talking about.

There are exactly two reasons for memmove() existing in the first place:

 (a) memcpy() hasn't necessarily handled overlapping areas, so people had to make up a new function just because you couldn't _rely_ on memcpy working right for that case.

   IOW, it's not an issue of "memcpy shouldn't handle overlapping areas automatically", but a "you can't rely on it even if it does, so for portability reasons you shouldn't".

 (b) you can make a _simpler_ memcpy() if you assume there are no overlapping areas (which is obviously the reason why memcpy originally didn't). But the "simpler" is really trivial - the traditional difference between memcpy and memmove is literally that memcpy always copied upwards, and memmove() simply just checks whether the destination address is above the source address and decides to copy backwards if so.

And quite frankly, neither excuse is valid these days for not just aliasing the two to the same thing (and make both do memmove).

There is no performance difference. The "is it overlapping" is about two small and fast instructions (no cache issues etc), depending on just how exact you want to do it (ie again, traditionally it's just that single compare and conditional jump - but you can be more precise if you want to by adding just a few more instructions).

In fact, I can pretty much guarantee that if you have a program that notices the one or two cycles of the overlapping check, then you have a program that actually prefers the _old_ glibc memcpy(), the simpler one that worked fine with the flash player too. The traditional one that always moved things upwards.

Because the whole bug was introduced by the change (did you take a look at glibc sources? I did) that made memcpy() _much_ more complicated, and now it does computed indirect jumps based on size etc. So the new-and-improved one is the one that takes a lot more cycles, exactly because it tries to handle the special cases. But then it doesn't handle the _simple_ special case of "is it overlapping?".

In short: there is simply no excuse for the current memcpy() behavior. It doesn't match historical behavior, so it breaks existing applications. And there is certainly no "simplicity" argument: if you want a simpler and more maintainable glibc code base, you just say "let's do memcpy and memmove with the same code, and not have those ugly #ifdef's and duplicate compilations".

And there is no performance argument. None. If the argument is that "memcpy()" should be as simple as possible and have minimal overhead and perform best for the case where a single cycle matters (ie everything cached, nicely aligned arguments etc), then the glibc changes should just be reverted entirely.

So either re-instate the old traditional behavior ("memcpy is simple") that at least has some _historical_ reason behind it, and that makes flash work again.

Or alternatively, just make memcpy DTRT automatically. The check for overlap is not going to be noticed. And _not_ checking for overlap obviously was noticed.