To my understanding, once trampoline touched upper halves of YMM registers ALL future switches between AVX and SSE require time consuming store/restore operation (i. e. all future calls to pow will suffer). Touching upper halves sets somewhat like a dirty flag (which forces cpu to do store/restore) and this flag never gets dropped during the whole program execution. That's why impact is so serious.
I was able to reproduce the issue using f25 live cd. So it looks like a cpu model depending issue. We were able to repro on E5-1630 (haswell) though.
To my understanding, once trampoline touched upper halves of YMM registers ALL future switches between AVX and SSE require time consuming store/restore operation (i. e. all future calls to pow will suffer). Touching upper halves sets somewhat like a dirty flag (which forces cpu to do store/restore) and this flag never gets dropped during the whole program execution. That's why impact is so serious.
I was able to reproduce the issue using f25 live cd. So it looks like a cpu model depending issue. We were able to repro on E5-1630 (haswell) though.