QEMU

Bug #1841592
Comment #4

Comment 4 for bug 1841592

Revision history for this message

Richard Henderson (rth) wrote on 2019-09-12:

It should be a fused multiply add; you may need to use -ffast-math or
something to get the compiler to generate the proper instruction.

However, one can see from target/ppc/translate/fp-impl.inc.c:

/* fmadd - fmadds */
GEN_FLOAT_ACB(madd, 0x1D, 1, PPC_FLOAT);

through to _GEN_FLOAT_ACB:

    gen_helper_f##op(t3, cpu_env, t0, t1, t2); \
    if (isfloat) { \
        gen_helper_frsp(t3, cpu_env, t3); \
    } \

That right there is a double-precision fma followed by a round
to single precision. This pattern is replicated for all single
precision operations, and is of course wrong.

I believe that correct results may be obtained by having
single-precision helpers that first convert the double-precision
input into a single-precision input using helper_tosingle(),
perform the required operation, then convert the result back to
double-precision using helper_todouble().

The manual says:

# For single-precision arithmetic instructions, all input values
# must be representable in single format; if they are not, the
# result placed into the target FPR, and the setting of
# status bits in the FPSCR and in the Condition Register
# (if Rc=1), are undefined.

The tosingle/todouble conversions are exact and bit-preserving.
They are used by load-single and store-single that convert a
single-precision in-memory value to the double-precision register
value. Therefore the input given to float32_add using this
conversion would be exactly the same as if we had given the
value unmollested from a memory input.

I don't know what real ppc hw does -- whether it takes all of
the double-precision input bits and rounds to 23-bits, like the
old 80387 hardware does, or truncates the input as I propose.
But for architectural results we don't have to care, because
of the UNDEFINED escape clause.