Bad code generated for 64-bit compare
This bug report was converted into a question: question #708566: Bad code generated for 64-bit compare.
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
GNU Arm Embedded Toolchain |
Invalid
|
Undecided
|
Unassigned |
Bug Description
ARM GCC 13.2.0 compiles 64-bit integer compares such as the following:
int32_t Compare64(int64_t x64, int64_t y64)
{
return (x64 <= y64) ;
}
into a sequence of ARM Thumb instructions similar to:
Compare64:
CMP R0,R2
SBCS R1,R1,R3
ITE LE
MOVLE R0,#1
MOVGT R0,#0
BX LR
However, this produces an incorrect result for x64 = 6 and y64 = 5. The problem is that although the CMP, SBCS sequence leaves the correct values in the N, V, and C flags, the Z flag only depends on bits 63..32 of the result and ignores bits 31..00. The following is a functionally correct solution using straight-line code.
Compare64:
CMP R0,R2
MRS R0,APSR // preserves Z flag from bits 32..00
SBCS R1,R1,R3
// N, V and C are correct at this point, but Z is not. Any 64-bit
// compare that depends on the Z flag (EQ,NE,GT,LE,HI,LS) must
// correct the Z flag before making a decision. 64-bit compares
// that do not depend on the Z flag (GE,LT,
// do not require this correction or the MRS instruction above.
// The following code corrects the Z flag:
MRS R1,APSR // get flags from bits 63..32
ORN R0,R0,1 << 30 // isolate Z flag from bits 31..00
AND R0,R0,R1 // combine with Z from bits 63..32
MSR APSR_nzcvq,R0 // Z is now 1 iff bits 63..00 are all zeroes
// Now all flags are correct and any condition code can be used
ITE LE
MOVLE R0,#1
MOVGT R0,#0
BX LR
Note: If the comparison is simply for zero (EQ) or non-zero (NE), then a simpler solution would be something like the following:
...
SUBS R0,R0,R2 // Need to compute the 64-bit difference
SBC R1,R1,R3
ORRS R0,R0,R1 // Z=1 iff bits 63..00 are all zeroes.
ITE EQ
...
tags: |
added: 64-bit removed: 65-bit |
description: | updated |
description: | updated |
description: | updated |
Sorry, the code generated by the compiler is actually correct. I had missed an important point:
For >= and <, the compiler uses the condition code GE and LT (which do NOT depend on the Z flag)
For <= and >, the compiler does NOT use LE or GT (which require testing the invalid Z flag. Instead it reverses the operands and uses GE and LT).
I.e., testing for x <= y is equivalent to testing for y >= x, and x > y is equivalent to y < x.
Dan