2023-11-09 00:48:00 |
Dan Lewis |
bug |
|
|
added bug |
2023-11-09 00:54:35 |
Dan Lewis |
tags |
65-bit compare |
64-bit compare |
|
2023-11-09 15:51:21 |
Dan Lewis |
description |
ARM GCC 13.2.0 compiles 64-bit integer compares such as the following:
int32_t Compare64(int64_t x64, int64_t y64)
{
return (x64 <= y64) ;
}
into a sequence of ARM Thumb instructions similar to:
Compare64:
CMP R0,R2
SBCS R1,R1,R3
ITE LE
MOVLE R0,#1
MOVGT R0,#0
BX LR
However, this produces an incorrect result for x64 = 6 and y64 = 5. The problem is that although the CMP, SBCS sequence leaves the correct values in the N, V, and C flags, the Z flag only depends on bits 63..32 of the result and ignores bits 31..00. Any 64-bit compare must correct the value of the Z flag before making a decision; the following is a functionally correct solution using straight-line code:
Compare64:
SUBS R12,R0,R2 // R12 used as a scratch register
MRS R0,APSR // preserves Z flag from bits 32..00
SBCS R12,R1,R3 // R12 used as a scratch register
// N, V and C are correct at this point, but Z is not.
// The following code corrects the Z flag:
MRS R1,APSR // get flags from bits 63..32
ORN R0,R0,1 << 30 // isolate Z flag from bits 31..00
AND R0,R0,R1 // combine with Z from bits 63..32
MSR APSR_nzcvq,R0 // Z is now 1 iff bits 63..00 are all zeroes
// Now all flags are correct and any condition code can be used
ITE LE
MOVLE R0,#1
MOVGT R0,#0
BX LR |
ARM GCC 13.2.0 compiles 64-bit integer compares such as the following:
int32_t Compare64(int64_t x64, int64_t y64)
{
return (x64 <= y64) ;
}
into a sequence of ARM Thumb instructions similar to:
Compare64:
CMP R0,R2
SBCS R1,R1,R3
ITE LE
MOVLE R0,#1
MOVGT R0,#0
BX LR
However, this produces an incorrect result for x64 = 6 and y64 = 5. The problem is that although the CMP, SBCS sequence leaves the correct values in the N, V, and C flags, the Z flag only depends on bits 63..32 of the result and ignores bits 31..00. Any 64-bit compare that depends on the Z flag (EQ, NE, GT, LE, HI, and LS) must correct the Z flag before making a decision; 64-bit compares that do not depend on the Z flag (GE, LT, HS, LO, MI, PL, VS, VC, AL) do not need this change. The following is a functionally correct solution using straight-line code:
Compare64:
SUBS R12,R0,R2 // R12 used as a scratch register
MRS R0,APSR // preserves Z flag from bits 32..00
SBCS R12,R1,R3 // R12 used as a scratch register
// N, V and C are correct at this point, but Z is not.
// The following code corrects the Z flag:
MRS R1,APSR // get flags from bits 63..32
ORN R0,R0,1 << 30 // isolate Z flag from bits 31..00
AND R0,R0,R1 // combine with Z from bits 63..32
MSR APSR_nzcvq,R0 // Z is now 1 iff bits 63..00 are all zeroes
// Now all flags are correct and any condition code can be used
ITE LE
MOVLE R0,#1
MOVGT R0,#0
BX LR |
|
2023-11-09 16:00:09 |
Dan Lewis |
description |
ARM GCC 13.2.0 compiles 64-bit integer compares such as the following:
int32_t Compare64(int64_t x64, int64_t y64)
{
return (x64 <= y64) ;
}
into a sequence of ARM Thumb instructions similar to:
Compare64:
CMP R0,R2
SBCS R1,R1,R3
ITE LE
MOVLE R0,#1
MOVGT R0,#0
BX LR
However, this produces an incorrect result for x64 = 6 and y64 = 5. The problem is that although the CMP, SBCS sequence leaves the correct values in the N, V, and C flags, the Z flag only depends on bits 63..32 of the result and ignores bits 31..00. Any 64-bit compare that depends on the Z flag (EQ, NE, GT, LE, HI, and LS) must correct the Z flag before making a decision; 64-bit compares that do not depend on the Z flag (GE, LT, HS, LO, MI, PL, VS, VC, AL) do not need this change. The following is a functionally correct solution using straight-line code:
Compare64:
SUBS R12,R0,R2 // R12 used as a scratch register
MRS R0,APSR // preserves Z flag from bits 32..00
SBCS R12,R1,R3 // R12 used as a scratch register
// N, V and C are correct at this point, but Z is not.
// The following code corrects the Z flag:
MRS R1,APSR // get flags from bits 63..32
ORN R0,R0,1 << 30 // isolate Z flag from bits 31..00
AND R0,R0,R1 // combine with Z from bits 63..32
MSR APSR_nzcvq,R0 // Z is now 1 iff bits 63..00 are all zeroes
// Now all flags are correct and any condition code can be used
ITE LE
MOVLE R0,#1
MOVGT R0,#0
BX LR |
ARM GCC 13.2.0 compiles 64-bit integer compares such as the following:
int32_t Compare64(int64_t x64, int64_t y64)
{
return (x64 <= y64) ;
}
into a sequence of ARM Thumb instructions similar to:
Compare64:
CMP R0,R2
SBCS R1,R1,R3
ITE LE
MOVLE R0,#1
MOVGT R0,#0
BX LR
However, this produces an incorrect result for x64 = 6 and y64 = 5. The problem is that although the CMP, SBCS sequence leaves the correct values in the N, V, and C flags, the Z flag only depends on bits 63..32 of the result and ignores bits 31..00. The following is a functionally correct solution using straight-line code.
Compare64:
CMP R0,R2
MRS R0,APSR // preserves Z flag from bits 32..00
SBCS R1,R1,R3
// N, V and C are correct at this point, but Z is not. Any 64-bit
// compare that depends on the Z flag (EQ,NE,GT,LE,HI,LS) must
// correct the Z flag before making a decision. 64-bit compares
// that do not depend on the Z flag (GE,LT,HS,LO,MI,PL,VS,VC,AL)
// do not require this correction or the MRS instruction above.
// The following code corrects the Z flag:
MRS R1,APSR // get flags from bits 63..32
ORN R0,R0,1 << 30 // isolate Z flag from bits 31..00
AND R0,R0,R1 // combine with Z from bits 63..32
MSR APSR_nzcvq,R0 // Z is now 1 iff bits 63..00 are all zeroes
// Now all flags are correct and any condition code can be used
ITE LE
MOVLE R0,#1
MOVGT R0,#0
BX LR |
|
2023-11-09 20:03:27 |
Dan Lewis |
description |
ARM GCC 13.2.0 compiles 64-bit integer compares such as the following:
int32_t Compare64(int64_t x64, int64_t y64)
{
return (x64 <= y64) ;
}
into a sequence of ARM Thumb instructions similar to:
Compare64:
CMP R0,R2
SBCS R1,R1,R3
ITE LE
MOVLE R0,#1
MOVGT R0,#0
BX LR
However, this produces an incorrect result for x64 = 6 and y64 = 5. The problem is that although the CMP, SBCS sequence leaves the correct values in the N, V, and C flags, the Z flag only depends on bits 63..32 of the result and ignores bits 31..00. The following is a functionally correct solution using straight-line code.
Compare64:
CMP R0,R2
MRS R0,APSR // preserves Z flag from bits 32..00
SBCS R1,R1,R3
// N, V and C are correct at this point, but Z is not. Any 64-bit
// compare that depends on the Z flag (EQ,NE,GT,LE,HI,LS) must
// correct the Z flag before making a decision. 64-bit compares
// that do not depend on the Z flag (GE,LT,HS,LO,MI,PL,VS,VC,AL)
// do not require this correction or the MRS instruction above.
// The following code corrects the Z flag:
MRS R1,APSR // get flags from bits 63..32
ORN R0,R0,1 << 30 // isolate Z flag from bits 31..00
AND R0,R0,R1 // combine with Z from bits 63..32
MSR APSR_nzcvq,R0 // Z is now 1 iff bits 63..00 are all zeroes
// Now all flags are correct and any condition code can be used
ITE LE
MOVLE R0,#1
MOVGT R0,#0
BX LR |
ARM GCC 13.2.0 compiles 64-bit integer compares such as the following:
int32_t Compare64(int64_t x64, int64_t y64)
{
return (x64 <= y64) ;
}
into a sequence of ARM Thumb instructions similar to:
Compare64:
CMP R0,R2
SBCS R1,R1,R3
ITE LE
MOVLE R0,#1
MOVGT R0,#0
BX LR
However, this produces an incorrect result for x64 = 6 and y64 = 5. The problem is that although the CMP, SBCS sequence leaves the correct values in the N, V, and C flags, the Z flag only depends on bits 63..32 of the result and ignores bits 31..00. The following is a functionally correct solution using straight-line code.
Compare64:
CMP R0,R2
MRS R0,APSR // preserves Z flag from bits 32..00
SBCS R1,R1,R3
// N, V and C are correct at this point, but Z is not. Any 64-bit
// compare that depends on the Z flag (EQ,NE,GT,LE,HI,LS) must
// correct the Z flag before making a decision. 64-bit compares
// that do not depend on the Z flag (GE,LT,HS,LO,MI,PL,VS,VC,AL)
// do not require this correction or the MRS instruction above.
// The following code corrects the Z flag:
MRS R1,APSR // get flags from bits 63..32
ORN R0,R0,1 << 30 // isolate Z flag from bits 31..00
AND R0,R0,R1 // combine with Z from bits 63..32
MSR APSR_nzcvq,R0 // Z is now 1 iff bits 63..00 are all zeroes
// Now all flags are correct and any condition code can be used
ITE LE
MOVLE R0,#1
MOVGT R0,#0
BX LR
Note: If the comparison is simply for zero (EQ) or non-zero (NE), then a simpler solution would be something like the following:
...
SUBS R0,R0,R2 // Need to compute the 64-bit difference
SBC R1,R1,R3
ORRS R0,R0,R1 // Z=1 iff bits 63..00 are all zeroes.
ITE EQ
... |
|
2023-11-28 20:14:07 |
Dan Lewis |
gcc-arm-embedded: status |
New |
Invalid |
|
2023-11-28 20:14:07 |
Dan Lewis |
converted to question |
|
708566 |
|