(From update of attachment 376695)
Oh, perfect -- I actually forgot about this patch. The previous BL impl for a long offset was horrible for perf; note that nothing uses BL() any more, so can you just nuke it?
I'm not sure if the BLX impl with a 24-bit offset is correct though; BLX with an immediate offset will always exchange instruction modes, as best I can tell. I think we need to check whether we're branching to Thumb or ARM code, and emit BLX or BL as appropriate, right? Also, for the case where we need to do a LD32, make the underrunProtect(4+LD32_size) instead of hardcoding 12.
(From update of attachment 376695)
Oh, perfect -- I actually forgot about this patch. The previous BL impl for a long offset was horrible for perf; note that nothing uses BL() any more, so can you just nuke it?
I'm not sure if the BLX impl with a 24-bit offset is correct though; BLX with an immediate offset will always exchange instruction modes, as best I can tell. I think we need to check whether we're branching to Thumb or ARM code, and emit BLX or BL as appropriate, right? Also, for the case where we need to do a LD32, make the underrunProtect (4+LD32_ size) instead of hardcoding 12.
Here's the code that I had:
intptr_t offs = PC_OFFSET_ FROM((addr& ~3),_nIns- 1);
if (isS24(offs>>2)) {
// we already did an underrunProtect(4) above
if (AvmCore: :config. thumb &&
(( int32_t) addr) & 0x1 == 1)
*( --_nIns) = (NIns)( (0xFA | h)<<24 | ((offs>>2) & 0xFFFFFF) );
asm_ output( "blx %p", addr);
*( --_nIns) = (NIns)( COND_AL | (0xB<<24) | ((offs>>2) & 0xFFFFFF) );
asm_ output( "bl %p", addr); :config. thumb) {
underrunPr otect(4+ LD32_size) ;
{
// we need to branch to thumb, so emit BLX here
int32_t h = (((int32_t) addr) & 0x2) >> 1;
// BLX addr (via offs & H)
} else {
// just a normal BL
// BL addr (via offs)
}
} else {
if (AvmCore:
// BLX IP
*( --_nIns) = (NIns)( COND_AL | (0x12FFF3<<4) | IP );
asm_ output( "blx ip");
underrunPr otect(8+ LD32_size) ;
} else {
MOV(PC, IP);
MOV(LR, PC);
}
}