ternary optimization improves with cast to prvalue
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
GNU Arm Embedded Toolchain |
New
|
Undecided
|
Unassigned |
Bug Description
As described here:
https:/
The cast is converted the types to prvalues which is improving the results. The same code using uint32_t instead of mytype doesn't require the cast. It appears that there is an optimization miss in this situation.
Here's the original text of the post:
In the code below, when defining WITH_CAST, the results of the compilation are significantly improved (with identical results in my larger codebase). The cast performed appears to be superfluous. I am running this within Keil 5.25pre2 (only as a simulator). I've used Keil simulator to check performance speed, by looking at what the t1 timer shows in terms of micro-seconds passed.
Snippet from code:
#if defined (WITH_CAST)
#define MAX(a,b) (((a) > (b)) ? (decltype(a)(a)) : (decltype(b)(b)))
#else
#define MAX(a,b) (((a) > (b)) ? ((a)) : ((b)))
#endif
GNU Arm Tools Embedded v. 7 2017-q4-major.
Compiler options:
-c -mcpu=cortex-m4 -mthumb -gdwarf-2 -MD -Wall -O -mapcs-frame -mthumb-interwork -std=c++14 -Ofast -I./RTE/_Target_1 -IC:/Keil_
Assembler options:
-mcpu=cortex-m4 -mthumb --gdwarf-2 -mthumb-interwork --MD *.d -I./RTE/_Target_1 -IC:/Keil_
Linker options:
-T ./RTE/Device/
-o Optimization.elf
*.o -lm
#include <cstdlib>
#include <cstring>
#include <cstdint>
#define WITH_CAST
struct mytype {
uint32_t value;
__attribute_
return t.value > a.value;
}
};
static mytype output_buf [32];
static mytype * output_memory_ptr = output_buf;
static mytype * volatile * output_memory_tmpp = &output_memory_ptr;
static mytype input_buf [32];
static mytype * input_memory_ptr = input_buf;
static mytype * volatile * input_memory_tmpp = &input_memory_ptr;
#if defined (WITH_CAST)
#define MAX(a,b) (((a) > (b)) ? (decltype(a)(a)) : (decltype(b)(b)))
#else
#define MAX(a,b) (((a) > (b)) ? ((a)) : ((b)))
#endif
int main (void) {
const mytype * input = *input_memory_tmpp;
mytype * output = *output_
mytype p = input[0];
mytype c = input[1];
mytype pc = MAX(p, c);
output[0] = pc;
for (int i = 1; i < 31; i ++) {
mytype n = input[i + 1];
mytype cn = MAX(c, n);
output[i] = MAX(pc, cn);
p = c;
c = n;
pc = cn;
}
output[31] = pc;
}