Looking at the benchmark results posted, I think "Digging into various flags found that adding ‘-fschedule-insns’, ‘-fschedule-insns2’ are causing improvement in performance." should actually be -fno-schedule-insns and -fno-schedule-insns2?
Do the results change if you add -mcpu=cortex-a15 -mtune=cortex-a15? There's a couple of differences between A15 and generic v7-a that might effect instruction scheduling
Looking at the benchmark results posted, I think "Digging into various flags found that adding ‘-fschedule-insns’, ‘-fschedule-insns2’ are causing improvement in performance." should actually be -fno-schedule-insns and -fno-schedule- insns2?
Do the results change if you add -mcpu=cortex-a15 -mtune=cortex-a15? There's a couple of differences between A15 and generic v7-a that might effect instruction scheduling