I see - llc seems to have different command line options to clang and has an interface that seems to allow users to add and remove options.
I can see that from the help output.
- vfpv4 implies add VFPv4 instructions which for me is scalar FMA instructions.
- neon implies add Neon instructions which to me refers to the original Advanced SIMD / Neon instructions as per the original Armv7-A instruction set.
- neon-vfpv4 implies add Neon instructions which came in with the Neon unit that came in with VFPv4. Which implies for all practical purposes vectorized FMA instructions.
Thus for me given my knowledge of the ISA, -mattr=+neon,vfpv4 is confusing because it’s not obvious whether this is 1+2 above (I’m not aware of any actual implementation like this) or #3 . Trying out clang suggests this is actually #3 above.
Further the combination of -mfloat-abi=soft with this is more confusing because in other places -mfloat-abi=soft actually means use of software floating point emulation and essentially means don’t emit any actual fp or simd instructions.
If you were targeting the Cortex-A53 or really an Armv8-A cpu that supported AArch32 mode, I suspect what you need is -march=armv8-a -mattr=+neon,fp-armv8,thumb or some such as you’d get additional rounding instructions that came in armv8.
regards
Ramana