I have observed a precision mismatch between tensorflow CPU output and TVM with batchnorm ops for certain distribution of input (significantly variance comparison).
Observed significant Tensorflow compile options are ‘-march=native’ and -msse3.
Any advice there? How do we enable these for our LLVM target ?
Thanks, Siva