How does TVM implement X86 SIMD instructions


#1

TVM as a compiler optimized, if the target is X86, on X86 for accelerated optimization, is it necessary to use SIMD instruction? If so, how does it work? If not, how does it work。


#2

SIMD instructions such as as AVX-2/AVX-512 extensions can be targeted using schedule primitives such as vectorize. The GEMM CPU tutorial is a good example: https://docs.tvm.ai/tutorials/optimize/opt_gemm.html#sphx-glr-tutorials-optimize-opt-gemm-py


#3

ok,thank you very much;
I want to find out how is vectorize implemented,
When I look for the code definition of vectorize,found it using the founction of _api_internal.StageVectorize ,but When I open file _api_internal.py, there are only comments in it


#4

it’s registered in cpp side https://github.com/dmlc/tvm/blob/181dbd8e94222b1dad5da4c3f15b8c63facc3582/src/api/api_lang.cc#L390.
Vectorization is implemented in https://github.com/dmlc/tvm/blob/master/src/pass/vectorize_loop.cc