When I run an example, it throws an error
codegen/llvm/codegen_llvm.cc:692: unknown intrinsic signed_integer_overflow
when the batch size is large. The overflow occurs during simplification.
Is this expected behavior or do we need to handle overflow here?
How to reproduce:
Set batch size = 128 in https://github.com/dmlc/tvm/blob/be77cf1963292b018cdf241c595955ab4b3b5f44/tutorials/autotvm/tune_nnvm_cuda.py#L209
and then run this tutorial.
cc @tqchen