Segmentation Fault in relay.quantize.quantize with target skylake-avx512/cascadelake

Hi,

I’m using TVM to quantize ResNet-50.

After importing model from Tensorflow Saved Model, I simply use

mod = relay.quantize.quantize(mod, params=params) to quantize model.

I was able to build and run with the target “llvm”, but got segment fault with the target “llvm -mcpu=skylake-avx512” and “llvm -mcpu=cascadelake” during relay.build.

Any help is greatly appreciated.

Thanks!

You can build the TVM with debug flag and locate the root cause.

I’m testing on a machine with an Intel ® Xeon ® Platinum 8269(Cascade Lake) CPU, and trying to evaluate the performance of TVM quantization with the acceleration of VNNI instruction.

I also tried to modify https://github.com/vinx13/tvm-cuda-int8-benchmark/tree/latest. I changed the target of to cpu without autotvm’s configuration to evaluate ResNet-50. It failed as the same of from tensorflow.

How to debug relay.build process? I only find a document about graph_runtime debugger https://docs.tvm.ai/dev/debugger.html.

Since your segment fault is very likely to take place in the intrinsics, I think it’s better to build TVM with CMAKE_BUILD_TYPE=Debug , and start your python script with

$ gdb --fullname python3
(gdb) run your_script.py
<your segment fault take place here>
(gdb) bt

The back trace command would help you locate the error.


You might interested in looking at PR #4196 .

This seems like the legalization or AlterOpLayout problem. Try with opt_level=0 and see if it passes.

I am travelling right now, so will not be able to reproduce the problem on my end quickly. You can also try 'return None` in Legalize and AlterOpLayout for conv2d and see if it passes. If it does, then the problem is definitely in one of those 2 passes.