TVM model after auto tune is not as fast as libtorch model (centernet dla34), which is twice as slow. Why?

TVM model after auto tune is not as fast as libtorch model (centernet dla34), which is twice as slow. Why?

auto tune之后的TVM模型速度还没有之前libtorch的模型(centernet dla34)速度快,慢了一倍,为什么?

The parameters are as follows

tuning_option = {

    'log_filename': log_file,
    'tuner': 'xgb',
    'n_trial': 500,
    'early_stopping': 200,
    'measure_option': autotvm.measure_option(
        builder=autotvm.LocalBuilder(timeout=10),
        runner=autotvm.LocalRunner(number=20, repeat=3, timeout=4, min_repeat_ms=150),
    ),
}

Can you try larger n_trial, such as 2000? Also increase min_repeat_ms to 1000.

There is one layer of training gflops that can’t go up all the time. Before modifying the timeout, it was 0.0 gflops. After modifying the timeout = 100, the maximum was 23.6. Other layers are normal. This layer is the first layer of convolution layer. Input the picture 1 * 3 * 608 * 608, and the convolution core is 16 * 3 * 7 * 7. I wonder if the space is too large to support? There are other reasons. It’s been three days. I don’t have a clue

有一层训练的GFLOPS一直上不去,修改timeout之前一直为0.0 GFLOPS,修改timeout=100后最大只有23.6,其它层都正常,这一层是第一层卷积层,输入图片1x3x608x608,卷积核为16x3x7x7,不知道是不是空间太大不支持?还是有可能其它原因。搞了三天了,一点头绪都没有了

What is the execution time of the best schedule for this conv2d workload?

I changed the 7 * 7 convolution of this layer to 3 * 3, which is still very slow. I have given up. It may be that the DLA network connection is too complex and various deconvolution.

I’ve replaced the basic network with resnet50, which is much more comfortable, and the training speed can go up.

我把这一层77的卷积改成33,仍然速度很慢,我已经放弃了,可能是dla网络连接过于复杂,各种反卷积。

我把基础网络替换成resnet50了,舒服多了,训一训速度就能上去。