Which is better, TVM or tensorrt?

zoufangyu1987 · May 20, 2020, 3:24am

Which is better, TVM or tensorrt? Now someone is pushing tensorrt. What are the advantages of TVM over tensorrt?

tvm 和tensorRT到底哪个好？现在有人在推tensorRT，TVM相比tensorRT有什么优点？

xiaocenxiaocen · May 20, 2020, 1:45pm

of course, tensorrt is better.
the performance of the operator implemented by tensorrt is faster than tvm.
and tensorrt supports new gpu architecture better. Nvidia will carefully tune performance on the new architecture, and they know detail information about the low level hardware.

kevinthesun · May 20, 2020, 6:52pm

It might be better to do benchmarking for both options, especially for your model. Too many factors can affect the end to end performance.

tqchen · May 20, 2020, 9:06pm

The main advantage of TVM though it is that offers flexibility in automatic performance optimization via AutoTVM, especially for emerging workloads. For example, as tvm gains support for 3d models all of its backend, including the nvidia one would take benefit from that. See some examples here https://tvm.apache.org/2019/04/29/opt-cuda-quantized

qiang · May 21, 2020, 5:03am

tvm is far more general. and more, you know, 安可.

jackwish · June 2, 2020, 6:02am

TensorRT and TVM has many similar features. TensorRT do have automatic performance opptimization, and it’s very easy to use.

Though things may envolve, I think the major difference is that, TensorRT is dedicated for Nvidia platforms. If you are using Nvidia GPU/DLA, TensorRT is production ready mostly. Otherwise, try TVM or other software system.

There is no better I think, unless the better is defined specificly.

zoufangyu1987 · June 2, 2020, 6:34am

TensorRT == Nvidia GPU

TVM == All hardware platform

kakabubu · June 8, 2020, 9:25am

I am happy to see that some 3d ops have been added to relay. But I did not find conv3d_transpose, which is often used for volumetric semantic segmentation. Hope this problem can be fixed soon.

cszu · November 8, 2020, 9:03am

Where can we get further information? For example, the comparison between TVM0.7 (or subsequent versions) and TensorRT7. We need to determine whether the subsequent GPU-based performance optimization work is mainly based on TVM, or mainly based on TensorRT, or a combination of both. Part of our current work is based on TensorRT and is considering switching to TVM, so this information is urgently needed. Thank you very much!

mbaret · November 8, 2020, 5:19pm

You may be interested in the TensorRT TVM integration, introduced recently in this PR https://github.com/apache/incubator-tvm/pull/6395. @trevor-m will be able to give more details on the capabilities, but generally speaking it allows for parts of a graph to be run using TensorRT hopefully allowing for the best of both worlds.

junrushao · November 9, 2020, 12:47am

I agree with @mbaret. TVM now provides a nice frontend to TensorRT - subgraphs that are supported by TensorRT will be executed by TensorRT, and unsupported parts are optimized with TVM. It provides even more competitive performance gain than either one alone.

cszu · November 9, 2020, 1:52pm

@mbaret @junrushao Thank you! I have read pull 6395 and the article " Relay TensorRT Integration" in docs https://tvm.apache.org/docs/deploy/tensorrt.html. I agree with you.TVM will be powerful with TensorRT.But I need more information to prove the value of TVM to leaders. I want to promote the application of TVM in the R&D center. I think I need to do some experiments by myself.

FrozenGene · November 9, 2020, 3:20pm

The latest performance compared with tensorrt could be found in the paper of auto scheduler: https://www.usenix.org/conference/osdi20/presentation/zheng

Of course you could start to reproduce it when
https://github.com/apache/incubator-tvm/pull/6877 and https://github.com/apache/incubator-tvm/pull/6882 are merged.

cszu · November 10, 2020, 2:41pm

Thank you very much! It’s very useful for me! From the paper I can inference TVM0.6 is comparable with TRT6, and TVM0.7(or 0.8?) is faster than TRT6. I’ll do some experiments to get a deeper comparison.

FrozenGene · November 10, 2020, 3:17pm

When those two prs is merged in, you should reproduce the performance, not official tvm version(0.6, 0.7 or something) currently. Of course, these two prs will be merged in the TVM 0.8(current dev version).