Compiling Tensorflow model is slow

I am working on a Nvidia Drive PX2 with CUDA 9.0 and Tensorflow 1.9.0.

To compile Tensorflow models I execute the from_tensorflow.py script. (see https://docs.tvm.ai/tutorials/nnvm/from_tensorflow.html)

Target settings:
target = ‘cuda’
target_host = ‘llvm -target=aarch64-linux-gnu’
layout = “NCHW”
ctx = tvm.gpu(0)

I get a few warnings like this:
WARNING:autotvm:Cannot find config for target=cuda, workload=(‘conv2d’, (1, 448, 10, 10, ‘float32’), (384, 448, 3, 3, ‘float32’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float32’). A fallback configuration is used, which may bring great performance regression.
I am not sure how they affect the performance.

The top 5 results for TVM and Tensorflow are output. The results look good but the running time is slow.
As I want to do some benchmarking I measured the time for each TVM and Tensorflow when classifying an image.
Compared to Tensorflow the running time for TVM is about four times higher.

How can that be? Is it because of the warnings I get?

Yes, typically if you have a custom model we try to do autotuning to find the best implementation for each layer in your model. If no autotuning is done, we rely on a fallback implementation which may be far from ideal for your given workload.

If you want to try tuning, see this example.

1 Like

Ok, thank you.

I want to do tuning for the Tensorflow model used in the “Compile Tensorflow Models” tutorial. (see https://github.com/dmlc/web-data/tree/master/tensorflow/models/InceptionV1)

However, I am not sure how to set the parameters for the following function in “tune_nnvm_cuda.py”:

tasks = autotvm.task.extract_from_graph(net, target=target,
shape={‘data’: input_shape}, dtype=dtype,
symbols=(nnvm.sym.conv2d,))

  • “net” is the graph to tune that I get when importing the graph to NNVM
  • “target” is the setting of my target

But I am not sure how to set “shape”, “dtype” and “symbols”. I think “shape” and “dtype” depend on the model I use.
Can anyone help me on how to set these parameters?

May be this post is related

Try just setting shape to a dictionary that maps the string "data" to the input shape of your model (just a tuple should be fine). Typically you can set dtype to just "float32" to be generic. Finally symbols is usually set to either (nnvm.sym.conv2d,) or (nnvm.sym.conv2d, nnvm.sym.conv2d_transpose) depending on whether your model contains conv2d transpose operators in addition to 2d convolution. This last parameter just controls which types of operators we want to tune.