Tensorflow model converted not in good shape. Help?

Hi,

I converted my first model to TVM. However I have some issues and questions.

  1. I still don’t get the target and target_host event looking at the docs I don’t get the full meaning.
    https://docs.tvm.ai/api/python/build.html

  2. My model has dynamic inputs however I had to make the fixed because the TVM was complaining. How can I make dynamic graphs in TVM?

tf.placeholder(tf.float32, [None, None, 3], name=‘input’)

  1. After given a fixed input I was able to convert it using graph.as_graph_def(add_shapes=True) (I had to add the shapes) however I have some warnings

cannot evaluate set type Cast
cannot find config for target llvm, workload=(‘conv2d’

Further investigation has driven to me that I have to specifically fine tune it on my own. Is that right?

The model runs in 35secs which is 3.6x slower than tensorflow

here is my code:

import nnvm
import nnvm.compiler
import nnvm.testing.tf
import tvm

x0                  = img

target              = 'llvm -mcpu=core-avx2'
target_host         = 'llvm'
layout              = None
ctx                 = tvm.cpu(0)

## nnvm graph
graph_def           = nnvm.testing.tf.ProcessGraphDefParam( graph.as_graph_def(add_shapes=True) )
sym, params         = nnvm.frontend.from_tensorflow(graph_def, layout=layout)

shape_dict          = {'input': x0.shape}
dtype_dict          = {'input': 'float32'}
graph, lib, params  = nnvm.compiler.build(sym, shape=shape_dict, target=target, target_host=target_host, dtype=dtype_dict, params=params)

from tvm.contrib import graph_runtime
m           = graph_runtime.create(graph, lib, ctx)

ts          = time.time()
m.set_input('input', tvm.nd.array( x0.astype('float32') ) )
m.set_input(**params)
m.run()
tvm_output  = m.get_output(0).asnumpy()
print('tvm predict took:' + str(time.time() - ts) )
    
imsave('test_tvm.jpg', tvm_output)

1: target_host is generally llvm or stackvm which construct params before calling device specific code by interacting with driver (OpenCL, CUDA …etc.). Where as the target is the complete list including llvm & stackvm. Generally when target is not llvm or stackvm we need to specify target_host.

2: Using relay (NNVMv2) we should be able to compile models with dynamic shapes. Tensorflow frontend for relay is yet to be merged and also the dynamic shape is untested now. We may need to wait a while here.

3: Yes, using AutoTVM and tuning further improves the performance.

@srkreddy1238 thanks for the reply!

I have tried to change the conv2d following the docs, however my net is NWHC which is pretty common in TF.

I have an error that says:
assert layout == NCWH, only support NCHW currently

so… game over for me no? I have done all the possible optimisations?

If I can’t do any further optimisation the model is 3.6X times slower than tensorflow not even using AVX2. That can happen?

Here is the code:

import nnvm
import nnvm.compiler
import nnvm.testing.tf
import tvm
from tvm import autotvm
from tvm.autotvm.tuner import XGBTuner, GATuner, RandomTuner, GridSearchTuner
from tvm.contrib import util

def tune_kernels(tasks):

    for i, tsk in enumerate(tasks):
        prefix = "[Task %2d/%2d] " % (i+1, len(tasks))

        # converting conv2d tasks to conv2d_NCHWc tasks
        op_name = tsk.workload[0]
        if op_name == 'conv2d':
            func_create = 'topi_x86_conv2d_NHWCc'
        elif op_name == 'depthwise_conv2d_nhwc':
            func_create = 'topi_x86_depthwise_conv2d_NHWCc_from_nhwc'
        else:
            raise ValueError("Tuning {} is not supported on x86".format(op_name))

        task = autotvm.task.create(func_create, args=tsk.args,
                                   target=target, template_key='direct')
        task.workload = tsk.workload

        tuner_obj = GridSearchTuner(task)
        # do tuning
        n_trial=len(task.config_space)
        tuner_obj.tune(n_trial=n_trial)

x0                  = img

target              = 'llvm -mcpu=core-avx2'
target_host         = 'llvm'
layout              = None
ctx                 = tvm.cpu(0)

## nnvm graph
graph_def           = nnvm.testing.tf.ProcessGraphDefParam( graph.as_graph_def(add_shapes=True) )
sym, params         = nnvm.frontend.from_tensorflow(graph_def, layout=layout)
tasks               = tasks = autotvm.task.extract_from_graph(sym, target=target,
                                            shape={'input': x0.shape}, dtype='float32',
                                            symbols=(nnvm.sym.conv2d,))
tune_kernels(tasks)

Try passing layout=‘NCHW’ for nnvm.frontend.from_tensorflow.

This will make all convolutions to be of NCHW layout. This option I added to make TF models run on CUDA but can be used on llvm as well. I am not sure if the transpose may neutralise the tuning effect, but worth giving a try.

Hi, do you fix your problem? i meet the same problem and try to use auto tvm, but it not works too. if you make auto tvm works, can you share your code to me? thx a lot.