How to use AutoTVM with manually created TOPI computations?

Hi all,

I’m confused about how to use AutoTVM on manually created TOPI computations, such as:

import tvm
import topi
from tvm import autotvm

data = tvm.placeholder((128, 64, 224, 224))
kernel = tvm.placeholder((32, 64, 5, 5))

conv = topi.nn.conv2d(data, kernel, strides=1, padding=2, dilation=1)
out = topi.nn.relu(conv)

task = autotvm.task.create(topi.generic.nn.schedule_conv2d_nchw,
                           args=(out,),
                           target='cuda')
print(task.config_space)

measure_option = autotvm.measure_option(
    builder=autotvm.LocalBuilder(),
    runner=autotvm.LocalRunner(repeat=3, min_repeat_ms=100, timeout=4)
)

tuner = autotvm.tuner.XGBTuner(task)
tuner.tune(n_trial=20,
           measure_option=measure_option,
           callbacks=[autotvm.callback.log_to_file('conv2d.log')])

The code throws error:

Traceback (most recent call last):

  File "test.py", line 34, in <module>
    target='cuda')

  File "/opt/conda/lib/python3.6/site-packages/tvm-0.6.dev0-py3.6-linux-x86_64.egg/tvm/autotvm/task/task.py", line 191, in create
    sch, _ = func(*args)

  File "</opt/conda/lib/python3.6/site-packages/decorator.py:decorator-gen-73>", line 2, in schedule_conv2d_nchw

  File "/opt/conda/lib/python3.6/site-packages/tvm-0.6.dev0-py3.6-linux-x86_64.egg/tvm/target.py", line 382, in dispatch_func
    return dispatch_dict[k](*args, **kwargs)

  File "</opt/conda/lib/python3.6/site-packages/decorator.py:decorator-gen-175>", line 2, in config_dispatcher

  File "/opt/conda/lib/python3.6/site-packages/tvm-0.6.dev0-py3.6-linux-x86_64.egg/tvm/autotvm/task/dispatcher.py", line 211, in dispatch_func
    workload = func(*args, **kwargs)

  File "/opt/conda/lib/python3.6/site-packages/tvm-0.6.dev0-py3.6-linux-x86_64.egg/tvm/autotvm/task/topi_integration.py", line 462, in config_dispatcher
    raise RuntimeError("Cannot find workload in attribute of this schedule")

RuntimeError: Cannot find workload in attribute of this schedule

Any idea on this? Your help is appreciated!

1 Like

I have the same question, how do I use AutoTVM for TOPI operations and schedulers. Here’s the code I want to optimize:

    from tvm import te
    import topi
    bsz = 24
    d1 = 16384
    d2 = 512
    d3 = 64
    A = te.placeholder((bsz, d1, d3), name='A', dtype='float32')
    B = te.placeholder((bsz, d2, d3), name='B', dtype='float32')
    R = topi.nn.batch_matmul(A, B)
    s = topi.cuda.batch_matmul.schedule_batch_matmul(R)

Thanks

First clarify the term: This question is not about “manually created TOPI computations”. This is about how to use AutoTVM to tune “your workload”. The workload refers to a certain input shapes and attributes to a certain TOPI compute function.

Based on that, we have the following statement:

task = autotvm.task.create("task name", args=(...), target='cuda')
  • “task name” is registered AutoTVM task name. You can usually find it on the decorator of the TOPI schedule function (example). Make sure you import topi (even you don’t use anything directly from it) to make this registration happen.
  • args: The arguments of the task. This is basically just the arguments to that particular schedule function.

Please note that AutoTVM is not tuning a Relay/TVM program you wrote. Instead, you only provide workloads to AutoTVM and it creates a program by itself for tuning. The tuning result is like (task, args, target, config) -> latency. As a result, when you are building your program with tuning logs, TVM compile engine will apply the tuning result to the schedule function.

In summary, the entire tuning process for a network is like follows:

  1. You create an AutoTVM task with the workload your want.
  2. Tune the task with AutoTVM and get a tuning log.
  3. Create a TVM program as you always do.
  4. Use with apply_history_best(log_file) with your build.

You can refer to this tutorial for the complete process. The difference between this tutorial and the flow I mentioned above is that this tutorial extracts workloads from a Relay program. Specifically, extract_from_program is an API that goes through a Relay program and collects tunable workloads in it. As a result, step 1 and step 2 can be simplified.

1 Like

Hi, I tried the following code:

# logging config (for printing tuning log to screen)
logging.getLogger('autotvm').setLevel(logging.DEBUG)
logging.getLogger('autotvm').addHandler(logging.StreamHandler(sys.stdout))

# the last layer in resnet
N, H, W, CO, CI, KH, KW, strides, padding = 1, 7, 7, 512, 512, 3, 3, (1, 1), (1, 1)
data = te.placeholder((N, CI, H, W), name='data')
kernel = te.placeholder((CO, CI, KH, KW), name='kernel')
conv = topi.nn.conv2d_nchw(data, kernel, strides, padding, dilation=1, out_dtype='float32')
cfg = autotvm.get_config()
task = autotvm.task.create("conv2d_nchw.cuda",
                           args=(cfg, [conv]),
                           target='cuda')
print(task.config_space)

But output is as follows:

Cannot find config for target=None, workload=None. A fallback configuration is used, which may bring great performance regression.
Traceback (most recent call last):

  File "tune_conv2d_cuda_builtin_tmp.py", line 194, in <module>
    target='cuda')

  File "/usr/tvm/python/tvm/autotvm/task/task.py", line 406, in create
    args = serialize_args(args)

  File "/usr/tvm/python/tvm/autotvm/task/task.py", line 64, in serialize_args
    ret.append(_encode(t))

  File "/usr/tvm/python/tvm/autotvm/task/task.py", line 61, in _encode
    'primitive types or tvm.tir.Var only' % type(x))

RuntimeError: Do not support type "<class 'tvm.autotvm.task.space.FallbackConfigEntity'>" in argument. Consider to useprimitive types or tvm.tir.Var only

Could you give me some advice on this situation? Thanks in advance.

You don’t need to provide cfg in arguments. The first cfg argument in the template is handled by the template registration decorator and will be provided by the current context.

Thank you for your reply. I remove the cfg, and I get the following error.

Traceback (most recent call last):

  File "tune_conv2d_cuda_builtin_tmp_2.py", line 23, in <module>
    target='cuda')

  File "/usr/tvm/python/tvm/autotvm/task/task.py", line 418, in create
    sch, _ = ret.func(*args)

  File "/usr/tvm/python/tvm/autotvm/task/task.py", line 209, in __call__
    return self._default_func(*args, **kwargs)

  File "/usr/tvm/python/tvm/autotvm/task/task.py", line 215, in _default_func
    out = self.fcompute(*args, **kwargs)

  File "/usr/tvm/python/tvm/autotvm/task/topi_integration.py", line 155, in wrapper
    node = topi_compute(cfg, *args)

TypeError: conv2d_nchw() missing 4 required positional arguments: 'kernel', 'strides', 'padding', and 'dilation'

The code now is:

import logging
import sys
import numpy as np

import tvm
from tvm import te
import topi
from topi.testing import conv2d_nchw_python

from tvm import autotvm

logging.getLogger('autotvm').setLevel(logging.DEBUG)
logging.getLogger('autotvm').addHandler(logging.StreamHandler(sys.stdout))

# the last layer in resnet
N, H, W, CO, CI, KH, KW, strides, padding = 1, 7, 7, 512, 512, 3, 3, (1, 1), (1, 1)
data = te.placeholder((N, CI, H, W), name='data')
kernel = te.placeholder((CO, CI, KH, KW), name='kernel')
conv = topi.nn.conv2d_nchw(data, kernel, strides, padding, dilation=1, out_dtype='float32')
#cfg = autotvm.get_config()
task = autotvm.task.create("conv2d_nchw.cuda",
                           args=([conv],),
                           target='cuda')
print(task.config_space)

As the error message suggested, you need to specify all required arguments. You still miss 4 of them.

But according to the schedule definition:

we only need one parameter outs here.

You are filling the compute, so the compute function at line 30 is what you should look at.

You mean for this line

in my code, the task_name "conv2d_nchw.cuda" here is actually the name for the compute rather than schedule?

Not exactly. The task name corresponds to a pair of compute and schedule functions. When creating a task (or even a TVM program), we always create a compute first, and create a schedule accordingly, so does AutoTVM.

I change my code as follows:

import logging
import sys
import numpy as np

import tvm
from tvm import te
import topi
from topi.testing import conv2d_nchw_python

from tvm import autotvm

logging.getLogger('autotvm').setLevel(logging.DEBUG)
logging.getLogger('autotvm').addHandler(logging.StreamHandler(sys.stdout))

# the last layer in resnet
N, H, W, CO, CI, KH, KW, strides, padding = 1, 7, 7, 512, 512, 3, 3, (1, 1), (1, 1)
data = te.placeholder((N, CI, H, W), name='data')
kernel = te.placeholder((CO, CI, KH, KW), name='kernel')
#conv = topi.nn.conv2d_nchw(data, kernel, strides, padding, dilation=1, out_dtype='float32')
#cfg = autotvm.get_config()
task = autotvm.task.create("conv2d_nchw.cuda",
                           args=(data, kernel, strides, padding, 1, 'float32'),
                           target='cuda')
print(task.config_space)

Now it runs well. I have another question: in the Tuning High Performance Convolution on NVIDIA GPUs tutorial, the tuned operator is built by:

# apply history best from log file
with autotvm.apply_history_best('conv2d.log'):
    with tvm.target.create("cuda"):
        s, arg_bufs = conv2d_no_batching(N, H, W, CO, CI, KH, KW, strides, padding)
        func = tvm.build(s, arg_bufs)

How can I build the operator when I tune it with AutoTVM? Thanks a lot.

Also, could you please help me on this question? I am also trying to build a module with only one operator using tvm and use AutoTVM to tune it. Your help will be greatly appreciated.