How do you create a task for auto-tuning (CUDA) from relay.nn.conv2d?

To create a task using relay.nn.conv2d for CPU (x86) you can do:

func_create = 'topi_x86_conv2d_NCHWc'
args= (('TENSOR', data_shape, 'float32'), ('TENSOR', kernel_shape, 'float32'), strides, padding,dilation, 'NCHW', 'float32')
workload = ('conv2d', data_shape_type, kernel_shape_type, strides, padding, dilation, 'NCHW', 'float32') # Workload for the task
task = autotvm.task.create(func_create, args=args,target=target, template_key='direct')

This workflow was followed in the previous version of the x86 auto-tuning tutorial for CNN.

However, for my application, I would like to create a “topi_nn_conv2d” task instead (as the ones in the tutorial:

https://docs.tvm.ai/tutorials/autotvm/tune_relay_cuda.html#sphx-glr-tutorials-autotvm-tune-relay-cuda-py

Something of the form

[Task(func_name=topi_nn_conv2d, args=(('TENSOR', (1, 3, 227, 227), 'float32'), ('TENSOR', (96, 3, 11, 11), 'float32'), (4, 4), (0, 0, 0, 0), (1, 1), 'NCHW', 'float32'), kwargs={}, workload=('conv2d', (1, 3, 227, 227, 'float32'), (96, 3, 11, 11, 'float32'), (4, 4), (0, 0, 0, 0), (1, 1), 'NCHW', 'float32'))]

If I use “func_create = topi_nn_conv2d” in the code above, it throws an error (likely because “topi_nn_conv2d” has not been defined as a task, such as ‘topi_x86_conv2d_NCHWc’).

I was wondering if somebody knows of a simple way to create a task of type ‘topi_nn_conv2d’ to do auto-tuning in the GPU

The simplest solution is creating a module with only one conv2d op, and use task extraction to get the corresponding tuning task. Otherwise, you have to import topi (whatever you use it or not) to make all decorators working to register TOPI schedules.

1 Like

Hi @comaniac,

Thank you for your help, I really appreciate it. I have taken what I believe is a similar approach to what you describe above. Here is my code:

ctx = tvm.gpu()

out = relay.nn.conv2d(data, kernel, strides=strides, padding=padding, dilation=dilation, channels = kernel_shape[0], kernel_size = kernel_size, data_layout='NCHW', out_dtype=dtype)

mod = relay.Module.from_expr(out)

kernel_weights = tvm.nd.array(np.ones(kernel_shape, dtype=dtype), ctx)

dict_params = {'kernel': kernel_weights}

task = autotvm.task.extract_from_program(mod["main"], target=target, params=dict_params, ops=(relay.op.nn.conv2d,))

This approach gives me a task of type topi_nn_conv2d. However, the variable task above does not have the attributes “target” and “workload”. Do you know of a way to create them? Doing task.target = 'cuda' returns

AttributeError: 'list' object has no attribute 'target'

The attribute “target” is required to send the task to the auto-tuner, which is why it is important in my application.

I really appreciate any help you can provide regarding this issue

  1. task target is determined when extracting from the program, because the same conv2d will result in different tuning tasks on different targets. Thus, the target of your tasks are the same as you put in extract_from_program(..., target=target, ...).

  2. task is a list so you should use task[0] to access a task, but it’s invalid to use task[0].target = ... to change the task target. Same reason as the above point.

1 Like

Thanks a lot for your reply @comaniac. I used task[0] instead of task as:

XGBTuner(task[0])

and the auto-tuner worked. In addition, the values in task[0].target and task[0].workload are set correctly :slight_smile:

One follow-up question I have @comaniac, if you would be so kind, is that I am trying to call a module (created from a relay conv2d as above) from C++, following the example in:

https://docs.tvm.ai/deploy/cpp_deploy.html

I have exported the library as:

graph, lib, params = relay.build_module.build(mod, params = dict_params, target=target)
lib.export_library("conv2d_cpu_opt.so")

and when I do print(mod) I get the following

def @main(%data: Tensor[(1, 3, 227, 227), float32], %kernel: Tensor[(96, 3, 11, 11), float32]) -> Tensor[(1, 96, 55, 55), float32] {
  nn.conv2d(%data, %kernel, strides=[4, 4], padding=[0, 0, 0, 0], channels=96, kernel_size=[11, 11], out_dtype="float32") /* ty=Tensor[(1, 96, 55, 55), float32] */
}

I am looking for the function signature, since I need it to pass the right arguments in C++.

From the print(mod) statement, can I infer that my function name is “main”, to import it like:

my_func = mod.GetFunction(“main”)

and that the function signature would be something like my_func(data,kernel,output) or maybe output = my_func(data,kernel)?

How would you go about finding the signature and function name in this case?

I don’t have much experience of running module in C++, but as the example shows your function name should be main, and the function signature should be like my_func(data,kernel,output). Note that all arguments are DLTensor.

1 Like

Thanks a lot for your response. Right now when using the library in C++ I am facing a problem since the function

f = mod.GetFunction(“main”)

is returning a null pointer, so the CHECK throws an error saying

terminate called after throwing an instance of 'dmlc::Error'
  what():  [11:38:05] conv2d_deploy_cpu.cc:43: Check failed: f != nullptr: 

Gonna go ahead and post this issue as a question in case somebody has done something similar in the past. Thanks again for all your help!

Hi, I am trying to build a module of one op as you do, but I am not familiar with tvm. Could you share your code of the above example with me? Thanks a lot.