[TOPI] Question about inference

CASS_choi · January 3, 2020, 1:48pm

Hi!!

I have a question about the TOPI.

if we infer network, we make layer and flow parameter/input data. and we can optimize them by scheduling several technique.

for example if we have a network like below

%0 = conv2d
%1 = relu(%0)
%2 = pool(%1)

we can fuse conv2d+relu. so when i do this with TOPI

with tvm.target.create('cuda'):
    L1 = topi.nn.conv2d( ... )
    L2 = topi.nn.relu( L1 )
    sch = topi.generic.schedule_conv2d_nchw(L2)
conv_relu = tvm.build(sch, args , 'cuda' , 'llvm' )

with tvm.target.create('cuda'):
    L1 = topi.nn.pool( ... )
    sch = topi.generic.schedule_pool(L1)
pool = tvm.build(sch ,args, 'cuda' , 'llvm')

## Inference
conv_relu( Args1,output )
pool ( Args2,result) 
print(result.asnumpy())
## get Data from device

If I run that code it works fine. but I’m wondering if it’s right to run two modules sequentially. In Relay, conv_relu and pool module are executed in a single module. However, TOPI executes two modules as above code.

What I want to ask is, there are performance difference between running with multiple modules( like TOPI ) and running with single modules( like Relay )?