Yolov3-tiny batch input test failed


I’m trying to inference “yolov3-tiny” model with input batch_size = 4.

The input shape was (4, 3, 416, 416).

However, the shape of the output is as follows:

module.get_output(0) --> (1, 255, 26, 26)

module.get_output(1) --> (1, 255, 13, 13)

IMHO, the problem has occurred when the following code is executed:

input_shape = (4, 3, 416, 416)
mod, params = relay.frontend.from_darknet(net, dtype=dtype, shape=input_shape)

when I print out mod[“main”], it seems that the reshape func does not support batch input.

%49 = nn.leaky_relu(%48, alpha=0.1f) /* ty=Tensor[(4, 256, 26, 26), float32] */;
%50 = nn.conv2d(%49, %LAYERTYPE.CONVOLUTIONAL22_weight, padding=[0, 0, 0, 0], channels=255, kernel_size=[1, 1]) /* ty=Tensor[(4, 255, 26, 26), float32] */;
%51 = nn.bias_add(%50, %LAYERTYPE.CONVOLUTIONAL22_bias) /* ty=Tensor[(4, 255, 26, 26), float32] */;
%52 = reshape(%51, newshape=[1, 3, 85, 26, 26]) /* ty=Tensor[(1, 3, 85, 26, 26), float32] */;
%53 = split(%52, indices_or_sections=[2, 4], axis=2) /* ty=(Tensor[(1, 3, 2, 26, 26), float32], Tensor[(1, 3, 2, 26, 26), float32], Tensor[(1, 3, 81, 26, 26), float32]) */;
%54 = %53.0;
%55 = sigmoid(%54) /* ty=Tensor[(1, 3, 2, 26, 26), float32] */;
%56 = %53.1;
%57 = %53.2;
%58 = sigmoid(%57) /* ty=Tensor[(1, 3, 81, 26, 26), float32] */;
%59 = (%55, %56, %58);
%60 = concatenate(%59, axis=2) /* ty=Tensor[(1, 3, 85, 26, 26), float32] */;
%61 = reshape(%60, newshape=[1, 255, 26, 26]) /* ty=Tensor[(1, 255, 26, 26), float32] */;

As shown in the above results,

the output shape of the reshape function in %52 is (1, 3, 85, 26, 26) rather than (4, 3, 85, 26, 26)

Is there any idea how to resolve such a problem??

best wishes,

R. Kim

diff --git a/python/tvm/relay/frontend/darknet.py b/python/tvm/relay/frontend/darknet.py
index 936d7c0dc..62a320780 100644
--- a/python/tvm/relay/frontend/darknet.py
+++ b/python/tvm/relay/frontend/darknet.py
@@ -637,12 +637,12 @@ class GraphProto(object):
             attr.update({'coords' : layer.coords})
             attr.update({'background' : layer.background})
             attr.update({'softmax' : layer.softmax})
-            attr.update({'shape' : (1, layer.c, layer.h, layer.w)})
+            attr.update({'shape' : (-1, layer.c, layer.h, layer.w)})
         elif LAYERTYPE.YOLO == layer_type:
             attr.update({'n' : layer.n})
             attr.update({'classes' : layer.classes})
-            attr.update({'shape' : (1, layer.c, layer.h, layer.w)})
+            attr.update({'shape' : (-1, layer.c, layer.h, layer.w)})
         elif LAYERTYPE.UPSAMPLE == layer_type:
             attr.update({'scale' : layer.stride})

You can apply this diff to python/tvm/relay/frontend/darknet.py to quickly solve this issue. This will change your output to module.get_output(0) --> (4, 255, 26, 26) module.get_output(1) --> (4, 255, 13, 13) But you need to do the post processing to consider the batch values as well. layer_out['output'] = layer_out['output'].reshape(out_shape) Here the outshape must consider the batch size as well, currently its written to handle only batch_size of 1.


The problem is solved!!

When I run Yolov3-tiny on jetson nano, it takes about 35 ms for single image inference.

Now, it takes about 120 ms for four image inference.

Greatly appreciate for your response.


What is target backend when you get this timing performance ?


The target device is jetson nano.

AutoTVM is used to derive best logs for conv2d layers on cuda backend with sm_52 compute capability option.

1 Like