How to get multi output from module.get_output()?


#1

Currently, I find that all examples / tutorials are only one output. However, we have some models maybe have multi output layers, for example, ssd mobilenets.

For this situation, how we get multi output?

Additionally, I find that sometimes our output shape is (1000,), sometimes is (1, 1000),both are ok?


#2
    # get nth output with out_shape as output shape
    out = module.get_output(n, out=tvm.nd.empty(out_shape, ctx=ctx))

#3

Thanks. It works. But how about output_shape is different? for example, ssd_mobilenet, our two output_shape are different. We should call two times of module.get_output(0, out=tvm.nd_empty(out_shape_0, ctx=ctx)) and module.get_output(1, out=tvm.nd.empty(out_shape_1), ctx=ctx) or out_shape = (out_shape_0, out_shape_1), then module.get_output(1, out=tvm.nd.empty(out_shape), ctx=ctx)???


#4

tvm.nd.empty(out_shape, ctx=ctx) allocates an array, so you can allocate several arrays for each output. It is recommended not to allocate arrays repetitively, but allocate it once and pass it to out parameter


#5

So, you means I can:
module.get_output(0, out=tvm.nd_empty(out_shape_0, ctx=ctx))
module.get_output(1, out=tvm.nd.empty(out_shape_1), ctx=ctx)


#6

I think @tqchen means

# allocate once
out_0 = tvm.nd.empty(out_shape_0, ctx=ctx)
out_1 = tvm.nd.empty(out_shape_1, ctx=ctx)

# infer for many times
for i in range(...):
    # set input and module.run
    ...
    # pass as arguments
    module.get_output(0, out_0)
    module.get_output(1, out_1)

#7

similar question,

I’m writing a c++ test application to test a inference using mxnet + resnet.
This network requires two outputs(1, 1000) but below example code is considered only for one output.
For the build, I modified out_ndim = 2; int64_t out_shape[2] = {1, 1000, };
the building works but seems it shows wrong result below,
The maximum position in output vector is: 0

Could you give me an advice how I could correct it?

Thanks,
Inki Dae


61 DLTensor* y;
62 int out_ndim = 1;
63 int64_t out_shape[1] = {1000, };
64 TVMArrayAlloc(out_shape, out_ndim, dtype_code, dtype_bits, dtype_lanes, device_type, device_id, &y);
65
66 // get the function from the module(get output data)
67 tvm::runtime::PackedFunc get_output = mod.GetFunction(“get_output”);
68 get_output(0, y);
69
70 // get the maximum position in output vector
71 auto y_iter = static_cast<float*>(y->data);
72 auto max_iter = std::max_element(y_iter, y_iter + 1000);
73 auto max_index = std::distance(y_iter, max_iter);
74 std::cout << "The maximum position in output vector is: " << max_index << std::endl;
75
76 TVMArrayFree(x);
77 TVMArrayFree(y);


#8

Excuse me . did you solve this problem ? ?


#9

Hi,

I am testing a model with 3 outputs, however, when I use m.get_output(0, ) for example for index 0, I always get different outputs. This means that the outputs are randomly mapped to the output indexes.

Is this a bug? or is there any way to get the output indexes in a deterministic way?

Thanks


#10

I would like to share the solution to my problem described in the previous comment.

The issue is that I defined the set of outputs of the model as a “set” as follows:

outputs = {'output1', 'output2', 'output3'}
mod, params = relay.frontend.from_tensorflow(graph_def, layout=layout, outputs=outputs, shape=shape_dict)

This results in a random order of the outputs when the set is indexed. I used this code as I found it as a example to define the outputs, however, the right way is with a list as follows:

outputs = ['output1', 'output2', 'output3']
mod, params = relay.frontend.from_tensorflow(graph_def, layout=layout, outputs=outputs, shape=shape_dict)

Small difference but it is important to be aware of this


#11

I guess copying output (might be large) from GPU to CPU asynchronously could be more time efficient.


#12

I’ve done so successfully.