Run multiple keras models in one program

I took a look at the tutorial for running keras models with tvm, and I can get that running with a single model.

I’ve slightly adapted this code so I can chose a keras model to run, and compile and execute that instead. Individually, I can get resnet50 and xception running. However, if I try and run them sequentially I get the following error:

nnvm._base.NNVMError: [16:21:19] /home/btaylor/projects/autotvm/nnvm/src/compiler/ Check failed: dshape.ndim() != 0 (0 vs. 0)

What is causing this error?

Can you share your script?

Here’s the code:

def get_available_models_names():
    availale_models = ['xception',

    return availale_models

def run_model(model_name, input_shape, out_shape, classes):
    model_fn, preprocess_fn = get_model_functions(model_name)

    # Load pretrained keras model
    model = model_fn(include_top=True, weights='imagenet',
                     input_shape=input_shape, classes=classes)

    # Load and preprocess image
    img =, 224))
    data = np.array(img)[np.newaxis, :].astype('float32')
    data = preprocess_fn(data).transpose([0, 3, 1, 2]) # Need NCHW format for GPU

    target = 'llvm'
    ctx = tvm.cpu()

    # Compile with nnvm
    # convert the keras model(NHWC layout) to NNVM format(NCHW layout), then compile
    sym, params = nnvm.frontend.from_keras(model)
    shape_dict = {'input_1': data.shape}
    with nnvm.compiler.build_config(opt_level=2):
        graph, lib, params =, target, shape_dict, params=params)

    # Execute with tvm
    m = graph_runtime.create(graph, lib, ctx)

    # set inputs
    m.set_input('input_1', tvm.nd.array(data.astype('float32')))

    # execute

    # get outputs
    tvm_out = m.get_output(0, tvm.nd.empty(out_shape, 'float32')).asnumpy()
    top1_tvm = np.argmax(tvm_out)

    return top1_tvm

def main():
    input_shape = (224, 224, 3)
    classes = 1000
    out_shape = (classes,)

    # Run each model once
    results = list()
    for model_name in get_available_models_names():
        print("Running model:", model_name)
        top_1 = run_model(model_name, input_shape, out_shape, classes)

if __name__ == "__main__":

Like this, the code will run fine and return a result. If I remove the break in main then I get the error in my first post. I know that each of the first 4 models in the list work individually, but I get this error when I try to execute each after some other working model.

The input name of the keras model can be varied if you run multiple models. I think replacing ’input_1’ with model.input_names[0] would solve the error.

Thanks, That seems to have solved the problem! Where is the documentation for this? I can’t seem to find it?

I’m getting another weird problem now. The only models I can get to run are: mobilenet_v1, mobilenet_v2, xception, and resnet50.

I’m not getting consistent results when i measure the prediction time of each of these models. Basically my approach is:

start_time = current_time()
end_time = current_time()
prediction_time = end_time - start_time

Sometimes I will get a value of around 1200ms, which seems a bit long for mobilenet, but most of the time I’m getting a value between 4 and 10ms. But each time the models seem to output the correct prediction.

Thanks, That seems to have solved the problem! Where is the documentation for this? I can’t seem to find it?

Not written in TVM docs, but some examples are available in the test script:

I’m getting another weird problem now. The only models I can get to run are: mobilenet_v1, mobilenet_v2, xception, and resnet50.

Okay, I’ll look into it.

I’m not getting consistent results when i measure the prediction time of each of these models.

I tried the same test, but I couldn’t reproduce the problem. The results were stable between 20 and 30 ms on my environment.

The timing problem seems to have stabilised now.

I’m also trying to run the following models from keras with the following errors:

inceptionv3 - PTX Error
inceptionresnetv2 - Lambda Layer Not supported (I’m assuming this isn’t currently available with tvm)

I also try and run vgg16 and 19, but get an out of memory error. I understand vgg contains a large matrix which is probably causing this error.

Thanks for all the help!

Inceptionv3 consumes too much shared momory in CUDA. Inceptionresnetv2 needs a Lambda layer support as you said. Both looks not so easy to be solved.

I’ve send two PRs to support NASNet:

Other models in your code look working correctly on my environment.

Thanks! I’ll take a look now