Run multiple keras models in one program


#1

I took a look at the tutorial for running keras models with tvm, and I can get that running with a single model.

I’ve slightly adapted this code so I can chose a keras model to run, and compile and execute that instead. Individually, I can get resnet50 and xception running. However, if I try and run them sequentially I get the following error:

nnvm._base.NNVMError: [16:21:19] /home/btaylor/projects/autotvm/nnvm/src/compiler/simplify_inference.cc:27: Check failed: dshape.ndim() != 0 (0 vs. 0)

What is causing this error?


Error message: Check failed: dshape.ndim() != 0 (0 vs. 0)
#2

Can you share your script?


#3

Here’s the code:

def get_available_models_names():
    availale_models = ['xception',
                       'resnet50',
                       'inceptionv3',
                       'inceptionresnetv2',
                       'mobilenet',
                       'densenet',
                       'nasnet',
                       'mobilenetv2'
                       ]

    return availale_models


def run_model(model_name, input_shape, out_shape, classes):
    model_fn, preprocess_fn = get_model_functions(model_name)

    # Load pretrained keras model
    model = model_fn(include_top=True, weights='imagenet',
                     input_shape=input_shape, classes=classes)


    # Load and preprocess image
    img = Image.open(IMAGE_FILE).resize((224, 224))
    data = np.array(img)[np.newaxis, :].astype('float32')
    data = preprocess_fn(data).transpose([0, 3, 1, 2]) # Need NCHW format for GPU

    target = 'llvm'
    ctx = tvm.cpu()

    # Compile with nnvm
    # convert the keras model(NHWC layout) to NNVM format(NCHW layout), then compile
    sym, params = nnvm.frontend.from_keras(model)
    shape_dict = {'input_1': data.shape}
    with nnvm.compiler.build_config(opt_level=2):
        graph, lib, params = nnvm.compiler.build(sym, target, shape_dict, params=params)

    # Execute with tvm
    m = graph_runtime.create(graph, lib, ctx)

    # set inputs
    m.set_input('input_1', tvm.nd.array(data.astype('float32')))
    m.set_input(**params)

    # execute
    m.run()

    # get outputs
    tvm_out = m.get_output(0, tvm.nd.empty(out_shape, 'float32')).asnumpy()
    top1_tvm = np.argmax(tvm_out)

    return top1_tvm


def main():
    input_shape = (224, 224, 3)
    classes = 1000
    out_shape = (classes,)

    # Run each model once
    results = list()
    for model_name in get_available_models_names():
        print("Running model:", model_name)
        top_1 = run_model(model_name, input_shape, out_shape, classes)
        print('\n')
        break

if __name__ == "__main__":
    main()

Like this, the code will run fine and return a result. If I remove the break in main then I get the error in my first post. I know that each of the first 4 models in the list work individually, but I get this error when I try to execute each after some other working model.


#4

The input name of the keras model can be varied if you run multiple models. I think replacing ’input_1’ with model.input_names[0] would solve the error.


#5

Thanks, That seems to have solved the problem! Where is the documentation for this? I can’t seem to find it?

I’m getting another weird problem now. The only models I can get to run are: mobilenet_v1, mobilenet_v2, xception, and resnet50.

I’m not getting consistent results when i measure the prediction time of each of these models. Basically my approach is:

start_time = current_time()
module.run()
end_time = current_time()
prediction_time = end_time - start_time

Sometimes I will get a value of around 1200ms, which seems a bit long for mobilenet, but most of the time I’m getting a value between 4 and 10ms. But each time the models seem to output the correct prediction.


#6

Thanks, That seems to have solved the problem! Where is the documentation for this? I can’t seem to find it?

Not written in TVM docs, but some examples are available in the test script:

I’m getting another weird problem now. The only models I can get to run are: mobilenet_v1, mobilenet_v2, xception, and resnet50.

Okay, I’ll look into it.

I’m not getting consistent results when i measure the prediction time of each of these models.

I tried the same test, but I couldn’t reproduce the problem. The results were stable between 20 and 30 ms on my environment.


#7

The timing problem seems to have stabilised now.

I’m also trying to run the following models from keras with the following errors:

inceptionv3 - PTX Error
inceptionresnetv2 - Lambda Layer Not supported (I’m assuming this isn’t currently available with tvm)

I also try and run vgg16 and 19, but get an out of memory error. I understand vgg contains a large matrix which is probably causing this error.

Thanks for all the help!


#8

Inceptionv3 consumes too much shared momory in CUDA. Inceptionresnetv2 needs a Lambda layer support as you said. Both looks not so easy to be solved.

I’ve send two PRs to support NASNet:
https://github.com/dmlc/tvm/pull/1635
https://github.com/dmlc/tvm/pull/1636

Other models in your code look working correctly on my environment.


#9

Thanks! I’ll take a look now