Teach NNVM recognize tensorflow model

Is it possible that import tf model to mxnet, then convert to NNVM? Is there any problem for this kind of conversion? Thanks.

https://github.com/dmlc/tvm/pull/1188

PR under review for tensorflow frontend.

Works good for InceptionV1 & V3.
Mobilenet also works for me with few changes which is yet to PR.

Hi,

I found your discussion of performance of resnet50 on tensorflow and tvm on https://github.com/dmlc/nnvm/issues/440.

Could you share how you converted tensorflow resnet50 model to nnvm? I tried to but failed, as Anyone successful converting tensorflow resnet50 model to nnvm?. If you would not mind sharing your advice, I appreciate it very much.

Thanks!

We have one fork of tvm. We have done many things on CoreML frontend. And we convert TF resnet50 to CoreML, then we conver CoreML to NNVM.

Did you encounter problems in converting resnet50 tf model to coreml? I had error messages as https://github.com/tf-coreml/tf-coreml/issues/210, but I saw you reported the error message on a different model. Any advice on how to solve the resnet50 conversion issue? Thanks a lot.

I don’t meet this error when to convert resnet50. You could try the way I mentioned in that issue.

@FrozenGene

I followed your advice converting tf resnet50 to coreml, and had the above problems.

My resnet50 model is the pretrained model ResNet-50 v1 or ResNet-50 v2 from https://github.com/tensorflow/models/tree/master/official/resnet. The model is in saved_model format. I freezed the model by

python freeze_graph.py --input_saved_model_dir=saved_model_dir --output_graph=frozen_model.pb --output_node_names=ArgMax --clear_devices

Freezing was successful. But both models raise
ValueError: Length of the 'dim' parameter must be equal to 4
when converted to coreml.

Shoudl I use a different tf resnet50 model? Could you share which tf resnet50 model you use and how you freeze it, if this is public information?

Thank you very much.

Just official resnet50 model provided by Tensorflow, not special. https://github.com/tensorflow/models/tree/master/research/slim

@FrozenGene

Did you use v1 or v2 on that page? How did you freeze it?

I downloaded v1, and used
python freeze_graph.py --input_graph=resnet_v1_50_inf_graph.pb --input_checkpoint=resnet_v1_50.ckpt --input_binary=true --output_graph=frozen_resnet_v1_50_slim.pb --output_node_names=resnet_v1_50/predictions/Reshape_1
to freeze it. But I got an error:

File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1759, in restore
    err, "a mismatch between the current graph and the graph")
tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [1,1,2048,1001] rhs shape= [1,1,2048,1000]

I use resnetv2 provided by it. Your error I remember export_inference_graph.py have one parameter can control it. You could investigate it.

@jjiang2cal

You may try this initial version of changes where I could compile Resnet_v2 via tensorflow frontend.

I am planning to PR it soon.

Yes set the --labels_offset=1 flag when exporting inference graph solves this problem. Thanks.

@srkreddy1238

Thanks for the quick commit!

When I tried tf slim models of resnet 50 v1 and v2 (https://github.com/tensorflow/models/tree/master/research/slim), I got NotImplementedError: Please freeze the graph with add_shapes=True. I use the freeze script from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py, and it does not have an add_shapes option. Is there any other freeze_graph I should use?

Sorry for bothering you so much :frowning:

freeze_graph.py --input_saved_model_dir=20180601_resnet_v2_imagenet_savedmodel/1527888387/ --output_graph=frozen_model-v2-fp16.pb --output_node_names=ArgMax --clear_devices

I use this command to freeze the model.

Ref.

helper function as shown below can be used to add shapes.

graph_def = nnvm.testing.tf.AddShapesToGraphDef(‘softmax’)

@srkreddy1238

This is the official model I first used. The input node and output node of this model (inspected by https://github.com/tf-coreml/tf-coreml/blob/master/utils/inspect_pb.py) are:

0: op name = import/input_tensor, op type = ( Placeholder ), inputs = , outputs = import/input_tensor:0
@input shapes:
@output shapes:
name = import/input_tensor:0 : (128, 224, 224, 3)
......
 702: op name = import/ArgMax, op type = ( ArgMax ), inputs = import/resnet_model/final_dense:0, import/ArgMax/dimension:0, outputs = import/ArgMax:0
@input shapes:
name = import/resnet_model/final_dense:0 : (128, 1001)
name = import/ArgMax/dimension:0 : ()
@output shapes:
name = import/ArgMax:0 : (128,)

Since NNVM/TVM does not support batch_size > 1, so I set batch_size to 1.
With you patch, it compiled to nnvm successfully :grinning:.
But I have questions of inference. The output of the graph is ArgMax, so it is the class number of the classification. And if batch_size is 1, the output shape should be (1,). I used the code below to inference:

......
out = module.get_output(0)
tvm_out = out.asnumpy()
print(tvm_out)

It printed out [766 774 457 766 729 701 824]
while I expected 230. And I don’t understand why it is a 7-element vector.

Below is one picture I used for inference. It is from the ImageNet dataset, classified as 230: ‘Shetland sheepdog, Shetland sheep dog, Shetland’.

ILSVRC2012_val_00000003

Do you have any insights how I should do the inference? Thanks a lot.

@srkreddy1238

For the research slim model,
graph_def = nnvm.testing.tf.AddShapesToGraphDef('resnet_v2_50/predictions/Reshape_1')
does solve the add shape error. But during conversion,

File "from_tensorflow_slim_v2.py", line 124, in <module>
    graph, lib, params = nnvm.compiler.build(sym, target, shape_dict, params=params)
  File "/tvm/nnvm/python/nnvm/compiler/build_module.py", line 270, in build
    ishape, _ = graph_util.infer_shape(graph, **shape)
  File "/tvm/nnvm/python/nnvm/compiler/graph_util.py", line 31, in infer_shape
    graph = graph.apply("InferShape")
  File "/tvm/nnvm/python/nnvm/graph.py", line 234, in apply
    check_call(_LIB.NNGraphApplyPasses(self.handle, npass, cpass, ctypes.byref(ghandle)))
  File "/tvm/nnvm/python/nnvm/_base.py", line 75, in check_call
    raise NNVMError(py_str(_LIB.NNGetLastError()))
nnvm._base.NNVMError: Error in operator resnet_v2_50/SpatialSqueeze: [17:58:16] /tvm/nnvm/src/top/tensor/transform.cc:693: Check failed: shp[i] == 1 (7 vs. 1) The squeezed axis must have shape 1!Want to squeeze 2, which has shape7

The input, output and the resnet_v2_50/SpatialSqueeze nodes are as below:

0: op name = import/input, op type = ( Placeholder ), inputs = , outputs = import/input:0
@input shapes:
@output shapes:
name = import/input:0 : (?, 224, 224, 3)
......
1762: op name = import/resnet_v2_50/SpatialSqueeze, op type = ( Squeeze ), inputs = import/resnet_v2_50/logits/BiasAdd:0, outputs = import/resnet_v2_50/SpatialSqueeze:0
@input shapes:
name = import/resnet_v2_50/logits/BiasAdd:0 : (?, 1, 1, 1001)
@output shapes:
name = import/resnet_v2_50/SpatialSqueeze:0 : (?, 1001)
......
1767: op name = import/resnet_v2_50/predictions/Reshape_1, op type = ( Reshape ), inputs = import/resnet_v2_50/predictions/Softmax:0, import/resnet_v2_50/predictions/Shape:0, outputs = import/resnet_v2_50/predictions/Reshape_1:0
@input shapes:
name = import/resnet_v2_50/predictions/Softmax:0 : (?, 1001)
name = import/resnet_v2_50/predictions/Shape:0 : (2,)
@output shapes:
name = import/resnet_v2_50/predictions/Reshape_1:0 : (?, 1001)

@FrozenGene

Did you have

File "/tvm/nnvm/python/nnvm/frontend/coreml.py", line 182, in PoolingLayerParams
    raise NotImplementedError("Other convolution padding not implemented")
NotImplementedError: Other convolution padding not implemented

when converting coreml to nnvm? (The coreml model is converted from research slim resnet50 v2 tf model.)

I have done many things for CoreML. For convolution, I have support SAME / VALID using 4-D padding (haven’t contributed back to community, will do soon) And for pooling, also support its padding completely too. So, I really cam not figure out the detail error only having this information. I suggest converting .mlmodel to Text format, you could Google it how to do it and then check what is this layer detail information.

@jjiang2cal

I know the shape operator issue above resnet_v2_50, I will try sharing the fix for it soon.

With this tensorflow frontend could support all models(Inception, Resnet, MobilenetV1/V2, Vgg) from research/slim.

As all models can’t be integrated into TVM test cases. I have added some utils to validate https://github.com/srkreddy1238/dmlc_data/tree/master/work/tf/samples for reference.