Intel_graphics scheduler inference incorrect accuracy

I downloaded ResNet-50 ONNX model (this one: https://github.com/onnx/models/tree/master/vision/classification/resnet/resnet50 release 1.3) and ran inference accordingly to this tutorial, just changing model and frontend.

When target is ‘llvm’ (for any opt_level) or tvm.target.intel_graphics() (just for opt_level = 3), the model correctly prints the output:
TVM prediction top-1: 283 Persian cat
TVM prediction top-2: 287 lynx, catamount
TVM prediction top-3: 285 Egyptian cat
TVM prediction top-4: 17 jay
TVM prediction top-5: 282 tiger cat

But, for intel_graphics without AlterOpLayout optimization, the output is incorrect and random. For example:
TVM prediction top-1: 506 coil, spiral, volute, whorl, helix
TVM prediction top-2: 999 toilet tissue, toilet paper, bathroom tissue
TVM prediction top-3: 327 starfish, sea star
TVM prediction top-4: 340 zebra
TVM prediction top-5: 339 sorrel

My hardware is Intel HD Graphics 530.
Could you advise me how to fix the accuracy? @Laurawly
Thanks.

Hi @Ajja, my current test environment is Intel Core i5 which comes with intel HD graphics 530. And I compare results with mxnet output which has no difference for gluoncv models. I can try to down load the model you provided above and see if I can reproduce your error.

Hi @Laurawly, Thanks for answering.
I also tried to run mxnet model by using script: https://github.com/dmlc/tvm/blob/master/tutorials/frontend/from_mxnet.py
with following changes:

diff --git a/tutorials/frontend/from_mxnet.py b/tutorials/frontend/from_mxnet.py
index d0e4c4ab..a1bb73ea 100644
--- a/tutorials/frontend/from_mxnet.py
+++ b/tutorials/frontend/from_mxnet.py
@@ -62,8 +62,8 @@ synset_path = download_testdata(synset_url, synset_name, module='data')
 with open(synset_path) as f:
     synset = eval(f.read())
 image = Image.open(img_path).resize((224, 224))
-plt.imshow(image)
-plt.show()
+#plt.imshow(image)
+#plt.show()

 def transform_image(image):
     image = np.array(image) - np.array([123., 117., 104.])
@@ -89,8 +89,8 @@ func = relay.Function(func.params, relay.nn.softmax(func.body), None, func.type_

 ######################################################################
 # now compile the graph
-target = 'cuda'
-with relay.build_config(opt_level=3):
+target = tvm.target.intel_graphics()#'cuda'
+with relay.build_config(opt_level=2):
     graph, lib, params = relay.build(func, target, params=params)

 ######################################################################
@@ -98,7 +98,7 @@ with relay.build_config(opt_level=3):
 # ---------------------------------
 # Now, we would like to reproduce the same forward computation using TVM.
 from tvm.contrib import graph_runtime
-ctx = tvm.gpu(0)
+ctx = tvm.context(str(target), 0)#tvm.gpu(0)
 dtype = 'float32'
 m = graph_runtime.create(graph, lib, ctx)
 # set inputs
@@ -108,8 +108,12 @@ m.set_input(**params)
 m.run()
 # get outputs
 tvm_output = m.get_output(0)
-top1 = np.argmax(tvm_output.asnumpy()[0])
-print('TVM prediction top-1:', top1, synset[top1])
+topk = np.argsort(tvm_output.asnumpy()[0])
+print('TVM prediction top-1:', topk[-1], synset[topk[-1]])
+print('TVM prediction top-2:', topk[-2], synset[topk[-2]])
+print('TVM prediction top-3:', topk[-3], synset[topk[-3]])
+print('TVM prediction top-4:', topk[-4], synset[topk[-4]])
+print('TVM prediction top-5:', topk[-5], synset[topk[-5]])

And output is the same. (282 tiger cat for opt_level=3 and random output for opt_level=2). I’m using current master branch.

If you could reproduce this and check, it would be awesome.

Yes, I could reproduce the error on my side. I tested a few convolution by itself using opt_level = 2 and there’s no problem. I guess all workloads of convolution needs to be tested to see which one fails which leads to incorrect results.