How to call DNNL under TVM version 0.7

yangjing69 · July 6, 2020, 2:54am

Hello, I’m using tvm version 0.7 and need to call the DNNL library, but tried the following statements and found that none of them are supported . mod = relay.build_extern(mod, “dnnl”) mod = relay.build_extern_compiler(mod, " dnnl") mod = relay.transform.AnnotateExternalCompiler(“xx”)(mod) How should I replace it? Looking forward to anyone’s help.Thank you!

comaniac · July 6, 2020, 4:48pm

TVM currently only support DNNL for dense. In order to use it, all you have to do is first enabling USE_MKLDNN in config.cmake, and then specify your target with llvm -libs=cblas.

Leslie-Fang · July 6, 2020, 11:10pm

Thanks comaniac, is it possible for me to set the target as this: target = ‘llvm -mcpu=skylake-avx512 -libs=cblas’? Then I plan to use autoTVM to tune the conv2d but keep the dense to use DNNL.

comaniac · July 6, 2020, 11:27pm

Yes you can definitely do that. Specifically, suppose your model contains 1 conv2d and 1 dense, you will probably have following AutoTVM tasks with llvm -mcpu=skylake-avx512 -libs=cblas:

conv2d_NCHWc.x86
dense_nopack.x86
dense_pack.x86
dense_cblas.x86

Then you only need to tune conv2d_NCHWc.x86. After the tuning, you compile the model with ApplyHistoryBest and the tuning log file, Relay op strategy will select conv2d_NCHWc.x86 for the conv2d as we have its tuning records on file. It will then select dense_cblas.x86 for dense, because we don’t have any dense records on file, and dense_cblas.x86 has a higher priority by default.

Leslie-Fang · July 6, 2020, 11:35pm

Thanks @ comaniac. From this discussion from TVM and BLAS libraries @ haichen, It seem we still need to set(USE_BLAS mkl) to use mkldnn. But if I only have mkldnn installed in the system without mkl. How should I set this option?

comaniac · July 6, 2020, 11:50pm

You should be able to use MKLDNN (a.k.a. DNNL, OneDNN) by enabling USE_MKLDNN ON when building TVM, as the following PR made:

Leslie-Fang · July 7, 2020, 12:32am

Thanks comaniac. I change the config.cmake to indicate the MKLDNN install pathset(USE_MKLDNN /home/lesliefang/tvm/mkldnn_install). Then rebuild the tvm. And change the target to llvm -mcpu=skylake-avx512 -libs=cblas in my build-model script. Unfortunately, When running the workload, it core-dumps with the error

TVMError: Check failed: ret == 0 (-1 vs. 0) : Check failed: f != nullptr: Cannot find function tvm.contrib.cblas.matmul in the imported modules or global registry

I will dig it deep but do you have any instructions?

comaniac · July 7, 2020, 12:40am

By looking at cmake/modules/contrib/BLAS.cmake, it seems like you have to enable both USE_BLAS and USE_MKLDNN.

cc @haichen could you confirm on this?

Leslie-Fang · July 7, 2020, 1:28am

Try install and set the MKL path, then it works with MKLDNN now However, I still didn’t get the point why we have to enable MKL to use MKLDNN gemm.

yangjing69 · July 9, 2020, 8:04am

Thanks [comaniac]. Sorry, I’m a newcomer to TVM. I find I’m not describing it accurately enough.

I want to use the dnnl library which defined in “src/relay/backend/contrib/dnnl” . What is the relationship between USE_MKLDNN and USE_DNNL_CODEGEN? Do I need to enable USE_DNNL_CODEGEN?
How do I call the dnnl library if the dnnl library compiles successfully? I saw it in the source code. “TVM_REGISTER_GLOBAL(“relay.ext.dnnl”).set_ body_typed(DNNLCompiler);” I think that call dnnl library should use the interface: “mod = relay.ext.dnnl(mod)” .But the “relay.ext” statement is not supported in the current tvm library, is it right?

comaniac · July 9, 2020, 5:19pm

Sorry for the confusion. The file you mentioned (src/relay/backend/contrib/dnnl) is for BYOC infra demonstration purpose instead of an official library support, although it should be able to run most models after this PR (https://github.com/apache/incubator-tvm/pull/5919) is merged and we will have a tutorial for that later on. It is configured by USE_DNNL_CODEGEN as you mentioned.

On the other hand, the DNNL support we are talking about in this thread only supports dense ops. It is configured by USE_BLAS and USE_MKLDNN.

yangjing69 · July 10, 2020, 1:57am

Thank you for your reply[comaniac]. I still have a few questions at the moment.

Based on the current 【BYOC】 dnnl source code, are we not able to run a complete model yet?
The current 【BYOC】 dnnl source code implements a subgraph to generate a C source code file, then for a model (with multiple subgraphs), we will get multiple C source code files, is that right?
How to call the generated C source code file?

comaniac · July 10, 2020, 4:40pm

With the PR #5919 (https://github.com/apache/incubator-tvm/pull/5919) merged today, we now serialize the graph to a JSON format instead of C source code. You could refer to the following unit test about how to run MobileNet V2 with DNNL. However, please note that as we mentioned in the PR description, there are still some TODOs that hurt the end-to-end performance, so you may not get the best result for now.

github.com

apache/incubator-tvm/blob/8a0249cd4d12a2eb1a4e7a692a9265bc63fec5c8/tests/python/relay/test_pass_partition_graph.py#L458



    i_data = np.random.uniform(0, 1, ishape).astype(dtype)
    w1_data = np.random.uniform(0, 1, w1shape).astype(dtype)

    ref_ex = relay.create_executor("graph", mod=ref_mod, ctx=tvm.cpu())
    ref_res = ref_ex.evaluate()(i_data, w1_data)
    check_result(mod, {"data": i_data, "weight1": w1_data},
                 (1, 32, 14, 14), ref_res.asnumpy(), tol=1e-5)


def test_extern_dnnl_mobilenet():
    if not tvm.get_global_func("relay.ext.dnnl", True):
        print("skip because DNNL codegen is not available")
        return

    dtype = 'float32'
    ishape = (1, 3, 224, 224)
    ref_mod, params = relay.testing.mobilenet.get_workload(batch_size=1, dtype='float32')
    mod = transform.AnnotateTarget(["dnnl"])(ref_mod)
    mod = transform.MergeCompilerRegions()(mod)
    mod = transform.PartitionGraph()(mod)

yangjing69 · July 13, 2020, 7:08am

Hi, I updated the tvm source code as you prompted, but when running the file"tvm/tests/python/relay/test_pass_partition_graph.py",I get an error “ModuleNotFoundError: No module named ‘tvm.relay.op.contrib.register’”，what should I do?

comaniac · July 13, 2020, 5:08pm

It should not happen. The module tvm.relay.op.contrib.register was added on April and never changed since then. Please make sure 1) you have set(USE_DNNL_CODEGEN ON) in build/config.cmake, 2) you have re-compiled latest TVM, 3) your PYTHONPATH is pointing to the TVM you just compiled.

With all above verified, the following code should work:

ubuntu> python
Python 3.6.9 (default, Apr 18 2020, 01:56:04)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tvm
>>> from tvm.relay.op.contrib.register import get_pattern_table
>>>

yangjing69 · July 17, 2020, 2:57am

Thank you very mach! I’ve found my mistake.