[RFC] Relay to ONNX

Hi @tqchen and @maheshambule,

Refer to TVM tutorial Bring Your Own Codegen To TVM, where details how to create a self-defined c source module codegen.

However, ONNX is not a C source module, we should define an ONNX module node for codegen additionally. The following is the steps how do I create ONNX module codegen.

  1. Create ONNXModuleCodegen to traverse relay IRModule and convert Relay op to ONNX op. (see codegen.cc)
class ONNXModuleCodegen {
 public:
	ONNXModuleCodegen(){}
  runtime::Module CreateONNXModule(const ObjectRef& ref){

	auto mod = Downcast<IRModule>(ref);
	String codes = (*to_onnx_)(mod);
	/* Here, use String instead of std::string, because some byte info
	 * will lost while calling PackedFunc object.
	 */
	const auto* pf = runtime::Registry::Get("runtime.ONNXModuleCreate");
    CHECK(pf != nullptr) << "Cannot find onnx module to create the external runtime module";
    return (*pf)(codes, "onnx");
  }
 private:
  /*!
   * \brief The python function to convert relay module to onnx module.
   * \return byte array -> String
   */
  const PackedFunc* to_onnx_ = runtime::Registry::Get("tvm.relay.converter.to_onnx");
};

Register a global function, “relay.ext.onnx”, whose body is a wrapper function to create onnx module.

runtime::Module ONNXCompiler(const ObjectRef& ref) {
  ONNXModuleCodegen onnx;
  return onnx.CreateONNXModule(ref);
}
TVM_REGISTER_GLOBAL("relay.ext.onnx").set_body_typed(ONNXCompiler);

Instead of writing op conversions in C++, use register_func to register a global function “tvm.relay.converter.to_onnx”, and write op conversions in Python to convert relay module to onnx module. (see converter/onnx.py)

@tvm.register_func("tvm.relay.converter.to_onnx")
def convert_to_onnx(model):
    ...
    opset = onnx.defs.onnx_opset_version() # get the supported opset version
    data = ""
    global_vars = model.get_global_vars()
    for i, global_var in enumerate(global_vars):
        func = model[global_var]
        sub_model = tvm.IRModule().from_expr(func.body)
        sub_model = fuse_ops(sub_model)
        func = sub_model["main"]
        graph_name = global_var.name_hint
        # Traverse the Relay function and record the nodes.
        sub_onnx_model = ONNXGenerator({}, opset, graph_name, "").to_onnx(func)
        bytes_data = get_onnx_bytes(sub_onnx_model)
        data += graph_name +"<"+ str(bytes_data)[2:-1]+">"; 
    return data
  1. Create ONNXModuleNode which is subclass of ModuleNode to create a specific runtime module. (see source_module.cc)
class ONNXModuleNode : public runtime::ModuleNode {
 public:
	ONNXModuleNode(std::string code,
                   std::string fmt)
      : code_(code), fmt_(fmt) {}
  const char* type_key() const {
    return "onnx";
  }
  PackedFunc GetFunction(
      const std::string& name,
      const ObjectPtr<Object>& sptr_to_self) final {
    LOG(FATAL) << "ONNX Source module cannot execute, to get executable module"
               << " build TVM with \'" << fmt_ << "\' runtime support";
    return PackedFunc();
  }

  std::string GetSource(const std::string& format) final {
	return code_;
  }
  ...
  void SaveToFile(const std::string& file_name,
                  const std::string& format) final {
    std::string fmt = GetFileFormat(file_name, format);
    std::string folder;
    size_t pos = file_name.find_last_of("\\/");
    if(pos!=std::string::npos){
    	folder = file_name.substr(0,pos+1);
    }else{
    	folder = file_name+"/";
    }
    auto datas = Split(code_,'>');
    if (fmt == "onnx") {
      CHECK_NE(code_.size(), 0);
      std::stringstream ss;
	  for(auto data : datas){
		  auto split_data = Split(data,'<');
		  ss<<folder<<split_data[0].c_str()<<"."<<fmt;
		  SaveBinaryToFile(ss.str(), ConvertEscape(split_data[1]));
		  ss.str("");
		  ss.clear();

	  }
    } else {
      CHECK_EQ(fmt, fmt_)
          << "Can only save to format=" << fmt_;
    }
  }
...
runtime::Module ONNXModuleCreate(String code, std::string fmt) {
	/* Here, use String instead of std::string, because some byte info
	 * will lost while calling PackedFunc object.
	 */
  auto n = make_object<ONNXModuleNode>(code, fmt);
  return runtime::Module(n);
}
TVM_REGISTER_GLOBAL("runtime.ONNXModuleCreate")
.set_body_typed(ONNXModuleCreate);
  1. Create a cmake file for ONNX, and include cmake in CMakeLists.txt. When user wants to use onnx module codegen, set USE_ONNX_CODEGEN “ON”, and build TVM source code. (see ONNX.cmake)
if(USE_ONNX_CODEGEN STREQUAL "ON")
  file(GLOB ONNX_RELAY_CONTRIB_SRC src/relay/backend/contrib/onnx/codegen.cc)
  list(APPEND COMPILER_SRCS ${ONNX_RELAY_CONTRIB_SRC})
  message(STATUS "Build with ONNX codegen.")
endif()

In addition, I update my code to the Apail 28 version. (see source code and example code )

@smallcoscat, Thanks. I also followed this tutorial and was able to create ONNX codegen for external runtime. Relevant code in this PR: https://github.com/maheshambule/tvm/pull/9

However, as suggested by @tqchen when I tried to implement ‘ONNX’ as target (and not as external codegen), I am facing a problem. The Relay IR gets converted to lowered TIR, and target codegen module receives a lowered TIR. Since ONNX is high-level IR, converting a lowered TIR to ONNX is difficult.

@tqchen, Please let us know if there is a way to pass high-level Relay IR to target codegen? If currently, it is not possible to do so should we go ahead with the external codegen approach?

W don’t have to strictly go through the TIR part, as the target only means IRModule-> runtime::Module. It is totally fine for target to take in IRModule that contains relay functions. I agree that it would be useful to have a ONNXModule as a runtime module.

@tqchen, Just to be on same page. Could you please confirm below? We need NOT to register ONNXModule as “target.build.onnx”. If registered this way, it will get invoked from here when we specify target as “onnx”.

But we need to register as external codegen as “relay.ext.onnx”. If registered this way, it will get invoked from here when graph is annotated as “onnx”.

I don;t think we need to do that. Just like the case of SourceModule, they are not registered anywhere.

As the code base refactors further, we could introduce it to the target built, when it is clear that the case of ONNX requires the IRModule to contain relay functions instead of TIR functions

Ok. Thanks for clarification. I will update the PR.

@maheshambule @smallcoscat

Hi! Thank you for deploying this function.

Can you give us any example codes, please?

I want to know how to export from tflite →Relay(Optimized)→ONNX→tflite.

I find this project is great, for the compiler who take ONNX as input can use relay graph optimizations quickly without write the tvm backend. Not sure any plan to support more OPs?

@smallcoscat @tqchen

  • Relay to onnx is implenmented by python ( def to_onnx(relay_ir, params, name, opset_version=11, path=None):slight_smile: not c++.
  • Will you plan to implenment relay to onnx by c++?
  • Because we have no python environment and cannot call python function in our environment. We must write our ops (relay to onnx) by c++ and hope tvm to give some examples of c++.