[RFC] Relay to ONNX

Motivation:

We want to port the DL models in Relay IR. For that, we want to serialize the Relay IR to disk. Once serialized third-party frameworks, compilers should be able to import those. We want the serialization format to be compact, portable, widely adopted and having well-documented specifications.

We may want to export the optimized Relay after running optimization passes on it. We see a few challenges there which we will talk later in this RFC.

Why ONNX?

Serialization format should meet below criteria:

  1. Widely adopted
  2. Well documented specification
  3. Import and export support in source and target system

ONNX is the best fit based on the criteria and hence it is chosen.

What is ONNX?

ONNX provides an open-source format for DL models. It defines an extensible computation graph model, as well as definitions of built-in operators and standard data types having support for inferencing majorly. The ONNX is widely supported and can be found in different frameworks, hardware.

Links: https://onnx.ai, https://github.com/onnx/onnx

Design /Implementation Approach

Below will be the implementation approach:

  • An in-memory Relay module (optimized or unoptimized) will be an input
  • Get the topologically sorted list of the Relay nodes
  • Convert the Relay nodes to ONNX nodes
  • Build ONNX graph and serialize it

image

Strategy for supporting ONNX opsets

  • ONNX operator set evolves over time. Hence, we will have to add support for different versions.
  • Initially we will support version 11.

Draft PR

The PR is created with support for a subset of operators. Few of the models from ONNX, MXNet and TFSlim model zoo are tested. For details, limitations and TO-DOs refer to the PR below. https://github.com/apache/incubator-tvm/pull/5052

Challenges

  • Support for higher-order features

    • ONNX does not have support for adding functions. ONNX does have some predefined functions though. So, we will not be able to map higher-order functions from Relay to ONNX. Proposal to add support for functions was not accepted. https://github.com/onnx/onnx/issues/48
  • Support for Operator Fusion pass

    • We may want to optimize the model using optimization passes before exporting it to ONNX. When we run a Fuse Op pass on Relay, the subgraph of nodes which can be fused together gets wrapped into an inline function. It will be difficult to add support for such inline functions for the reasons listed in the point above. Also, the target runtime should have required support to run fused ops.

CC: @jroesch, @tqchen, @zhiics, @kevinthesun

1 Like

@jroesch, @tqchen, Regarding the naming convention discussion on the PR, I agree the converter does not seem to be the correct word. The suggested words by you are either ‘export’ or ‘target’. I think ‘export’ should be used as it is more in line with other DL frameworks. Please let me know your thoughts.

Thanks for the proposal. It would be great to list alternatives with labels (see example Target and Attributes), discuss pros and cons, so others can share their opinions easily.

Given that onnx is not rich enough to cover all the operators, such conversion might be limited to a subset. It would be great to document a potential usage cases.

Here are some high level thoughts:

  • It seems does not make sense to use onnx as a serialization format, as we can always store the functions in the native format (with complete coverage)
  • If we have a target runtime with onnx format in mind, then we should list those runtimes and discuss the scope of the coverage needed.