[DISCUSS] Adding a PyTorch Frontend

@tqchen @alexwong Translating TorchScript to Relay sounds very interesting. TorchScript also seems to be the direction PyTorch devs are going forward. Quoting from a comment by one of the PyTorch devs:

TorchScript is intended as a replacement for PyTorch -> ONNX -> Caffe2 conversion. We think the experience is overall better, as we can precisely preserve the semantics of your model code and you don’t have to work with two separate frameworks. That said, we will continue to support a PyTorch → ONNX conversion step, as it is a standard that we want PyTorch as a framework to interoperate with well. ONNX covers CoreML, ONNX.js etc. which are pretty useful (from what we heard).

For me, if I have to go through ONNX, I have to wait for three pieces in place:

  • ONNX standardization of operators that I care about
  • PyTorch to ONNX exporter
  • ONNX to Relay frontend

If instead we consider a direct PyTorch to TVM route, we only need to implement Relay frontend. This alone is a deal breaker for me. The particular use case I have is quantization - I want to leverage PyTorch’s quantization capability, but I don’t know how long I have to wait before the three pieces above become ready for new quantization methods that PyTorch community would develop.

2 Likes

@t-vi

I don’t think PyTorch -> TVM and PyTorch -> ONNX is the fair comparison here (rather PyTorch -> ONNX -> TVM). This is removing a step from the existing PyTorch -> ONNX -> TVM path. I should also clarify that its definitely not my intention to dismiss the PyTorch/TVM project, I just think its a separate enough use case so its worth having a native PyTorch parser in TVM.

As for how this will fit in with @jonso’s fallback feature, let me read through the proposal and get a better idea :slight_smile:

1 Like

I think @masahi’s comment above illustrates the pain-point. MLIR, ONNX, Relay are all some sort of IR and it would definitely be nicer to go directly from PT rather than through ONNX. For future operator support for PT, we just have to update this parser rather than hope ONNX covers it and then update the ONNX parser.

I’m also not an ONNX person but the ONNX parser in TVM doesn’t support control flow and from looking at some docs and code (very quickly so correct me if I’m wrong!) https://pytorch.org/docs/stable/onnx.html, https://github.com/pytorch/pytorch/blob/66f2bba8527d941c7d73d3cae9e9576c601587a6/torch/jit/init.py#L249, the ONNX parser uses a trace to export a PyTorch graph which means it doesn’t preserve control flow. This gives more reason to move to a native parser for PT as we can implement some support in the future rather than wait for PT -> ONNX path to be updated.

1 Like

@alexwong Since traced models and TorchScript-ed models share the same PyTorch IR (correct me if I am wrong @t-vi), we can use your WIP implementation as is, and add control flow ops on top to support loading TorchScript-ed models, is that right?

TracedModule is a subclass of ScriptModule, so if you have conversion from TracedModule working, you are already converting a subset of ScriptModule.

1 Like

Yes, that is how I envision it. Both tracing and scripting are ways to go from Eager-mode PyTorch to TorchScript and while the current implementation is grabbing a trace; in the future we can switch to scripting which among other things has control flow ops which we can then support.

We can also consider loading serialized *.pt ScriptModule (similar to ONNX workflow, where we assume serialization is already done by users). That way we don’t have to care if the model comes from script or trace.

@masahi That is something I thought about and certainly an option.

@t-vi I would PM this but can’t for some reason (maybe because basic trust level -> basic trust level PM is not allowed). I noticed you withdrew your comments which is too bad. I thought it was a good discussion to have and would be helpful for those joining the convo to read through.

Thanks everyone for good discussion. I want to say that the community welcomes different opinions, even conflicting views. This is exactly what apache way is about, resolving technical discussions in a diplomatic way and get things that is best for the users.

Specifically, we would like to separate how we do it(technical pros and cons) from the the decision(what is the final decision)

Most of the technical discussions can be proceeded with as many opinions as possible and people can agree on the result. Then the decision is something that can be suggested by summarizing the points from two sides

4 Likes

This is good work.

I agree this part is not ideal . I think the custom TVM repo thing is totally unnecessary and I’ll check whether we can just go back to use the upstream TVM repo. As for conversion code, ideally I think we should make the JIT IR -> Relay conversion code a reusable library preferably in C++. Then it’s easier to bind to Python. But it’s not the status for pytorch/tvm repo now.

I can see what this proposal is coming from. There are different use scenarios and for us we would like to have PyTorch framework to drive the outer layer of inference run and use TVM as sort of acceleration framework (https://sampl.cs.washington.edu/tvmconf/slides/2019/E05-Hao-Lu-Ansha-Yu.pdf). But I can totally see a use case where we just take a pt model, convert it and run in in relay VM without any other runtime dependency.

I think the WIP PR is good. And since we really just have TorchScript where tracing and scripting are just different ways of converting the python program to TorchScript, starting from the simpler traced model is a good plan.

3 Likes

@alexwong @tqchen

I’ve added support for translating TorchScript If and Loop nodes to Relay in my modified version of PyTorch parser based on @alexwong’s PR. I uploaded test cases along with a standalone pytorch frontend that can be run with TVM upstream master branch.

For example, given a PyTorch module with conditional and loop,

class LoopWithIf(torch.nn.Module):
    def forward(self, inp):
        a = inp
        for i in range(inp.size(0)):
            b = a * 2
            b = a + b
            if b.sum() > 0.0:
                a += b
            else:
                a -= b
        return a

The TorchScript compiler generates the following IR. Note the prim::If and prim::Loop nodes.

graph(%self : __torch__.LoopWithIf,
      %inp.1 : Tensor):
  %2 : None = prim::Constant()
  %3 : int = prim::Constant[value=1]()
  %4 : bool = prim::Constant[value=1]() # dynamic_test.py:64:8
  %5 : int = prim::Constant[value=0]() # dynamic_test.py:64:32
  %6 : int = prim::Constant[value=2]() # dynamic_test.py:65:20
  %7 : float = prim::Constant[value=0]() # dynamic_test.py:67:25
  %8 : int = aten::size(%inp.1, %5) # dynamic_test.py:64:23
  %a : Tensor = prim::Loop(%8, %4, %inp.1) # dynamic_test.py:64:8
    block0(%i : int, %a.15 : Tensor):
      %b.1 : Tensor = aten::mul(%a.15, %6) # dynamic_test.py:65:16
      %b.3 : Tensor = aten::add(%a.15, %b.1, %3) # dynamic_test.py:66:16
      %14 : Tensor = aten::sum(%b.3, %2) # dynamic_test.py:67:15
      %15 : Tensor = aten::gt(%14, %7) # dynamic_test.py:67:15
      %16 : bool = aten::Bool(%15) # dynamic_test.py:67:15
      %a.14 : Tensor = prim::If(%16) # dynamic_test.py:67:12
        block0():
          %a.4 : Tensor = aten::add_(%a.15, %b.3, %3) # dynamic_test.py:68:16
          -> (%a.4)
        block1():
          %a.7 : Tensor = aten::sub_(%a.15, %b.3, %3) # dynamic_test.py:70:16
          -> (%a.7)
      -> (%4, %a.14)
  return (%a)

My parser can translate above IR to Relay equivalent below.

v0.0.4
def @main(%X: Tensor[(10, 20), float32]) -> Tensor[(10, 20), float32] {
  %9 = (
    let %while_loop: fn (int32, Tensor[(10, 20), float32]) -> (int32, Tensor[(10, 20), float32]) = fn (%i: int32, %a.15: Tensor[(10, 20), float32]) -> (int32, Tensor[(10, 20), float32]) {
      %0 = greater_equal(%i, 1 /* ty=int32 */) /* ty=bool */;
      %1 = less_equal(%i, 10 /* ty=int32 */) /* ty=bool */;
      %2 = logical_and(%0, %1) /* ty=bool */;
      if (%2) {
        %3 = add(%i, 1 /* ty=int32 */) /* ty=int32 */;
        %4 = multiply(%a.15, 2f /* ty=float32 */) /* ty=Tensor[(10, 20), float32] */;
        %5 = add(%a.15, %4) /* ty=Tensor[(10, 20), float32] */;
        %6 = sum(%5) /* ty=float32 */;
        %7 = greater(%6, 0f /* ty=float32 */) /* ty=bool */;
        %8 = if (%7) {
          add(%a.15, %5) /* ty=Tensor[(10, 20), float32] */
        } else {
          subtract(%a.15, %5) /* ty=Tensor[(10, 20), float32] */
        };
        %while_loop(%3, %8) /* ty=(int32, Tensor[(10, 20), float32]) */
      } else {
        (%i, %a.15)
      }
    };
    %while_loop
  );
  %10 = %9(1 /* ty=int32 */, %X) /* ty=(int32, Tensor[(10, 20), float32]) */;
  %10.1
}

The conditional maps to Relay If and loop is translated to conditional and tail recursion via relay.loop.while_loop(...). The translation is straightforward and it only takes about 50 lines in my parser.

5 Likes

Really nice work @masahi!

1 Like

Hi all,

The PR by @alexwong at https://github.com/apache/incubator-tvm/pull/4497 is close to be ready for merge.

At this point I see no good reason not to add this to TVM, so when it becomes ready I’ll go ahead and merge it. Later I’ll send a control flow support to begin work on script models. Thanks.

4 Likes

This is really interesting!

I’m thinking this could enable a straight forward implementation of iterative retraining for networks that will be quantized.

1 Like

Hi All,

I am interested in this work, and I am wondering what is the status of Pytorch --> Relay IR?

Thanks, S.

NotImplementedError: The following operators are not implemented: [‘aten::values’]

I don’t think the PyTorch frontend supports sparse tensors at this time.

add Dict type in relay ?

Ah, sorry. I had checked ATen’s values function. But so you already found the thread discussing data types in the runtime / IR, so that bit would seem to come first. :slight_smile:

1 Like

[Frontend][PyTorch] NotImplementedError: The following operators are not implemented: [‘aten::is_floating_point’, ‘aten::true_divide’]