Difference Relay vs. NNVM

Hi, I tried to read everything I could find about TVM.

But I am still confused about Relay and NNVM:

In my understanding Relay is the IR, based upon Halide, which ist used for most optimizations in TVM. But what exactly is NNVM? I can’t find any information about it in the documentation.

Relay is the higher level IR that expresses models in terms of operators, like conv2d or dense. There’s a lower level IR underneath this called TIR which operators like conv2d are lowered to. TIR looks quite similar to C if you print it and it’s what is ultimately lowered to LLVM IR/OpenCL/CUDA etc. Here’s a random snippet of TIR taken from the tutorials:

primfn(A_1: handle, B_1: handle) -> ()
  attr = {"global_symbol": "main", "tir.noalias": True}
  buffers = {B: Buffer(B_2: handle, float32, [n: int32], [stride: int32], type="auto"),
             A: Buffer(A_2: handle, float32, [n, m: int32], [stride_1: int32, stride_2: int32], type="auto")}
  buffer_map = {A_1: A, B_1: B} {
  for (i: int32, 0, n) {
    B_2[(i*stride)] = 0f32
    for (k: int32, 0, m) {
      B_2[(i*stride)] = ((float32*)B_2[(i*stride)] + (float32*)A_2[((i*stride_1) + (k*stride_2))])
    }
  }
}

It’s TIR that has the historical connection to Halide IR rather than Relay. NNVM was the precursor to Relay and as far as I’m aware is now deprecated, so you shouldn’t need to worry about it.

1 Like