Relay is the higher level IR that expresses models in terms of operators, like conv2d or dense. There’s a lower level IR underneath this called TIR which operators like conv2d are lowered to. TIR looks quite similar to C if you print it and it’s what is ultimately lowered to LLVM IR/OpenCL/CUDA etc. Here’s a random snippet of TIR taken from the tutorials:
primfn(A_1: handle, B_1: handle) -> ()
attr = {"global_symbol": "main", "tir.noalias": True}
buffers = {B: Buffer(B_2: handle, float32, [n: int32], [stride: int32], type="auto"),
A: Buffer(A_2: handle, float32, [n, m: int32], [stride_1: int32, stride_2: int32], type="auto")}
buffer_map = {A_1: A, B_1: B} {
for (i: int32, 0, n) {
B_2[(i*stride)] = 0f32
for (k: int32, 0, m) {
B_2[(i*stride)] = ((float32*)B_2[(i*stride)] + (float32*)A_2[((i*stride_1) + (k*stride_2))])
}
}
}
It’s TIR that has the historical connection to Halide IR rather than Relay. NNVM was the precursor to Relay and as far as I’m aware is now deprecated, so you shouldn’t need to worry about it.