What did Codegen do in tvm?


For instance, when I viewing the code in the tvm/src/codegen,I am confused. I want to know what codegen does to make it match the corresponding backend.


It translates the TVM IR into another IR that can be compiled into executable code. For example, code in src/codegen/llvm generates LLVM IR, which is then handed over to the LLVM’s optimizer/code generator to get the object file or assembly. codegen_c.cc will translate TVM IR into C code which then can be compiled using a C compiler, etc.


Thank you for your answer,My current question is how to read the codegen code. because I can not reach the level of fast reading code.For example, I want to read the implementation of X86, but there are too many codegen files, such as codegen.c, codegen_c.cc and llvm folder corresponding codegen_cpu.cc, codegen_x86_64.cc.At the same time, there are more codes in the file. How can I read the key points?


X86-64 uses LLVM, so you could start with codegen_llvm.cc. codegen_cpu.cc is derived from codegen_llvm, and codegen_x86_64.cc is further derived from codegen_cpu. Most of the codegen happens in the first two files, whatever is not handled by codegen_cpu is dealt with in codegen_llvm. The last one, codegen_x86_64 is only there to generate some vector intrinsics.
They all work in a similar way: they have visitor functions that are called for each element (statement or expression) of the TVM/Halide IR, and generate the corresponding LLVM IR. For example,

llvm::Value* CodeGenLLVM::VisitExpr_(const And* op) {
  return builder_->CreateAnd(MakeValue(op->a), MakeValue(op->b));

This is one of the simplest functions in there, but illustrates the concept: MakeValue takes an expression (in TVM/Halide IR) and returns the corresponding llvm::Value. Then the call to CreateAnd creates an And operator in LLVM IR, using the llvm::Value values returned by MakeValue.
Many of the translations are not so straightforward, so the “Visit” functions are more complicated, but the concept remains more or less the same.


Thank you for your answer and I have helped a lot. I have been reading the code in these two days.I suddenly have a question about whether Halide IR was converted to LLVM IR in CodeGen. Is this step already using LLVM?


Yes, codegen converts Halide IR into LLVM IR.


Thank you very much for your help, but I still have some questions. If you have WeChat or other communication software, can you give me a convenient way to communicate?


No, sorry. Feel free to ask more questions on this forum, this way other people with the same questions can benefit from it.


OK,Here is my current problem: my general purpose is to add a specific backend supported by LLVM in tvm, so I tried to use existing riscv, my idea is that riscv is a CPU architecture, and I don’t need to support intrin instructions now, what do I need to do?


It may be the case that you don’t need to do anything, although it’s somewhat unlikely. Are you using a RISC-V simulator? Does it simulate the whole system, or just the environment for running applications?

When you create an LLVM target, pass “-target riscv32” (or riscv64) to it, i.e. “llvm -target riscv32” and see what happens. You will need LLVM with RISC-V backend enabled, and one that supports JIT compilation (I’m not sure if that’s available for RISC-V, but you can try and see).

In general, the easiest approach is to pretend that RISC-V is just another CPU, and rely on the CPU code. When something fails, fix it and try again. You’ll get a better understanding of what pieces are missing, and that’s something that’s often difficult to fully predict ahead of time.

I think someone has already tried using TVM on RISC-V, but it didn’t work because LLVM didn’t support PIC for RISC-V at the time. The RISC-V backend is actively developed in LLVM, and so this may no longer be a problem now.