Understanding the internals of relay.build_module.build()?

Dear TVM/Relay experts,

I am trying to understand the internals of relay.build_module.build() function, and it looks like a complicated puzzle (sorry for my lack of experience) to me.

By my debugging I was able to solve some of this puzzle, and I know that “relay.build_module.build()” calls ScheduleGetter(target).Create(source_func) function inside the compile_enginee.cc. However, I am lost little bit lost in my debugging. My goal is to add my own custom (probably optimized) operator or template implementations in relay/tvm build. For that, I’d like to understand the whole flow in detail.

  1. Could someone explain me what are the sequence of function calls from relay.build_module.build() all the way to code generation?
  2. Could someone explain me how to dump IR of certain operators?

Thanks in advance, John.

1 Like

Additionally, I am wondering how to dump Schedule and CachedFunc returned by the “CreateSchedule” function from compile_enginee.cc ?

========================This is CreateSchedule Function====================== std::pair<Schedule, CachedFunc> CreateSchedule( const Function& source_func, const Target& target) { return ScheduleGetter(target).Create(source_func); }

For my little experence, relay.build_module.build() is implement in function BuildRelay() of relay/backend/build_module.cc. BuildRelay() calls Optimize -> Lookup -> … GetLoweredFunc() . Then use tvm.build to get the final Module. You can debug the process using tests/cpp/relay_build_module_test.cc.
I dont know if this will be helpfull to you , i am learning too.

2 Likes

Thank you ckmufeng,

I have able to trace from relay.build_module.build() to BuildRelay(). As you mentioned, I am (now) able to see that the BuildRelay() has two functionalities:

  1. It does bunch of Optimization (calls Optimize() function)
  2. It generates a code with “codegen”.

My question is when does the LLVM is called to generate the code lib(runtime lib)?

I assume LLVM is being called at some point after doing optimization in the TVM/Relay level, and at some point when it is ready to generate the code, the LLVM is called to generate the code. This is my understanding, and please correct me if I am wrong. So, I am wondering how the LLVM is being called and where it is being called at what stage (or point me some source code).

Thank you so much, John.

Dear John,

Hi I’m going through similar path that you’ve been.

I think the LLVM is called after relay.build().

I’m working on TVM + VTA, so I usually use the code deploy_vision_on_vta.py.

with relay.build_config(opt_level=3, disabled_pass={"AlterOpLayout"}):
    if target.device_name != "vta":
        graph, lib, params = relay.build(
            relay_prog, target=target,
            params=params, target_host=env.target_host)
    else:
        with vta.build_config():
            graph, lib, params = relay.build(
                relay_prog, target=target,
                params=params, target_host=env.target_host)

out_file = open("inf.ll", "w")
out_file.write(lib.get_source())
out_file.close()
# Measure Relay build time
build_time = time.time() - build_start
print(model + " inference graph built in {0:.2f}s!".format(build_time))

# Send the inference library over to the remote RPC server
temp = util.tempdir()
lib.save(temp.relpath("graphlib.o"))

I kept tried to find the LLVM when digging relay.build, but I found that lib.get_source() or lib.save invokes LLVM.

For lib.save case, module.py -> _SaveTofFile -> module.cc -> llvm_module.cc

Now I’m trying to understand the gap between Relay IR and LLVM IR.

I know that “schedule” stands between these two IR and trying to dig deeper.

If you had some breakthroughs let me know.

Regards,

Jake