Profiling a TVM run on CPU

I’m trying to profile a short TVM script:

# script.py
import tvm
from tvm import relay
from tvm.relay.testing.mobilenet import get_workload
import numpy as np

module, params = get_workload()

input_shape = (3, 224, 224)
src_dtype = 'float32'
input = tvm.nd.array(np.random.rand(*input_shape).astype(src_dtype))
relay.create_executor("graph", mod=module).evaluate()(input, **params)

I would like to get a gprof-style call trace over the code that is generated, but I’m unsure on how. I had a few ideas:

  • Profile the entire python invocation. This leads to a lot of noise.
  • Generate C code, compile that binary, and profile it; but I’m not sure the C generator is well-supported, based on what I’ve read.

Any ideas?

2 Likes

How about using vm profiler? https://github.com/apache/incubator-tvm/blob/master/python/tvm/relay/backend/profiler_vm.py

1 Like

Or graph runtime debugger? Follow this tutorial but change one line:

from tvm.contrib import graph_runtime

to

from tvm.contrib.debugger import debug_runtime as graph_runtime

and you will see the time breakdown by each op.

1 Like

Great, thank you for this! For posterity, I actually ended up using gperftools, Google’s CPU profiling toolkit. It worked pretty well. To use it, I compiled TVM and linked in their libprofiler:

@@ -297,6 +297,11 @@ target_link_libraries(tvm_topi tvm ${TVM_LINKER_LIBS} ${TVM_RUNTIME_LINKER_LIBS}
 target_link_libraries(tvm_runtime ${TVM_RUNTIME_LINKER_LIBS})
 target_link_libraries(nnvm_compiler tvm)
 
+target_link_libraries(tvm /usr/local/Cellar/gperftools/2.7/lib/libprofiler.dylib)
+target_link_libraries(tvm_topi /usr/local/Cellar/gperftools/2.7/lib/libprofiler.dylib)
+target_link_libraries(tvm_runtime /usr/local/Cellar/gperftools/2.7/lib/libprofiler.dylib)
+target_link_libraries(nnvm_compiler /usr/local/Cellar/gperftools/2.7/lib/libprofiler.dylib)
+
 if (HIDE_PRIVATE_SYMBOLS AND NOT ${CMAKE_SYSTEM_NAME} MATCHES "Darwin")
   set(HIDE_SYMBOLS_LINKER_FLAGS "-Wl,--exclude-libs,ALL")
   # Note: 'target_link_options' with 'PRIVATE' keyword would be cleaner

Then I ran the script, setting the variable CPUPROFILE:

CPUPROFILE=script.out `pyenv which python3` script.py

Finally, open the outputted file with pprof, their provided tool for parsing the results:

pprof `pyenv which python3` script.out

Using the web command I was able to see a call graph. This was a very coarse way to profile, as it included profiling information on the compilation process in addition to information on the program that TVM generated and ran. However, it got me the information I needed!

Thanks all!

2 Likes

Can you please re post the link of the tutorial. This seems to be broken.

https://tvm.apache.org/docs/tutorials/get_started/relay_quick_start.html#sphx-glr-tutorials-get-started-relay-quick-start-py

1 Like