I have an interest in using NNVM/TVM as a static compiler to generate a runtime image for HW accelerator like that of NVDLA.
NVDLA accelerates at a much higher level of abstraction than CPU and GPU, e.g. conv2d, batchborm, etc.
Graph level optimisation like ops fusion by NNVM will be very useful.
Can you advise how best I should approach, e.g. relevant topics in the documentation/tutorial etc.