The resnet18 example on the VTA has the json input file as custom so that the NNVM graph gets converted into VTA. I saw a post which states that it was created in a discontinued internal branch. Also a new graph transformation is being developed for Relay and NNVM will be discontinued. What is the timeline for relay+VTA based solution which can be used to port other models in VTA? Also can you please help understand the transformation done to resnet18 json file and is it possible to perform similar change to other models until the relay based solution is available.
We are working to release Model translation in Relay that will massage off the shelf models to be compiled and run on VTA. This involves applying quantization from fp32 to int8, and subsequently performing bit-packing so that we can take advantage of tensorization.
The resnet18 model was a series of custom quantization passes applied in an ad-hoc fashion; @ziheng can comment on how this was achieved. However it is not seen as a sustainable approach to quantization. We want to bank on a push-button compilation flow in Relay moving forward.