[VTA] Support for more generic models

flip1995 · November 6, 2019, 3:37pm

apache/incubator-tvm/blob/86b844b99595c30a2f41cd02796a6b1ccb7c60a6/vta/python/vta/top/graphpack.py#L249-L253


def get_subgraph(expr, start_name, stop_name):
    """ We assume stop_name only appears once for simplicity.
        This constraint will be lifted in the future.
        bitpack_start and bitpack_end are both inclusive.
    """

IIUC, this means, that only models, which have a unique layer in the architecture can be efficiently run with VTA? This unique layer should also be one of the last layers, to offload as much as possible on the FPGA.

So if I have a model with the structure¹

nn.conv2d
nn.leaky_relu
nn.max_pool2d
...
nn.conv2d
nn.leaky_relu
nn.max_pool2d
nn.conv2d

I won’t be able to use graph_pack on this model, since I can’t specify a (good) stop_name for this network. Which then means, that I can’t offload this model to VTA.

If that’s the case this restricts the usability of VTA to only pretty specific models. Is there already a plan how to address this challenge?

cc @thierry, since you are the author of this code.

¹ This is the (simplified) structure of TinyYOLO from Tensornets

thierry · December 12, 2019, 3:17am

The compiler support for VTA at this point is very rigid as rightfully pointed by @flip1995; what we want is a more flexible graph partitioning, rewriting framework to offload dense operators easily to VTA. Bring your own codegen work is a step in that direction: Bring Your Own Codegen to TVM

Our plan is to deprecate the error prone and brittle compilation flow for VTA and use the more robust and modular “plug your own accelerator” interface once it lands successfully.