RFC - Where to call dialect passes

TVM community is working on adding Relay dialects. One such example is QNN that aims to support pre-quantized framework models. Each dialect will tend to have their own transformation passes. Naturally, the question arises where those passes should be called.

First, I think we might want to stick with the same relay.build_module API to avoid any confusion. This API can take a graph that has dialect operators. But, at the same time, we want to keep build_module.cc clean, without any dialect code crawling in.

To solve this issue, my proposal is to add a BuildConfig string option, dialect_name. Each dialect can have a wrapper around the sequence of passes it wants to run. This wrapper can somehow be registered using a string dialect_name (example QNN for QNN dialect). The build_module will call that registered function using the string supplied in the relay BuildConfig dialect_name before calling any of the Relay passes. This way, the build_module changes can be minimal, with only one call to the registered function.

@jroesch @tqchen @zhiics @yzhliu @vinx13 @shoubhik

had you looked at passmanager? we are migrating to it.

sorry for the delayed response on this, I was on a break in the past few weeks.

I think we want to offer a programmatic experience to the users who want to configure passes, and move toward simplify the relay.build logic. The current proposed approach brings the another layer of indirection. The api is also not really that composable in a sense that we are putting all the options into the single build function.

The challenge instead would be if we want to allow users to pragmatically call QNN legalize api. Here is an example:

mod = relay.frontend.from_tflite(my_model)
mod = relay.qnn.Legalize()(mod)
module = relay.build(mod)

Having the additional one-liner call into the module-to-module api would be just as convenient as adding a string to the build config, and would be more flexible(we could add additional passes after legalization).

As an analogy, we want to make customizing pass pipelines just as easy as using layer apis to construct a new models in deep learning frameworks.

I tried to summarize the meta-thoughts here [DISCUSS] All-in-one Build API and Pass API Composability

@tqchen

Thanks for the response. The problem is that we need the pass in the middle as a necessity (relay.qnn.Legalize), otherwise relay.build(mod) will have a graph that does have some QNN ops.

A user who just wants to read a pre-quantized framework graph would not know when to call this pass or not. If this pass absent, they will see an error.

Zhi mentioned this concern in your post. Lets discuss this there.