TVM community is working on adding Relay dialects. One such example is
QNN that aims to support pre-quantized framework models. Each dialect will tend to have their own transformation passes. Naturally, the question arises where those passes should be called.
First, I think we might want to stick with the same
relay.build_module API to avoid any confusion. This API can take a graph that has dialect operators. But, at the same time, we want to keep
build_module.cc clean, without any dialect code crawling in.
To solve this issue, my proposal is to add a
BuildConfig string option,
dialect_name. Each dialect can have a wrapper around the sequence of passes it wants to run. This wrapper can somehow be registered using a string
QNN for QNN dialect). The build_module will call that registered function using the string supplied in the relay BuildConfig
dialect_name before calling any of the
Relay passes. This way, the build_module changes can be minimal, with only one call to the registered function.