Let’s say I have multiple devices on a single piece of hardware that can accelerate various computations. Is there a way to configure which Relay nodes can be fused for each device without having to write an IR pass? Since IR passes are already in place that fuse nodes based on default rules, it seems like this could be extended to something more general and easily configurable (a config file perhaps).
This could potentially open up a search space at the high-level much like AutoTVM acts on the low-level. Devices could have many overlapping fusable nodes and finding the best combination may be difficult.