Semantics of opt_level in relay and across the TVM stack

ramana-arm · November 6, 2019, 12:25am

What kind of optimizations can a user expect with various opt_levels by default in TVM ? It would be good to document this clearly so that we have an understanding of whether opt_levels in TVM match up to folks who could think of this like -O1, -O2, -O3 and -O0 in static compiler land?

A lot of the developer documentation is around adding a new pass to relay and alludes to this here but I don’t see anything obvious about what optimizations are turned on at level 2 for instance. Should this be documented ?

jroesch · November 7, 2019, 11:13pm

Hi Ramana,

Originally the idea was to have an O-level system similar to traditional compilers but I personally have begun to believe this design doesn’t work very well. My 2c is in traditional compilers O-levels are a rough compile time to compiled performance trade-off with each set of optimizations being designed to work well together. There are many optimizations which are left out of the opt system and explicitly enabled with -f or some similar flag.

In the ML compiler world the flows are often custom and many optimizations should never be enabled implicitly by bumping the O-level, unfortunately people have begun somewhat arbitrarily setting O-levels and turning on-and-off rare-opts with high-O levels. I’m personally in favor of removing O-levels directly from passes and if people still desire such functionality having an explicit white-list approach which allows people to read which optimizations are turned on at each level in one stop.

Happy to chat more if you would like to work on a new design.

ramana-arm · November 21, 2019, 10:37am

Hi Jared,

Thanks for that perspective. My background is in traditional compilers and I agree that the O-levels are a trade off between compile time / compiled performance / code size trade / IEEE compliance (-Ofast). Further to that, they do provide ease of use with a simple expectation that people are used to from a compiler or a compiler driver and they provide an easy mechanism for folks to test conveniently what the compiler does or doesn’t do and to expect a reasonable level of performance. I’ve always viewed the -f(option) not enabled as part of a standard -O or -W option as really a bit of a cop-out to help with some point fixes but equally I’ve observed they are most at risk of bit-rotting if they aren’t used heavily enough. Not only that , there are other heuristics that you can play with using the --param option.

To my mind trying to reason about a well defined set of criteria for -O levels to me sounds useful enough to consider. Given the combinatorial explosion of frameworks x targets, we do have an issue in terms of making things simpler, thus a taxonomy of understanding -O<levels> is probably useful to maybe even consider different passes for different frameworks.

regards Ramana