Google lasted work: MLIR Primer


#1

https://drive.google.com/file/d/1hUeAJXcAXwz82RXA5VtO5ZoH8cVQhrOK/view
This is huge since the author of this work is the author of LLVM. I just go thru it quickly, can’t wait to discuss it with folks in tvm, my take-aways,

  1. principled compiler is definite a trend to forward, Technics in traditional compilers like ssa is very important. also see DLVM 2018 https://openreview.net/pdf?id=HJxPq4ywz
  2. the applying of Polyhedral will be critical. we might see poly work of google in the future, since they have the best poly team in the world.

TVM Monthly - Feb 2019
#2

I was at the C4ML workshop, and I would like to share some of my thoughts. MLIR as itself is a meta-way of defining IRs, in the folks’ word “XML for IRs”. In particular, for Google, there will be dialects like MLIR-XLA, MLIR-TFLite, and MLIR-TFGraph. Concrete compiler solutions still need to be built for each layer of the dialect languages, and they can be very different due to the difference in the semantics of the operators. In short, MLIR itself offers a direction of infra, rather than a solution to the problems we are facing. It will really be interesting to see how things will move and how the TVM community and learn from and work with MLIR.

I agree that principled compiler treatment for deep learning optimization is the way to go. We also need to bring in novel solutions beyond traditional compiler techniques and put machine learning in the center. The TVM community has been pioneering the direction of deep learning compilation, and I hope we can continue to do so by learning from and work with MLIR

Here are the things that we already do, which shares the same philosophy with MLIR

  • Extensible operator definition and pass infrastructure(in relay)
  • Layers of IR and co-optimization when possible

Besides that, here are two things that we should learn from the MLIR, that we can move toward in the incoming year:

  • Unify the relay and the tensor expression optimization layer, to bring a unified TVM-IR that works across layers of optimization.
  • Make sure that the TVM stack and its IR interoperates with dialects of MLIR.

In the meanwhile, we can collectively work together to build a principled stack that automatically optimizes for models across CPU, GPU, and specialized accelerators. The direction like relay, pass improvements, fomalization of symbolic integer analysis and hardware stack will contribute a lot to that direction. I hope we as a community can strive to provide an open, full stack solution that reflects the MLIR principle, work with MLIR and enable the future of deep learning compilation.


#3

Some folks of us discussed it. Yes, MLIR is good, however, it is not total solution like TVM provides. We don’t know what will be happened next. One thing I think we can care is we can make sure interoperates with MLIR when it release.

Polyhedral is also good. Some folks leveraged LLVM Polly. See http://pollylabs.org/gsoc2017/Enable-Polyhedral-Optimizations-in-XLA-through-LLVM-Polly.html. The key is to modify Polly’s SCoP detection and can detect conv2d.


#4

@FrozenGene poly on llvm is difficult, poly on tensor layer is a different story, more powerful, no need to detect conv2d.


#5

Yeah. I know. I just indicate someone has done it like this way. http://llvm.org/devmtg/2017-10/slides/Agarwal-Enabling%20Polyhedral%20optimizations%20in%20TensorFlow%20through%20Polly.pdf


#6

I agree with this point. The designs and thoughts of MLIR may be shared with other frameworks, including its IR and Polly. MLIR seems built it on LLVM IR, but it may be too complicated to be necessary for TVM. A subset of LLVM IR may be enough for DSL, so that Halide remains the best choice for TVM. It will make Polly’s SCoP detection easier, cause the flow graph is simpler to analyze.


#7

One more thing I come up suddenly. We could have one documentation page to compare others NN compiler technology(such as XLA, GLOW) in our docs.tvm.ai like Clang do it (compare with GCC) in its site. I think many people are interested at it. For example, we could list: frontend framework support, hardware backend support, common model performance data, what companies are involved. @tqchen


#8

+1 for unifying TVM IR across different layers for optimization. I think we’ve already seen the need/benefit of it when working on Relay runtime. It would be interesting to see Relay(v2)/NNVM(v3) to unify different IRs.


#9

I would also like to point that this would be great. To be honest at the beginning of my search for compilers for DL I had a hard time trying to grasp what each one was better at than the other.


#10

This presentation is indeed very interesting. Regarding multi-layer IR, I think it may be a Google’s answer to ONNX, which, as far as I understand, also aims at establishing some standard for representing ML models. We probably should wait for the specifications and then provide compatibility with Relay.

The really interesting thing is Polyhedral topic they announced. I think this case needs close attention and action. TVM may benefit of adopting techniques from Polyhedral world. As I mentioned in https://github.com/dmlc/tvm/issues/2588, we may want to include ISL as a third-party dependency and make experiments to become more familiar with the semantics.

Edit: Interesting news are coming from Tiramisu project which combines polyhedral model with scheduling language. https://arxiv.org/pdf/1804.10694.pdf . Note their criticism of Halide’s approach to scheduling.


#11

+1 for this. I have some notes on some of the alternative deep learning compiler work. Can share those.


#12

TVM definitely needs dependency analysis infra, we can start from this point, since isl has very powerful dependency analysis.


TVM Monthly - Feb 2019