Graph partitioning and Heterogeneous Execution


#21

@zhiics I’m curious about the timeline for this item - this will be particularly useful for our upcoming FPGA backends (especially for PCI-E based FPGA platforms).

Thanks!


#22

@thierry I think it is on the 0.5 roadmap. I have something now and we tested it successfully on a processor with Intel CPU and Intel Graphics. I am more focusing on some design details now.


#23

Thanks, this will be very useful for FPGA backends down the road. Is a PR for this already open? If now what’s the timeline? Thank you


#24

@thierry It’s currently in my private repo. We are testing it with some use cases. If you want it soon, I can probably add you to the repo and we can start from there. Sounds like a plan?


#25

That would be fantastic! I’d be happy to provide feedback as well with respect to our FPGA examples. My github ID is tmoreau89.


#26

Is it possible to make the WIP in a public fork? I am assuming many people watching this thread would be interested in what is going on, and it helps making things more accessible to broader community


#27

Good idea, if that’s not too much of a hassle, it would be great to have the community also provide feedback on your WIP.


#28

@tqchen @thierry Sounds good. I can check with somebody internally and see how to proceed.


#29

Got the approval and sent the WIP out: https://github.com/dmlc/tvm/pull/1688.


#30

What do you think to have a dedicated module which handles memory syncing across devices rather than explicit copy node in graph? Maybe too much rewrite… @tqchen