[RFC] support cce target name in tvm

xqdan · October 10, 2018, 3:34am

Hi,

We want to add cce target in tvm, it would take time for open souring the whole cce backend, can we PR cce target related first? we need to add a device_type in dlpack, also add target name and device_type in c_runtime_api.cc build_module.cc runtime_ctypes.py.

Thanks

tqchen · October 10, 2018, 5:30am

please provide a bit more background as not everyone in the community know what the cce is

xqdan · October 10, 2018, 6:31am

Sure, just similar to Cuda C, cce C is a programming language for Huawei’s AI chip, Davinci IP core.

We have supported cce backend based on TVM with the help of community, thanks goes to community members.

Ascend AI IP and chip series(with unified Davinci core inside) has been released today at HC 2018, you can check it out here. https://www.huawei.com/en/press-events/news/2018/10/huawei-hc-2018-eric-xu-ai

xqdan · October 23, 2018, 2:20am

@tqchen I send a PR to dlpack

tqchen · October 23, 2018, 2:36am

Given the current status of CCE support, maybe it makes sense to bring kDLCCE to tvm repo first, with some background info, once we have some running examples, then we upstream the change to DLPack

xqdan · October 23, 2018, 2:39am

bring kDLCCE to tvm repo first —> tvm are using dlpack as sub module, how can I do it?

tqchen · October 23, 2018, 2:42am

add the flag definition to https://github.com/dmlc/tvm/blob/master/include/tvm/runtime/c_runtime_api.h#L63

xqdan · October 23, 2018, 2:42am

great! so I close PR on dlpack, and send another one on TVM

tqchen · October 23, 2018, 2:42am

And please also provide background information, hopefully a rough timeline on when can the community start to use CCE backend

liangfu · October 24, 2018, 3:16am

It’s great to enable the programming model for the new AI chip. I think the community would take time to get familiar with the new era of ASIC based accelerators. @xqdan can you provide more information on the following details regarding to CCE C programming and the DaVinci chip?

CCE C Programming (Technical Specification, Programming Syntax, Programming Interface, Optimization Guidelines)
DaVinci chip (Computational Capacity, Availability of Development Boards)

Without those specs, it wouldn’t be friendly for the community to accept the new AI chip. There are already many AI chips that don’t have any developer friendly programming interface.

xqdan · October 24, 2018, 9:31am

@liangfu thanks for your attention, actually what we’ve been doing on TVM is trying to reduce developers‘ burden of learning these detailed low level information. Imagine that you just write the tvm dsl and no need to take care of the things you mentioned above.

thierry · October 26, 2018, 6:29pm

@xqdan One thing that might be nice is to understand the set of hardware intrinsics that TVM should lower a schedule down to (for instance are we using tensorization intrinsics, or different types of DMA load/stores). In addition, it might be good to understand how a programmer can expose more parallelism for the chip to take advantage of. For instance with the VTA reference design we used virtual threads that would be lowered to low-level dataflow-like synchronization operations to uncover task-level parallelism within the chip.

Highlighting these challenges when targeting the DaVinci chip would be nice, and perhaps contrasting it with VTA so that programmers can understand how it relates in terms of challenges.

Overall this is very exciting stuff!