I think BYOC assumes your new AI accelerator supports programming in C/C++. But if the new AI accelerator doesn’t support such feature, I think these steps might be necessary:
- set
target=ext_dev
and-device=new_AI_accelerator_name
- optionally, quantize the input model into a target precision that the new accelerator supports, e.g. int8, bfloat16
- define instruction layout (load, compute, store etc.)
- implement host-device memory interface (e.g. dma_copy: dram->sram, sram->dram)
- implement runtime and driver for your target device, in order to properly handle instruction execution sequence and dependency