I am a master student who is currently dealing with deep learning accelerator simulator design. I have implemented the physical model of computing units mainly some Matrix multiply array and got one cache simulator avaliable now. What i need to do now is to build the model for activiation function unit, pooling unit and algebra calculation unit for error back propogate plus the datapath to connect these components all toghther. I found the architecture and datapath is good in VTA while my research target is for new device based on emerging non-voltale technology. So i mainly plan to use the part without compute unit. My plan is to condigure the bandwidth of bus according to the size of DL model then generate the RTL code for latency/area/power information. Can i ask is this a plasiblue plan? Which files should i look into to change them to delete the compute unit while not influence the HLS process to generate RTL and also make it OK for further simulation process.