I’m seeing that in Resnet, between conv2d layers, there are some operations (e.g., add, relu, clip) that are not run in FPGA. Instead they are running in ARM CPU. How can I force these operations to FPGA?
I know I can try to tag these as env.alu, but where should I do that?
vta_conv2d.py has tag all the ewise ops to alu. But I think these are out of conv2d scope (stop_fusion separate these ops to another fused_xx).
Appreciate any suggestion