[VTA] How to put all the add, etc. operations to FPGA/VTA

I’m seeing that in Resnet, between conv2d layers, there are some operations (e.g., add, relu, clip) that are not run in FPGA. Instead they are running in ARM CPU. How can I force these operations to FPGA?

I know I can try to tag these as env.alu, but where should I do that?

vta_conv2d.py has tag all the ewise ops to alu. But I think these are out of conv2d scope (stop_fusion separate these ops to another fused_xx).

Appreciate any suggestion

They’re actually running on FPGA. Specifically, on the ALU,

  • ReLU operator has been transformed to use MIN instruction
  • clip operator has been split into two cycles: MIN and MAX instruction

Actually I mean these ops (as shaped in the attached figure). I think they are operating in “int8”?

I tried to print the VTA instructions during runtime. These ops (“fused_add_nn_relu_clip_3”) seems not issue any VTA commands.