I am trying to call cblas libraries when “cblas” is specified in the target libs and the target is x86. This needs to happen in two places: dense and batch_matmul.
Dense is straightforward because x86 already has an overridden compute and schedule. However, batch_matmul only has an overridden schedule. All computes call the same compute function. Further, batch_matmul doesn’t support autotvm tuning. How can I make a generic compute call and override it per target?
Should I use the same methodology as the generic schedules in topi/python/topi/generic/nn.py? Or should I still use the