Update matmul to support more options

  1. Strides are supported and tested via tvm-clj.
  2. Code is much more clear.
  3. double support as well as float.

I will do a bit more testing. I searched the codebase and it appears the gemm code was just cut/pasted for all the contrib matmul’s with little apparent effort to unify approaches.

I propose:

  1. Header file with unified approach that all implementations can use.
  2. Update all matmuls; they will then mainly just forward to unified approach with different gemm operators.
  3. Add support for optional alpha, beta parameters. Old code would continue to work with no changes; the argvec.

This would update all matmuls in contrib. I cannot test all those different options, however, so I would need help with that. I could test mkl, cpu, cuda, and if the intel or nvidia opencl implementations support it the rocblas pathway.

Does this sound like a good plan? This may allow tvm to be used in systems that, for instance, are using gemm to sum gradients into an accumulator (beta of 1).

yes, I added cublas and rocblas by copy-pasting cblas code with minimal effort :slight_smile:

I like your change. You don’t have to test on rocblas. rocblas is for rocm backend only, so it requires AMDGPU. It is not compatible with opencl.

If we want to introduce something like gemm(with alpha support) maybe we need to rename the external operator as gemm. contributions are more than welcomed

I am hesitant to rename as I do not want to break anyone’s stuff. I could add a new operator called gemm (in the same files) that supports alpha, beta.

The changes I am making are not just adding alpha, beta. It is also verifying the implementation to work with matrixes with strides (submatrixes). Also, for the cuda implementation the stream and handle management were not done as well as contrib.cudnn so I am following that model.

My vote is to change nothing and just have more things work than would have otherwise but if there is some interface that matmul is supposed to adhere to (because, for instance, topi.matmul is only capable of it) then I should rename.

Opening this up for review now.

I haven’t looked at the roc support. Because I don’t have code to build it, I am not comfortable changing that code as I cannot test it.