[RFC][VTA] bfloat16/fp8/fp9/posit tensor core


#1

Over the past year, we (Stillwater and Calligo) have developed several posit data paths and posit data path generators. We are now looking at integrating these data paths into VTA through a Chisel data path generator.

One observation that came out of our research is the fact that at the arithmetic level, all these number systems, fp32, float16, bfloat16, ms-fp8/fp9, and all the posit configurations can be unified in the data path. The only difference is the minimum size of the scale and significant in each floating point representation.

We want to design a unified floating point pipeline for VTA that will be able to support the key floating point formats: fp16/fp32/bfloat16/ms-fp8/ms-fp9/posit<[4-16],[0-3]>.

There are a couple of key components that will require investigations regarding their parameterization. The list includes:

1- the super accumulators needed to support the possible fused dot products and mixed-precision tensor operations
2- the register files holding the triple format (sign, scale, fraction)
3- the size and exceptions around the scale processing
4- the size and rounding operations around the significant processing
5- the intermediate rounding or deferment

We are looking for collaborators to create a data path generator for VTA that will enable the VTA to be used to study learning rates and recall accuracy among all these different number systems.