TSIM Cycle Measurement Question for Parallel Module


#1

Hi @vegaluis,

I just run test_vta_insn.py and saw performance number which count by cycle, for example 8579 clock from GEMM.

one question is , let’s say in the Task level pipeline parallelism scenario, Load module and Compute module is parallel running , store module running after compute finish, if we assume
load module spend 4000 clock , compute module spend 4500 clocks, and store module spend 79 clocks , is the overall cycle spend, actually is assume all three module is serialize running and just summarize their total cycle?

or TSIM can handle the parallel scenario, count the synchronize time cost , accuracy figure out the module waiting/running even after the simulator thread get swap out by process scheduler, could you give some detail information about how TSIM make the performance time is accurate in parallel module scenario?

Regards
Hua


#2

The granularity and the type of event of interest can be defined by users if they need to. I just happen to do the most common measurement used which is counting the cycles from launching VTA until it finishes.

See here

If you are interested on counting something else, you can add what you are interested in (for example you can add when some module starts and until is done). You can also add multiple counters, you just need to modify the VCR

Regarding TSIM is based on Verilator which is an RTL simulator. It will simulate hardware at its core nature (parallel). So if you take out task-level parallelism from the architecture, then cycles will go up.


#3

@vegaluis, great to know the simulate is at core nature(parallel), thanks for the kindly reply.