[AutoTVM] Selective tuning of hotspots

Dear community,

I’m currently trying to reduce overall Auto-TVM runtimes by selectively tuning only the kernels that are actual hotspots in the application.

Hotspot detection can be performed fairly easily, e.g. by using the debug runtime which gives a detailed callgraph profile when executing run().

My question is how to match these identified operations to the AutoTVM selected kernels.

On the one hand, the profile information looks like this example shows. A prioritized list of nodes mostly identified by their LLVM IR name.

On the other hand, when selecting the tasks to be tuned kernels =autotvm.task.extract_from_program(ir["main"], target=target, params=params, ops=None) gives you a list of Task objects, e.g.:

Task(func_name=dense_nopack.x86, args=(('TENSOR', (1, 16), 'float32'), ('TENSOR', (64, 16), 'float32'), None, 'float32'), kwargs={}, workload=('dense_nopack.x86', ('TENSOR', (1, 16), 'float32'), ('TENSOR', (64, 16), 'float32'), None, 'float32'))

My question refers to how to match such Tasks to their IR counterparts?

Any help, ideas, suggestions are much appreciated! Thank you & Best regards

It’s a bit tricky. For now you can only match op type and shape.

I see! That’s a pity if convolutios coexist with the same shapes… Maybe still can be assiciated somehow… Anyhow, thank you very much for your answer :slight_smile:

Well…if more than two convs have the same shape (both input and weight), then they will be the same tuning task. The tricky part is that it’s not straightforward to see the weight shape from the debug runtime log.

Damn you are right! Hm… matching the debug runtime output to the LLVM IR is fairly easy. I don’t know whether the shapes are somehow encoded in the LLVM IR. My guess: no…