Hello! In the GPU conv tutorial I see the following two lines:
s[output].pragma(kernel_scope, 'auto_unroll_max_step', cfg['auto_unroll_max_step'].val)
s[output].pragma(kernel_scope, 'unroll_explicit', cfg['unroll_explicit'].val)
I’m curious if these pragmas cover all the operators/nested loops, e.g. AA, WW in the tutorial, that are attached under kernel_scope
which is the first iter_var
of output
? Generally, suppose I have a schedule looks like this:
for A {
for B {
}
for C {
}
}
If I set the kernel_scope to be for A
, does it take effect on for B
and for C
? If I want for B
and for C
to follow different pragmas, should I define two separate groups of tunable knobs for both of them?
Thanks in advance!