Question on operator fusion

hashjoin · October 1, 2019, 10:11pm

Hey, here I have a general question. It says that operator fusion combines multiple operators into a single kernel without saving the intermediate results in memory. From my understanding, it achieve efficient communication by avoiding the frequent access to the global memory. Right? I just want to know what’s the difference with using the shared memory. Lets say, we can also avoid the inter-kernel communication via the shared memory?

hashjoin · October 2, 2019, 4:14pm

Anyone could help me?

lygztq · June 24, 2021, 9:15am

I think shared memory only has the same lifetime as the block, it no longer exists when the kernel is finished.