How to write a correct schedule for simple hybrid script like:
for i in range(batch_size):
valid_count[i] = 0
for j in bind('threadIdx.x', num_anchors):
score = data[i, j, 1]
if score > score_threshold:
for k in bind('threadIdx.y', box_data_length):
out_tensor[i, valid_count[i], k] = data[i, j, k]
valid_count[i] += 1
if j > valid_count[i]:
for k in bind('threadIdx.y', box_data_length):
out_tensor[i, j, k] = -1.0
Currently, the default schedule https://github.com/dmlc/tvm/blob/master/topi/python/topi/cuda/vision.py#L10 which works for ir_builder doesn’t work for hybrid script with error forget binding…
@were