Wrong padding while implementing pad operator for 2 feature maps inside same schedule

I need to pad two feature maps to correlate them afterwards. For that I define the the padding computation or use the topi pad operator. The problem is that while padding both feature maps, one is correctly padded but the other no, occasioning meaningless results when I run my correlation operator. The three computations, padding for feature map “A”, padding for feature map “B” and the correlation operator “C” are defined inside the same schedule. Shape of the original feature maps is {batch_size:3, channels: 256, height: 64, width: 48} and pad_size is 20, second feature map is correctly padded while the first not. Any suggestions on how to solve this problem?

I attach the lowered computation for the hole operator (in this case I tried first topi’s operator and afterwards my own padding operator)

// attr [Apad] storage_scope = "global"
allocate Apad[float32 * 2359296]
// attr [Bpad] storage_scope = "global"
allocate Bpad[float32 * 7028736]
produce Apad {
  for (i0, 0, 3) {
    for (i1, 0, 256) {
      for (i2, 0, 64) {
        for (i3, 0, 48) {
          Apad[((((i0*786432) + (i1*3072)) + (i2*48)) + i3)] = tvm_if_then_else(((20 <= i2) && (20 <= i3)), A[(((((i0*786432) + (i1*3072)) + (i2*48)) + i3) - 980)], 0f)
        }
      }
    }
  }
}
produce Bpad {
  for (nn, 0, 3) {
    for (cc, 0, 256) {
      for (yy, 0, 104) {
        for (xx, 0, 88) {
          Bpad[((((nn*2342912) + (cc*9152)) + (yy*88)) + xx)] = tvm_if_then_else(((((20 <= yy) && (yy < 84)) && (20 <= xx)) && (xx < 68)), B[(((((nn*786432) + (cc*3072)) + (yy*48)) + xx) - 980)], 0f)
        }
      }
    }
  }
}
produce C {
  for (nn, 0, 3) {
    for (c, 0, 441) {
      for (yy, 0, 64) {
        for (xx, 0, 48) {
          C[((((nn*1354752) + (c*3072)) + (yy*48)) + xx)] = 0f
          for (rc, 0, 256) {
            C[((((nn*1354752) + (c*3072)) + (yy*48)) + xx)] = (C[((((nn*1354752) + (c*3072)) + (yy*48)) + xx)] + (Apad[((((nn*786432) + (rc*3072)) + (yy*48)) + xx)]*Bpad[((((((nn*2342912) + (rc*9152)) + ((c % 21)*176)) + (yy*88)) + ((c/21)*2)) + xx)]))
          }
        }
      }
    }
  }
}

In case you need more information, i’m still implementing the correlation operator which is based on the following equation from FlowNet: Learning Optical Flow with Convolutional Networks:

image

Thanks in advance!