ONNX NonZero operator

The HuggingFace PyTorch pretrained BERT model has an embedding layer that includes the op NonZero. This is the description from the ONNX docs:

Returns the indices of the elements that are non-zero (in row-major order - by dimension). NonZero behaves similar to numpy.nonzero

Given that this op returns a tensor of arbitrary length, is there any way we can support this? @jwfromm do you have any thoughts?

This sounds like a good use case for the dynamic shaping added in PR #3606. Using relay.Any would require that we use the relay VM rather than the graph runtime for execution though and may have some unpleasant interactions with autotuning. I suspect the dynamic shape won’t be a problem for downstream nodes in the graph and we can properly get their shapes using infer_value.

Regardless of how smoothly something like this integrates, it’s definitely worth exploring as dynamism is only going to become more common.

I have filed a feature request to support this operator: https://github.com/apache/incubator-tvm/issues/4568