[RFC] Canonicalizing AutoTVM Log Format

anwang · June 29, 2020, 8:45pm

addressing @mdw-octoml’s points:

I will add a comment addressing the semantics of the dtype field in the proto.
I will further refine the spec to avoid Any. I originally included google.protobuf.Any to capture the current tuple argument semantics, which seemingly supports arbitrary nesting here https://github.com/apache/incubator-tvm/blob/master/python/tvm/autotvm/task/task.py#L43-L63. It looks like stakeholders prefer to improve the format rather than use it as a snapshot, so this will warrant further discussion.
re: tightening up the proto semantics. I will add comments to the proto to elucidate the following: version refers to log format schema version as a SemVer string. An example of tvm_version is “0.7.dev1” and afaik that doesn’t follow SemVer, but I will comment on the expected format. I agree that timestamp should be an ISO-8601 formatted timestamp and will make this change.
It looks like Config will have some drastic changes, so I will convert the Config message to containing a oneof field.

anwang · June 29, 2020, 9:22pm

@comaniac ~~I will change workload to task.~~ Since ansor is not op-based, I think it makes sense to keep the workload syntax to prepare for ansor’s log format changes.

I agree that the list-based representation of arguments is less than ideal – currently it’s hard to understand the semantics of any particular argument. If we go with a “kwargs” approach I think we should not support “arbitrary” kwargs, since the proto would necessarily need to look like

message Task {
  string task_name = 1;
  map<string, google.protobuf.Any> args = 2;
}

or

message Task {
  string task_name = 1;
  map<string, Argument> args = 2;
}

The “arbitrary kwarg” approach doesn’t restrict the type of a particular argument in any meaningful way, and I feel the point of formalizing a schema is to add these restrictions. I think it would be better to have a full enumeration of the possible arguments for the task. @comaniac what do you think? Is the example you provided an exhaustive representation of possible arguments? If not and you agree that we should restrict possible arguments, could you provide or point me to where I can find the right enumeration?

comaniac · June 29, 2020, 9:31pm

@anwang Ansor also has a “task” concept. A task is not necessary to be just for one operator. It just means a “tuning” task. As a result, I still vote for task.

In addition, I don’t think full enumeration is proper for several reasons.

Full enumeration will lose the flexibility when adding new tasks.
It would make the log too long and tedious, because the task arguments (attributes) are very different. For example, this is the task arguments for conv2d_NCHWc.x86:

github.com

apache/incubator-tvm/blob/master/topi/python/topi/x86/conv2d.py#L149



    kernel = te.compute(
        (oc_chunk, ic_chunk, kh, kw, ic_bn, oc_bn),
        lambda occ, icc, k_h, k_w, icb, ocb:
        kernel[occ * oc_bn + ocb, icc * ic_bn + icb, k_h, k_w],
        name="kernel_vec")

    return data, kernel

@autotvm.register_topi_compute("conv2d_NCHWc.x86")
def conv2d_NCHWc(cfg, data, kernel, strides, padding, dilation, layout, out_layout, out_dtype):
    """Compute conv2d with NCHWc layout."""
    # layout and out_layout are not used here,
    # we keep them for debug convenience when dumping autotvm workload
    if len(data.shape) == 5:
        n, ic_chunk, ih, iw, ic_bn = get_const_tuple(data.shape)
        oc_chunk, ic_chunk_group, kernel_height, kernel_width, _, oc_bn = \
            get_const_tuple(kernel.shape)
        in_channel = ic_chunk * ic_bn
        num_filter = oc_chunk * oc_bn
    else:

And this is dense.nopack.x86:

github.com

apache/incubator-tvm/blob/master/topi/python/topi/x86/dense.py#L138


    tilek_bn = 1
    for bn in range(vec_width*2, 0, -1):
        if K % bn == 0:
            tilek_bn = bn
            break
    cfg["tile_k"] = SplitEntity([K // tilek_bn, tilek_bn])
    cfg["tile_x"] = SplitEntity([N, 1])
    cfg["tile_y"] = SplitEntity([1, M])

@autotvm.register_topi_compute("dense_nopack.x86")
def dense_nopack(cfg, data, weight, bias=None, out_dtype=None):
    """Compute dense without packing"""
    if out_dtype is None:
        out_dtype = data.dtype
    M, K = get_const_tuple(data.shape)
    N, _ = get_const_tuple(weight.shape)
    # create tuning space
    cfg.define_split("tile_y", 32 if isinstance(M, tvm.tir.Var) else M, num_outputs=2)
    cfg.define_split("tile_x", 32 if isinstance(N, tvm.tir.Var) else N, num_outputs=2)
    cfg.define_split("tile_k", 32 if isinstance(K, tvm.tir.Var) else K, num_outputs=2)
    if cfg.is_fallback:

You can basically search for autotvm.register_topi_compute in TOPI to see all task function arguments. Unless we can also canonicalize the task arguments, it seems impractical have a full enumeration argument list.

Consequently, IMHO, supporting arbitrary kwargs arguments would be more practical.

anwang · June 29, 2020, 10:24pm

I see. Thanks for clarifying @comaniac, I agree with your comments.

Addressing @merrymercy’s points:

One possible solution to the redundancy of repeating items such as target string would be to encode something like this: message AutoTVMLogs{ string target; repeated AutoTVMLog; ...} where the inner AutoTVMLog no longer indicates the target string. However, this change would make it more difficult to adhere to the “one record per line” json standard AutoTVM currently holds. For simplicity I prefer keeping the redundancy, but since I haven’t worked very closely with the logs myself, I will defer to others’ takes.
The proposed implementation will allow manipulation of readable json.
The major differences you indicated can modify the proto as desired when ansor is ready.

Here is an updated proposal of the protobuf given everyone’s feedback.

syntax = "proto3";
package autotvm.log;
import "google/protobuf/any.proto";

message Target {
  // For now this is the string representation of a target; e.g. "llvm -mcpu=broadwell"
  // This should be replaced once the rfc "TVM Target specification" is finalized
  string target_string = 1;
}

message AutoTVMLog {
  // The compilation target
  Target target = 1;
  // Represents a tuning task
  Task task = 2;
  // The configuration used by this task
  Config config = 3;
  // Tuning results
  Result result = 4; 
  // SemVer string describing the AutoTVM log format version
  string version = 5;
  // SemVer string with qualifiers attached as a suffix. e.g. "0.7.dev1"
  string tvm_version = 6;
}

message Task {
  // Human-readable task name
  string task_name = 1;
  // Map of keyword arguments where the key indicates argument name
  map<string, Argument> args = 2;
}

message Argument {
  oneof arg {
    Tensor tensor = 1;
    // Possible tuple values are not well specified and may require more sorting out
    // https://github.com/apache/incubator-tvm/blob/master/python/tvm/autotvm/task/task.py#L43-L63
    Tuple tuple = 2;
    string value = 3;
  }
}

message Tensor {
  repeated uint32 shape = 1;
  // Indicates a numpy dtype
  string dtype = 2;
}

message Tuple {
  repeated google.protobuf.Any values = 1;
}

// Config for AutoTVM v1
message Config_v1 {
  // code hash
  string code_hash = 1;
  repeated Entity entities = 2;
  uint32 index = 3;
}

message Config {
  oneof config {
    Config_v1 config_v1 = 1;
  }
}

message Entity {
  // Entities are previously output as `[["tile_ow", "sp", [-1, 1]], <other_entities>]`
  // The proposed encoding clarifies entity type in the schema itself instead of as a string
  string knob_name = 1;
  oneof entity {
    SplitEntity split = 2;
    ReorderEntity reorder = 3;
    AnnotateEntity annotate = 4;
    OtherOptionEntity other_option = 5;
  }
}

message SplitEntity {
  repeated int32 size = 1;
}

message ReorderEntity {
  repeated uint32 order = 1;
}

message AnnotateEntity {
  repeated string annotations = 1;
}

message OtherOptionEntity {
  google.protobuf.Any value = 1;
}

message Result {
  // The measured runtime costs of this configuration
  repeated float costs = 1;
  // The error type defined by MeasureErrorNo
  int32 error_no = 2;
  // End-to-end cost of benchmarking, including rpc, compilation, test runs
  float all_cost = 3;
  // ISO-8601 formatted timestamp
  string timestamp = 4;
}

One further question I have is regarding the Tuple argument. It is serialized arbitrarily in branches that include possible recursion here https://github.com/apache/incubator-tvm/blob/master/python/tvm/autotvm/task/task.py#L53-L54 and it’s unclear to me what these different serializations should map to in logical structures. Could someone (perhaps @haichen) clarify what each branch is meant to represent? Everything that I’ve marked Tuple below represents a structure that is unclear to me.

if isinstance(x, tensor.Tensor):  # message Tensor { shape, dtype }
    return ('TENSOR', get_const_tuple(x.shape), x.dtype)
if isinstance(x, (tuple, list, container.Array)):  # message Tuple { repeated Any } 
    return tuple([_encode(a) for a in x])
if isinstance(x, (str, int, float, np.int, np.float, expr.Var)):  # message Tuple { repeated Any } 
    return x
if isinstance(x, (expr.StringImm, expr.IntImm, expr.FloatImm)):  # message Tuple { repeated Any }
    return x.value
if isinstance(x, runtime.container.String):  # string value
    return str(x)