AutoTVM log format

Relay op strategy #4644 makes two changes to AutoTVM:

  1. Removes the template key from AutoTVM template
  2. Unify the workload name and the task name into one flatten name.

Therefore, the old AutoTVM log won’t be able to use after this PR. Given that, we could also take this chance to improve the current log format.

The current Autotvm log format is

log_record: {
  "i": [target, task_name, args, kwargs, workload, config], // change to "input"
  "r": [costs, error_no, all_cost, timestamp], // change to "result"
  "v": autotvm_log_version // change to "version"
}

config: {
  "i": index, // change to "index"
  "t": template_key, // remove
  "c": code_hash, // change to "code_hash"
  "e": [config_entity, ...] // change to "entity"
}

First, we should expand the one character into a full word to improve the readability, e.g., “i” -> “input”, as commented above, and remove the template key entry in the config json object.

Second, I want to discuss whether we should remove the workload in the record. Previously the workload name (first item in the workload) is op name, which is different from Autotvm task name. Now they share the same name after #4644. Further both input field and workload have args and kwargs. So it seems redundant to keep both (task_name, args, kwargs) and workload in the log, and have both serialize_args and args_to_workload two serialization functions.

Third, do we have other things want to add/change in the log record?

Could you share your thoughts on this?

@tqchen @merrymercy @ziheng @kevinthesun @comaniac @vinx13 @jwfromm

1 Like

Expansion of record key name is a great idea. For workload, as long as we have args_to_workload to get workload from args, I think it’s safe to remove it from log record.

I’m considering replace the args_to_workload by the serialize_args since they serve the same purpose. But of course we still add the workload to the op attrs in the autotvm compute wrapper in order to query the autotvm DispatchContext.

Thanks for the proposal, and I totally agree that the AutoTVM log format should be improved to reduce confusions. For the raised topics:

  1. Using full names is great. I’d also like to discuss the log version. Will that be aligned to TVM version? If not, we should have a mechanism to determine the version other than manual bumping. Something like TVM commit number, timestamp, minor release (although we don’t have it yet) would be reasonable.

  2. I agree to remove either workload or (task_name, args, kwargs). As long as we have serialize and deserialize functions, removing either of them would be fine.

  3. Other opinions:

    • From the code base I think we don’t need code_hash anymore (correct me if I missed any important function of it).
    • The index field is also meaningless. The only purpose I can think of is debugging grid search.
    • If we can remove all fields except for entity in config, then we can simplify config to entity directly.
  4. I raised this issue in the op strategy PR but I think this is a great place to revisit it: The op strategy PR now leverages the measured time to determine which implement should be applied when compilation. To do so, it now adds the cost to input when loading logs, because we will pass result to the Relay build flow. According to this requirement, should we consider changing the log format to a form like the following:

log_record: {
  "input": [target, task_name, args, kwargs],
  "config": [entity, cost],
  "metadata": [error_no, all_cost, timestamp],
  "version": autotvm_log_version
}

The all idea is separating the current result to cost and metadata, and integrate cost to config. As a result, the metadata part is totally flexible and can be simply thrown away when loading records.

I really like all of @comaniac’s suggestions. I’m also in favor of logging the specific TVM commit number used when tuning as it will make finding the cause of performance regressions much easier.

Just to provide more context about this. Relay op strategy needs to compare the performance between different implementation and therefore requires to get the cost from autotvm log. The way to achieve this currently is that I add the cost field in the config and load the cost from autotvm log.

I still prefer to leave the cost in the "result" field along with the error_no, all_cost, timestamp. The cost I added in the config data structure is more related to the DispatchContext in the autotvm, but not necessary to change the log format.

From my perspective, the AutoTVM log version is different from TVM version. I agree that having some sort of TVM version would be good. But it’s hard to get the commit hash, maybe just TVM release version?

@merrymercy @ziheng Could you comment on code_hash and index?

For the version part, we can consider adding more info to version like this topic Could we add more info in tvm.__version__ suggested. On the other hand, if we will release TVM miner versions (as one option in Development Process and Backporting patches to release branches), it would be suitable to use major plus miner version directly.

I can confirm index and code_hash are useless in the current code base.

code_hash was introduced to avoid using a config to a wrong template. But it seems this is not a problem in our current practice.

I agree with other changes.

Let me make a summary of the current proposed autotvm log format:

log_record: {
  "input": [target, task_name, args, kwargs, config_entity],
  "result": [costs, error_no, all_cost, timestamp],
  "version": autotvm_log_version,
  "tvm_version": tvm_version
}

config_entity: [list of configs, e.g., split]

The TVM version will use tvm.__version__, and we can add more info to it in the future. The autotvm log version will change from 0.1 to 0.2

Does it look good to everyone?

vote for this one :+1:

After taking another look, I found that “index” in the config is used by AutoTVM model based tuner. So the config log needs to keep the index. But code_hash can be removed.

config: {
  "index": index,
  "entity": [list of configs, e.g., split],
}

I agree that we can remove code_hash if it simplifies the impl. If the implementation is not that complicated, perhaps it makes sense to keep it as it is because we can use it to validate the template

Agree. Let’s still keep code_hash since it doesn’t increase much complexity.