Currently AutoTVM logs are serialized very casually here and the exact format is often dependent on how individual classes scattered across the codebase choose their string representations.
This may lead to a scenario where the serialization could implicitly change when an unrelated change is made to a field involved in AutoTVM log encoding.
We propose a solution to canonicalize and add defined structure to the AutoTVM Log Format, by constructing typed python classes for producing logs as a programmatically defined schema. Encoding method definitions are omitted.
from abc import ABC, abstractmethod
class AutoTVMLog:
input: Input
config: Config
result: Results
version: str
tvm_version: str
class Input:
target: str
task_name: str
args: List[Argument]
kwargs: Dict[str, Any]
class Argument(ABC):
# no fields
class Tensor(Argument):
name: str
shape: List[int]
dtype: str
class Tuple(Argument):
values: List[Any]
class String(Argument):
value: str
class Config:
code_hash: str
entity: List[Entity]
index: int
class Entity:
knob_name: str
knob_type: str,
entity_repr: Union[SplitEntity, ReorderEntity, AnnotateEntity, OtherOptionEntity]
class Results:
costs: List[float]
error_no: int
all_cost: float
timestamp: float
Note that the addition of this code is not necessarily intended to change the output log schema in any way, it is more intended to clarify the schema so there is a single source for future log modifications. However, below I have listed some possible cosmetic changes that it may be nice to consider resolving as a drive-by.
Clarifications and fixes:
-
tvm_version
is a float in the codebase, we will correct this to str. We will need to bump the schemaversion
to ā0.3ā. -
āConfigā is currently represented in list form e.g.
[["tile_oh", "ot", 1], ...
. We can modify this format to something more readable but longer, like[{knob_name: "tile_oh", knob_type: "ot", "entity_repr": 1}, ...]
. Thoughts about keeping the original representation vs. changing to a longer more readable format? -
autotvm.task.kwargs
is unused in AutoTVM as indicated here. Keep or removeAutotvmRecord.Input.kwargs
from the record schema? -
In my own experiments with AutoTVM I have consistently observed
code_hash
's value isnull
in output logs. This is also true for all schedules on tophub. Remove or file issue to fix? -
AutoTVMRecord
orAutoTVMLog
? Other naming concerns?