TVM Monthly - Jul 2020

TVM Monthly - 7 2020

As discussed by the TVM PPMC, our goal is to provide a monthly summary of the project so users and developers can get a better understanding of the goings on of the TVM community.

Feedback and suggestions are welcomed so that we can further improve these updates.

Community

This forum got 107k pageviews, 2.8k user visits in the last month.

Thomas Viehmann and MathInf GmbH published a new TVM blogpost on Bridging PyTorch and TVM.

Zhi Chen and Cody Yu also published a new TVM blogpost on How to Bring Your Own Codegen to TVM.

Features and Improvements

In the previous month, Lianmin and other contributors brought PRs for the AutoTVM 2.0, which can automatically generates a much larger search space from compute declaration only.

More improvements along with details are listed below.

Topi

  • Fix x86 conv2d template when tuning with unpacked layout #5938
  • Fix the filter width parameter in depthwise_conv2d #6081
  • Using MKL blas for quantized dense #6115
  • Fix conv2d_transpose output padding #6236
  • topi -> tvm/topi #6186

Target

  • Migrate data structure of TargetNode #5960
  • Use TargetNode::attrs for Target serialization #5993
  • each option of target str should only contain one '=' #5988
  • ONNX codegen #5052
  • Rename target_id => target_kind #6199
  • 64-bit RPi4b target #6211

Relay

  • Add resnet-3d & Update network definitions for NHWC layout #5945
  • Small bug fix for Conv1D imports. #5995
  • Fix what looks like bizzare copy-paste issue #6010
  • Add Parser 2.0 #5932
  • Dynamic TopK Op #6008
  • Move invoke_tvm_op and shape_func to vm dialect #5958
  • GRU Layer Support #6020
  • Add pass for getting calibration data from a relay module #5997
  • Dynamic broadcast_to, zeros, ones #6007
  • Merge two consecutive reshape ops #6052
  • Add operation scatter_add to relay, based on scatter implementation. #6030
  • Add dynamic reshape grad #6080
  • i64 indices #5235
  • Keep fixed dim when unifying dynamic shape #5795
  • Port eliminate_common_subexpr to non-recursive form #6134
  • Fix interpreter for dyanmic shape input of ndarray_size #6086
  • Allow to config allocator type and refactor vm code structure #6105
  • Support NMSv4 #6085
  • Handle ndarray_size in FoldConstant #6156
  • when converting constant nodes with types of int64 or float64 #6159
  • Add ReshapeTensor instruction in the VM to replace the reshape op #6089
  • Fix bug in transpose_shape_func #6180
  • Basic block normal form #6152
  • pytorch frontend support conv1d #6203
  • OneHot operation #6209
  • Support combine multiple dense op just into dense #6062
  • Add Dynamic Resize Op #6198
  • Add unbiased variance op and corresponding support in pytorch frontend #6232
  • Refine tensorflow frontend 1.x & 2.x compatibility #6240

Runtime

  • if a param not in input, we should still consume it's data #5990
  • Support module based interface runtime #5753
  • init TVMPackedFunc's name #6044
  • Enable auto conversion String->DLDataType #6214
  • fix typo #6230

Ci

  • Update ci-cpu to the latest #6031
  • Move CI over to new Rust crates and try to fix flaky test. #6011
  • Add ACL docker installation #5916
  • Temporary disable nmsv4 test #6151
  • Update ci-cpu to the latest #6164
  • add caffe environment #6023
  • Remove topi from the CI cache #6188
  • Enable CI for Ethos-N #6171

Byoc

  • JSON Runtime with DNNL End-to-End Flow #5919
  • Handle one symbol for each runtime #5989
  • Run accelerator specific optimizations #6068
  • Arm Compute Library integration #5915
  • Support asymmetric per-layer quantized operators #6109
  • Retire the example json runtime #6177
  • json_node.h should include data_type.h #6224
  • Improve installation tutorial #6170

Ansor

  • Phase 0: Ansor minimum system for auto schedule generating #5962
  • Phase 1: Access Analyzer #6103
  • Phase 1: Add follow_split and follow_fused_split steps #6142
  • Phase 1: Add pragma/storage_align/rfactor steps #6141
  • Phase 1: Add RPC Runner #6077
  • Phase 1: Add annotation/compute_at/compute_root/compute_inline steps #6073
  • Phase 1: Add cache_read/cache_write steps #6107
  • Phase 1: Rename namspace form auto_schedule to auto_scheduler #6059
  • Phase 1: The base class for cost models #6187

Fix

  • Add missing expr visitor for any #6082
  • Remove the tvm web from version update #6122
  • Clear relay cache after every build & Clear warning message cache after autotvm task extraction #6131
  • avoid unexpected throw in AttrInitEntry #6128
  • Verify that tensor reshape is valid. #6215

Docs

  • improve the doc of release #6091
  • Cleanup docs build instructions. #6094
  • Organize Design and Architectures #6097
  • Reorganize the docs. #6146
  • Clarify Docs Categorization #6155
  • Improve the docs build instructions #6173
  • Added casting to hybrid script doc and fixed pass infra doc #6174
  • Update pass infra tutorial #6193

Tir

  • Improved massive build times caused by tir.floormod and tir.floordiv. Fixed Topi testcase. #5666
  • Buffer logger assert removed #6147
  • Enhance VerifyGPUCode #6194
  • HoistIfThenElse added #6066
  • Hybrid Script Support for TIR #6227

Fixes

  • Improve docker/bash.sh to handle git worktrees #5970
  • Add parser for contrib.box_decode #5967
  • Add Dynamic reshape to a dynamic namespace and add DynamicToStatic Pass #5826
  • Add meshgrid op in Relay, TOPI, Pytorch frontend #5961
  • fix tvm relay testing tf.py typo error #5977
  • Remove redundant function CreateBufferVecPtr #5982
  • VectorType::get with two parameters is deprecated in LLVM 11+ #5984
  • QNN support for TFLite 2.1.0 quantized models #5848
  • Inequalities solver #5618
  • Use LocalRunner by default in the tutorial tune_relay_cuda.py #6001
  • Undefined names: import os for line 324 & import re for line 308 #6003
  • GitHub Actions upgrade to actions/setup-python@v2 #6002
  • Dynamic Tile Op #5983
  • Only pass pythonpath for ci images #6005
  • Auto-convert shuffle with single index to "extract element" #6006
  • Cache object refs in loop partitioner instead of object pointers #6004
  • Fix test_arith_solve_linear_inequality.py::test_multi_equal #6014
  • MXNet frontend support for AMP cast op #5976
  • Remove duplicate line #6017
  • Gather op support added #6013
  • Demo showing how to run a pruned :hugs: model. #5975
  • Move compiler related registry items to vta/build_module.py #6012
  • Pin keras version #6032
  • Fix in arm_cpu/conv2d_alter_op for NHWC quantized #6027
  • Add creation of Hexagon device in RPC client #6035
  • Terminate basic block after "ret" instruction #6036
  • µTVM CRT modifications for on-device RPC server #5921
  • Create TBAA information based on the unrelying buffer type #6046
  • Add support for tflite arg_min and arg_max #5992
  • Fix fully_connected converter when batch size is not 1 #6038
  • Fix a primitive check error #5991
  • Refactor to expose MakeOp functions to C++ #6047
  • Fix conv2_gemm after target structure update #6037
  • Remove use of designated initializers from hexagon_module.cc #6055
  • Build crttest and cpptest separately. #6057
  • Fix pytorch frontend prim::Constant issue #6051
  • update frontend tutorials to new model based runtime interface #6063
  • Enable x86 cpu cache flush #5914
  • Remove unnecessary std::cout #6072
  • Fix error message in Buffer::vstore, NFC #6056
  • Fix FSIM Compile Error. #6070
  • Improve vector simplification for float operands #6043
  • Refine LSTMBlockCell to support dynamic rnn #5963
  • Fix LocalBuilder on macOS with python 3.8. #6083
  • Add missing test for fast erf #6058
  • Fixed point multiplication improvements for AArch64 #5980
  • Fix code generation bugs for C/CUDA & Improve VerifyGPUCode pass #6041
  • MXNet pre-quantized BERT #6039
  • Scalar support for te.extern #6079
  • Delete declaration of unused op_node #6102
  • Load configs even it has no entity #6100
  • Update SGX example Cargo.toml #6067
  • Add default value for option USE_DNNL_CODEGEN in the cmake #6099
  • Update installation doc with minor improvements #6104
  • lint: add opencl .cl file type #6092
  • Clean up conversions between TVM and Rust functions #6114
  • Improve reduction schedule on arm CPUs #6110
  • Register Shape Func for Some Operators to Handle Dynamic Shapes #5955
  • Fix variable name conflict with OpenCL keyword #6048
  • Some rust cleanups #6116
  • fix typos in comments and relay tutorial #5999
  • Option to specify alternate directory to output build to #6016
  • Add 'get_num_inputs' to GraphRuntime #6118
  • TFLite quantized conv test #6084
  • Fix autotvm on the conv2d_nchw_winograd.mali operator #6130
  • add attr option mfloat-abi for arm32 #6123
  • Fix CUDA Library Tuning #6132
  • Add missing RPC sources after refactor #6113
  • Add TVM application extension with WASM runtime #5892
  • @t-vi -> Reviewer #6149
  • Correct runtime.load_module #6161
  • Improve error messages in graph tuner, graph runtime, and module loader. #6148
  • Typo in mod creation #6165
  • Fix some shape mismatches between TF and Relay #6166
  • Improve doc string #6176
  • Fix incorrect function signature in header #6172
  • Temporary disable conv2d grad strided flaky test #6183
  • Remove libtopi from the build #6189
  • Create Interpreter for each constant subgraph #6195
  • Fix alignment of note #6181
  • Implemented PADV2 Operator for TFLite and added support for constant values in PAD. #6167
  • Unary ops support added in frontend #6196
  • fix #6205 #6207
  • Change the meaning of conv3d_transpose output_padding to match conv{1,2}d_transpose #6065
  • Fix compile warnings. #6204
  • Fix -mfloat-abi=soft compilation for ARM with OpenCL target #6150
  • match pytorch 1.6 googlenet pretrained model (#6201) #6212
  • Add --runtime=c, remove micro_dev target, enable LLVM backend #6145
  • Mod operator, bug fix #6160
  • RESHAPE with dynamic shape arg in TFLite frontend #6208
  • fix compilation error with cuda 11 #6213
  • fix port_end wrong default value 9199 to 9099 for keeping same with source code #6220
  • Std op without specified dimensions support #6226
  • fix crt building and running error #6231
  • jcf94 -> Reviewer #6241
  • Implemented ONE_HOT Operator for TFLite. #6223

Contributors Who Reviewed Pull Requests

Note: The format is name (number of activities) Disclaimer: number of activities do not directly correspond to the community’s view about the significance of contributions.

tqchen (103), zhiics (56), junrushao1994 (41), comaniac (27), tmoreau89 (17), jroesch (15), kevinthesun (15), merrymercy (13), MarisaKirisame (13), masahi (13), anijain2305 (11), FrozenGene (11), icemelon9 (10), ZihengJiang (10), liangfu (8), mbrookhart (7), siju-samuel (6), jwfromm (6), electriclilies (5), kazum (4), mbaret (4), yzhliu (3), yongwww (3), weberlo (3), ANSHUMAN87 (3), areusch (3), u99127 (3), jcf94 (3), Laurawly (2), lixiaoquan (2), cbalint13 (2), lhutton1 (2), trevor-m (2), jmorrill (2), binarybana (2), mwillsey (2), srkreddy1238 (1), nhynes (1), mshawcroft (1), kparzysz-quic (1), t-vi (1), cchung100m (1), yidawang (1), wpan11nv (1), shoubhik (1), eric-haibin-lin (1), hcho3 (1), roastduck (1), adityaatluri (1), spectrometerHBH (1), yongfeng-nv (1), Hzfengsy (1), tkonolige (1), ymwangg (1), manupa-arm (1), alexwong (1), leonwanghui (1), TaoLv (1), d-smirnov (1)

Contributors Whose Pull Requests were Updated

Note: The format is name (number of activities)

tqchen (33), kparzysz-quic (11), icemelon9 (10), merrymercy (10), anijain2305 (8), lhutton1 (8), jcf94 (7), comaniac (6), lixiaoquan (6), eric-haibin-lin (6), giuseros (6), mbrookhart (5), zhiics (4), FrozenGene (4), areusch (4), windclarion (4), lsy643 (4), siju-samuel (3), yzhliu (3), jroesch (3), jwfromm (3), trevor-m (3), cclauss (3), tkonolige (3), ymwangg (3), zhanghaohit (3), liangfu (2), junrushao1994 (2), huajsj (2), ANSHUMAN87 (2), csullivan (2), jiangzoi (2), vinx13 (1), tmoreau89 (1), kazum (1), kevinthesun (1), nhynes (1), inadob (1), antinucleon (1), wpan11nv (1), xqdan (1), maheshambule (1), notoraptor (1), pingsutw (1), hzfan (1), binarybana (1), leonwanghui (1), electriclilies (1), seanlatias (1), d-smirnov (1), alexbooth (1), hogepodge (1), fernchen (1), jxx123 (1), Leslie-Fang (1), mwillsey (1), yzwqf (1), samskalicky (1), dprankratz (1), jiuqi-yang (1), qunluo (1), sleepwalker2017 (1)

2 Likes