TVM Monthly - January 2020

TVM Monthly - January 2020

As discussed by the TVM PPMC, our goal is to provide a monthly summary of the project so users and developers can get a better understanding of the goings on of the TVM community.

Feedback and suggestions are welcomed so that we can further improve these updates.

Community

It has been a busy holiday month with both New Year and Lunar New Year happening in January, and with that comes a lot of new changes to TVM. New Year, New TVM!

We welcome 3 new committers this cycle, @wweic, @MarisaKirisame, and @FrozenGene.

This forum got 87.5k page views, 2.7k user visits in the last month.

Unified IR

The biggest changes this month are the large number of refactors in preparation for the Unified IR. The unified IR is an ongoing effort to improve and standardized the TVM IRs implementation, as well as design. The goal is to provide a more unified interface for cross-stack optimizations and facilitate the mixing of microkernels, TVM ops, and Relay programs.

You can find more information about the unified IR in a series of RFCs 4617 and 4812 which are out now.

.

Quantization

@ZihengJiang has introduced a second large quantization RFC this cycle as well. The goal of this RFC is to build upon existing quantization work to make it more generic, flexible while also making it hardware aware. Ziheng has written up the details quite clearly in his RFC which contains more details.

Quantization Figure

Tooling

This weekly newsletter was almost completely generated by a tool I put together from scripts written by numerous community members over the past year.

For future newsletter authors I encourage to use it and contribute. You can find more details about it here.

Pull Requests

The below is high-level summary of the PRs closed in the last month grouped by area.

Community

Refactor

  • Unify approach to Visitor/Mutator under Functor #4606
  • TVM_REGISTER_API -> TVM_REGISTER_GLOBAL #4621
  • Introduce SeqStmt to replace ir::Block #4627
  • IRPrinter->NodePrinter, move to node/printer.h #4622
  • Initialize Unified IR Type Data Structures #4616
  • Remove un-necessary var sub-field in TypeVars #4615
  • Remove old Low-level Visitor/Mutator #4612
  • Migrate Low-level IR Passes into the New Stmt/Expr Mutator #4607
  • Automatically deduce ftype signature in Registry.set_body_typed #4623
  • Add Node suffix to low-level IR nodes #4649
  • relay::Module Def -> TypeDef #4665
  • tvm::Expr -> PrimExpr(Primitive Expr) #4669
  • Replace TensorObj and TensorValue with NDArray #4643
  • Initialize Unified IR Expr Data Structure #4673
  • Allow Module to store BaseFunc #4678
  • Unified IR Primitive Op and Registry #4687
  • Unified IR IRModule structure. #4699
  • Move error.h into ir #4701
  • Initialize Unified IR Pass Infra #4702
  • Polish ir/type #4705
  • Unify IntImm and UIntImm #4706
  • attrs.h -> ir #4709
  • Move support related code to include/tvm/support #4716
  • Make more clear naming for C API Type codes. #4715
  • Introduce include/tvm/target #4721
  • Unified IR, introduce include/tvm/arith/ #4722
  • top - namespace for Tensor Operation DSL #4727
  • Polish runtime #4729
  • Get rid of packed_func_ext. #4735
  • Unify vm and interpreter objects #4693
  • Establish tir namespace #4740
  • codegen->target, build_module->driver #4742
  • Finish move all types to IR. #4746
  • Establish printer in the source folder #4752
  • top->te #4759
  • driver.h -> driver_api.h #4760

Relay

  • Add half_pixel option to Resize op #4610
  • skip example json runtime test when config is not set #4614
  • test tensor_array in vm #4608
  • Improve memory_allocation pass to support multiple i/o dynamic kernels #4595
  • add unit test for tensor_array_split #4619
  • Add parses support for unary elemwise ops #4634
  • Add parses support for SLICE #4502
  • add op crop_and_resize #4417
  • Added pool autopadding and simplified converters. #4672
  • Relay annotation and partitioning for external compilers #4570
  • Fix meaning of conv2d_transpose output_padding parameter #4318
  • use packed func macro for external codegen #4710
  • fix _parse_param bug #4711
  • Add constant input support for elemwise ops #4666
  • Add parser support for squared difference #4652
  • Add type check to dense #4724
  • Invoke tvm::build from relay compile_engine and interpreter #4723
  • Broadcast condition, x, and y for Where op #4774
  • Add parser support for relational ops #4695
  • Remove duplicated BindParamByName function in VM compiler #4793
  • Use SimplifyInference for L2 Normalization. #4795
  • Expose vm OptimizeModule to Python #4800

QNN

  • Making scale/zero_points as expr instead of attrs. #4611
  • Channel wise quantization - Quantize & Requantize #4629
  • Conv2D type checking for kernel per-channel scales. #4732
  • Add missing nullptr check #4773
  • Doc fix on convolution and dequantize #4799
  • Conv2D with dilation support. #4796

VTA

  • Prevent Chisel VTA linter changing the Scala code #4555
  • Update docker for TSIM based simulation #4674
  • Update Jenkinsfile for VTA test with TSIM #4734
  • Enable TSIM CI Testing #4407
  • Fix an issue in updating uop_idx in the TensorGemm module #4694
  • Support network which have no unique operator as start/stop name for graph pack. #4703

Runtime

  • make adt tag signed #4605
  • Improve TVMBackendPackedCFunc to allow return val #4637
  • EdgeTPU runtime for Coral Boards #4698
  • Fix memory leak when using openMP #4811

TOPI

  • Allow empty tensor for reshape, tile and strided_slice #4618
  • Fix meaning of conv2d_transpose output_padding parameter"; #4708
  • Remove cpp upsampling and resize op #4769
  • upsample operator 'NCHWinic' format support. #4791

CI

  • Pin python pillow to “;<7”; due to torchvision 1.2.0 dependency issue #4632
  • Update image version tags in Dockerfile comments #4631
  • better deletion script for pycache #4635
  • Recover Windows Mac Build CI via Github Actions #4662
  • Update deps for chisel #4675
  • Bump to use the new cpu image #4677

Frontend

  • Add support for tf.Keras networks in Relay Keras frontend #4630
  • Add conv3d #4604
  • Fix incorrect calculations in tf SLICE #4518

Autotvm

  • Use VM compile to extract autotvm tasks #4328
  • Download fallback schedule file if it does not exist #4671
  • Ignore error when removing tmpdir #4781
  • Fix a bug in generating the search space #4779
  • Minor bug fixes in AutoTVM for QNN graphs #4797

Fixes

  • Make calibration faster and more memory usage friendly #4589
  • Added declare of aluBits for TensorAlu #4624
  • Improve comments #4633
  • Get around limitation of g+±4.8 #4626
  • Bugfix StmtMutator IfThenElse #4609
  • Remove unecessary rdynamic #4613
  • Resolve constexpr related link error in debug mode #4641
  • Asymmetric padding #4511
  • Reduce data size of asymmetric padding testcase #4658
  • Fix Base64OutStream portability issue #4668
  • fix topi.nn.global_pool layout=“NHWC” #4656
  • Also package core.rly #4679
  • fskip of EliminateCommonSubexpr cannot always return false #4620
  • Fix Python syntax error in start_rpc_server_to_tracker.py #4682
  • os.path --> osp to match the import #4681
  • GitHub actions/checkout@v1 --> v2 #4680
  • Fix Python syntax error AGAIN in start_rpc_server_to_tracker.py #4685
  • Use ==/!= to compare str, bytes, and int literals #4686
  • Rename start_rpc_server_to_tracker.py to start_rpc_server_to_tracker.sh #4689
  • Deploy Quantized Model on CUDA #4667
  • Conv1D #4639
  • 1D Pooling #4663
  • GitHub Action lint Python code for syntax errors #4688
  • Generate blob use LLVM directly #4657
  • reduce input size to fix oom #4653
  • Fix RemoveUnusedFunctions pass #4700
  • link the math library by default #4713
  • Update mainline version to 0.7.dev0 #4720
  • add SizeVar representing non-neg valued variable in a tensor shape #4684
  • Fix the compile problem of cpp_rpc #4725
  • Bring Your Own Codegen Guide – Part 1 #4602
  • Convert Layout pass. #4664
  • JSON upgrader to upgrade serialized json. #4730
  • Fallback schedule for Int8 depthwise. #4733
  • Fix dense x86 schedule #4728
  • Fix demo dockerfile build failed #4744
  • Expose relay BindParamsByName to Python #4751
  • Improve CUDA vectorizer #4736
  • Add .asf.yaml for github info #4761
  • Bring Your Own Codegen Guide – Part 2 #4718
  • Fix padding in pooling op #4738
  • Remove run_infer_type duplicates #4766
  • pooling.cc improvements #4767
  • Export builtin_fp16 on Windows #4731
  • TVM_REGISTER_API -> TVM_REGISTER_GLOBAL #4768
  • Fix Tensorflow conv3d pad bug, add non-cubic data and kernel tests #4772
  • Bump prebuilt-image version in demo dockerfile #4770
  • Update tune_simple_template.py #4778
  • Explicitly link to cublasLt if it exists #4776
  • Implement pass manager tracing API #4782
  • Fix hasattr by extracting Python error type from Windows error message #4780
  • Replace os.path.exists with try…except…else #4784
  • Improve CUDA conv2d_transpose_nchw #4762
  • Add CUDA conv2d for NHWC layout #4737
  • Make sure to visit the arguments of inlined functions #4783
  • Parse additional exception strings #4785
  • conv3d_ndhwc schedule #4775
  • fix #4670: add bias for fc layer #4801
  • Change color channel from BGR to RGB for darknet preprocessing #4794
  • Solve ARM BIG.LITTLE heterogeneous multicores #4747
  • Create a StringImm reference type #4806
  • Fix -Wextra #4804
  • Fix vta tutorial #4809

Contributors Who Reviewed Pull Requests

Note: The format is name (number of activities) Disclaimer: number of activities do not directly correspond to the community’s view about the significance of contributions.

tqchen (92), ZihengJiang (35), zhiics (30), yzhliu (24), wweic (21), vinx13 (20), masahi (19), FrozenGene (18), tmoreau89 (15), kevinthesun (13), comaniac (13), jroesch (12), icemelon9 (11), MarisaKirisame (9), junrushao1994 (9), yongwww (8), jwfromm (7), Hzfengsy (7), merrymercy (6), anijain2305 (6), soiferj (6), jackwish (5), Laurawly (4), apivovarov (4), kazum (3), liangfu (3), optima2005 (3), u99127 (3), ajtulloch (2), cchung100m (2), shoubhik (2), broune (2), minminsun (2), srkreddy1238 (1), siju-samuel (1), zhreshold (1), sgrechanik-h (1), nishi-t (1), petrex (1), ehsanmok (1), ashutoshparkhi (1), wyc-ruiker (1), yinghai (1), kice (1), mbarrett97 (1), jmorrill (1), TaoLv (1), trevor-m (1), kevinyuan (1), Leo-arm (1)

Contributors Whose Pull Requests were Updated

Note: The format is name (number of activities)

tqchen (54), srkreddy1238 (21), zhiics (15), masahi (9), jwfromm (9), anijain2305 (8), cclauss (8), optima2005 (7), icemelon9 (5), liangfu (5), inadob (5), alexgl-github (5), yzhliu (4), soiferj (4), apivovarov (3), hlu1 (3), FrozenGene (3), abergeron (3), comaniac (3), leandron (3), wweic (2), tmoreau89 (2), kevinthesun (2), yongwww (2), hcho3 (2), zxy844288792 (2), wyc-ruiker (2), jmorrill (2), pingsutw (2), wpan11nv (2), zhigaowu (2), ZihengJiang (1), MarisaKirisame (1), vinx13 (1), Laurawly (1), nhynes (1), jroesch (1), huajsj (1), u99127 (1), BenjaminTu (1), Hzfengsy (1), kice (1), mbarrett97 (1), LiangHao151941 (1), trevor-m (1), broune (1), changkaiyan (1), KeDengMS (1), kevinyuan (1), vexilligera (1), yuliujq (1), qihaitao (1)

4 Likes