TVM Monthly - August 2019


#1

As discussed with TVM PMC, we would like to give a summary of the project per month, so people can get a better sense of what is going on in the community.

Feedback and suggestions are welcomed so that we can further improve the report.

Community

The community welcomes new committer Hao Lu (@hlu1) and new reviewer MarisaKirisame (@MarisaKirisame).

This forum grew healthily and got 62.4k pageviews, 2.3k user visits in the last month.

Features and Improvements

In the previous month, the community has been working on improving the infrastructure including Relay symbolic shape, quantization, sparsity, accelerators and VM runtime. We improved both the coverage and performance of operators for various of frameworks.

  • For the compiler infrastructure, we merged the shape function support for symbolic shape. It enables certain cases for broadcast with symbolic shapes.
  • For quantization, multiple QNN operators were added to pave the way for converting pre-quantized networks to Relay. In addition, Relay quantization pass was refactored to fix the model accuracy and a KL-divergence algorithm was introduced for per-layer calibration.
  • For sparsity, fast transpose for square CSR matrices has been now merged, which is a good start point for more general sparse type support.
  • For accelerator support, Chisel added a test infrastructure for testing chisel modules, VTA test scripts offered an option to be able to run tests in TSIM, a virtual memory system has been landed for tsim_driver which allows us to perform TSIM tests on PYNQ and DE10-Nano.
  • For Relay VM, we have merged a profiler which enables the collection of execution statistics for each primitive operator. Also, we have enabled the execution of VM on other devices besides CPU.

More improvements along with details are listed below.

Compiler Improvements

  • Add shape function for symbolic shape (#3606)
  • Moving Conv, Dense, Concatenate InferTypes to header (#3783)
  • Simplify casts of constants 0 and 1 (#3758)
  • Conditionally replace reduction init axis. (#3408)
  • Improve Partial Evaluator (#3749, #3703)
  • Improve AD for concatenate (#3729)
  • Add a legalize pass to Relay (#3672)
  • Strict mode in Relay pattern matching (#3620)
  • Quit and clean when TVM is interrupted (#3640)
  • Support MKL on Windows (#3837)

Rely/TOPI Operator Support and Improvement

  • Bitserial operations conv2d, dense and bitpack (#3844)
  • Improve numeric gradient check (#3856)
  • Gradient for Conv2d (#3636)
  • Resize rework (3788)
  • Use cblas for dense and batch_matmul (#3787)
  • Improve conv2d_transpose CUDA schedule template (#3796)
  • Use legalize to handle NHWC layout for arm_cpu (#3754)
  • SpaceToDepth and MirrorPad Operators (#3718)
  • Add variance and layer norm op (#3700)
  • Add sparse_transpose for Square CSR matrices (#3707)
  • Update TOPI softmax compute and CPU schedule (#3680)
  • TOPI: Memoize winograd matrix (#3687)

User Interface and Frontend

  • TFLite frontend operator support: tile, transpose (#3814, #3705)
  • ONNX frontend operator support: PReLU for NNVM, Not, Sign, Equal (#3813, #3836, #3760)
  • Keras frontend operator support: Dot (#3668)
  • Add more cases to Keras _convert_reshape (#3846)
  • TensorFlow frontend operator support: OneHot, log1p, cos, sin (#3781, #3614)
  • Support BatchMatMul with input dimensions larger than 3 for TensorFlow (#3732)
  • CoreML improvement for image scaler and padding (#3800)
  • Clean up TensorFlow frontend (#3710)

Quantization and AutoTVM/Graph Tuner

  • Qnn Concatenate, quantize, dequantize and requantize operators (#3819, #3730, #3745, #3531)
  • QNNtoRelay & QNNLegalize Pass utility (#3838, #3782)
  • Refactor quantization codebase and fix model accuracy (#3543)
  • KL-divergence-based per-layer calibration (#3538)
  • Improve graph tuner dealing with Tuple (#3649)
  • AutoTVM: Fix hang/crash issues on feature extraction (#3689)

Runtime

  • Add build_create_shared_func to tvm/contrib/cc.py (#3840)
  • Reduce set_input and set_input_zero_copy overhead (#3805)
  • Relay VM Profiler (#3727)
  • Support execution on devices for Relay VM (#3678)

Documents, Test, and Build

  • conda recipe (#3791)
  • Allow users to specify download directory (#3803)
  • Update docs for installation for CUDA (#3832)
  • Update hybrid_script.rst (#3799)
  • Acknowledge Halide attributions (#3824)
  • Add psutil dependency (#3780)
  • Temporary disable rust test (#3809)
  • Solve occasional CI issue when pad value is all 0 (#3801)
  • Towards TSIM CI testing (#3704)
  • Use pip3 for python3 (#3742)
  • Update docker image ci_cpu,i386 to include verilator (#3738)
  • Remove sccache from Rust install (#3728)
  • Tutorial: Build a Graph Convolutional Network on TVM (#3681)
  • Update dmlc-core to the latest commit (#3716)
  • Update GPU docker (#3709)
  • Add an option to build with -pthread (#3671)
  • Add DGL to {ci_gpu, demo_cpu, demo_gpu} docker images (#3692)

Accelerator and Microcontroller Support

  • TSIM: add virtual memory support to examples (#3868)
  • Vulkan backend supports Call::reinterpret and vectorized comparison (#3795)
  • TSIM: Introduce Virtual Memory for TSIM Driver (#3686)
  • Parallel TSIM hardware compilation with macOS and debug support (#3797)
  • Chisel: scale dram base address in hardware instead of runtime (#3772)
  • Chisel: run all unittests by default (#3766)
  • Chisel: improved Data Gen, Added ALU Test (#3743)
  • Chisel dependencies for TSIM CI (#3721)
  • Chisel: Added Module Unit Test Infrastructure (#3698)

Fixes

  • Fix infinite recursive device_api.ext_dev call in VTA. (#3843)
  • Fix depth_mult for TensorFlow frontend (#3676)
  • Fix database APIs for AutoTVM (#3821)
  • Fix axis of softmax in Keras (#3834)
  • Fix VTA TensorLoad module (#3841)
  • Fix inconsistent python/cpp API behavior for if_then_else, power (#3829)
  • Fix code comment of operators in ONNX frontend (#3830)
  • Added repo for llvm-9 to fix missing dependency issue (#3826)
  • Fix typo in Relay text parser (#3785)
  • Fix tvm const warnings (#3817)
  • Add gfx906 bc (#3808)
  • Fixed onnx test failures when run on a cpu backend (#3764)
  • Fix ArgBinder assert order (#3794)
  • Fix for NoneType Target for quantization (#3792)
  • Fix out-of-date quantization realize (#3790)
  • Fix Qnn concatenate InferType (#3779)
  • Fix dense tuning (#3768)
  • Fix visit_pattern in ExprMutator (#3769)
  • Fix Chisel Scala style (#3765)
  • Fix some pass docs (#3767)
  • Fix mistype in rpc tutorial (#3763)
  • Fix tvm.scan follow by tvm.compute segfault (#3723)
  • Fix the potential index overflow in where operator (#3751)
  • Revert compile_cmd kwarg name change (#3746)
  • Update tophub (#3752)
  • Fix typo in ir_pass.h (#3741)
  • Bug fix for VME Shell (#3737)
  • Fix missing apt https transport support (#3735)
  • Take zero extent loops as NoOp and remove it (#3724)
  • Fix mxnet converter for hybridblock and add div_sqrt_dim (#3701)
  • Fix partial eval unit test name (#3719)
  • Fix conv2d schedule code (#3648, #3717)
  • Remove thread related headers (#3713)
  • Fix FunctionPass (#3712)
  • Export tvm::relay::OpRegistry::OpRegistry (#3711)
  • Fix Metal reinterpret (#3706)
  • Fix gather_nd in Relay (#3442)
  • Fix error in partial evaluator (#3693)
  • Align the naming rule for OpAttributeUnImplemented (#3695)
  • Enable the sparse schedule (#3651)
  • Fix typo names in Caffe2 frontend (#3685)
  • Make tests multi-process friendly. (#3683)
  • Fix typo in README.md (#3684)

People Who Reviewed Pull Requests:

Note: The format is name(number of activities). Disclaimer: number of activities do not directly correspond to the community’s view about the significance of contributions.

tqchen (61), tmoreau89 (43), zhiics (20), vinx13 (16), icemelon9 (15), kevinthesun (13), vegaluisjose (12), FrozenGene (12), jroesch (11), yzhliu (9), MarisaKirisame (8), liangfu (8), masahi (7), slyubomirsky (7), ajtulloch (7), shoubhik (7), cchung100m (6), hlu1 (5), yongwww (5), anijain2305 (5), weberlo (5), merrymercy (5), u99127 (5), ZihengJiang (4), apivovarov (4), junrushao1994 (4), kazum (3), wweic (3), antinucleon (3), cbalint13 (3), yinghai (3), srkreddy1238 (2), eqy (2), mshawcroft (2), were (2), jwfromm (2), soiferj (2), SWu (2), Huyuwei (1), zhreshold (1), lixiaoquan (1), sgrechanik-h (1), yidawang (1), xqdan (1), grwlf (1), alexeyr (1), Mutinifni (1), yuluny2 (1), reminisce (1), kaitingwang (1), TaoLv (1), kparzysz-quic (1), altanh (1), peterjc123 (1)

People Whose Pull Requests are Updated:

Note: The format is name(number of activities, area list)

MarisaKirisame (15), anijain2305 (13), tmoreau89 (9), icemelon9 (8), cchung100m (7), soiferj (6), ZihengJiang (5), vinx13 (5), BenjaminTu (5), tqchen (4), ajtulloch (4), liangfu (4), weberlo (4), yuluny2 (4), wweic (3), apivovarov (3), merrymercy (3), huajsj (3), FrozenGene (3), petrex (3), umangyadav (3), ZQPei (3), zhiics (2), jroesch (2), kevinthesun (2), vegaluisjose (2), lixiaoquan (2), yongwww (2), sgrechanik-h (2), cbalint13 (2), jwfromm (2), tristan-arm (2), abuccts (2), alexgl-github (2), shoubhik (2), yzhliu (1), Laurawly (1), nhynes (1), PariksheetPinjari909 (1), mshawcroft (1), hlu1 (1), were (1), sxjscience (1), abergeron (1), antinucleon (1), junrushao1994 (1), eric-haibin-lin (1), csarofeen (1), comaniac (1), szha (1), marcelotrevisani (1), ghostplant (1), sf-wind (1), kparzysz-quic (1), henrywoo (1), hqucms (1), Lyken17 (1), Oldpan (1), SWu (1), thatch (1), ethanjyx (1), mingwayzhang (1)