Pure WebAssembly support

kazum · April 27, 2020, 7:30pm

I’m interested in WebAssembly as a next generation of portable and secure binary images, which can run anywhere and be deployed on, e.g., Krustlet. Pure WASM support without a JavaScript layer looks like the area where other DL frameworks haven’t worked on much yet. I’d like to add more support for it to TVM.

Basically, we already have a WASM support with the LLVM feature and our Rust runtime. However, we have no documents about how to generate and use the WASM library with TVM, and we have no way to autotune the binary.

I have some ideas for those and would like to contribute, but I’m not sure if I’m in the correct way. I’d like to hear opinions from experts.

How to generate a deploy lib in the WebAssembly format with TVM

Building a static WASM library is easy. Make the build target ‘llvm -target=wasm32-unknown-unknown --system-lib’ and save a static lib with Module.save.
Building a shared WASM library looks challenging. clang doesn’t accept the ‘-shared’ option for the WASM target. I created the shared lib somehow, but I still got an error when I used it from Rust programs:
```
rust-lld: error: ../../libtvmwasm.wasm: not a relocatable wasm file
```
Is there any way to generate a relocatable wasm lib?

How to use the generated WASM library from other programs

I tried the following two Rust programs with the generated static lib.

Create a WASM binary with WASI and use it from wasmtime.

On my environment, Rust optimized out the link to the deploy lib because we don’t use any symbols in the lib explicitly – we call functions via PackedFunc. To avoid the optimization, I had to add a function to make it clear that we need the lib.
Create a wasCC actor and provide inference serving via HTTP.

On my environment, TVMBackendRegisterSystemLibSymbol of the lib was not called and all of the get_function() calls were failed. I had to call __wasm_call_ctors() explicitly to invoke TVMBackendRegisterSystemLibSymbol.

Auto tuning WASM binary

Currently, WASI doesn’t support networking, so it looks impossible for WASM programs to work as a RPC server. For autoTVM, we need a WASM runtime to process WASM functions in Rust or C.

[A0] Rust: We can use wasmtime crate. We also have to add RPC features support to the rust frontend. It might be easy to migrate to pure WASM in future when WASI supports networking.
[A1] C: We can implement it with WASM C++ API. I’m not sure how difficult it is, but looks feasible to me.

Performance

@tqchen commented that WASM performance might be better than WebGL here, but the current generated WASM binary is not. It takes a few seconds to process ResNet50 on my environment (Xeon CPU E5-2660).

It looks the same to the other DL frameworks. For example, I tried ONNX.js on several environment including mobile phones, but WASM is slower than WebGL. (c.f. https://microsoft.github.io/onnxjs-demo/#/resnet50)

I guess it is because WASM doesn’t support threading natively yet. Or am I missing something for WASM optimization?

Any comments would be appreciated!

@tqchen, @nhynes, @ehsanmok, @jroesch

tqchen · April 28, 2020, 1:32pm

Thanks for starting the topic. I think one thing we do need to do is to reuse existing cpu autotvm templates and possibly tune for wasm.

The lack of dlopen in wasm is not going to go away for a while due to the special programming model. We recently have some rough idea to get around it and will report back soon once we have concrete actionable items after initial investigation

tqchen · May 4, 2020, 4:39am

which touches a related topic(revamped the js runtime to directly use WebAssembly standard API). See also how did we get around the dlopen problem using the new RPC protocol