MicroTVM: store parameters as external data structure

Wheest · March 4, 2024, 4:24pm

I’m compiling programs for MicroTVM (for a custom target), so I generate C code, and use my own compilation toolchain (much like this tutorial).

However, my platform has limited memory, and for larger models my linker refuses to compile because we have run out of memory.

This is because (as far as I can see), all the parameters are being stored in the ELF file, (under .rodata). These parameters are stored in the generated file codegen/host/src/default_lib0.c , which for my case is over 523MB. When compiled, this file is a more manageable 131MB, but still too large for my linker to accept.

If I had the weights as an external serialised data structure, then my toolchain would not complain.

I’ve tried enabling and disabling USMP (Unified Static Memory Planning), but it does not change how default_lib0.c is handled.

My build configuration looks like this:

    # Use the AOT executor rather than graph or vm executors. Use unpacked API and C calling style.
    EXECUTOR = tvm.relay.backend.Executor(
        "aot",
        {"unpacked-api": True, "interface-api": "c", "workspace-byte-alignment": 8},
    )

I found some additional configuration options here, including link-params which looks promising.

Looking at the docs, link-params is described as:

{"link-params":True} enables parameters to be linked into the generated files rather than provided externally.

Therefore, I tried setting "link-params": False, in my executor configuration.

Enabling this switches the location of the data from default_lib0.c to default_lib1.c (which is also compiled in my original flow but is smaller).

Is there a canonical way of storing my weights in an external data structure stored on disk, rather than as part of the ELF file?

In theory, maybe I could change my toolchain compiler to allow a larger .rodata part, but I’m unsure of what downstream problems that could cause, since I think this data is loaded into ROM when the binary is executed (and I don’t have enough of it).

This post seems relevant, as it talks about the issues of having limited ROM, and marking things within default_lib1.c as going in different areas of memory. I still need to understand what’s its doing to see if it helps, but in the meatime any help would be appreciated.

Wheest · March 5, 2024, 1:05pm

I’ve gone a bit deeper into this problem, and my understanding is thus:

Many modern operating systems use demand paging on .rodata, so that it is only loaded into memory when needed. Therefore for most systems the approach taken by MicroTVM is more than enough.

However, the system I’m working with doesn’t have that. My storage is memory mapped, so I can edit my linker to look for the .rodata there. But the issue is of course that is very slow, since it never actually moves the data into memory, but does every array access from storage.

Therefore, the solution I’m going for is to serialise the weights and load and free them on demand. I’m pretty sure MicroTVM doesn’t have support for that yet?