why time taking are added,when speed up ssd by TVM ?

when we speed up SSD as the tutorial(https://docs.tvm.ai/tutorials/frontend/deploy_ssd_gluoncv.html) ,time taking are added(40ms before speed up, after 420ms). what’s more, we must compiler LLVM for cuda target on x86 platform, otherwise, the code can’t run sucessfully.But, when we run the code after compiler LLVM for cuda target, the cpu target fails to run. So we want to know:

  1. why time taking are added,after speeding up ssd by TVM ?
  2. why we must compiler LLVM for cuda target on x86 platform? PS: GPU: 1050TI, CPU: intel x86, mxnet has two version: one is 1.5.1 version C++ source compiler; the other is python version(mxnet_cu100mkl-1.6.0b20191013-py2.py3-none-manylinux1_x86_64.whl); TVM: from the mxnet 1.5.1 version C++ source LLVM: the lastest version–0.9.0

Thanks.