[solved] Querying rpc tracker returns empty list (Android device)

Hello,

I am following the instructions from android_rpc,
after launching the tracker on my machine

$ python -m tvm.exec.rpc_tracker --port 9090
INFO:RPCTracker:bind to 0.0.0.0:9090

I run the app and enter the address (tried 0.0.0.0 and real ip), tracker port (here 9090), and key (I use android as recommended), then I query the tracker but get an empty list:

Tracker address localhost:9090

Server List
----------------------------
server-address	key
----------------------------
----------------------------

Queue Status
---------------------------
key   total  free  pending
---------------------------
---------------------------

I am running ubuntu 16.04 in a company (internet proxy, etc.).
I can see that the ports 9090 are listening on both machine, I guess I have missed something but that should not be a tvm problem, however, I still need your help.
Thank you.

Is the tracker accessible from the phone (e.g., are they on the same NAT, etc.).

Thank you that was the problem.
Now trying to run nnvm/tutorials/deploy_model_on_mali_gpu.py in which I modified the fields related to backend and device connection, I am getting:

TVMError: Check failed: sock.Connect(addr): Connect to 192.168.123.107:9090

Context:
I am running an rpc server on my linux machine binded to 9090.
I am running the android rcp server port 5001 forwarded (fron netstat -a) as previously mentioned, 9090 reverse forwarded as well, I set the key in the app to “android”.
In the script I have set the host to the ip address of the android phone (in settings>status>ip address, only available when wifi is activated, 192.168.123.107 here), and the port to 9090.
I have replaced rpc.connect(host, port) by rpc.connect(host, port, key) where key = "android".
And that’s it.

I can move my question to a new discussion if that makes more sense.
Thank you in advance for your help.

I have just checked and in fact no, there are not on the same NAT. They are connected with USB so forwarding port with local address nay be the solution. Does 0.0.0.0 correspond to localhost or is it better to replace it with ifconfig result?

On some systems 0.0.0.0 is aliased to localhost. You should not set the ip address on the device to 0.0.0.0, as telling it that it itself is the tracker.

You can try port forwarding, but I have not tested that approach.

Thanks for the reply.
Port forwarding seems to work. Strange thing is that it works with localhost address in the phone but not the real address.
I am speaking in term of results of the query_rpc_tracker script, I can see the device in the list with localhost address on the phone but empty list with true address.
I did the port forwarding as mentionned by kparzysz previously.

0.0.0.0 means “any network interface” and when it’s used in a listening socket (i.e. in a server), it will listen at a given port on all TCPv4 interfaces.

After some reconfigurations/compilations and better understanding, port forwarding is working now.
Thanks

1 Like

i also encounter such issue, the server ip is 192.168.1.23, android ip is:192.168.1.19

After start tracker on server:

python3 -m tvm.exec.rpc_tracker --port 9090

INFO:root:If you are running ROCM/Metal, fork will cause compiler internal error. Try to launch with arg --no-fork
INFO:RPCTracker:bind to 0.0.0.0:9090

run the app, then query the tracker but get the empty list.

and the app error log is:

07-09 15:31:20.940 6189 6205 W System.err: java.net.SocketTimeoutException: failed to connect to /192.168.1.23 (port 9090) from /192.168.1.19 (port 35636) after 6000ms
07-09 15:31:20.940 6189 6205 W System.err: at libcore.io.IoBridge.connectErrno(IoBridge.java:185)
07-09 15:31:20.940 6189 6205 W System.err: at libcore.io.IoBridge.connect(IoBridge.java:130)
07-09 15:31:20.940 6189 6205 W System.err: at java.net.PlainSocketImpl.socketConnect(PlainSocketImpl.java:129)
07-09 15:31:20.940 6189 6205 W System.err: at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:356)
07-09 15:31:20.940 6189 6205 W System.err: at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
07-09 15:31:20.940 6189 6205 W System.err: at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
07-09 15:31:20.940 6189 6205 W System.err: at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:357)
07-09 15:31:20.940 6189 6205 W System.err: at java.net.Socket.connect(Socket.java:626)
07-09 15:31:20.940 6189 6205 W System.err: at ml.dmlc.tvm.rpc.ConnectTrackerServerProcessor.connectToTracker(ConnectTrackerServerProcessor.java:196)
07-09 15:31:20.940 6189 6205 W System.err: at ml.dmlc.tvm.rpc.ConnectTrackerServerProcessor.run(ConnectTrackerServerProcessor.java:107)
07-09 15:31:20.940 6189 6205 W System.err: at ml.dmlc.tvm.tvmrpc.RPCProcessor.run(RPCProcessor.java:67)
07-09 15:31:20.941 6189 6205 W System.err: java.net.SocketException: getsockname failed: EBADF (Bad file descriptor)
07-09 15:31:20.941 6189 6205 W System.err: at libcore.io.IoBridge.getLocalInetSocketAddress(IoBridge.java:713)
07-09 15:31:20.941 6189 6205 W System.err: at java.net.Socket.close(Socket.java:1553)
07-09 15:31:20.941 6189 6205 W System.err: at ml.dmlc.tvm.rpc.ConnectTrackerServerProcessor.run(ConnectTrackerServerProcessor.java:184)
07-09 15:31:20.941 6189 6205 W System.err: at ml.dmlc.tvm.tvmrpc.RPCProcessor.run(RPCProcessor.java:67)
07-09 15:31:20.941 6189 6205 W System.err: Caused by: android.system.ErrnoException: getsockname failed: EBADF (Bad file descriptor)
07-09 15:31:20.941 6189 6205 W System.err: at libcore.io.Linux.getsockname(Native Method)
07-09 15:31:20.942 6189 6205 W System.err: at libcore.io.ForwardingOs.getsockname(ForwardingOs.java:103)
07-09 15:31:20.942 6189 6205 W System.err: at libcore.io.IoBridge.getLocalInetSocketAddress(IoBridge.java:700)
07-09 15:31:20.942 6189 6205 W System.err: … 3 more
07-09 15:31:20.943 6189 6205 W System.err: 5001
07-09 15:31:20.943 6189 6205 W System.err: java.net.BindException: bind failed: EADDRINUSE (Address already in use)
07-09 15:31:20.943 6189 6205 W System.err: using port: 5002
07-09 15:31:20.943 6189 6205 W System : ClassLoader referenced unknown path: system/framework/mediatek-cta.jar

According to the log, port 5001 is already in use, so the server is listening on port 5002. You’ll need to change the adb port forwarding accordingly when that happens.

Thanks for your professional reply, it really works!