[AutoTVM] LocalRunner not working on Windows


#1

Currently, autotvm.LocalRunner uses the option

use_popen = true

here. When use_popen is true, os.setsid will be accessed inside Server. This is only available in Linux, so I get error when running any of scripts in tutorial/autotvm that uses autotvm.LocalRunner.

How can I get around this? My understanding is that RPC is supposed to work on Windows. @merrymercy


#2

One workaround is to start an RPC Tracker and an RPC Server locally. Then change all LocalRunner to RPCRunner


#3

I’ve modified line 282-285 of tutorial tune_simple_template.py like this:
measure_option = autotvm.measure_option(
builder=autotvm.LocalBuilder(‘default’),
runner=autotvm.RPCRunner(“test”, host=‘localhost’, port=9090, number=5, timeout=4,))

and lunch the server : python -m tvm.exec.rpc_server --host=“localhost” --port=9090

but it still fail to connect on windows.
Could you please tell me how to modify the first sample of auto-tuning tune_simple_template.py to make it run correctly on windows.Thanks a lot.


#4
  1. start an rpc tracker
python3 -m tvm.exec.rpc_tracker
  1. start an rpc server
python3 -m tvm.exec.rpc_server --tracker localhost:9190 --key test
  1. Confirm the connection
python3 -m tvm.exec.query_rpc_tracker

You are supposed to see a free “test” on the queue status.

  1. Use you code

#5

It seems that there are still some problems.The terminal information like this:
Exception in thread Thread-1:
Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\lib\threading.py”, line 916, in _bootstrap_inner
self.run()
File “C:\ProgramData\Anaconda3\lib\threading.py”, line 864, in run
self._target(*self._args, **self._kwargs)
File “C:\Users\LyndonHu\AppData\Roaming\Python\Python36\site-packages\tvm-0.5.dev0-py3.6-win-amd64.egg\tvm\autotvm\measure\measure_methods.py”, line 553, in _check
remote = request_remote(device_key, host, port, priority)
File “C:\Users\LyndonHu\AppData\Roaming\Python\Python36\site-packages\tvm-0.5.dev0-py3.6-win-amd64.egg\tvm\autotvm\measure\measure_methods.py”, line 520, in request_remote
tracker = _rpc.connect_tracker(host, port)
File “C:\Users\LyndonHu\AppData\Roaming\Python\Python36\site-packages\tvm-0.5.dev0-py3.6-win-amd64.egg\tvm\rpc\client.py”, line 401, in connect_tracker
return TrackerSession((url, port))
File “C:\Users\LyndonHu\AppData\Roaming\Python\Python36\site-packages\tvm-0.5.dev0-py3.6-win-amd64.egg\tvm\rpc\client.py”, line 192, in init
self._connect()
File “C:\Users\LyndonHu\AppData\Roaming\Python\Python36\site-packages\tvm-0.5.dev0-py3.6-win-amd64.egg\tvm\rpc\client.py”, line 200, in _connect
magic = struct.unpack("<i", base.recvall(self._sock, 4))[0]
File “C:\Users\LyndonHu\AppData\Roaming\Python\Python36\site-packages\tvm-0.5.dev0-py3.6-win-amd64.egg\tvm\rpc\base.py”, line 61, in recvall
raise IOError(“connection reset”)
OSError: connection reset

Get devices for measurement successfully!
No: 1 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.062399864196777344, timestamp=1540780378.7561827) [(‘tile_y’, [512, 1]), (‘tile_x’, [512, 1])],None,0
No: 2 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.04679989814758301, timestamp=1540780378.7873828) [(‘tile_y’, [1, 512]), (‘tile_x’, [64, 8])],None,39
No: 3 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.07799983024597168, timestamp=1540780379.302183) [(‘tile_y’, [4, 128]), (‘tile_x’, [4, 128])],None,77
No: 4 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.032199859619140625, timestamp=1540780379.5995827) [(‘tile_y’, [4, 128]), (‘tile_x’, [64, 8])],None,37
No: 5 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.04680013656616211, timestamp=1540780379.5995827) [(‘tile_y’, [8, 64]), (‘tile_x’, [4, 128])],None,76
No: 6 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.062400102615356445, timestamp=1540780379.6775827) [(‘tile_y’, [32, 16]), (‘tile_x’, [64, 8])],None,34
No: 7 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.062400102615356445, timestamp=1540780380.3025827) [(‘tile_y’, [4, 128]), (‘tile_x’, [128, 4])],None,27
No: 8 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.062400102615356445, timestamp=1540780380.318183) [(‘tile_y’, [64, 8]), (‘tile_x’, [16, 32])],None,53
No: 9 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.046799659729003906, timestamp=1540780389.9279828) [(‘tile_y’, [512, 1]), (‘tile_x’, [64, 8])],None,30
No: 10 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.031199932098388672, timestamp=1540780389.9435828) [(‘tile_y’, [512, 1]), (‘tile_x’, [1, 512])],None,90
Finish loading 30 records
Cannot find config for target=llvm, workload=(‘matmul’, 512, 512, 512, ‘float32’). A fallback configuration is used, which may bring great performance regression.
Press any key to continue . . .

while the local builder call the function build(), it generate a mistake:‘系统找不到指定的文件。’


#6

It’s not working for me either. To be fair, anything involving multi processing is a pain on Windows…


#7

@stoneforestwhu
The first error seems to be a problem of RPC Tracker.
Can you see free devices when executing “python3 -m tvm.exec.query_rpc_tracker” ?

@masahi
What’s your problem? What’s is your target? If you tune for CPU, you can pass use_popen=False
I don’t have a Windows environment. Contributions are welcome.


#8

I have solved the first bug. I changed the port from 9090 to 9190.I’m tracking the second bug and want to know what’s the file it cann’t find. I changed the target from “llvm” to “cuda”。The terminal information like this:
ConfigSpace (len=100, space_map=
0 tile_y: Split(policy=all, product=512, num_outputs=2) len=10
1 tile_x: Split(policy=all, product=512, num_outputs=2) len=10
)
Get devices for measurement successfully!
No: 1 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(ValueError(‘Direct host side access to device memory is detected in default_function. Did you forget to bind?’,),), error_no=2, all_cost=0.031200885772705078, timestamp=1540880531.8170488) [(‘tile_y’, [128, 4]), (‘tile_x’, [64, 8])],None,32
No: 2 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(ValueError(‘Direct host side access to device memory is detected in default_function. Did you forget to bind?’,),), error_no=2, all_cost=0.0, timestamp=1540880532.175858) [(‘tile_y’, [128, 4]), (‘tile_x’, [512, 1])],None,2
No: 3 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(ValueError(‘Direct host side access to device memory is detected in default_function. Did you forget to bind?’,),), error_no=2, all_cost=0.015600442886352539, timestamp=1540880532.8476753) [(‘tile_y’, [2, 256]), (‘tile_x’, [64, 8])],None,38
No: 4 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(ValueError(‘Direct host side access to device memory is detected in default_function. Did you forget to bind?’,),), error_no=2, all_cost=0.0, timestamp=1540880533.2064846) [(‘tile_y’, [512, 1]), (‘tile_x’, [128, 4])],None,20
No: 5 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(ValueError(‘Direct host side access to device memory is detected in default_function. Did you forget to bind?’,),), error_no=2, all_cost=0.0, timestamp=1540880533.5964947) [(‘tile_y’, [128, 4]), (‘tile_x’, [32, 16])],None,42
No: 6 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(ValueError(‘Direct host side access to device memory is detected in default_function. Did you forget to bind?’,),), error_no=2, all_cost=0.0, timestamp=1540880533.6130948) [(‘tile_y’, [256, 2]), (‘tile_x’, [256, 2])],None,11
No: 7 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(ValueError(‘Direct host side access to device memory is detected in default_function. Did you forget to bind?’,),), error_no=2, all_cost=0.0, timestamp=1540880534.0519063) [(‘tile_y’, [64, 8]), (‘tile_x’, [2, 256])],None,83
No: 8 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(ValueError(‘Direct host side access to device memory is detected in default_function. Did you forget to bind?’,),), error_no=2, all_cost=0.0, timestamp=1540880534.2859123) [(‘tile_y’, [8, 64]), (‘tile_x’, [4, 128])],None,76
No: 9 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(ValueError(‘Direct host side access to device memory is detected in default_function. Did you forget to bind?’,),), error_no=2, all_cost=0.0, timestamp=1540880544.2591732) [(‘tile_y’, [2, 256]), (‘tile_x’, [2, 256])],None,88
No: 10 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(ValueError(‘Direct host side access to device memory is detected in default_function. Did you forget to bind?’,),), error_no=2, all_cost=0.0, timestamp=1540880544.649183) [(‘tile_y’, [16, 32]), (‘tile_x’, [4, 128])],None,75
Finish loading 110 records
Cannot find config for target=llvm, workload=(‘matmul’, 512, 512, 512, ‘float32’). A fallback configuration is used, which may bring great performance regression.
Press any key to continue . . .


#9

You should have different schedules for cuda and cpu. Please refer to the tutorial https://docs.tvm.ai/tutorials/get_started.html#sphx-glr-tutorials-get-started-py


#10

If the target is “llvm”. The terminal infomation like this:

ConfigSpace (len=100, space_map=
0 tile_y: Split(policy=all, product=512, num_outputs=2) len=10
1 tile_x: Split(policy=all, product=512, num_outputs=2) len=10
)
Get devices for measurement successfully!
No: 1 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.07700777053833008, timestamp=1540880974.8627179) [(‘tile_y’, [128, 4]), (‘tile_x’, [64, 8])],None,32
No: 2 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.0590059757232666, timestamp=1540880975.5287843) [(‘tile_y’, [8, 64]), (‘tile_x’, [16, 32])],None,56
No: 3 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.05600571632385254, timestamp=1540880975.6207936) [(‘tile_y’, [512, 1]), (‘tile_x’, [2, 256])],None,80
No: 4 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.06600689888000488, timestamp=1540880976.4048722) [(‘tile_y’, [512, 1]), (‘tile_x’, [1, 512])],None,90
No: 5 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.05500531196594238, timestamp=1540880976.9999316) [(‘tile_y’, [8, 64]), (‘tile_x’, [128, 4])],None,26
No: 6 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.07400727272033691, timestamp=1540880977.0279346) [(‘tile_y’, [128, 4]), (‘tile_x’, [8, 64])],None,62
No: 7 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.05000495910644531, timestamp=1540880977.206952) [(‘tile_y’, [256, 2]), (‘tile_x’, [1, 512])],None,91
No: 8 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.048004865646362305, timestamp=1540880977.223954) [(‘tile_y’, [1, 512]), (‘tile_x’, [16, 32])],None,59
No: 9 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.055005550384521484, timestamp=1540880987.2189534) [(‘tile_y’, [64, 8]), (‘tile_x’, [128, 4])],None,23
No: 10 GFLOPS: 0.00/0.00 result: MeasureResult(costs=(FileNotFoundError(2, ‘系统找不到指定的文件。’, None, 2, None),), error_no=2, all_cost=0.06100630760192871, timestamp=1540880987.2719588) [(‘tile_y’, [64, 8]), (‘tile_x’, [64, 8])],None,33
Finish loading 120 records
Cannot find config for target=llvm, workload=(‘matmul’, 512, 512, 512, ‘float32’). A fallback configuration is used, which may bring great performance regression.
Press any key to continue . . .

It seems that it didn’t work.
My code is from tutorial:Writing tunable template and Using auto-tuner

import logging
import sys

import numpy as np
import tvm

from tvm import autotvm

def matmul_v0(N, L, M, dtype):
A = tvm.placeholder((N, L), name=‘A’, dtype=dtype)
B = tvm.placeholder((L, M), name=‘B’, dtype=dtype)

k = tvm.reduce_axis((0, L), name='k')
C = tvm.compute((N, M), lambda i, j: tvm.sum(A[i, k] * B[k, j], axis=k), name='C')
s = tvm.create_schedule(C.op)

# schedule
y, x = s[C].op.axis
k = s[C].op.reduce_axis[0]

yo, yi = s[C].split(y, 8)
xo, xi = s[C].split(x, 8)

s[C].reorder(yo, xo, k, yi, xi)

return s, [A, B, C]

@autotvm.template # 1. use a decorator
def matmul_v1(N, L, M, dtype):
A = tvm.placeholder((N, L), name=‘A’, dtype=dtype)
B = tvm.placeholder((L, M), name=‘B’, dtype=dtype)

k = tvm.reduce_axis((0, L), name='k')
C = tvm.compute((N, M), lambda i, j: tvm.sum(A[i, k] * B[k, j], axis=k), name='C')
s = tvm.create_schedule(C.op)

# schedule
y, x = s[C].op.axis
k = s[C].op.reduce_axis[0]

# 2. get the config object
cfg = autotvm.get_config()

# 3. define search space
cfg.define_knob("tile_y", [1, 2, 4, 8, 16])
cfg.define_knob("tile_x", [1, 2, 4, 8, 16])

# 4. schedule according to config
yo, yi = s[C].split(y, cfg['tile_y'].val)
xo, xi = s[C].split(x, cfg['tile_x'].val)

s[C].reorder(yo, xo, k, yi, xi)

return s, [A, B, C]

@autotvm.template
def matmul(N, L, M, dtype):
A = tvm.placeholder((N, L), name=‘A’, dtype=dtype)
B = tvm.placeholder((L, M), name=‘B’, dtype=dtype)

k = tvm.reduce_axis((0, L), name='k')
C = tvm.compute((N, M), lambda i, j: tvm.sum(A[i, k] * B[k, j], axis=k), name='C')
s = tvm.create_schedule(C.op)

# schedule
y, x = s[C].op.axis
k = s[C].op.reduce_axis[0]

##### define space begin #####
cfg = autotvm.get_config()
cfg.define_split("tile_y", y, num_outputs=2)
cfg.define_split("tile_x", x, num_outputs=2)
##### define space end #####

# schedule according to config
yo, yi = cfg["tile_y"].apply(s, C, y)
xo, xi = cfg["tile_x"].apply(s, C, x)

s[C].reorder(yo, xo, k, yi, xi)

return s, [A, B, C]

if name == ‘main’:
N, L, M = 512, 512, 512
task = autotvm.task.create(matmul, args=(N, L, M, ‘float32’), target=‘llvm’)
print(task.config_space)

logging.getLogger('autotvm').setLevel(logging.DEBUG)
logging.getLogger('autotvm').addHandler(logging.StreamHandler(sys.stdout))

measure_option = autotvm.measure_option(builder=autotvm.LocalBuilder(),
    runner=autotvm.RPCRunner("test", host='localhost', port=9190, number=5, timeout=4,))

# begin tuning, log records to file `matmul.log`
tuner = autotvm.tuner.RandomTuner(task)
tuner.tune(n_trial=10,
           measure_option=measure_option,
           callbacks=[autotvm.callback.log_to_file('matmul.log')])

# apply history best from log file
with autotvm.apply_history_best('matmul.log'):
    with tvm.target.create("llvm"):
        s, arg_bufs = matmul(N, L, M, 'float32')
        func = tvm.build(s, arg_bufs)

# check correctness
a_np = np.random.uniform(size=(N, L)).astype(np.float32)
b_np = np.random.uniform(size=(L, M)).astype(np.float32)
c_np = a_np.dot(b_np)

c_tvm = tvm.nd.empty(c_np.shape)
func(tvm.nd.array(a_np), tvm.nd.array(b_np), c_tvm)

tvm.testing.assert_allclose(c_np, c_tvm.asnumpy(), rtol=1e-2)

#11

Let us make llvm work first.

Set do_fork=False in this line

We can get traceback of the error


#12

Thanks a lot. I’ve tried this,but the result is the same.
I’ve traced the code and found in file tvm/autotvm/measure/measure_methods.py:
def default_build_func(measure_input, tmp_dir, **kwargs):
“”"
Default build func. This can work for cuda, opencl, llvm backend

Parameters
----------
measure_input: MeasureInput
    The input of measurement
tmp_dir: str
    The path of temporary directory to export generated library
"""
tic = time.time()
try:
    filename = os.path.join(tmp_dir, "tmp_func_%0x.tar" % getrandbits(64))
    func, arg_info = _build_func_common(measure_input, **kwargs)
    **func.export_library(filename)**
except Exception as e:  # pylint: disable=broad-except
    return BuildResult(None, None, e, time.time() - tic)
return BuildResult(filename, arg_info, None, time.time() - tic)

in line func.export_library(filename), the filename can’t be found.


#13

@merrymercy yes, my target is x86.

My problems is very weird. It seems I’m getting different kind of errors every time I run it. I’ve seen psutil.AccessDenied error, timeout error, etc. tvm.exec.rpc_tracker is also giving me its error after some time in its console, something like below. I’m going to look into this further

C:\Users\mmasuda>python -m tvm.exec.rpc_tracker
INFO:root:If you are running ROCM/Metal, fork will cause compiler internal error. Try to launch with arg ```--no-fork```
INFO:RPCTracker:bind to 0.0.0.0:9190
ERROR:tornado.application:Exception in callback (588, <function wrap.<locals>.null_wrapper at 0x00000230B08228C8>)
Traceback (most recent call last):
  File "C:\Users\mmasuda\AppData\Local\Continuum\Anaconda3_4.4\lib\site-packages\tornado\ioloop.py", line 888, in start
    handler_func(fd_obj, events)
  File "C:\Users\mmasuda\AppData\Local\Continuum\Anaconda3_4.4\lib\site-packages\tornado\stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "D:\projects\tvm\python\tvm\rpc\tornado_util.py", line 22, in _event_handler
    self._event_handler(events)
  File "D:\projects\tvm\python\tvm\rpc\tornado_util.py", line 59, in _event_handler
    if self._update_read() and (events & self._ioloop.WRITE):
  File "D:\projects\tvm\python\tvm\rpc\tornado_util.py", line 96, in _update_read
    self.on_message(msg)
  File "D:\projects\tvm\python\tvm\rpc\tracker.py", line 198, in on_message
    self.call_handler(json.loads(msg))
  File "D:\projects\tvm\python\tvm\rpc\tracker.py", line 221, in call_handler
    self._tracker.put(key, value)
  File "D:\projects\tvm\python\tvm\rpc\tracker.py", line 298, in put
    self._scheduler_map[key].put(value)
  File "D:\projects\tvm\python\tvm\rpc\tracker.py", line 115, in put
    self._schedule()
  File "D:\projects\tvm\python\tvm\rpc\tracker.py", line 106, in _schedule
    item = heapq.heappop(self._requests)
TypeError: '<' not supported between instances of 'function' and 'function'

#14

@merrymercy Any idea what is happening in the error below? Is this a timeout error?

$python test.py
Extract tasks...
D:\projects\tvm\python\tvm\tag.py:32: UserWarning: Tag 'broadcast' declared via TagScope was not used.
  warnings.warn("Tag '%s' declared via TagScope was not used." % (self.tag,))
Tuning...
[Task  1/ 9]  Current/Best:    0.00/ 353.29 GFLOPS | Progress: (196/280) | 920.09 sWARNING:autotvm:Too many errors happen in the tuning. Now is in debug mode
DEBUG:autotvm:No: 197   GFLOPS: 0.00/353.29     result: MeasureResult(costs=('',), error_no=7, all_cost=10, timestamp=1540903358.0489917)       [('tile_ic', [1, 3]), ('tile_oc', [4, 16]), ('tile_ow', [32, 7]), ('unroll_kw', False)],,None,191
DEBUG:autotvm:No: 198   GFLOPS: 0.00/353.29     result: MeasureResult(costs=('',), error_no=7, all_cost=10, timestamp=1540903358.895727)        [('tile_ic', [1, 3]), ('tile_oc', [1, 64]), ('tile_ow', [112, 2]), ('unroll_kw', False)],,None,167
DEBUG:autotvm:No: 199   GFLOPS: 0.00/353.29     result: MeasureResult(costs=('',), error_no=7, all_cost=10, timestamp=1540903358.895727)        [('tile_ic', [1, 3]), ('tile_oc', [1, 64]), ('tile_ow', [224, 1]), ('unroll_kw', True)],,None,13
DEBUG:autotvm:No: 200   GFLOPS: 0.00/353.29     result: MeasureResult(costs=('',), error_no=7, all_cost=10, timestamp=1540903358.895727)        [('tile_ic', [1, 3]), ('tile_oc', [16, 4]), ('tile_ow', [28, 8]), ('unroll_kw', False)],,None,201
DEBUG:autotvm:No: 201   GFLOPS: 0.00/353.29     result: MeasureResult(costs=('',), error_no=7, all_cost=10, timestamp=1540903358.895727)        [('tile_ic', [1, 3]), ('tile_oc', [4, 16]), ('tile_ow', [8, 28]), ('unroll_kw', True)],,None,107
DEBUG:autotvm:No: 202   GFLOPS: 0.00/353.29     result: MeasureResult(costs=('',), error_no=7, all_cost=10, timestamp=1540903358.895727)        [('tile_ic', [3, 1]), ('tile_oc', [32, 2]), ('tile_ow', [16, 14]), ('unroll_kw', False)],,None,212
DEBUG:autotvm:No: 203   GFLOPS: 0.00/353.29     result: MeasureResult(costs=('',), error_no=7, all_cost=10, timestamp=1540903359.6766381)       [('tile_ic', [3, 1]), ('tile_oc', [4, 16]), ('tile_ow', [4, 56]), ('unroll_kw', False)],,None,274
...

#15

It seems to be a timeout error. The current default behavior of autotvm is to start displaying more debug information at a certain point. I would not be concerned with this if you can see runs where there are no errors (error_no is 0 or GFLOPs is not 0).


#16

I see GLOPS > 0 at the beginning, but once it reaches the timeout error with GFLOPS = 0, it seems stuck there forever.


#17

So my guess is that the timeout is being fired here. I do not have a lot of experience with multiprocessing on Windows. Could it be that there is something wrong with the join() call?


#18

I have no idea. Honestly I don’t want to spend time debugging Windows only weirdness. If I couldn’t figure out how to get around this, I’m just going to go ahead and nuke the entire async / rpc / multiprocessing stuff from autotvm.


#19

Ok, that can work too. Originally the main goals of the multiprocessing/async were to tolerate failures and to scale with more hardware devices. If you don’t have too many devices and are reasonably confident that the templates do not have configurations that will crash, then you can try removing these features.