Segmentation fault (core dumped) when use tensorflow lstm

lionhjw · May 27, 2020, 8:56am

why LSTM max sequence length is small, for example 10, all is fine. but when i increase it, for example 50, it comes out Segmentation fault (core dumped) without any useful log. Is any limit(cpu or memory) when use tvm or just it is a bug

tensorflow code

rnn_cell_f = tf.nn.rnn_cell.LSTMCell(self.units, name=self.name + '/rnn_cell_f')
rnn_cell_b = tf.nn.rnn_cell.LSTMCell(self.units, name=self.name + '/rnn_cell_b')
rnn_out, _, _ = tf.nn.static_bidirectional_rnn(rnn_cell_f, rnn_cell_b, x_sequence, dtype=tf.float32)

tvm code

with tf.Session() as sess:
    with tf.gfile.FastGFile(tf_model, 'rb') as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
        tf.import_graph_def(graph_def, name='')
        graph_def = tf_testing.ProcessGraphDefParam(graph_def)
        graph_def = tf_testing.AddShapesToGraphDef(sess, 'dense_1/Softmax')

        sym, params = relay.frontend.from_tensorflow(graph_def, layout=layout, shape=shape_dict, outputs=['dense_1/Softmax:0'])
        with relay.build_config(opt_level=3):
            graph, lib, params = relay.build(sym, target=target, params=params)

lsy643 · May 28, 2020, 2:43am

It seems that you have met a stack overflow error.

You can try this


def _build(mod, target, params, ret_dict):
    with relay.build_config(opt_level=3):
            graph, lib, params = relay.build(mod, target=target, params=params)
            ret_dict['graph'] = graph
            ret_dict['lib'] = lib
            ret_dict['params'] = params


ret_dict = {}
build_thread = threading.Thread(target=_build, args=(sys, target, params, ret_dict))
old_stack_size = threading.stack_size(1024*1024*108)
build_thread.start()
threading.stack_size(old_stack_size)
build_thread.join()

graph = ret_dict['graph']
lib = ret_dict['lib'] 
params = ret_dict['params']

lionhjw · May 28, 2020, 3:28am

that is it, thanks, my program works fine now