Recurrent network optimization

I was searching the forum about the optimization of the recurrent neural network but didn’t find any of the information. So if anyone has an idea of how to optimize the recurrent networks.

Like pruning the LSTM network or any other optimization techniques???

There is an example implementation of persistent LSTMs here if that helps.