RECURRENT NEURAL NETWORKS And Long Short Term Memory
- Slides: 51
RECURRENT NEURAL NETWORKS And Long Short Term Memory 1 Michael Green & Shaked Perek
OUTLINE Introduction Motivation RNN architecture RNN problems LSTM How LSTM solves the problem Paper experiments Conclusions 2
INTRODUCTION RNN were introduced in the late 80’s. Hochreiter discovers the ‘vanishing gradients’ problem in 1991. Long Short Term Memory published in 1997. LSTM a recurrent network to overcome these problems. 3
MOTIVATION Feed forward networks accept a fixed-sized vector as input and produce a fixed-sized vector as output fixed amount of computational steps recurrent nets allow us to operate over sequences of vectors 4
MOTIVATION Image classification Image captioning Sentiment analysis Synced sequence(video classification) 5 Machine translation
RNN ARCHITECTURE Output Hidden Units Delay Inputs 6
RNN ARCHITECTURE Left: feed forward neural network Middle: a simple recurrent neural network Right: Fully connected recurrent neural network 7
RNN FORWARD PASS 8
RNN ARCHITECTURE 9
RNN ARCHITECTURE The recurrent network can be converted into a feed forward network by unfolding over time 10
BACK PROPAGATION THROUGH TIME BPTT learning algorithm is an extension of standard backpropagation that performs gradients descent on an unfolded network. The gradient descent weight updates have contributions from each time step. The errors have to be back-propagated through time as well as through the network 11
RNN BACKWARD PASS For recurrent networks, the loss function depends on the activation of the hidden layer through its influence on the output layer and through its influence on the hidden layer at the next step. 12
RNN BACKWARD PASS For recurrent networks, the loss function depends on the activation of the hidden layer through its influence on the output layer and through its influence on the hidden layer at the next step. 13
RNN BACKWARD PASS For recurrent networks, the loss function depends on the activation of the hidden layer through its influence on the output layer and through its influence on the hidden layer at the next step. 14
RNN BACKWARD PASS 15
RNN BACKWARD PASS 16
RNN BACKWARD PASS 17
BACK PROPAGATION THROUGH TIME 18
BACK PROPAGATION THROUGH TIME 19
BACK PROPAGATION THROUGH TIME 20
VANISHING GRADIENTS 21
VANISHING GRADIENTS 22
OUTLINE - LSTM Introduction Motivation RNN architecture RNN problems Long Short Term Memory How LSTM solves the problem Paper experiments Conclusions 23
LSTM - INTRODUCTION LSTM was invented to solve the vanishing gradients problem. LSTM maintain a more constant error flow in the backpropogation process. LSTM can learn over more than 1000 time steps , and thus can handle large sequences that are linked remotely. 24
LSTM – SAME IDEA AS RNN LSTM 25
LSTM ARCHITECTURE 26
LSTM ARCHITECTURE 27
LSTM ARCHITECTURE Each memory cell contains a node with a self-connected recurrent edge of fixed weight one, ensuring that the gradient can pass across many time steps without Vanishing – which is called CEC (constant error carousel) 28
LSTM ARCHITECTURE - OVERVIEW 29
LSTM ARCHITECTURE – INPUT 30
LSTM ARCHITECTURE – INPUT GATE 31
LSTM ARCHITECTURE – CEC 32
LSTM ARCHITECTURE – OUTPUT GATE 33
LSTM ARCHITECTURE – OUTPUT GATE 34
LSTM FORWARD PASS 35
LSTM – ACTIVATION FUNCTIONS 36
LSTM – ACTIVATION FUNCTIONS 37
LSTM – ACTIVATION FUNCTIONS 38
LSTM BACKWARD PASS Errors arriving at cell outputs are propogated to the CEC Errors can stay for a long time inside the CEC This ensures non-decaying error Can bridge time lags between input events and target signals 39
LSTM BACKWARD PASS Error from the current output Error from the next cell output 40
LSTM BACKWARD PASS We also need to add the gradient from the next step 41
LSTM BACKWARD PASS 42
LSTM BACKWARD PASS 43
LSTM EXPERIMENTS All experiments involve long minimal time lags Complex tasks that cannot be solved by simple strategies such as weight guessing. Comparison to other RNNs Weight initialization [-0. 2, 0. 2], [-0. 1, 0. 1] 44
LSTM EXPERIMENTS – PREDICT NEXT SYMBOL Prediction of the next symbol in a sequence of symbols Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory. " Neural computation 9. 8 (1997): 1735 -1780. 45
LSTM EXPERIMENTS 46
LSTM EXPERIMENTS - CLASSIFICATION 47
ADVANTAGES OF LSTM Non-decaying error backpropagation. For long time lag problems, LSTM can handle noise and continuous values. No parameter fine tuning. Memory for long time periods 48
LIMITATIONS OF LSTM Small gradients 49
LSTM POSSIBLE REMEDY 50
LSTM CONCLUSIONS RNNs - self connected networks Vanishing gradients and long memory problems LSTM - solves the vanishing gradient and the long memory limitation problem LSTM can learn sequences with more than 1000 time steps. 51
- Long and short
- Long term memory vs short term memory
- Pixel recurrent neural networks.
- Rnn andrew ng
- Long term plan and short term plan
- Example of short term human resources
- Difference between long term and short term liabilities
- Difference between long term and short term liabilities
- Short term goal
- Long term and short term financial planning
- Extensions of recurrent neural network language model
- Lstm colah
- Recurrent neural network based language model
- Visualizing and understanding recurrent networks
- Cs 231 n
- Long term memory example
- Prospective memory examples
- Process of memory encoding
- Example of encoding failure
- Types of ltm
- What part of the brain stores long term memory
- Long term memory
- To establish effective short- and long-term goals ____.
- Visualizing and understanding convolutional networks
- Deep neural networks and mixed integer linear optimization
- Least mean square algorithm in neural network
- Neural networks for rf and microwave design
- Neural networks and fuzzy logic
- Neural networks and learning machines
- Short term memory computer
- Short term memory
- Short term memory in psychology
- Memory is the persistence of learning over time
- Once upon a time there lived an old man
- Vc dimension of neural networks
- Freed et al 2001 ib psychology
- Audio super resolution using neural networks
- Convolutional neural networks for visual recognition
- Image style transfer using convolutional neural networks
- Nvdla
- Cnn ppt
- Matlab neural network toolbox pdf
- 11-747 neural networks for nlp
- Perceptron xor
- Csrmm
- On the computational efficiency of training neural networks
- Threshold logic unit
- Introduction to convolutional neural networks
- Convolutional neural networks
- Few shot learning with graph neural networks
- Deep forest towards an alternative to deep neural networks
- Convolutional neural networks