Deep learning Recurrent Neural Networks CV 192 Lecturer

Contents • • Review of previous lecture – CNN Recurrent Neural Networks Back propagation

Example – Object recognition and localization [Andrej Karpathy Li Fei-Fei, (2015): Deep Visual-Semantic Alignments

MLP with one hidden layer [Lecun, Y. , Bengio, Y. , & Hinton, G.

How big should our hidden layer be? https: //cs. stanford. edu/people/karpathy/convnetjs/demo/cl assify 2 d.

Convolution • Discrete: kernel Convolution: matrix*kernel http: //deeplearning. net/software/theano/tutorial/conv_arithmetic. html

. CNN – Convolutional Layer Filters learned by Alex. Net (2012) 96 convolution kernels

. CNN – Pooling Layer • Reduce the spatial size of the next layer

. CNN – Fully connected layer • Usually the last layer before the output

. Feature Visualization • Visualizing via optimization of the input: Colah, et al. ,

Sequence modeling Examples: Image classification Image captioning Sentiment analysis http: //karpathy. github. io/2015/05/21 /rnn-effectiveness/

DRAW: A recurrent neural network for image generation [Gregor, et al. , (2015) DRAW:

. Recurrent Neural Networks vs Feedforward Networks

. Recurrent Neural Networks Gradient flow

. Recurrent Neural Networks Truncated Back propagation “through time”: • Break every K steps.

. Recall the vanishing gradient problem…

Vanishing / exploding gradient • [Example from: Goodfellow, I. , Bengio, Y. , Courville,

. Long Short Term Memory (LSTM) • [Hochreiter, S. , & Schmidhuber, J. (1997).

. LSTM vs regular RNN The repeating module in a standard RNN contains a

. LSTM • http: //colah. github. io/posts/2015 -08 -Understanding-LSTMs/

. LSTM – Gradient Flow Chen, G. (2016). A Gentle Tutorial of Recurrent Neural

. Input gate • http: //colah. github. io/posts/2015 -08 -Understanding-LSTMs/

. (tanh) gate • http: //colah. github. io/posts/2015 -08 -Understanding-LSTMs/

. Forget Gate • http: //colah. github. io/posts/2015 -08 -Understanding-LSTMs/

. Output Gate • http: //colah. github. io/posts/2015 -08 -Understanding-LSTMs/

. Increasing model memory • Adding more hidden layers increases the model capacity while

. Increasing model memory • Adding depth through ‘time’. • Does not increase memory.

. Neural Image Caption Generation with Visual Attention [Xu et al. , (2015) Show

. Neural Image Caption Generation with Visual Attention • Word generation conditioned on: •

Shakespeare http: //karpathy. github. io/2015/05/21/rnn-effectiveness/

Visualizing the predictions and the “neuron” firings in the RNN http: //karpathy. github. io/2015/05/21/rnn-effectiveness/

. Summary • Deep learning is a field of machine learning that uses artificial

Slides: 48

Download presentation

Deep learning: Recurrent Neural Networks CV 192 Lecturer: Oren Freifeld, TA: Ron Shapira Weber Photo by Aphex 34 / CC BY-SA 4. 0

Contents • • Review of previous lecture – CNN Recurrent Neural Networks Back propagation through RNN LSTM

Example – Object recognition and localization [Andrej Karpathy Li Fei-Fei, (2015): Deep Visual-Semantic Alignments for Generating Image Descriptions]

MLP with one hidden layer [Lecun, Y. , Bengio, Y. , & Hinton, G. (2015)]

How big should our hidden layer be? https: //cs. stanford. edu/people/karpathy/convnetjs/demo/cl assify 2 d. html

Convolution • Discrete: kernel Convolution: matrix*kernel http: //deeplearning. net/software/theano/tutorial/conv_arithmetic. html

. CNN – Convolutional Layer Filters learned by Alex. Net (2012) 96 convolution kernels of size 11 x 3 learned by the first convolutional layer on the 224 x 3 input images (Image. Net) Krizhevsky, A. , Sutskever, I. , & Hinton, G. E. (2012).

. CNN – Pooling Layer • Reduce the spatial size of the next layer http: //cs 231 n. github. io/convolutional-networks/

. CNN – Fully connected layer • Usually the last layer before the output layer (e. g softmax)

. Feature Visualization • Visualizing via optimization of the input: Colah, et al. , "Feature Visualization", Distill, 2017.

Recurrent Neural Networks

Sequence modeling Examples: Image classification Image captioning Sentiment analysis http: //karpathy. github. io/2015/05/21 /rnn-effectiveness/ Machine translation Video frame-by-frame classification

DRAW: A recurrent neural network for image generation [Gregor, et al. , (2015) DRAW: A recurrent neural network for image generation]

. Recurrent Neural Networks vs Feedforward Networks

. Recurrent Neural Networks Gradient flow

. Recurrent Neural Networks

. Recurrent Neural Networks Truncated Back propagation “through time”: • Break every K steps. • Forward propagation stays the same. • Downside?

. Recurrent Neural Networks •

. Recall the vanishing gradient problem…

. Gradient behavior in RNN •

Vanishing / exploding gradient • [Example from: Goodfellow, I. , Bengio, Y. , Courville, A. , & Bengio, Y. (2016). Deep learning (Vol. 1). ]

. Vanishing / exploding gradient •

. Long Short Term Memory (LSTM) • [Hochreiter, S. , & Schmidhuber, J. (1997). Long short-term memory]

. LSTM vs regular RNN The repeating module in a standard RNN contains a single layer. http: //colah. github. io/posts/2015 -08 -Understanding-LSTMs/ The repeating module in an LSTM contains four interacting layers.

. LSTM • http: //colah. github. io/posts/2015 -08 -Understanding-LSTMs/

. LSTM – Gradient Flow Chen, G. (2016). A Gentle Tutorial of Recurrent Neural Network with Error Backpropagation

. Input gate • http: //colah. github. io/posts/2015 -08 -Understanding-LSTMs/

. (tanh) gate • http: //colah. github. io/posts/2015 -08 -Understanding-LSTMs/

. Forget Gate • http: //colah. github. io/posts/2015 -08 -Understanding-LSTMs/

. Output Gate • http: //colah. github. io/posts/2015 -08 -Understanding-LSTMs/

. LSTM • http: //colah. github. io/posts/2015 -08 -Understanding-LSTMs/

. LSTM Summary •

. Increasing model memory •

. Increasing model memory • Adding more hidden layers increases the model capacity while computation scales linearly. • Increases representational power via non-linearity.

. Increasing model memory • Adding depth through ‘time’. • Does not increase memory. • Increases representational power via non-linearity.

. Neural Image Caption Generation with Visual Attention [Xu et al. , (2015) Show and tell: A neural image caption generator]

. Neural Image Caption Generation with Visual Attention • Word generation conditioned on: • Context vector (CNN features) • Previous hidden state • Previously generated word [Xu et al. , (2015) Show and tell: A neural image caption generator]

Shakespeare http: //karpathy. github. io/2015/05/21/rnn-effectiveness/

Visualizing the predictions and the “neuron” firings in the RNN http: //karpathy. github. io/2015/05/21/rnn-effectiveness/

[http: //norman-ai. mit. edu/]

. Summary • Deep learning is a field of machine learning that uses artificial neural network in order to learn representations in an hierarchical manner. • Convolutional neural achieved state-of-the-art results in the field of computer vision by • Reducing the number of parameters via shard weights and local connectivity. • Using an hierarchal model. • Recurrent neural network are used for sequence modelling. The vanishing gradient problem is address by a model called LSTM.

. Thanks!