ECE 6504 Deep Learning for Perception Topics Recurrent
- Slides: 25
ECE 6504: Deep Learning for Perception Topics: – – Recurrent Neural Networks (RNNs) Back. Prop Through Time (BPTT) Vanishing / Exploding Gradients [Abhishek: ] Lua / Torch Tutorial Dhruv Batra Virginia Tech
Administrativia • HW 3 – – (C) Dhruv Batra Out today Due in 2 weeks Please please start early https: //computing. ece. vt. edu/~f 15 ece 6504/homework 3/ 2
Plan for Today • Model – Recurrent Neural Networks (RNNs) • Learning – Back. Prop Through Time (BPTT) – Vanishing / Exploding Gradients • [Abhishek: ] Lua / Torch Tutorial (C) Dhruv Batra 3
New Topic: RNNs (C) Dhruv Batra Image Credit: Andrej Karpathy 4
Synonyms • Recurrent Neural Networks (RNNs) • Recursive Neural Networks – General familty; think graphs instead of chains • Types: – – – Long Short Term Memory (LSTMs) Gated Recurrent Units (GRUs) Hopfield network Elman networks … • Algorithms – Back. Prop Through Time (BPTT) – Back. Prop Through Structure (BPTS) (C) Dhruv Batra 5
What’s wrong with MLPs? • Problem 1: Can’t model sequences – Fixed-sized Inputs & Outputs – No temporal structure • Problem 2: Pure feed-forward processing – No “memory”, no feedback (C) Dhruv Batra Image Credit: Alex Graves, book 6
Sequences are everywhere… (C) Dhruv Batra Image Credit: Alex Graves and Kevin Gimpel 7
Even where you might not expect a sequence… (C) Dhruv Batra Image Credit: Vinyals et al. 8
Even where you might not expect a sequence… • Input ordering = sequence (C) Dhruv Batra Image Credit: Ba et al. ; Gregor et al 9
(C) Dhruv Batra Image Credit: [Pinheiro and Collobert, ICML 14] 10
Why model sequences? Figure Credit: Carlos Guestrin
Why model sequences? (C) Dhruv Batra Image Credit: Alex Graves 12
Name that model Y 1 = {a, …z} X 1 = Y 2 = {a, …z} X 2 = Y 3 = {a, …z} Y 4 = {a, …z} X 3 = X 4 = Y 5 = {a, …z} X 5 = Hidden Markov Model (HMM) (C) Dhruv Batra Figure Credit: Carlos Guestrin 13
How do we model sequences? • No input (C) Dhruv Batra Image Credit: Bengio, Goodfellow, Courville 14
How do we model sequences? • With inputs (C) Dhruv Batra Image Credit: Bengio, Goodfellow, Courville 15
How do we model sequences? • With inputs and outputs (C) Dhruv Batra Image Credit: Bengio, Goodfellow, Courville 16
How do we model sequences? • With Neural Nets (C) Dhruv Batra Image Credit: Alex Graves 17
How do we model sequences? • It’s a spectrum… Input: No sequence Input: Sequence Output: No sequence Example: “standard” classification / regression problems (C) Dhruv Batra Example: Im 2 Caption Example: sentence classification, multiple-choice question answering Example: machine translation, video captioning, openended question answering, video question answering Image Credit: Andrej Karpathy 18
Things can get arbitrarily complex (C) Dhruv Batra Image Credit: Herbert Jaeger 19
Key Ideas • Parameter Sharing + Unrolling – Keeps numbers of parameters in check – Allows arbitrary sequence lengths! • “Depth” – Measured in the usual sense of layers – Not unrolled timesteps • Learning – Is tricky even for “shallow” models due to unrolling (C) Dhruv Batra 20
Plan for Today • Model – Recurrent Neural Networks (RNNs) • Learning – Back. Prop Through Time (BPTT) – Vanishing / Exploding Gradients • [Abhishek: ] Lua / Torch Tutorial (C) Dhruv Batra 21
BPTT • a (C) Dhruv Batra Image Credit: Richard Socher 22
Illustration [Pascanu et al] • Intuition • Error surface of a single hidden unit RNN; High curvature walls • Solid lines: standard gradient descent trajectories • Dashed lines: gradient rescaled to fix problem (C) Dhruv Batra 23
Fix #1 • Pseudocode (C) Dhruv Batra Image Credit: Richard Socher 24
Fix #2 • Smart Initialization and Re. Lus – [Socher et al 2013] – A Simple Way to Initialize Recurrent Networks of Rectified Linear Units, Le et al. 2015 (C) Dhruv Batra 25
- Gated recurrent unit in deep learning
- Deep learning vs machine learning
- Deep learning approach and surface learning approach
- Deep asleep deep asleep it lies
- Deep forest towards an alternative to deep neural networks
- O the deep deep love of jesus
- Cuadro comparativo e-learning y b-learning
- Vocal cord positions
- Visualizing and understanding recurrent networks
- Radial artery in hand
- Rima vestibuli function
- Recurrent strokes
- Recurrent stroke causes
- Artery of heubner
- Extensions of recurrent neural network language model
- Cs 7643 github
- Cs 231 n
- A recurrent bert-based model for question generation
- Pixel rnn
- Recurrent neural network based language model
- List the five elements of hair design
- 4th aortic arch derivatives
- Symbol dichotomy
- Part 135 recurrent training
- Vocal cord positions
- Jolls triangle thyroid