Convolutional Neural Network CNN Network Architecture designed for
- Slides: 35
Convolutional Neural Network (CNN) Network Architecture designed for Image 1
Image Classification dog cat tree Model Cross entropy 100 x 100 (All the images to be classified have the same size. ) 2
Image Classification 3 channels 3 -D tensor 100 x 100 100 100 x 100 value represents intensity 3
Fully Connected Network …… 100 x 100 …… …… 3 x 107 100 x 100 …… …… 100 x 3 1000 Do we really need “fully connected” in image processing? 4
Observation 1 Input Identifying some critical patterns Layer 1 Layer 2 …… Bird ……? …… …… Perhaps human also identify birds in a similar way … 5
https: //www. dcard. tw/f/funny/p/233833012 6
Observation 1 Need to see the Input whole image? A neuron does not have to see the whole image. Layer 1 Layer 2 …… bird …… …… …… basic detector advanced detector Some patterns are much smaller than the whole image. 7
3 x 3 x 3 weights Simplification 1 …. . . 3 x 3 bias 3 x 3 …. . . 1 0 0 1 11 00 00 11 0 00 11 00 0 0 1 1 0 0 00 00 11 11 00 00 1 0 0 0 11 00 00 00 11 00 0 1 0 00 11 00 0 0 1 0 00 00 11 00 …. . . Receptive field 1 8
Simplification 1 • Can different neurons have different sizes of receptive field? • Cover only some channels? • Not square receptive field? 3 x 3 weights Receptive field 1 0 0 1 11 00 00 11 0 00 11 00 0 0 1 1 0 0 00 00 11 11 00 00 1 0 0 0 11 00 00 00 11 00 0 1 0 00 11 00 0 0 1 0 00 00 11 00 the same receptive field Can be overlapped 9
Simplification 1 – Typical Setting Each receptive field has a set of neurons (e. g. , 64 neurons). all channels kernel size (e. g. , 3 x 3) stride = 2 overlap 1 0 0 1 11 00 00 11 0 00 11 00 0 0 1 1 0 0 00 00 11 11 00 00 1 0 0 0 11 00 00 00 11 00 0 1 0 00 11 00 0 0 1 0 00 00 11 00 padding The receptive fields cover the whole image. 10
Observation 2 • The same patterns appear in different regions. I detect “beak” in my receptive field. Each receptive field needs a “beak” detector? I detect “beak” in my receptive field. 11
…. . . Simplification 2 3 x 3 x 3 weights bias … 1 parameter sharing …. . . 1 0 0 1 11 00 00 11 0 00 11 00 0 0 1 1 0 0 00 00 11 11 00 00 1 0 0 0 11 00 00 00 11 00 0 1 0 00 11 00 0 0 1 0 00 00 11 00 3 x 3 x 3 weights bias … 1 12
…. . . Simplification 2 bias 1 …. . . bias … Two neurons with the same receptive field would not share parameters. … 1 0 0 1 11 00 00 11 0 00 11 00 0 0 1 1 0 0 00 00 11 11 00 00 1 0 0 0 11 00 00 00 11 00 0 1 0 00 11 00 0 0 1 0 00 00 11 00 1 13
Simplification 2 – Typical Setting Each receptive field has a set of neurons (e. g. , 64 neurons). …… …… 1 0 0 1 11 00 00 11 0 00 11 00 0 0 1 1 0 0 00 00 11 11 00 00 1 0 0 0 11 00 00 00 11 00 0 1 0 00 11 00 0 0 1 0 00 00 11 00 14
Simplification 2 – Typical Setting Each receptive field has a set of neurons (e. g. , 64 neurons). Each receptive field has the neurons with the same set of parameters. filter 1 filter 2 filter 3 filter 4 …… …… 1 0 0 1 11 00 00 11 0 00 11 00 0 0 1 1 0 0 00 00 11 11 00 00 1 0 0 0 11 00 00 00 11 00 0 1 0 00 11 00 0 0 1 0 00 00 11 00 15
Benefit of Convolutional Layer Fully Connected Layer Jack of all trades, master of none Receptive Field Parameter Sharing Convolutional Layer Larger model bias (for image) • Some patterns are much smaller than the whole image. • The same patterns appear in different regions. 16
Another story based on filter Convolutional Layer Filter 1 3 x channel tensor Convolution …… Filter 2 3 x channel tensor channel = 1 (black and white) …… channel = 3 (colorful) Each filter detects a small pattern (3 x channel). 17
Convolutional Layer 0 1 0 0 1 1 0 0 0 1 1 0 0 6 x 6 image 1 -1 -1 -1 1 Filter 1 -1 -1 -1 Filter 2 1 1 1 -1 -1 -1 …… 1 0 0 1 Consider channel = 1 (black and white image) (The values in the filters are unknown parameters. ) 18
Convolutional Layer stride=1 1 -1 -1 -1 1 Filter 1 1 0 0 1 0 0 0 1 0 1 1 0 0 0 3 -1 -3 1 0 -3 0 0 1 1 0 0 -3 -3 0 1 3 -2 -2 -1 6 x 6 image 19
Convolutional Layer stride=1 1 0 0 1 0 0 0 1 0 1 1 0 0 0 1 1 0 0 6 x 6 image -1 -1 -1 -1 Filter 2 Do the same process for every filter 3 -1 -1 -1 -3 -1 1 -1 0 -2 -3 1 -3 -1 Feature -3 Map 0 -1 -2 -2 0 -2 -4 1 1 -1 3 20
Convolutional Layer 64 Convolution filters Convolution 3 -1 -1 -1 -3 -1 1 -1 0 -2 -3 1 -3 -1 0 -2 1 3 -1 -2 0 -2 -4 -1 3 1 “Image” with 64 channels ……
Multiple Convolutional Layers 64 Convolution filters 3 -1 -1 -1 -3 -1 1 -1 0 -2 -3 1 -3 -1 0 -2 1 3 -1 -2 0 -2 -4 -1 3 1 “Image” with 64 channels Convolution …… Filter: 3 x 64 64 22
Multiple Convolutional Layers 64 Convolution filters Convolution 1 0 0 0 1 0 0 1 0 1 1 0 0 0 …… 3 -1 -1 -1 -3 -1 1 -1 0 -2 -3 1 -3 -1 0 -2 1 3 -1 -2 0 -2 -4 -1 3 1 23
Comparison of Two Stories …. . . Receptive field 1 -1 -1 Filter -1 1 -1 3 x channel -1 -1 1 tensor (ignore bias in this slide) 24
…. . . The neurons with different receptive fields share the parameters. bias 1 …. . . bias … Each filter convolves over the input image. … 1 0 0 1 11 00 00 11 0 00 11 00 0 0 1 1 0 0 00 00 11 11 00 00 1 0 0 0 11 00 00 00 11 00 0 1 0 00 11 00 0 0 1 0 00 00 11 00 1 25
Convolutional Layer Neuron Version Story Filter Version Story Each neuron only considers a receptive field. There a set of filters detecting small patterns. The neurons with different receptive fields share the parameters. Each filter convolves over the input image. They are the same story. 26
Observation 3 • Subsampling the pixels will not change the object bird subsampling 27
Pooling – Max Pooling 1 -1 -1 -1 Filter 1 1 -1 -1 -1 Filter 2 3 -1 -1 -1 -3 1 0 -3 -1 -1 -2 1 -3 -3 0 1 -1 -1 -2 1 3 -2 -2 -1 -1 0 -4 3 28
Convolutional Layers + Pooling Repeat Convolution Pooling 3 -1 -1 -1 -3 -1 1 -1 0 -2 -3 1 -3 -1 0 -2 1 3 -1 -2 0 -2 -4 -1 3 1 “Image” with 64 channels …… 3 -1 0 3 1 0 1 3 29
The whole CNN cat dog …… Convolution softmax Pooling Fully Connected Layers Convolution Pooling Flatten 30
Application: Playing Go Network 19 x 19 matrix 19(image) x 19 vector 48 channels in Alpha Go Black: 1 white: -1 none: 0 Next move (19 x 19 positions) 19 x 19 classes Fully-connected network can be used But CNN performs much better. 31
Why CNN for Go playing? • Some patterns are much smaller than the whole image Alpha Go uses 5 x 5 for first layer • The same patterns appear in different regions. 32
Why CNN for Go playing? • Subsampling the pixels will not change the object Pooling How to explain this? ? ? Alpha Go does not use Pooling …… 33
More Applications Speech https: //dl. acm. org/doi/10. 1109/T ASLP. 2014. 2339736 Natural Language Processing https: //www. aclweb. org/antholo gy/S 15 -2079/ 34
To learn more … • CNN is not invariant to scaling and rotation (we need data augmentation ). Spatial Transformer Layer https: //youtu. be/So. Cyw. Z 1 h. Zak (in Mandarin) 35
- Cnn ppt for image classification
- Convolutional neural network alternatives
- Convolutional neural network
- Visualizing and understanding convolutional networks
- Convolutional neural networks for visual recognition
- Image style transfer using convolutional neural networks
- Sparse convolutional neural networks
- Introduction to convolutional neural networks
- Convolutional neural networks
- Convolutional neural networks
- Is cnn a feedforward network
- Convolutional lstm network
- Neural networks ib psychology
- Least mean square algorithm in neural network
- Student teacher neural network
- Cost function in deep learning
- Tlu in neural network
- Meshnet: mesh neural network for 3d shape representation
- Pengertian artificial neural network
- Neural network in r
- Matlab neural network toolbox
- Spss neural network
- Xkcd neural network
- Extensions of recurrent neural network language model
- Lstm colah
- Linear separability in neural network
- Artificial neural network in data mining
- Least mean square algorithm in neural network
- Weka neural network
- Adaline neural network
- Decision boundary of neural network
- Ann unsupervised learning
- Tacotron 2
- Neural network terminology
- Principal component analysis jmp
- Playground.tensorflow