Unsupervised Learning Generation Programming assignment 2 Creation Generative

Creation • Generative Models: https: //openai. com/blog/generative-models/ What I cannot create, I do not

Creation – Image Processing Now v. s. In the future Machine draws a cat

Generative Models Pixel. RNN Variational Autoencoder (VAE) Generative Adversarial Network (GAN) later

LSTM nitty-gritty # of LSTM units (cells) feature timesteps [batch, timesteps, feature] batch feature

LSTM nitty-gritty # of LSTM units (cells) feature • All batches must have same

Pixel. RNN Ref: Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu, Pixel Recurrent Neural

More than images …… Audio: Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen

Practicing Generation Models: Pokémon Creation • Small images of 792 Pokémon's • Can machine

Practicing Generation Models: Pokémon Creation • Tips (? ) Ø Each pixel is represented

Practicing Generation Models: Pokémon Creation • Original image (40 x 40): http: //speech. ee.

Real Pokémon Never seen by machine! Cover 50% Cover 75% It is difficult to

Pokémon Creation Drawing from scratch Need some randomness

Generative Models Pixel. RNN Variational Autoencoder (VAE) Diederik P Kingma, Max Welling, Auto-Encoding Variational

Auto-encoder As close as possible code Randomly generate a vector as code NN Encoder

Auto-encoder input NN Encoder output code VAE input NN Decoder NN Encoder Minimize reconstruction

Cifar-10 https: //github. com/openai/iaf Source of image: https: //arxiv. org/pdf/1606. 04934 v 1. pdf

Pokémon Creation input NN Encoder m 1 m 2 m 3 exp X NN

Writing Poetry sentence NN Decoder NN Encoder sentence code Code Space i went to

Why VAE? Intuitive Reason ? decode encode noise

Why VAE? Intuitive Reason input NN Encoder What will happen if we only minimize

Why VAE? • Back to what we want to do Estimate the probability distribution

How to sample? Gaussian Mixture Model (multinomial) m is an integer Each x you

VAE z is a vector from normal distribution Each dimension of z represents an

Maximizing Likelihood P(z) is normal distribution Maximizing the likelihood of the observed x Tuning

Maximizing Likelihood P(z) is normal distribution Maximizing the likelihood of the observed x q(z|x)

Maximizing Likelihood P(z) is normal distribution Maximizing the likelihood of the observed x NN’

Connection with Network Minimize Minimizing NN’ (Refer to the Appendix B of the original

Conditional VAE https: //arxiv. org/pdf/1406. 5298 v 2. pdf

To learn more … • Carl Doersch, Tutorial on Variational Autoencoders • Diederik P.

Problems of VAE • It does not really try to simulate real images code

Generative Models Pixel. RNN Variational Autoencoder (VAE) Generative Adversarial Network J. Good fellow, Jean

Yann Le. Cun’s comment https: //www. quora. com/What-are-some-recent-and-potentiallyupcoming-breakthroughs-in-unsupervised-learning

Yann Le. Cun’s comment …… https: //www. quora. com/What-are-some-recent-and-potentially-upcoming-breakthroughsin-deep-learning

擬態的演化 http: //peellden. pixnet. net/blog/post/40406899 -2013%E 7%AC%AC%E 5%9 B%9 B%E 5%AD%A 3%EF%B C%8 C%E

The evolution of generation NN Generator v 1 NN Generator v 2 NN Generator

GAN - Discriminator Vectors from a distribution NN Generator v 1 Decoder in VAE

Randomly sample a vector GAN - Generator NN Generator v 1 “Tuning” the parameters

GAN – Toy Example Real data (black points) z NN Generator x Green distribution

Cifar-10 • Which one is machine-generated? Ref: https: //openai. com/blog/generative-models/

Moving on the code space Alec Radford, Luke Metz, Soumith Chintala, Unsupervised Representation Learning

畫漫畫 • Ref: https: //github. com/mattya/chainer-DCGAN

畫漫畫 Web demo: http: //mattya. github. io/chainer-DCGAN/ • Ref: http: //qiita. com/mattya/items/e 5 bfe

In practical …… • GANs are difficult to optimize. • No explicit signal about

To learn more … • “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”

To learn more … • Basic tutorial: • http: //blog. aylien. com/introduction-generativeadversarial-networks-code-tensorflow/ • https:

Acknowledgement • 感謝 Ryan Sun 來信指出投影片上的錯字

Slides: 52

Download presentation

Unsupervised Learning: Generation Programming assignment #2

Creation • Generative Models: https: //openai. com/blog/generative-models/ What I cannot create, I do not understand. Richard Feynman https: //www. quora. com/What-did-Richard-Feynman-mean-when-he-said-What-I-cannotcreate-I-do-not-understand

Creation – Image Processing Now v. s. In the future Machine draws a cat http: //www. wikihow. com/Draw-a-Cat-Face

Generative Models Pixel. RNN Variational Autoencoder (VAE) Generative Adversarial Network (GAN) later

LSTM nitty-gritty # of LSTM units (cells) feature timesteps [batch, timesteps, feature] batch feature

LSTM nitty-gritty # of LSTM units (cells) feature • All batches must have same feature length • Each batch can have different time steps [batch, timesteps, feature] batch timesteps

Pixel. RNN Ref: Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu, Pixel Recurrent Neural Networks, ar. Xiv preprint, 2016 • To create an image, generating a pixel each time E. g. 3 x 3 images NN NN NN …… Can be trained just with a large collection of images without any annotation

Pixel. RNN Ref: Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu, Pixel Recurrent Neural Networks, ar. Xiv preprint, 2016 Real World

More than images …… Audio: Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu, Wave. Net: A Generative Model for Raw Audio, ar. Xiv preprint, 2016 Video: Nal Kalchbrenner, Aaron van den Oord, Karen Simonyan, Ivo Danihelka, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu, Video Pixel Networks , ar. Xiv preprint, 2016

Practicing Generation Models: Pokémon Creation • Small images of 792 Pokémon's • Can machine learn to create new Pokémons? Don't catch them! Create them! • Source of image: http: //bulbapedia. bulbagarden. net/wiki/List_of_Pok%C 3%A 9 mon_by_base_stats_(Generation_VI) Original image is 40 x 40 Making them into 20 x 20

Practicing Generation Models: Pokémon Creation • Tips (? ) Ø Each pixel is represented by 3 numbers (corresponding to RGB) R=50, G=150, B=100 Ø Each pixel is represented by a 1 -of-N encoding feature 0 1 Clustering the similar color 0 0 …… = = 0 167 colors in total

Practicing Generation Models: Pokémon Creation • Original image (40 x 40): http: //speech. ee. ntu. edu. tw/~tlkagk/courses/ML_2016/Pokemon_creation/ima ge. rar • Pixels (20 x 20): http: //speech. ee. ntu. edu. tw/~tlkagk/courses/ML_2016/Pokemon_creation/pix el_color. txt • Each line corresponds to an image, and each number corresponds to a pixel • http: //speech. ee. ntu. edu. tw/~tlkagk/courses/ML_2016/Pokemon_cre ation/colormap. txt 0 1 2 …… … • Following experiment: 1 -layer LSTM, 512 cells

Real Pokémon Never seen by machine! Cover 50% Cover 75% It is difficult to evaluate generation.

Pokémon Creation Drawing from scratch Need some randomness

Generative Models Pixel. RNN Variational Autoencoder (VAE) Diederik P Kingma, Max Welling, Auto-Encoding Variational Bayes, ar. Xiv preprint, 2013 Generative Adversarial Network (GAN)

Auto-encoder As close as possible code Randomly generate a vector as code NN Encoder NN Decoder Image ?

Auto-encoder input NN Encoder output code VAE input NN Decoder NN Encoder Minimize reconstruction error m 1 m 2 m 3 exp + NN Decoder X From a normal distribution Minimize output

Cifar-10 https: //github. com/openai/iaf Source of image: https: //arxiv. org/pdf/1606. 04934 v 1. pdf

Pokémon Creation input NN Encoder m 1 m 2 m 3 exp X NN Decoder + 10 -dim Training Pick two dim, and fix the rest eight NN Decoder 10 -dim ? output

Writing Poetry sentence NN Decoder NN Encoder sentence code Code Space i went to the store to buy some groceries. i were to buy any groceries. …… "come with me, " she said. "talk to me, " she said. "don’t worry about it, " she said. Ref: http: //www. wired. co. uk/article/google-artificial-intelligence-poetry Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz, Samy Bengio, Generating Sentences from a Continuous Space, ar. Xiv prepring, 2015

Why VAE? Intuitive Reason ? decode encode noise

Why VAE? Intuitive Reason input NN Encoder What will happen if we only minimize reconstruction error? Original m 1 Code with m 2 noise m 3 + exp NN Decoder X Minimize The variance of noise is automatically learned output

Why VAE? Intuitive Reason input NN Encoder What will happen if we only minimize reconstruction error? Original m 1 Code with m 2 noisy m 3 + exp NN Decoder X Minimize The variance of noise is automatically learned L 2 regularization output

Why VAE? • Back to what we want to do Estimate the probability distribution P(x) Each Pokémon is a point x in the space

How to sample? Gaussian Mixture Model (multinomial) m is an integer Each x you generate is from a mixture Distributed representation is better than cluster. P(x) P(m) 1 2 3 4 5 …. .

VAE z is a vector from normal distribution Each dimension of z represents an attribute NN P(x) Infinite Gaussian z

Maximizing Likelihood P(z) is normal distribution Maximizing the likelihood of the observed x Tuning the parameters to maximize likelihood L NN Decoder We need another distribution q(z|x) NN’ Encoder

Maximizing Likelihood P(z) is normal distribution Maximizing the likelihood of the observed x q(z|x) can be any distribution

Maximizing Likelihood

Maximizing Likelihood P(z) is normal distribution Maximizing the likelihood of the observed x NN’

Connection with Network Minimize Minimizing NN’ (Refer to the Appendix B of the original VAE paper) Maximizing close NN’ NN This is the auto-encoder

Conditional VAE https: //arxiv. org/pdf/1406. 5298 v 2. pdf

To learn more … • Carl Doersch, Tutorial on Variational Autoencoders • Diederik P. Kingma, Danilo J. Rezende, Shakir Mohamed, Max Welling, “Semi-supervised learning with deep generative models. ” NIPS, 2014. • Sohn, Kihyuk, Honglak Lee, and Xinchen Yan, “Learning Structured Output Representation using Deep Conditional Generative Models. ” NIPS, 2015. • Xinchen Yan, Jimei Yang, Kihyuk Sohn, Honglak Lee, “Attribute 2 Image: Conditional Image Generation from Visual Attributes”, ECCV, 2016 • Cool demo: • http: //vdumoulin. github. io/morphing_faces/ • http: //fvae. ail. tokyo/

Problems of VAE • It does not really try to simulate real images code NN Decoder Output As close as possible One pixel difference from the target Realistic Fake VAE may just memorize the existing images, instead of generating new images

Generative Models Pixel. RNN Variational Autoencoder (VAE) Generative Adversarial Network J. Good fellow, Jean Pouget-Abadie, Mehdi Mirza, Bing (GAN) Ian Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio, Generative Adversarial Networks, ar. Xiv preprint 2014

Yann Le. Cun’s comment https: //www. quora. com/What-are-some-recent-and-potentiallyupcoming-breakthroughs-in-unsupervised-learning

Yann Le. Cun’s comment …… https: //www. quora. com/What-are-some-recent-and-potentially-upcoming-breakthroughsin-deep-learning

擬態的演化 http: //peellden. pixnet. net/blog/post/40406899 -2013%E 7%AC%AC%E 5%9 B%9 B%E 5%AD%A 3%EF%B C%8 C%E 5%86%AC%E 8%9 D%B 6%E 5%AF%82%E 5 %AF%A 5 棕色蝴蝶不是棕色蝴蝶沒有葉脈葉脈 ……. .

The evolution of generation NN Generator v 1 NN Generator v 2 NN Generator v 3 Discriminator v 1 Discriminator v 2 Discriminator v 3 Real images:

GAN - Discriminator Vectors from a distribution NN Generator v 1 Decoder in VAE 0 0 1 1 Real images: image Discriminator v 1 1/0 (real or fake)

Randomly sample a vector GAN - Generator NN Generator v 1 “Tuning” the parameters of generator The output be classified as “real” (as close to 1 as possible) Generator + Discriminator = a network Discriminator v 1 Using gradient descent to find the parameters of generator Fix the discriminator 1. 0 0. 87

GAN – Toy Example Real data (black points) z NN Generator x Green distribution Discriminator Demo: http: //cs. stanford. edu/people/karpathy/gan/ 1/0 Blue curve

Cifar-10 • Which one is machine-generated? Ref: https: //openai. com/blog/generative-models/

Moving on the code space Alec Radford, Luke Metz, Soumith Chintala, Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, ICLR, 2016

畫漫畫 • Ref: https: //github. com/mattya/chainer-DCGAN

畫漫畫 Web demo: http: //mattya. github. io/chainer-DCGAN/ • Ref: http: //qiita. com/mattya/items/e 5 bfe 5 e 04 b 9 d 2 f 0 bbd 47

In practical …… • GANs are difficult to optimize. • No explicit signal about how good the generator is • In standard NNs, we monitor loss • In GANs, we have to keep “well-matched in a contest” • When discriminator fails, it does not guarantee that generator generates realistic images • Just because discriminator is stupid • Sometimes generator find a specific example that can fail the discriminator • Making discriminator more robust may be helpful.

To learn more … • “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks” • “Improved Techniques for Training GANs” • “Autoencoding beyond pixels using a learned similarity metric” • “Deep Generative Image Models using a Laplacian Pyramid of Adversarial Network” • “Super Resolution using GANs” • “Generative Adversarial Text to Image Synthesis”

To learn more … • Basic tutorial: • http: //blog. aylien. com/introduction-generativeadversarial-networks-code-tensorflow/ • https: //bamos. github. io/2016/08/09/deepcompletion/ • http: //blog. evjang. com/2016/06/generativeadversarial-nets-in. html

Acknowledgement • 感謝 Ryan Sun 來信指出投影片上的錯字