Generative Adversarial Network Hungyi Lee Three Categories of
- Slides: 39
Generative Adversarial Network 李宏毅 Hung-yi Lee
Three Categories of GAN 1. Typical GAN Generator random vector image 2. Conditional GAN blue eyes, red hair, short hair paired data “Girl with red hair” Generator text image 3. Unsupervised Conditional GAN domain x domain y x y Generator Photo unpaired data Vincent van Gogh’s style
Generative Adversarial Network (GAN) • Anime face generation as example vector image Generator image Discriminator score high dimensional vector Larger score means real, smaller score means fake.
Algorithm • Initialize generator and discriminator • In each training iteration: G D Step 1: Fix generator G, and update discriminator D sample Database generated objects 1 1 1 0 0 vector randomly sampled 1 G Update D Fix Discriminator learns to assign high scores to real objects and low scores to generated objects.
Algorithm • Initialize generator and discriminator • In each training iteration: G D Step 2: Fix discriminator D, and update generator G Generator learns to “fool” the discriminator hidden layer vector NN Generator update large network Backpropagation Discriminator fix 0. 13
Algorithm • Initialize generator and discriminator • In each training iteration: Learning D Sample some real objects: Generate some fake objects: 1 1 1 0 0 vector vector Learning G 1 G update G D G Update D fix image D fix 1
https: //crypko. ai/#/
GAN is hard to train …… • There is a saying …… (I found this joke from 陳柏文’s facebook. )
Three Categories of GAN 1. Typical GAN Generator random vector image 2. Conditional GAN blue eyes, red hair, short hair paired data “Girl with red hair” Generator text image 3. Unsupervised Conditional GAN domain x domain y x y Generator Photo unpaired data Vincent van Gogh’s style
a dog is running Text-to-Image a bird is flying • Traditional supervised approach c 1: a dog is running NN Image as close as possible Text: “train” Target of NN output A blurry image!
[Scott Reed, et al, ICML, 2016] Conditional GAN c: train G Normal distribution Image x = G(c, z) x is real image or not D (original) Real images: Generated images: scalar 1 0 Generator will learn to generate realistic images …. But completely ignore the input conditions.
[Scott Reed, et al, ICML, 2016] Conditional GAN c: train G Image x = G(c, z) scalar x is realistic or not + c and x are matched or not Normal distribution D (better) True text-image pairs: (train , (cat , ) 0 (train , ) 1 ) 0
Conditional GAN - Sound-to-image c: sound "a dog barking sound" Training Data Collection video G Image
Conditional GAN - Sound-to-image • Audio-to-image The images are generated by Chia-Hung Wan and Shun-Po Chuang. https: //wjohn 1483. github. io/audi o_to_scene/index. html Louder
Conditional GAN - Image-to-label Multi-label Image Classifier = Conditional Generator Input condition Generated output
Conditional GAN - Image-to-label The classifiers can have different architectures. The classifiers are trained as conditional GAN. [Tsai, et al. , submitted to ICASSP 2019] F 1 VGG-16 + GAN Inception +GAN Resnet-101 +GAN Resnet-152 +GAN Att-RNN RLSD MS-COCO 56. 0 60. 4 62. 4 NUS-WIDE 33. 9 41. 2 53. 5 63. 8 62. 8 64. 0 63. 3 63. 9 62. 1 62. 0 55. 8 53. 1 55. 4 52. 1 54. 7 46. 9
Conditional GAN - Image-to-label The classifiers can have different architectures. The classifiers are trained as conditional GAN. Conditional GAN outperforms other models designed for multi-label. F 1 VGG-16 + GAN Inception +GAN Resnet-101 +GAN Resnet-152 +GAN Att-RNN RLSD MS-COCO 56. 0 60. 4 62. 4 NUS-WIDE 33. 9 41. 2 53. 5 63. 8 62. 8 64. 0 63. 3 63. 9 62. 1 62. 0 55. 8 53. 1 55. 4 52. 1 54. 7 46. 9
Talking Head https: //arxiv. org/abs/1905. 08233
Three Categories of GAN 1. Typical GAN Generator random vector image 2. Conditional GAN blue eyes, red hair, short hair paired data “Girl with red hair” Generator text image 3. Unsupervised Conditional GAN domain x domain y x y Generator Photo unpaired data Vincent van Gogh’s style
Domain X Cycle GAN Domain X Domain Y Become similar to domain Y ? scalar Input image belongs to domain Y or not Domain Y
Domain X Cycle GAN Domain X Domain Y Become similar to domain Y Not what we want! ignore input scalar Input image belongs to domain Y or not Domain Y
[Jun-Yan Zhu, et al. , ICCV, 2017] Cycle GAN as close as possible Cycle consistency Lack of information for reconstruction scalar Input image belongs to domain Y or not Domain Y
Cycle GAN as close as possible scalar: belongs to domain Y or not scalar: belongs to domain X or not as close as possible
Cycle GAN as close as possible It is bad. negative It is good. positive negative positive sentence? negative sentence? I love you. It is bad. I hate you. negative as close as possible I love you. positive
Discrete Issue Seq 2 seq model hidden layer with discrete output It is good. It is bad. negative positive update positive sentence? large network fix Backpropagation
Three Categories of Solutions Gumbel-softmax • [Matt J. Kusner, et al, ar. Xiv, 2016] Continuous Input for Discriminator • [Sai Rajeswar, et al. , ar. Xiv, 2017][Ofir Press, et al. , ICML workshop, 2017][Zhen Xu, et al. , EMNLP, 2017][Alex Lamb, et al. , NIPS, 2016][Yizhe Zhang, et al. , ICML, 2017] “Reinforcement Learning” • [Yu, et al. , AAAI, 2017][Li, et al. , EMNLP, 2017][Tong Che, et al, ar. Xiv, 2017][Jiaxian Guo, et al. , AAAI, 2018][Kevin Lin, et al, NIPS, 2017][William Fedus, et al. , ICLR, 2018]
文句改寫 感謝 王耀賢 同學提供實驗結果 ✘ Negative sentence to positive sentence: ✘ it's a crappy day -> it's a great day i wish you could be here -> you could be here it's not a good idea -> it's good idea i miss you -> i love you i don't love you -> i love you i can't do that -> i can do that i feel so sad -> i happy it's a bad day -> it's a good day it's a dummy day -> it's a great day sorry for doing such a horrible thing -> thanks for doing a great thing my doggy is sick -> my doggy is my doggy my little doggy is sick -> my little doggy is my little doggy
Speech Recognition Human Teacher Supervised Learning I can do speech recognition after teaching This utterance is “good morning”. • Supervised learning needs lots of annotated speech. • However, most of the languages are low resourced.
Speech Recognition Human Teacher Supervised Learning I can do speech recognition after teaching Unsupervised Learning Listening to human talking This utterance is “good morning”. I can automatically learn speech recognition Reading text on the Internet
Acoustic Token Discovery Acoustic tokens can be discovered from audio collection without text annotation. Acoustic tokens: chunks of acoustically similar audio segments with token IDs [Zhang & Glass, ASRU 09] [Huijbregts, ICASSP 11] [Chan & Lee, Interspeech 11]
Acoustic Token Discovery Token 3 Token 2 Token 1 Token 4 Token 3 Acoustic tokens can be discovered from audio collection without text annotation. Acoustic tokens: chunks of acoustically similar audio segments with token IDs [Zhang & Glass, ASRU 09] [Huijbregts, ICASSP 11] [Chan & Lee, Interspeech 11]
[Wang, et al. , ICASSP, 2018] Acoustic Token Discovery Phonetic-level acoustic tokens are obtained by segmental sequence-to-sequence autoencoder.
Unsupervised Speech Recognition p 1 p 2 p 3 p 4 AY L AH V Y UW G UH D B AY p 1 p 3 p 2 HH AW AA R Y UW p 1 p 4 p 3 p 5 p 1 p 5 p 4 p 3 Phone-level Acoustic Pattern Discovery p 1 = “AY” Cycle GAN AY M F AY N T AY W AA N Phoneme sequences from Text [Liu, et al. , INTERSPEECH, 2018] [Chen, et al. , ar. Xiv, 2018]
Model
Experimental Results
Accuracy The progress of supervised learning Unsupervised learning today (2019) is as good as supervised learning 30 years ago. The image is modified from: Phone recognition on the TIMIT database Lopes, C. and Perdigão, F. , 2011. Speech Technologies, Vol 1, pp. 285 --302.
Three Categories of GAN 1. Typical GAN Generator random vector image 2. Conditional GAN blue eyes, red hair, short hair paired data “Girl with red hair” Generator text image 3. Unsupervised Conditional GAN domain x domain y x y Generator Photo unpaired data Vincent van Gogh’s style
To Learn More … You can learn more from the You. Tube Channel
- Singing
- Quantum generative adversarial learning
- Spectral normalization
- Conditional generator
- Unsupervised image to image translation
- Lee hung yi
- Hungyi lee
- Hungyi
- Hungyi lee
- Hungyi lee
- What are the 3 classes of yeast breads
- Mintzberg outlined three categories of managerial roles:
- Three categories of opportunity analysis canvas
- Divided into three categories
- Adversarial stakeholders
- Adversarial trial system
- The adversarial system
- Adversarial personalized ranking for recommendation
- Pgd
- On adaptive attacks to adversarial example defenses
- Adversarial interview
- Adversarial system law definition
- Adversarial search problems uses
- Adversarial examples
- Adversarial multi-task learning for text classification
- Voice conversion
- Adversarial training
- Certified defenses against adversarial examples
- The limitations of deep learning in adversarial settings.
- Adversarial patch
- Vb mapp definition
- Generative grammar
- Phrase structure grammar ppt
- Ontologisk realisme
- Generative recursion
- Deep and surface structure in linguistics
- Bentley generative components
- Normalizing flow
- Hudson ladder of safety culture
- Generative lymphoid organs