Conditional GANs Experimental Review Sanket Lokegaonkar Papers ExploredExperimented

Papers Explored/Experimented Pix 2 pix (Image-to-Image Translation with Conditional Adversarial Networks. P. Isola J.

Differences/Similarities GANs Training: Min-max Nash Equilibrium for arguing convergence. Pipeline: Generator: Receives input z,

Evaluation Metrics Qualitative Examination (Generally) Amazon Mechanical Turk Classifier Scores Ablation Studies

Experimental Setup GPU: Tesla K 40 (Industry Standard) GTX 1060 (Local Machine) Training time:

Contemporary Work Image-to-Image Translation Semantic Segmentation edge 2 cats Image Super-Resolution Text-to-Image Synthesis Synthesizing

DCGANs Results (Generative Space Visualization):

Pix 2 Pix(Image-to-Image Translation with Conditional Adversarial Network) Idea: Image-to-Image Translation using Conditional Adversarial

Pix 2 Pix Results after training on CMP Facade Database Input Output Groundtruth

Pix 2 Pix Results Sat 2 Map Setting: Input Output Groundtruth

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space Contributions:

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space Results

Generative Adversarial Text to Image Synthesis Generator: Pass the text encoding as the conditional

Generative Adversarial Text to Image Synthesis Results after training on Caltech-UCSD Birds 200 dataset

Generative Adversarial Text to Image Synthesis Results after training on Flowers VGG 102 Dataset

Slides: 25

Download presentation

Conditional GANs Experimental Review Sanket Lokegaonkar

Papers Explored/Experimented Pix 2 pix (Image-to-Image Translation with Conditional Adversarial Networks. P. Isola J. -Y. Zhu T. Zhou, A. A. Efros, ar. Xiv 2016 ) Plug & Play Generative Networks (Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space A. Nguyen, J. Yosinski, Y. Bengio, A. Dosovitskiy, and J. Clune, ar. Xiv 2016) Generative Adversarial Text to Image Synthesis ( Generative Adversarial Text to Image Synthesis. S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, H. Lee, ICML 2016) Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks. " ar. Xiv

Differences/Similarities GANs Training: Min-max Nash Equilibrium for arguing convergence. Pipeline: Generator: Receives input z, generates an image Discriminator: Binary classifier outputs real or fake label Objective: Produce Natural Images

Evaluation Metrics Qualitative Examination (Generally) Amazon Mechanical Turk Classifier Scores Ablation Studies

Experimental Setup GPU: Tesla K 40 (Industry Standard) GTX 1060 (Local Machine) Training time: Pix 2 Pix (CMU Facade) (~200 Images for 200 epochs) took around 3 hours (Sat 2 Map) (1000 images for 200 epochs) took around 20 hours DCGANs (Local Machine)

Contemporary Work Image-to-Image Translation Semantic Segmentation edge 2 cats Image Super-Resolution Text-to-Image Synthesis Synthesizing Photo-Realistic Images Sound-to-Image Synthesis? (Credits: Dr. Huang)

DCGANs Results (In Training):

DCGANs Results (Generative Space Visualization):

Pix 2 Pix(Image-to-Image Translation with Conditional Adversarial Network) Idea: Image-to-Image Translation using Conditional Adversarial Networks Add auxiliary loss to make the generator output close to the groundtruth L 1 Loss, L 2 Loss Generator Architecture: U-Net (encoder-decoder with skip connections between mirrored layers in the encoder and decoder stacks)

Pix 2 Pix Results after training on CMP Facade Database Input Output Groundtruth

Pix 2 Pix Results Sat 2 Map Setting: Input Output Groundtruth

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space Contributions: Find realistic images by doing gradient ascent in the latent space to maximize on the activations of a separate classifier network Scales to much larger resolutions Conditioned on Imagenet classes Uses Markov chain sampler (MCMC) Energy-based model formulation

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space Results on Caption Conditioning:

Generative Adversarial Text to Image Synthesis Generator: Pass the text encoding as the conditional Deconvolutional network Discriminator: Takes as input the text embedding for matching-aware discrimination Evaluated on: MS-COCO CUB (Birds dataset)

Generative Adversarial Text to Image Synthesis Results after training on Caltech-UCSD Birds 200 dataset Caption: this vibrant red bird has a pointed black beak Caption: this bird is yellowish orange with black wings

Generative Adversarial Text to Image Synthesis Results after training on Caltech-UCSD Birds 200 dataset Caption: the bright blue bird has a white colored belly Caption: blue bird with red eyes and black beak

Generative Adversarial Text to Image Synthesis Results after training on Flowers VGG 102 Dataset Caption: this vibrant red bird has a pointed black beak Caption: this bird is yellowish orange with black wings

Generative Adversarial Text to Image Synthesis Results after training on Flowers VGG 102 Dataset Caption: the bright blue bird has a white colored belly Caption: blue bird with red eyes and black beak