Wavelet Transform for Superresolution and Representation Learning TFW

  • Slides: 37
Download presentation
Wavelet Transform for Super-resolution and Representation Learning TFW Oral Presentation R 06942074 Yu-Jhe Li

Wavelet Transform for Super-resolution and Representation Learning TFW Oral Presentation R 06942074 Yu-Jhe Li

Outline 1. Super Resolution (SR) 2. Application of Wavelet transform on SR 3. Learning

Outline 1. Super Resolution (SR) 2. Application of Wavelet transform on SR 3. Learning resolution-invariant representation

Outline 1. Super-resolution (SR) 2. Application of Wavelet transform on SR 3. Learning resolution-invariant

Outline 1. Super-resolution (SR) 2. Application of Wavelet transform on SR 3. Learning resolution-invariant representation

Super-resolution • Super-resolution imaging (SR) is a class of techniques that enhance the resolution

Super-resolution • Super-resolution imaging (SR) is a class of techniques that enhance the resolution of an imaging system. Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. Fast and Accurate Image Super. Resolution with Deep Laplacian Pyramid Networks. CVPR 2017

Existing methods for SR (i. e. SRCNN) C. Dong, C. C. Loy, K. He,

Existing methods for SR (i. e. SRCNN) C. Dong, C. C. Loy, K. He, and X. Tang, “Learning a deep convolutional network for image super-resolution, ” in Computer Vision, ECCV, pp. 184– 199, Springer, 2014.

Outline 1. Super Resolution (SR) 2. Application of Wavelet Transform on SR • Survey

Outline 1. Super Resolution (SR) 2. Application of Wavelet Transform on SR • Survey paper of Deep Wavelet Prediction for Image Super-resolution (CVPR 17 Workshops) 3. Learning resolution-invariant representation

2 D Discrete Wavelet Transformation (2 d. DWT) • To perform a 1 D

2 D Discrete Wavelet Transformation (2 d. DWT) • To perform a 1 D Discrete Wavelet Transformation, a signal x[n] ∈ RN is first passed through a half band highpass filter GH[n] and a low-pass filter GL[n], which are defined as (for Haar (“db 1”) wavelet):

1 -level 2 d DWT decomposition • The 2 D signal x[n, m] can

1 -level 2 d DWT decomposition • The 2 D signal x[n, m] can be treated as 1 D signals among the rows x[n, : ] at a given nth column and among the columns x[: , m] at a given m-th row • After filtering, half of the samples can be eliminated according to the Nyquist rule, since the signal now has a frequency bandwidth of π/2 radians instead of π.

Deep Wavelet Prediction for Superresolution (DWSR) • The 2 d. DWT and 2 d.

Deep Wavelet Prediction for Superresolution (DWSR) • The 2 d. DWT and 2 d. IDWT. A, B, C, D are four example pixels located in a 2× 2 grid at the top left corner of HR image. a, b, c, d are four pixels from the top left corner of four sub-bands correspondingly.

Wavelet prediction for SR network structure

Wavelet prediction for SR network structure

Comparison

Comparison

Outline 1. Super Resolution (SR) 2. Application of Wavelet transform on SR 3. Learning

Outline 1. Super Resolution (SR) 2. Application of Wavelet transform on SR 3. Learning Resolution-invariant Deep Representation

Learning Resolution-Invariant Deep Representations for Person Re-Identification AAAI 2019 (Acceptance rate 16. 2%)

Learning Resolution-Invariant Deep Representations for Person Re-Identification AAAI 2019 (Acceptance rate 16. 2%)

Person Re-Identification (re-ID) • Aim at matching images of the same person across different

Person Re-Identification (re-ID) • Aim at matching images of the same person across different and non-overlapping cameras Same person

Challenges in Person Re-ID Occlusion Background Clutters Pose Variants

Challenges in Person Re-ID Occlusion Background Clutters Pose Variants

Resolution Variations • In real world, images captured by surveillance cameras are often of

Resolution Variations • In real world, images captured by surveillance cameras are often of low resolution (LR) since image resolution may vary drastically due to the distance between the camera and the person of interest Short Distance High resolution Long distance Low resolution

Resolution mismatch • In addition to recognizing images across different camera views, one also

Resolution mismatch • In addition to recognizing images across different camera views, one also needs to match cross-resolution images.

Existing Strategies Reference: Li et al. 2015. Multi-scale learning for low-resolution person re-identification. In

Existing Strategies Reference: Li et al. 2015. Multi-scale learning for low-resolution person re-identification. In ICCV. Jing et al. 2015. Super-resolution person re-identification with semi-coupled low-rank discriminant dictionary learning. In CVPR. Wang et al. 2016. Scale-adaptive low-resolution person re-identification via learning a discriminating surface. In IJCAI. Jiao et al. 2018. Deep low-resolution person re-identification. In AAAI. Wang et al. 2018. Cascaded srgan for scale-adaptive low resolution person re-identification. In IJCAI.

SR for Cross Resolution Person Re-ID Image Recovery Query SR 2 x Query SR

SR for Cross Resolution Person Re-ID Image Recovery Query SR 2 x Query SR 3 x Query SR 4 x Gallery

Our Motivation Same Identity

Our Motivation Same Identity

Our Goal: Resolution-Invariant Representation Query Resolution-Invariant Representation Encoder Gallery

Our Goal: Resolution-Invariant Representation Query Resolution-Invariant Representation Encoder Gallery

Resolution Adaptation and re-Identification Network (RAIN)

Resolution Adaptation and re-Identification Network (RAIN)

Training Set HR image set LR image set. . . Down-sample

Training Set HR image set LR image set. . . Down-sample

Resolution Adversarial Learning Cross-Resolution Feature Extractor (F) Resolution Adversarial Learning Down-sample Discriminator (D)

Resolution Adversarial Learning Cross-Resolution Feature Extractor (F) Resolution Adversarial Learning Down-sample Discriminator (D)

Preserve Image Representation High-Resolution Decoder (G) Down-sample

Preserve Image Representation High-Resolution Decoder (G) Down-sample

Person Identity Classification Down-sample Global Average Pooling Re-ID Classifier (C) Identity Classification Intra-class/Inter-class Discrepancy

Person Identity Classification Down-sample Global Average Pooling Re-ID Classifier (C) Identity Classification Intra-class/Inter-class Discrepancy

The Proposed RAIN (whole model) Cross-Resolution Feature Extractor (F) Down-sample Resolution Adversarial Learning Discriminator

The Proposed RAIN (whole model) Cross-Resolution Feature Extractor (F) Down-sample Resolution Adversarial Learning Discriminator (D) High-Resolution Decoder (G) Global Average Pooling Re-ID Classifier (C) Identity Classification Intra-class/Inter-class Discrepancy

Evaluation

Evaluation

Implementation Details • Architecture • • Cross-resolution feature extractor: Image. Net-pretrained Res. Net-50. Discriminator:

Implementation Details • Architecture • • Cross-resolution feature extractor: Image. Net-pretrained Res. Net-50. Discriminator: 5 conv High-resolution decoder: 5 blocks. Each block contains 1 conv + 2 deconv Re-ID classifier: 1 FC • Initialization: all randomly initialized. • Optimizer: stochastic gradient descent (SGD), the learning rate: 0. 003, momentum: 0. 9, and weight decay: 0. 0005 • Batch size: 32 • The margin m for triplet loss: is set to 2. • All images (both HR and LR images) are resized to 256 × 128 × 3 • Run time: Take about 8 hours on a single NVIDIA Ge. Force GTX 1080 GPU with 12 GB memory.

Experiments settings • Datasets: • MLR-CUHK 03 (down-sampled images from CUHK 03) • MLR-VIPe.

Experiments settings • Datasets: • MLR-CUHK 03 (down-sampled images from CUHK 03) • MLR-VIPe. R (down-sampled images from VIPER) • CAVIAR (Containing HR and LR) (Li et al. 2014) (Gray et al. 2008) (Cheng et al. 2011) • Evaluation Protocol • Cumulative match characteristic (CMC) (Rank 1, Rank 5, Rank 10, Rank 20 reported) [Li et al. ] 2014. Deep-reid: Deep filter pairing neural network for person re-identification. In CVPR. [Gray et al. ] 2008. Viewpoint invariant pedestrian recognition with an ensemble of localized features. In ECCV. [Cheng et al. ] 2011. Custom pictorial structures for re-identification. In BMVC.

Comparisons • 5 existing methods + 2 baselines + 2 ours variants Reference: Li

Comparisons • 5 existing methods + 2 baselines + 2 ours variants Reference: Li et al. 2015. Multi-scale learning for low-resolution person re-identification. In ICCV. Jing et al. 2015. Super-resolution person re-identification with semi-coupled low-rank discriminant dictionary learning. In CVPR. Wang et al. 2016. Scale-adaptive low-resolution person re-identification via learning a discriminating surface. In IJCAI. Jiao et al. 2018. Deep low-resolution person re-identification. In AAAI. Wang et al. 2018. Cascaded srgan for scale-adaptive low resolution person re-identification. In IJCAI.

Ablation studies • All loss terms play crucial roles in achieving the state-of-the-art performances.

Ablation studies • All loss terms play crucial roles in achieving the state-of-the-art performances.

Visualization of Cross-Resolution Features • (a) 35 different identities each of which is shown

Visualization of Cross-Resolution Features • (a) 35 different identities each of which is shown in a unique color. • (b) Images with the same resolution are shown in the same color.

Top ranked Results • Query with down-sampling rate = 2

Top ranked Results • Query with down-sampling rate = 2

Top ranked Results • Query with down-sampling rate = 4

Top ranked Results • Query with down-sampling rate = 4

Top ranked Results • Query with down-sampling rate = 8

Top ranked Results • Query with down-sampling rate = 8

Thanks for listening~

Thanks for listening~