Deep Learning applied to Pedestrian Detection Generative Adversarial

Overview Presentation of our works: 1. Pedestrian images (dataset) generation resorting to Generative Adversarial

Application of the GANs to the PD problem - Overview • Data augmentation by

GAN pipeline Generator optimization problem 1 : 1 Discriminator optimization problem 1 : I.

GAN generated samples vs the original ones, using the Caltech PD dataset 5

Evolution of the Generated samples from the training epochs 1 to 90 (1 of

Evolution of the Generated samples from the training epochs 1 to 90 (2 of

Evolution of the Generated samples from the training epochs 1 to 90 (3 of

Evolution of the Generated samples from the training epochs 1 to 90 (4 of

Evolution of the Generated samples from the training epochs 1 to 90 (5 of

Evolution of the Generated samples – comparison between epochs 1 and 90 11

Noise space neighbourhood interpolation properties The noise values were perturbed with samples from a

Nearest Neighbours in the training set Example images generated with the Generator (90 epochs

PD method training – data augmentation 1 The PD method from Ribeiro et al.

Traditional vs GAN data augmentation procedure 1) Traditional 2) GAN based In 1): (a)

1 Adopted PD method (Ribeiro et al. ) Changes in Training: CNN architecture: VGG

Human Aware Navigation (HAN) Image to World Middle point Project onto the floor plane

1 Pedestrian Detection (Ribeiro et al. ) (similar to the previous one) 1 David

Tracking Given the detected pedestrians, project and track in the world coordinate system: •

Overall system: setup Camera 1 (ceiling) Camera 2 (ceiling) MBOT mobile platform Camera 3

Conclusions GAN • Generation of a PD dataset from noise resorting to GANs •

1 Adopted GAN architecture (Radford et al. ) Generator architecture and layers outputs: Noise

Slides: 24

Download presentation

Deep Learning applied to Pedestrian Detection: Generative Adversarial Networks and Human Aware Navigation use cases SPARSIS Project David Ribeiro Instituto de Sistemas e Robótica Instituto Superior Técnico, Universidade de Lisboa

Overview Presentation of our works: 1. Pedestrian images (dataset) generation resorting to Generative Adversarial Networks (GAN) 2. Application of the GAN generated images to train a Pedestrian Detection (PD) method, in a data augmentation (DA) setting 3. Performance comparison between the traditional and GAN based DA, as well as a mixture of both procedures 4. Application of PD to the robot Human Aware Navigation (HAN) problem 2

Application of the GANs to the PD problem - Overview • Data augmentation by using the pedestrian samples generated from the GAN • Training of a PD method by including these images 3

GAN pipeline Generator optimization problem 1 : 1 Discriminator optimization problem 1 : I. Goodfellow et al. , “Generative adversarial nets, ” in Advances in Neural Information Processing Systems, 2014. 4

GAN generated samples vs the original ones, using the Caltech PD dataset 5

Evolution of the Generated samples from the training epochs 1 to 90 (1 of 10) 6

Evolution of the Generated samples from the training epochs 1 to 90 (2 of 5) 7

Evolution of the Generated samples from the training epochs 1 to 90 (3 of 5) 8

Evolution of the Generated samples from the training epochs 1 to 90 (4 of 5) 9

Evolution of the Generated samples from the training epochs 1 to 90 (5 of 5) 10

Evolution of the Generated samples – comparison between epochs 1 and 90 11

Noise space neighbourhood interpolation properties The noise values were perturbed with samples from a uniform distribution , resulting in . The images displayed above were generated (after 90 epochs of training) with the modified noise values. 12

Nearest Neighbours in the training set Example images generated with the Generator (90 epochs of training), and the first three Nearest Neighbours (NNs) in the training set (with decreasing similarity from left to right), by using features from the layers conv 5 after pooling, fc 6 and fc 7: . 13

PD method training – data augmentation 1 The PD method from Ribeiro et al. is adopted Training data augmentation is performed : 1) with GAN generated samples 2) using random deformations and horizontal flipping (traditional procedure) 3) mixing the procedures 1) and 2) Experimental Setup: 17540 negatives and 16376 positives - 4094 original and 12282 DA 1 David Ribeiro, Jacinto C. Nascimento, Alexandre Bernardino, Gustavo Carneiro, Improving the performance of pedestrian detectors using convolutional learning, Pattern Recognition, Volume 61, January 2017, Pages 641 -649. 14

Traditional vs GAN data augmentation procedure 1) Traditional 2) GAN based In 1): (a) original pedestrians; (b) these samples after horizontal flipping; (c) the randomly deformed original pedestrians; and (d) randomly deformed horizontally flipped images. 15

1 Adopted PD method (Ribeiro et al. ) Changes in Training: CNN architecture: VGG Very Deep 16 (Simonyan and Zisserman 2), pretrained with Imagenet (Russakovsky et al. 3), then adapted to PD (the figure above was adapted from Ribeiro et al. 1), and fine-tuned using the Caltech dataset (the figure displayed above was adapted from Ribeiro et al. 1) 1 David Ribeiro, Jacinto C. Nascimento, Alexandre Bernardino, Gustavo Carneiro, Improving the performance of pedestrian detectors using convolutional learning, Pattern Recognition, Volume 61, January 2017, Pages 641 -649. 2 K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition”, ICLR, 2015. 3 O. Russakovsky et al. , “Image. Net Large Scale Visual Recognition Challenge, ” IJCV, 2015. 16

Results 17

Human Aware Navigation (HAN) Image to World Middle point Project onto the floor plane (assuming known robot position) Tracking Associate with targets and use the positions as measurements in the Kalman Filters Pedestrian Detection • HAN constraints (cost functions, based on Mateus et 1 al. ): naturalness, social rules and human comfort • A* path planner (to ensure minimal cost path) 1 A. Mateus et al. , “Human-Aware Navigation using External Omnidirectional Cameras, ” Iberian Robotics Conference, 2015. 18

1 Pedestrian Detection (Ribeiro et al. ) (similar to the previous one) 1 David Ribeiro, Jacinto C. Nascimento, Alexandre Bernardino, Gustavo Carneiro, Improving the performance of pedestrian detectors 19 using convolutional learning, Pattern Recognition, Volume 61, January 2017, Pages 641 -649.

Tracking Given the detected pedestrians, project and track in the world coordinate system: • Associate detections between frames – Nearest Neighbor Joint Probabilistic Data Association 1 (Bar-Shalom et al. ); • Estimate the person’s velocity – Kalman filter (one for each) and constant velocity model assumption (Bitgood and Dukes 2) for the prediction step 1 Y. Bar-Shalom, F. Daum and J. Huang, “The probabilistic data association filter: Estimation in the presence of measurement origin uncertainty”, IEEE Control Systems Magazine, 2009. 2 S. Bitgood and S. Dukes, “Not Another Step! Economy of Movement and Pedestrian Choice Point Behavior in Shopping Malls, ” Environment and Behavior, 2006. 20

Overall system: setup Camera 1 (ceiling) Camera 2 (ceiling) MBOT mobile platform Camera 3 (ceiling) Camera 4 (onboard) Environment (ROS Rviz) 21

Overall system: Experiments 22

Conclusions GAN • Generation of a PD dataset from noise resorting to GANs • Improvement of the detector’s performance by including the generated dataset in the training DA (and combining it with the normal DA procedure) HAN • Successful integration of PD in the robot navigation problem (HAN) with real-time performance • PD+HAN system is suitable for robot navigation tasks (runtime and accuracy) 23

1 Adopted GAN architecture (Radford et al. ) Generator architecture and layers outputs: Noise Generated Image Discriminator architecture and layers outputs: Image (real or generated) 1 loss value A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks”, Co. RR, vol. abs/1511. 06434, 2015. 24