Recognition of Faces and Facial Attributes using Accumulative

Recognition of Faces and Facial Attributes using Accumulative Local Sparse Representations Domingo Mery Sandipan Banerjee Department of Computer Science Universidad Católica de Chile Department of Computer Science & Engineering University of Notre Dame

Agenda • • Motivation Proposed method Experiments Conclusions

class 1 class 3 description : : : description class 2 description classifier’s design LEARNING TESTING description query image classification class

Are all parts of the face important? Important for gender. Important for race. Important for expression.

Are all parts of the face important? Important for Mary.

Are all parts of the face important? Important for Mary and Miguel. Not important at all!!! Important for Miguel.

Agenda • • Motivation Proposed method Experiments Conclusions

Our method is based on. . . SRC Sparse Representation Classification

Gallery Subject-1 Subject-2 Subject-3 Subject-4 Subject-k . . .

Gallery = Dictionary Subject-1 Subject-2 Subject-3 Subject-4 Subject-k . . .

Gallery = Dictionary Subject-1 Subject-2 Subject-3 Subject-4 Subject-k . . . Sparse Representation Query

Gallery = Dictionary Subject-1 Subject-2 Subject-3 Subject-4 Subject-k . . . 0. 6 0. 3 0. 1 Sparse Representation Query

Gallery Subject-1 Subject-2 Subject-3 Subject-4 Subject-k . . . 0. 6 0. 3 0. 1 Sparse Representation 0. 3 Query + 0. 6 + 0. 1

Gallery Subject-1 Subject-2 Subject-3 Subject-4 Subject-k . . . The query image is represented as a linear combination of few images of the gallery. 0. 3 Query + 0. 6 + 0. 1

0. 3 Query + 0. 6 + 0. 1

0. 3 + 0. 6 Query Not similar: reconstruction error is high + 0. 1

0. 3 + 0. 6 + 0. 1 Query Very similar: reconstruction error is very low

Query is classified as this subject In SRC, the query is classified as the subject with the lowest reconstruction error

ALSR Accumulative Local Sparse Representation [ PROPOSED METHOD ]

Our approach uses Patches! description query image selection classification ID

class i class 1 . . . description : : : description class k description classifier’s design LEARNING TESTING description query image classification class

class i class 1 . . . class k . . . : : : description dictionary 1 dictionary i dictionary k LEARNING TESTING description query image sparse representation classification class

class i class 1 . . . class k . . . : : : description dictionary 1 dictionary i dictionary k LEARNING TESTING description query image SRC class

class i class 1 . . . class k . . . : : : description dictionary 1 dictionary i dictionary k LEARNING TESTING for each test patch SRC query image : for all test patches majority vote description : class

class i class 1 . . . class k . . . : : : description dictionary 1 dictionary i dictionary k LEARNING TESTING for each test patch : SRC query image : for all test patches majority vote description selection of best dictionaries class

class i class 1 . . . class k . . . : : : description dictionary 1 dictionary i dictionary k LEARNING TESTING for each test patch query image : SRC score for all test patches : majority vote description selection of best dictionaries class

class i class 1 . . . class k . . . : : : Visual Vocabulary & Stop List description dictionary 1 dictionary i dictionary k LEARNING TESTING for each test patch : SRC score query image for all test patches face mask : majority vote description selection of best dictionaries class

Gallery Subject-1 Subject-2 Subject-3 Subject-4 Subject-k . . . For each image of the gallery: Dictionary . . . Patches of Subject-1 Patches of Subject-2 + position (x, y) of each patch Patches of Subject-k

Gallery Class-1 Class-2 Class-3 Class-4 Class-k . . . Patches of Class-1 Patches of Class-2 General dictionary D Patches of Class-k

Dictionary Query

0. Original dictionary 1. Selection of the nearest patches (using (x, y) information) 2. Selection of the most similar patches (using intensity information) 0. 5 3. Sparse Representation 0. 3 0. 2 i For patch i 4. Contribution si 0 0. 8 0 . . . 0. 2

0. Original dictionary D 1. Selection of the nearest patches (using (x, y) information) Dn 2. Selection of the most similar patches (using intensity information) Ds 3. Sparse Representation of yi using Ds 0. 5 xi 0. 3 0. 2 i 4. Contribution i si Patch yi Neighborhood of yi For patch i 0 0. 8 0 . . . 0. 2

Subject Contribution of each patch: z Query classified as #2 1 2 3 N 0. 1 0. 5 0 0. 2 0 0. 3 0 0. 1 0. 2 0. 4 0 0 0. 1 0. 6 0 0. 2 0 0. 3 0 0 0. 2 0 0 0. 1 0 0 0 0. 4 0 0. 5 0. 2 0. 1 0 0. 6 0. 2 0. 1 : : 0. 1 0. 7 0 0. 2 1. 1 14. 1 0. 7 1. 5 TOTAL

Patches Contribution p 1 2 3 4 1 0. 2 0. 3 2 0. 1 0. 3 0. 2 0. 1 3 0. 1 0. 4 0 0 4 0. 1 0. 6 0 0. 1 5 0. 2 0. 1 0. 2 6 0. 3 0. 1 0 0. 1 7 0. 1 0. 6 0. 1 0. 2 8 0 0. 3 0 0 9 0 0. 1 0. 7 0 10 0. 1 0. 5 0. 2 0. 1 11 0. 2 0. 1 0 12 0 0. 1 0. 2 0. 3 1 12 Query

Patches Mask Contribution p q 1 2 3 4 1 0 0. 2 0. 1 0. 2 0. 3 2 1 0. 3 0. 2 0. 1 3 1 0. 4 0 0. 1 0. 6 0 0. 1 5 1 0. 2 0. 1 0. 2 6 1 0. 3 0. 1 0 0. 1 7 1 0. 6 0. 1 0. 2 8 1 0 0. 3 0 0 9 0 0 0. 1 0. 7 0 10 1 0. 5 0. 2 0. 1 11 1 0. 2 0. 1 0 12 0 0 0. 1 0. 2 0. 3 Patches that are not discriminative can be removed x x Query Face masks used in our experiments

Patches Mask Contribution p q 1 2 3 4 1 0 - - 2 1 0. 3 0. 2 0. 1 3 1 0. 4 0 0 4 0 - - 5 1 0. 2 0. 1 0. 2 6 1 0. 3 0. 1 0 0. 1 7 1 0. 6 0. 1 0. 2 8 1 0 0. 3 0 0 9 0 - - 10 1 0. 5 0. 2 0. 1 11 1 0. 2 0. 1 0 12 0 - - Patches that are not discriminative can be removed Query

Patches Mask Contribution SCI p q 1 2 3 4 SCI 1 0 - - - 2 1 0. 3 0. 2 0. 1 0. 2 3 1 0. 4 0 0 0. 3 4 0 - - - 5 1 0. 2 0. 1 0. 2 6 1 0. 3 0. 1 0. 2 7 1 0. 6 0. 1 0. 2 0. 4 8 1 0 0. 3 0 0 0. 8 9 0 - - - 10 1 0. 5 0. 2 0. 1 0. 5 11 1 0. 2 0. 1 0 0. 2 12 0 - - - SCI: Sparsity Concentration Index (score) Query

Patches Mask Contribution SCI p q 1 2 3 4 SCI 1 0 - - - 2 1 0. 3 0. 2 0. 1 0. 2 3 1 0. 4 0 0 0. 3 4 0 - - - 5 1 0. 2 0. 1 0. 2 6 1 0. 3 0. 1 0. 2 7 1 0. 6 0. 1 0. 2 0. 4 8 1 0 0. 3 0 0 0. 8 9 0 - - - 10 1 0. 5 0. 2 0. 1 0. 5 11 1 0. 2 0. 1 0 0. 2 12 0 - - - SCI: Sparsity Concentration Index > 0. 25 Query

Patches Mask Contribution SCI p q 1 2 3 4 SCI 1 0 - - - 2 1 - - 0. 2 3 1 0. 4 0 0 0. 3 4 0 - - - 5 1 - - 0. 2 6 1 - - 0. 2 7 1 0. 6 0. 1 0. 2 0. 4 8 1 0 0. 3 0 0 0. 8 9 0 - - - 10 1 0. 5 0. 2 0. 1 0. 5 11 1 - - 0. 2 12 0 - - - SCI: Sparsity Concentration Index > 0. 25 Query

Patches Mask Contribution SCI p q 1 2 3 4 SCI 1 0 - - - 2 1 - - - 3 1 0. 4 0 0 0. 3 4 0 - - - 5 1 - - - 6 1 - - - 7 1 0. 6 0. 1 0. 2 0. 4 8 1 0 0. 3 0 0 0. 8 9 0 - - - 10 1 0. 5 0. 2 0. 1 0. 5 11 1 - - - 12 0 - - - SCI: Sparsity Concentration Index > 0. 25 Query

Patches Mask Contribution SCI max p q 1 2 3 4 SCI max 1 0 - - - 2 1 - - - 3 1 0. 4 0 0 0. 3 0. 4 4 0 - - - 5 1 - - - 6 1 - - - 7 1 0. 6 0. 1 0. 2 0. 4 0. 6 8 1 0 0. 1 0. 3 0 0. 8 0. 3 9 0 - - - 10 1 0. 5 0. 2 0. 1 0. 5 11 1 - - - 12 0 - - - Maximal value of each contribution Query

Patches Mask Contribution SCI max p q 1 2 3 1 0 - - 2 1 - 3 1 4 Normalization 4 SCI max 1 2 3 4 - - - - - 0. 1 0. 4 0 0 0. 3 0. 4 0. 25 1. 0 0 - - - - - 5 1 - - - - - 6 1 - - - - - 7 1 0. 6 0. 1 0. 2 0. 4 0. 6 0. 17 1. 0 0. 17 0. 33 8 1 0 0. 1 0. 3 0 0. 8 0. 3 0 0. 33 1. 0 0 9 0 - - - - - 10 1 0. 5 0. 2 0. 1 0. 5 0. 2 1. 0 0. 4 0. 2 11 1 - - - - - 12 0 - - - - - Each contribution is divided by its maximum Query

Patches Mask Contribution SCI max p q 1 2 3 1 0 - - 2 1 - 3 1 4 Normalization 4 SCI max 1 2 3 4 - - - - - 0. 1 0. 4 0 0 0. 3 0. 4 0. 25 1. 0 0 - - - - - 5 1 - - - - - 6 1 - - - - - 7 1 0. 6 0. 1 0. 2 0. 4 0. 6 0. 17 1. 0 0. 17 0. 33 8 1 0 0. 1 0. 3 0 0. 8 0. 3 0 0. 33 1. 0 0 9 0 - - - - - 10 1 0. 5 0. 2 0. 1 0. 5 0. 2 1. 0 0. 4 0. 2 11 1 - - - - - 12 0 - - - - - The normalized contributions must be greater than 0. 2 Query

Patches Mask Contribution SCI max p q 1 2 3 1 0 - - 2 1 - 3 1 4 Normalization 4 SCI max 1 2 3 4 - - - - - 0. 1 0. 4 0 0 0. 3 0. 4 0. 25 1. 0 - - - 5 1 - - - - - 6 1 - - - - - 7 1 0. 6 0. 1 0. 2 0. 4 0. 6 - 1. 0 - 0. 33 8 1 0 0. 1 0. 3 0 0. 8 0. 3 - 0. 33 1. 0 - 9 0 - - - - - 10 1 0. 5 0. 2 0. 1 0. 5 - 1. 0 0. 4 - 11 1 - - - - - 12 0 - - - - - The normalized contributions must be greater than 0. 2 Query

Patches Mask Contribution SCI max p q 1 2 3 1 0 - - 2 1 - 3 1 4 Normalization 4 SCI max 1 2 3 4 - - - - - 0. 1 0. 4 0 0 0. 3 0. 4 0. 25 1. 0 - - - 5 1 - - - - - 6 1 - - - - - 7 1 0. 6 0. 1 0. 2 0. 4 0. 6 - 1. 0 - 0. 33 8 1 0 0. 1 0. 3 0 0. 8 0. 3 - 0. 33 1. 0 - 9 0 - - - - - 10 1 0. 5 0. 2 0. 1 0. 5 - 1. 0 0. 4 - 11 1 - - - - - 12 0 - - - - - 0. 25 3. 33 1. 4 0. 33 max The query is classified according the maximal normalized contribution Query

The code of the MATLAB implementation is available on our webpage: http: //dmery. ing. puc. cl > Material > ALSR

Agenda • • Motivation Proposed method Experiments Conclusions

Experiments • Face Recognition in LFW • Gender Recognition in AR • Expression Recognition in Oulu-CASIA

Face Recognition in LFW [ PROTOCOL ] The gallery has 143 subjects with at least 11 images per subject (10 for training, the rest for testing). There are 1430 images for training and 2744 for testing.

Example in LFW Contributions per class Images of the same subject in the gallery (subject #117). Maximum for #117 Query image Contributions per class Maximum for #117 1 1 . . . 143 117 143

Example in LFW Images of the same subject in the gallery (subject #98). 1 . . . 143

80. 8 In this table, we do not report deep learning methods that require millions of training images (for the sake of truth, VGG-Face in this experiment achieves 97. 7%)

Gender Recognition in AR [ PROTOCOL ] 100 subjects (50 women and 50 men). For gender recognition, 14 non-occluded images per subject. In this experiment, the first 25 males and 25 females were used for training and the last 25 males and 25 females were used for testing.

Expression Recognition in Oulu-CASIA [ PROTOCOL ] Six different facial expressions (surprise, happiness, sadness, anger, fear and dis- gust) under normal illumination from 80 subjects (59 males and 21 females) ranging from 23 to 58 years in age. The dataset contains 480 sequences: the first 9 images of each sequence are not considered, the first 40 individuals are taken as training subset and the rest as testing.

Agenda • • Motivation Proposed method Experiments Conclusions

We presented a new algorithm that is able to recognize faces and facial attributes automatically from face images captured under less constrained conditions including some variability in ambient lighting, pose, expression, size of the face and distance from the camera. The robustness of our algorithm is due to two reasons: • The dictionary used in the recognition corresponds to a rich collection of representations of relevant parts which were selected using closeness and similarity criteria. • The testing stage is based on accumulative sparse contributions according to location and relevance criteria. We believe that this new approach can be used to solve other kind of computer vision problems in which there are similar unconstrained conditions and a huge number of training images is not available. In the future, we will train our own deep learning network to obtain a better description of the patches, and we will learn the face image masks from training data, instead of manual selection.

Recognition of Faces and Facial Attributes using Accumulative Local Sparse Representations Domingo Mery Sandipan Banerjee Department of Computer Science Universidad Católica de Chile Department of Computer Science & Engineering University of Notre Dame