Paper list CVPR 19 Graphonomy Universal Human Parsing

Human Parsing: huge different granularity and quantity of semantic labels ü A single universal

Intra-Graph Reasoning 1. get local feature tensors from convolution layers 2. construct graph with

Geometric features: relative coordinates Temporal features: temporal displacements

Attention Enhanced Graph Convolutional LSTM

AGC-LSTM Network Joints Feature Representation Temporal Hierarchical Architecture: average pooling in temporal domain to

Actional-Structural Graph Convolution Block

Unified manner: • multi-person detection • 2 D pose estimation • instance segmentation TO

Keypoint detection Produce heatmaps (one channel per keypoint), offsets(two channels per keypoint for displacements

Grouping keypoints into person detection instances • Mid-range pairwise offsets • Recurrent offset refinement

Keypoint- and instance-level detection scoring

Instance-level person segmentation points from the image position x to the position of the

Detecting graph elements n Ground Truth for vertices and edges n Stacked hourglass network

Connecting elements with associative embeddings n An edge points to a vertex by matching

Slides: 73

Download presentation

Paper list: • • • CVPR 19 -Graphonomy- Universal Human Parsing via Graph Transfer Learning AAAI 18 -Spaital Temporal Graph Convolutional Networks for Skeleton-based Action Recognition BMVC 18 -Part-based Graph Convolutional Network for Action Recognition CVPR 19 -Actional-Structural Graph Convolutional Networks for Skeleton-based Action Recognition CVPR 19 -An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition NIPS 2017 -Inductive Representation Learning on Large Graphs CVPR 19 -Graphical Contrastive Losses for Scene Graph Generation ECCV 18 -Person. Lab- Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part. Based, Geometric Embedding Model MM 18 -RGCNN- Regularized Graph CNN for Point Cloud Segmentation WWW 19 -Learning Graph Pooling and Hybrid Convolutional Operations for Text Representations AAAI 19 -Multi-GCN- Graph Convolutional Networks for Multi-View Networks, with Applications to Global Poverty

CVPR 2019

Human Parsing: huge different granularity and quantity of semantic labels ü A single universal human parsing model to tackle all levels of the task(Multi-task learning) ü Pretrain in one dataset, transfer to another dataset with graph transfer capability(Transfer learning)

Intra-Graph Reasoning 1. get local feature tensors from convolution layers 2. construct graph with external structure knowledge 3. feature maps -> graph node feature 4. employ graph convolution three times 5. re-project the graph nodes to image features

Inter-Graph Transfer

AAAI 18

BMVC 18

Geometric features: relative coordinates Temporal features: temporal displacements

CVPR 19

Graph Convolutional Neural Network

Attention Enhanced Graph Convolutional LSTM

AGC-LSTM Network Joints Feature Representation Temporal Hierarchical Architecture: average pooling in temporal domain to increase the temporal receptive field of the top AGC-LSTM layers Learning AGC-LSTM

CVPR 19

Spatio-Temporal GCN

Actional-Structural GCN

Actional-Structural Graph Convolution Block

CVPR 1 9

NIPS 2017

ECCV 2018

Unified manner: • multi-person detection • 2 D pose estimation • instance segmentation TO DO: • identify person instance • localize facial and body keypoint • estimate instance segmentation mask

Keypoint detection Produce heatmaps (one channel per keypoint), offsets(two channels per keypoint for displacements in the horizontal and vertical directions) points from the image position x to the k-th keypoint of the closest person instance j

Hough voting

Grouping keypoints into person detection instances • Mid-range pairwise offsets • Recurrent offset refinement • Fast greedy decoding

Keypoint- and instance-level detection scoring

Instance-level person segmentation points from the image position x to the position of the k-th keypoint of the corresponding instance j

HH 2019. 1. 5

Pipeline

Detecting graph elements n Ground Truth for vertices and edges n Stacked hourglass network n Two heatmaps from the final tensor

Connecting elements with associative embeddings n An edge points to a vertex by matching its output embedding as closely as possible: n The embedding vectors produced for each vertex are sufficiently different

Support for overlapping detections