Learning Semanticsaware Distance Map with Semantics Layering Network

  • Slides: 17
Download presentation
Learning Semantics-aware Distance Map with Semantics Layering Network for Amodal Instance Segmentation ACM MM

Learning Semantics-aware Distance Map with Semantics Layering Network for Amodal Instance Segmentation ACM MM 2019 Poster Ziheng Zhang∗, Anpei Chen∗, Ling Xie , Jingyi Yu, Shenghua Gao† Shanghai. Tech University 2019/11/15 Zhixuan Li 1

What is Amodal Instance Segmentation • The occluded part should be predicted modal region

What is Amodal Instance Segmentation • The occluded part should be predicted modal region amodal region -> yellow part of the refrigerator RGB input 2019/11/15 Ground Truth Zhixuan Li 2

Normal semantic segmentation network • Predict labels for all visible pixels • Each pixel

Normal semantic segmentation network • Predict labels for all visible pixels • Each pixel only could have one possible label 2019/11/15 Zhixuan Li 3

A natural variant from modal to amodal • Predict labels for all visible and

A natural variant from modal to amodal • Predict labels for all visible and invisible pixels • Each pixel could have more than one possible labels • occluded region: >= 2 labels / pixel 2019/11/15 Zhixuan Li 4

Comparison of the prediction’s data structure of the above two methods AMODAL (natural naïve

Comparison of the prediction’s data structure of the above two methods AMODAL (natural naïve version) MODAL H*W*C Same C is the number of classes argmax cross C channels, on each position Different -> one label / position reserve predictions from all channels -> multi labels / occluded position -> one label / not occluded position H C 2019/11/15 W Zhixuan Li 5

What information did we ignore on the GT? • Relative depth order The girl

What information did we ignore on the GT? • Relative depth order The girl is closer to camera than the surface board 2019/11/15 Zhixuan Li 6

Amodal naïve+depth order version MODAL AMODAL (natural naïve version) AMODAL (naïve + depth version)

Amodal naïve+depth order version MODAL AMODAL (natural naïve version) AMODAL (naïve + depth version) Dimensions H*W*C H*W*K Channel means what classes relative depth order note argmax reserve all in each depth, use argmax to get one class H H K C 2019/11/15 W Zhixuan Li W 7

Restriction of this naïve + depth version • No all areas of one object

Restriction of this naïve + depth version • No all areas of one object have same relative depth • left and right legs of the person are in front of and behind the horse 2019/11/15 Zhixuan Li 8

SLN from this paper • Different pixels have different depths 2019/11/15 Zhixuan Li 9

SLN from this paper • Different pixels have different depths 2019/11/15 Zhixuan Li 9

Illustrations 2019/11/15 Zhixuan Li 10

Illustrations 2019/11/15 Zhixuan Li 10

SLN Architecture 2019/11/15 Zhixuan Li 11

SLN Architecture 2019/11/15 Zhixuan Li 11

Sem-dist map • visibility level L • “remove” at least L objects to make

Sem-dist map • visibility level L • “remove” at least L objects to make it visible • the sem-dist map M 2019/11/15 Zhixuan Li 12

Experiments COCOA dataset 2500, 1250 and 1250 images are used for training, validation and

Experiments COCOA dataset 2500, 1250 and 1250 images are used for training, validation and testing D 2 SA dataset 5600 images nature scene 2019/11/15 supermarket goods Zhixuan Li 13

Metrics for amodal segmentation • mean average precision (AP) and mean average recall (AR)

Metrics for amodal segmentation • mean average precision (AP) and mean average recall (AR) 2019/11/15 Zhixuan Li 14

Result on COCOA validation set AR 10 and AR 100: 10 and 100 segments

Result on COCOA validation set AR 10 and AR 100: 10 and 100 segments per image things: person, car et. al stuff: sky, grass et. al 2019/11/15 countable uncountable Zhixuan Li 15

2019/11/15 COCOA dataset 16

2019/11/15 COCOA dataset 16

Conclusions • STOA on COCOA and D 2 SA dataset • good baseline code

Conclusions • STOA on COCOA and D 2 SA dataset • good baseline code 2019/11/15 Zhixuan Li 17