Shifting More Attention to Video Salient Object Detection

Shifting More Attention to Video Salient Object Detection Deng-Ping Fan 1, Wenguan Wang 2, Ming-Ming Cheng 1, and Jianbing Shen 2 1 Nankai University 2 Inception Institute of Artificanial Intelligence Dataset, Code, and Result: https: //github. com/Deng. Ping. Fan/DAVSOD 1. Introduction 2. Densely Annotated VSOD (DAVSOD) dataset Wechat 4. Largest-scale benchmark Vi. : #videos. AF. : #annotated frames. DL: densely labeling. AS: attention shift. FP: annotate objects according to eye fixation. EF: eye fixation records IL: instance annotation. Problem: Existing video salient object detection (VSOD) datasets: l Limited to small scales (dozes of videos). l Without real human attention when annotation. 5. Result l Its diversity and generality are quite limited. l Ignore the saliency shift phenomenon. 3. Saliency-Shift Aware VSOD (SSAV) model Contribution: Based on the new problem——saliency shift, we l Presented a new DAVSOD dataset. l Proposed a strong SSAV model. l Provided the largest-scale benchmark. l Suggested some potential research, eg. , saliencyaware video captioning, video salient object subitizing. 6. Take-away l Extensive experiments verified that even considering top performing models, VSOD remain seems far from being solved. l Promising new avenue for video salient object detection.

Slides: 1

Download presentation