Scratch Det Training SingleShot Object Detectors from Scratch
Scratch. Det: Training Single-Shot Object Detectors from Scratch CVPR 2019 oral 2019/8/22 1
Motivation • Fine-tune the pretrained model of image classification task in Image. Net for detection have drawbacks: 1. The classification task prefers to translation invariance, and thus needs downsampling operations for better performance. In contrast, the local texture information is more critical for object detection, making the usage of translation-invariant operations with caution. 2. It is inconvenient to change the architecture of networks (even small changes) in fine-tuning process. 2
Main Contribution • Present a single-shot object detector trained from scratch, named Scratch. Det, which integrates Batch. Norm to help the detector converge well from scratch, independent to the type of network • Introduce a new Root-Res. Net backbone network based on the new designed root block, which noticeably improves the detection accuracy, especially for small objects 3
Batch. Norm for training-from-scratch • Batch. Norm in the backbone subnetwork • Add Batch. Norm in each convolution layer in the backbone subnetwork • Batch. Norm in the detection head subnetwork • Add Batch. Norm in each convolution layer in the detection head subnetwork • Batch. Norm in the whole network • Use Batch. Norm in both the backbone and detection head subnetworks • Batch. Norm for the pretrained network 4
Backbone Network Redesign • Backbone Network 5
Datasets • PASCAL VOC 2007 • PASCAL VOC 2012 • MS COCO 6
Experiments on PASCAL VOC 2007 • Analysis of Batch. Norm • Batch. Norm in the backbone network 7
Experiments on PASCAL VOC 2007 • Analysis of Batch. Norm • Batch. Norm in the detection head network 8
Experiments on PASCAL VOC 2007 • Analysis of Batch. Norm • Batch. Norm in the whole network 9
Experiments on PASCAL VOC 2007 • Analysis of Batch. Norm • Batch. Norm for the pretrained network • 77. 2% m. AP fine-tuning the pretrained VGG-16 based SSD 10
Experiments on PASCAL VOC 2007 • Analysis of backbone subnetwork • Kernel size in the first layer 11
Experiments on PASCAL VOC 2007 • Analysis of backbone subnetwork • Downsampling in the first layer 12
Experiments on PASCAL VOC 2007 • Analysis of backbone subnetwork • Number of layers in the root block 13
Experiments on PASCAL VOC 2007 • Analysis of backbone subnetwork • Number of layers in the root block Scratch. Det 300 14
Experiments on PASCAL VOC 2007 and 2012 • A 15
Experiments on MS COCO 16
Conclusion • Study the effects of Batch. Norm in the backbone and detection head subnetworks, and successfully train detectors from scratch • Explore various architectures for detector designing by taking the pretaining-free advantage and propose a new Root-Res. Net backbone network to further improve the detection accuracy 17
My own thinking • We need to find out the key difference between different vision tasks, since it will reflect on network architectures and will cause a huge impact to the results • Novelty of this paper is a little bit poor although the experiments is abundant 18
- Slides: 18