Efficient Video Classification Using Fewer Frames Shweta Bhardwaj
Efficient Video Classification Using Fewer Frames Shweta Bhardwaj, Mukundhan Srinivasan, Mitesh M. Khapra Indian Institute of Technology Madras, NVIDIA Bangalore
Motivation • Building compact models for video classification which have a small memory footprint • Most compact models have a large FLOPs. 2020/11/28 2
Motivation • ECCV 2018 workshop • (i) recurrent neural network (LSTM) • (ii) cluster-and-aggregate (Net. VLAD) • (iii) based on C 3 D 2020/11/28 3
Motivation • ECCV 2018 • distillation workshop • (i) recurrent neural network (LSTM) • (ii) cluster-and-aggregate (Net. VLAD) • (iii) based on C 3 D • Expensive teacher network with every frames • Inexpensive student network with fewer frames 2020/11/28 • Same model 4
Framework 2020/11/28 5
Teacher Network 2020/11/28 6
Student Network 7 2020/11/28
Framework 2020/11/28 8
Student Network 9 2020/11/28
Framework 2020/11/28 10
Student Network 11 2020/11/28
Framework 2020/11/28 12
Experiment Dataset: Youtube-8 M(2017 version) Model: H-RNN & Net. VLAD & Ne. Xt. VLAD Skyline: the original teacher model Baseline: the student model without teacher model 13 2020/11/28
Evaluation 14 2020/11/28
Best baseline & Best loss combination 2020/11/28 15
Performance of student close to the skyline 2020/11/28 16
Better performance on limited training data 2020/11/28 17
Parallel student take more epochs match the serial 2020/11/28 18
Computational cost and time is less than skyline, with the same performance 2020/11/28 19
Training student to match the intermediate representations of teacher, get the same performance 2020/11/28 20
Other models’ performance 2020/11/28 21
Conclusion • Our model outperforms the baseline and gives a significant reduction in terms of computational time and cost when compared to the skyline. • In the future, train a reinforcement learning agent to first select the most favorable k frames. Comments ++ Save lots of time and memory, also get the same results. ++ Good idea -- The performance of the best baseline is close to the model. 2020/11/28 22
- Slides: 22