• HOF, MBH • Can be learned Single convolutional layer (containing orientation sensitive filters) followed by rectification and pooling layers • Trajectory • Can be an input using Trajectory stacking • Still missing: • Local pooling over spatio-temporal tubes centered at the trajectories • Camera motion compensation
• UCF-101 – optical flow representation • Two-Stream Conv. Net
X. Wang, A. Farhadi, A. Gupta CVPR 2016
• Loss:
Spatial stream Conv. Net Temporal stream Conv. Net
• Initialize network weights with pre-trained Two-Stream Network. • Repeat: • Forward propagation and feature computing for each frame • Search Latent variables: • such that • Calculate joint loss • Perform back-propagation
• Objective: Spatial stream Conv. Net Spatial Distance Score Temporal stream Conv. Net • Model fusion: • 2 x. Temporal. Score + Spatial. Score Temporal Distance Score