EVA 2 Exploiting Temporal Redundancy In Live Computer
- Slides: 63
EVA 2: Exploiting Temporal Redundancy In Live Computer Vision Mark Buckler, Philip Bedoukian, Suren Jayasuriya, Adrian Sampson International Symposium on Computer Architecture (ISCA) Tuesday June 5, 2018
Convolutional Neural Networks (CNNs) 2
Convolutional Neural Networks (CNNs) 3
FPGA Research Suda et al. Embedded Vision Accelerators Zhang et al. Shi. Dian. Nao Qiu et al. Farabet et al. Many more… ASIC Research EIE Industry Adoption Eyeriss SCNN Many more… 4
Temporal Redundancy Input Change Frame 0 Frame 1 Frame 2 Frame 3 High Low Low 5
Temporal Redundancy Frame 0 Frame 1 Frame 2 Frame 3 Input Change High Low Low Cost to Process High 6
Temporal Redundancy Frame 0 Frame 1 Frame 2 Frame 3 Input Change High Low Low Cost to Process High Low 7
Talk Overview Background Algorithm Hardware Evaluation Conclusion 8
Talk Overview Background Algorithm Hardware Evaluation Conclusion 9
Common Structure in CNNs Image Classification Object Detection Semantic Segmentation Image Captioning 10
Common Structure in CNNs Intermediate Activations Frame 0 Frame 1 #Make. Ryan. Gosling. The. New. Lenna CNN Prefix CNN Suffix High energy Low energy 11
Common Structure in CNNs Intermediate Activations CNN Prefix CNN Suffix High energy Low energy “Key Frame” Motion “Predicted Frame” #Make. Ryan. Gosling. The. New. Lenna Motion CNN Prefix CNN Suffix High energy Low energy 12
Common Structure in CNNs Intermediate Activations CNN Prefix CNN Suffix High energy Low energy “Key Frame” Motion “Predicted Frame” CNN Prefix Motion CNN Suffix Low energy #Make. Ryan. Gosling. The. New. Lenna 13
Talk Overview Background Algorithm Hardware Evaluation Conclusion 14
Activation Motion Compensation (AMC) Time Vision Computation Input Frame Vision Result Stored Activations Key Frame t CNN Prefix Predicted Frame t+k Motion Estimation CNN Suffix Motion Compensation Motion Vector Field CNN Suffix Predicted Activations 15
Activation Motion Compensation (AMC) Time Vision Computation Input Frame Vision Result Stored Activations Key Frame t CNN Prefix CNN Suffix ~1011 MACs Predicted Frame t+k Motion Estimation ~107 Adds Motion Compensation Motion Vector Field CNN Suffix Predicted Activations 16
AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? 17
AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? 18
AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? 19
AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? ? 20
AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? 21
Motion Estimation • We need to estimate the motion of activations by using pixels… CNN Prefix CNN Suffix Motion Estimation Motion Compensation Performed on Pixels Performed on Activations CNN Suffix 22
Pixels to Activations Input Image 3 x 3 Conv Intermediate 64 Activations 23
Pixels to Activations: Receptive Fields C=64 C=3 C=64 w=h=8 Input Image 3 x 3 Conv Intermediate 64 Activations 24
Pixels to Activations: Receptive Fields C=64 C=3 C=64 w=h=8 5 x 5 “Receptive Field” Input Image 3 x 3 Conv Intermediate 64 Activations • Estimate motion of activations by estimating motion of receptive fields 25
Receptive Field Block Motion Estimation (RFBME) … … Key Frame Predicted Frame 26
Receptive Field Block Motion Estimation (RFBME) 0 1 2 3 Key Frame Predicted Frame 27
Receptive Field Block Motion Estimation (RFBME) 0 1 2 3 Key Frame Predicted Frame 28
AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? 29
Motion Compensation C=64 Vector: X = 2. 5 Y = 2. 5 Stored Activations Predicted Activations • Subtract the vector to index into the stored activations • Interpolate when necessary 30
AMC Design Decisions • How to perform motion estimation? • How to perform motion compensation? • Which frames are key frames? ? 31
When to Compute Key Frame? • System needs a new key frame when motion estimation fails: • • De-occlusion New objects Rotation/scaling Lighting changes 32
When to Compute Key Frame? Input Frame • System needs a new key frame when motion estimation fails: • • De-occlusion New objects Rotation/scaling Lighting changes • So, compute key frame when RFBME error exceeds set threshold Key Frame Motion Estimation Yes CNN Prefix Error > Thresh? No Motion Compensation CNN Suffix Vision Result 33
Talk Overview Background Algorithm Hardware Evaluation Conclusion 34
Embedded Vision Accelerator Global Buffer Eyeriss (Conv) EIE (Full Connect) CNN Prefix Y. -H. Chen, T. Krishna, J. S. Emer, and V. Sze, “Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks, ” CNN Suffix S. Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. A. Horowitz, and W. J. Dally, “EIE: Efficient inference engine on compressed deep neural network, ” 35
Embedded Vision Accelerator (EVA 2) Global Buffer EVA 2 Motion Estimation Eyeriss (Conv) Motion Compensation Y. -H. Chen, T. Krishna, J. S. Emer, and V. Sze, “Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks, ” EIE (Full Connect) CNN Prefix CNN Suffix S. Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. A. Horowitz, and W. J. Dally, “EIE: Efficient inference engine on compressed deep neural network, ” 36
Embedded Vision Accelerator (EVA 2) Frame 0 37
Embedded Vision Accelerator (EVA 2) Frame 0: Key frame 38
Embedded Vision Accelerator (EVA 2) Frame 1 Motion Estimation 39
Embedded Vision Accelerator (EVA 2) Frame 1: Predicted frame Motion Estimation Motion Compensation • EVA 2 leverages sparse techniques to save 80 -87% storage and computation 40
Talk Overview Background Algorithm Hardware Evaluation Conclusion 41
Evaluation Details Train/Validation Datasets Evaluated Networks Hardware Baseline EVA 2 Implementation You. Tube Bounding Box: Object Detection & Classification Alex. Net, Faster R-CNN with VGGM and VGG 16 Eyeriss & EIE performance scaled from papers Written in RTL, synthesized with 65 nm TSMC 42
EVA 2 Area Overhead Total 65 nm area: 74 mm 2 EVA 2 takes up only 3. 3% 43
Normalized Energy EVA 2 Energy Savings 1 0. 9 0. 8 0. 7 0. 6 0. 5 0. 4 0. 3 0. 2 0. 1 0 Input Frame CNN Prefix CNN Suffix orig Alex. Net Eyeriss orig Faster 16 EIE orig Vision Result Faster. M EVA^2 44
Normalized Energy EVA 2 Energy Savings 1 0. 9 0. 8 0. 7 0. 6 0. 5 0. 4 0. 3 0. 2 0. 1 0 Input Frame Key Frame Motion Estimation Motion Compensation orig pred Alex. Net Eyeriss orig pred Faster 16 EIE EVA^2 orig pred Faster. M CNN Suffix Vision Result 45
Normalized Energy EVA 2 Energy Savings 1 0. 9 0. 8 0. 7 0. 6 0. 5 0. 4 0. 3 0. 2 0. 1 0 Input Frame Key Frame Motion Estimation Yes CNN Prefix orig pred avg Alex. Net Eyeriss Faster 16 EIE EVA^2 Faster. M Error > Thresh? No Motion Compensation CNN Suffix Vision Result 46
High Level EVA 2 Results Network Vision Task Keyframe % Accuracy Degredation Average Latency Average Energy Savings Alex. Net Classification 11% 0. 8% top-1 86. 9% 87. 5% Faster R-CNN VGG 16 Detection 36% 0. 7% m. AP 61. 7% 61. 9% Faster R-CNN VGGM 37% 0. 6% m. AP 54. 1% 54. 7% Detection • EVA 2 enables 54 -87% savings while incurring <1% accuracy degradation • Adaptive key frame choice metric can be adjusted 47
Talk Overview Background Algorithm Hardware Evaluation Conclusion 48
Conclusion • Temporal redundancy is an entirely new dimension for optimization • AMC & EVA 2 improve efficiency and are highly general • Applicable to many different… • CNN applications (classification, detection, segmentation, etc) • Hardware architectures (CPU, GPU, ASIC, etc) • Motion estimation/compensation algorithms 49
EVA 2: Exploiting Temporal Redundancy In Live Computer Vision Mark Buckler, Philip Bedoukian, Suren Jayasuriya, Adrian Sampson International Symposium on Computer Architecture (ISCA) Tuesday June 5, 2018
Backup Slides 51
Why not use vectors from video codec/ISP? • We’ve demonstrated that the ISP can be skipped (Bucker et al. 2017) • No need to compress video which is instantly thrown away • Can save energy by power gating the ISP • Opportunity to set own key frame schedule • However, great idea for pre-stored video! 52
Why Not Simply Subsample? • If lower frame rate needed, simply apply AMC at that frame rate • Warping • Adaptive key frame choice 53
Different Motion Estimation Methods Faster 16 Faster. M 54
Difference from Deep Feature Flow? • Deep Feature Flow does also exploit temporal redundancy, but… AMC and EVA 2 Adaptive key frame rate? On chip activation cache? Learned motion estimation? Yes No Motion estimation granularity Per receptive field Motion compensation Sparse (four-way zero skip) Activation storage Sparse (run length) Deep Feature Flow No No Yes Per pixel (excess granularity) Dense 55
Difference from Euphrates? • Euphrates has a strong focus on So. C integration • Motion estimation from ISP • May want to skip the ISP to save energy & create more optimal key schedule • Motion compensation on bounding boxes • Skips entire network, but is only applicable to object detection 56
Re-use Tiles in RFBME 57
Changing Error Threshold 58
Different Adaptive Key Frame Metrics 59
Normalized Latency & Energy 60
How about Re-Training? 61
Where to cut the network? 62
#Make. Ryan. Gosling. The. New. Lenna • Lenna dates back to 1973 • We need a new test image for image processing!
- Image compression model in digital image processing
- Rfbme
- Virtual value chain
- Define new entry in entrepreneurship
- Exploiting format string vulnerabilities
- Large and fast: exploiting memory hierarchy
- Algorithms for recovery and isolation exploiting semantics
- Exploiting machine learning to subvert your spam filter
- Exploiting the sponsorship examples
- Www.quizlet live
- Live healthy be happy
- Wordiness and redundancy examples
- First hop redundancy protocol
- Inanticipable
- Partial redundancy elimination
- Partial redundancy elimination
- Spatial redundancy in video compression
- Dependability in computer architecture
- Redundancy control in database
- Redundancy ratio
- Scsb redundancy
- Psychovisual redundancy
- Digital image processing
- Crc error detection
- Lan redundancy
- Ups redundancy
- Psychovisual redundancy example
- Wordiness and redundancy
- Psychovisual adalah
- Psychovisual redundancy
- Psychovisual redundancy
- Data redundancy and update anomalies
- Boolean algebra
- Retrenchment strategy meaning
- Bridge collapse oklahoma
- Redundant arrays of independent disks
- Vertical redundancy check
- Spatial redundancy in video compression
- Crc error detection
- Qchecksum
- Interpixel redundancy
- Dependability via redundancy
- Dependability via redundancy
- A logical grouping of characters is a
- Sinus caverneux nerf
- Temporal key integrity protocol (tkip)
- Temporal resolution
- 5 ejemplos coloquiales de constantes
- Temporal locality
- Properties of synapse ppt
- Gametic isolation example
- Temporal state marketing
- Suboccipital venous plexus
- Temporal pole function
- Mnemstudio
- Henry sisk y mario sverdlik aportaciones
- Distance in pragmatics
- What is temporal and spatial coherence
- Huesos etmoides
- Giro occipital superior
- Musculo de la masticacion
- Types of speciation
- Spatial summation
- Etapas para realizar una entrevista