Cache and Scratch Pad Memory SPM Nasibeh Teimouri

Cache and Scratch Pad Memory (SPM) Nasibeh Teimouri Embedded System Lab. (ESL) Department of Electrical and Computer Engineering Northeastern University, Boston (MA), USA

Memory Wall • The growing disparity of speed between CPU and memory outside the CPU chip – Bandwidth wall: limited communication bandwidth beyond chip boundaries – Solution • Memory hierarchy – Registers – Different levels of caches and/or SPM – Main memory 2

Outline • Cache • Benefits and downsides • Design alternatives for cache • Cache locking • Scratch Pad Memory (SPM) • SPM vs. Cache • Organization • Characteristics • Improvements 3

Cache • Cache as the primary solution for memory wall • Small memories on or close to the CPU • > 50% of chip area + Faster than main memory due to being transparent to the SW • 1~2 cycle access latency and imp + Improved average execution time - Unpredictable Worst Case Execution Time (WCET) - Power hungry • 25% to 45% of total chip power for caches and peripherals in embedded systems CPU Internal ROM Internal SRAM External DRAM 4

Outline • Cache • Benefits and downsides • Design alternatives for cache • Cache locking • Scratch Pad Memory (SPM) • SPM vs. Cache • Organization • Characteristics • Improvements 5

Design Alternatives for Cache with High WCET Predictability ü Solution 1: cache locking • Control over the cache contents by the SW - Power hungry caches and peripherals - Cache pollution ü Solution 2: Scratch Pad Memory (SPM) • Small and fast on chip static RAMs mapped to the processor’s address space at a predefined address range • Static : SPM locations don’t change at runtime • Dynamic : SPM locations don’t change at runtime • Data allocation under SW control • Profiling to estimate reuse and put reused data in SPM + Low power dissipation due to simple circuitry controlled by the SW 6

Outline • Cache • Benefits and downsides • Design alternatives for cache • Cache locking • Scratch Pad Memory (SPM) • SPM vs. Cache • Organization • Characteristics • Improvements 7

SPM Organization vs. Cache Organization • No need for address translation – The SMP memory array is statically addressed • No need for availability checking • The comparator and the signal • Similarity between cache and SPM miss/hit acknowledging • Connected to the same address and data buses circuitry • Access latency of 1 ~ 2 processor cycle 8

SPM Characteristics vs. Cache Characteristics • Cache • Indirect, hardware-managed addressing • Inefficient, cache line based storage • SPM • Not globally addressable • Not globally visible 9

SPM Improvement over Cache • ATMEL board • CACTI model • Same size of Cache and SPM • Different applications including bubble sort, quick sort, etc. with different size from 64 byte to 2048 byte • On average – Energy improvement by 40% – Performance improvement by 18% – Area improvement by 33% 10

References • Rajeshwari Banakar, Stefan Steinke, Bo-Sik Lee, M. Balakrishnan, and Peter Marwedel. 2002. Scratchpad memory: design alternative for cache on-chip memory in embedded systems. In. Proceedings of the tenth international symposium on Hardware/software codesign (CODES '02). ACM, New York, NY, USA, 73 -78. DOI=http: //dx. doi. org/10. 1145/774789. 774805 • Einstein, A. , B. Podolsky, and N. Rosen, 1935, “Can quantum-mechanical description of physical reality be considered complete? ”, Phys. Rev. 47, 777 -780. • Rakesh Komuravelli, Matthew D. Sinclair, Johnathan Alsop, Muhammad Huzaifa, Maria Kotsifakou, Prakalp Srivastava, Sarita V. Adve, and Vikram S. Adve. 2015. Stash: have your scratchpad and cache it too. SIGARCH Comput. Archit. News 43, 3 (June 2015), 707 -719. DOI=http: //dx. doi. org/10. 1145/2872887. 2750374 • http: //www. ann. ece. ufl. edu/courses/eel 6935_08 spr/student_talks/scratch_pad_rajani_tilottama. ppt 11

Thank you! 12