Multithreaded RTOS How Multithreading can increase onchip parallelism

  • Slides: 14
Download presentation
Multi-threaded RTOS How Multi-threading can increase on-chip parallelism

Multi-threaded RTOS How Multi-threading can increase on-chip parallelism

Outline l l l l Introduction Multi-threading models Architectures of multi-threaded processors Simultaneous multi-threading

Outline l l l l Introduction Multi-threading models Architectures of multi-threaded processors Simultaneous multi-threading and multiprocessors Cache design Examples of Multi-threaded environments Conclusions

Introduction l Two forms of parallelism § § instruction-level parallelism (ILP) thread-level parallelism (TLP)

Introduction l Two forms of parallelism § § instruction-level parallelism (ILP) thread-level parallelism (TLP) § l l l Both identify independent instructions that can execute in parallel Wide-issue superscalar processors exploit ILP by executing multiple instructions from a single program in a single cycle. Multiprocessors exploit TLP by executing different threads in parallel on different processors. The first multi-threaded processor approaches in the 1970 s and 1980 s applied multi-threading at user-thread-level to solve the memory access latency problem.

Introduction l Motivations for multi-threaded processor architecture development include chip area , cost and

Introduction l Motivations for multi-threaded processor architecture development include chip area , cost and complexity. l Simultaneous Multi-threading (SMT), Single chip multiprocessing (CMP), SMT VLIW architecture, Multithreaded Vector (SMV) architecture l l DSP applications inherently benefit from the following architectural characteristics: § § § Parallelization at multiple levels of hierarchy: - Instruction - separate instruction memory space - Data – separate date memory space - Thread- multiple functional units - Data transfer – multiple wide data buses

Vertical and Horizontal Waste l l Vertical waste is introduced when the processor issues

Vertical and Horizontal Waste l l Vertical waste is introduced when the processor issues no instructions in a cycle Horizontal waste when not all issue slots can be filled in a cycle.

Vertical and Horizontal Waste

Vertical and Horizontal Waste

Multi-threaded Models l Fine-Grain Multithreading l l SM: full Simultaneous Issue l l Only

Multi-threaded Models l Fine-Grain Multithreading l l SM: full Simultaneous Issue l l Only one thread issues instructions each cycle, but it can use the entire issue width of the processor. Single Dual Four SM: limited Connection l l Hardware context is connected directly one of each type of functional units. Less dynamic

Performance

Performance

SMT VLIW Architecture

SMT VLIW Architecture

Simultaneous Vector Multi-threaded Architecture (SVMT)

Simultaneous Vector Multi-threaded Architecture (SVMT)

SMT vs. Multiprocessing

SMT vs. Multiprocessing

Cache design

Cache design

Examples Multi-threaded RTOS l Analog Devices VDK l u. Clinux l The RTXC Quadros

Examples Multi-threaded RTOS l Analog Devices VDK l u. Clinux l The RTXC Quadros RTOS l RTCX/ss l RTXC/ss l Thread. X

Conclusions l l l A simultaneous multithreaded architecture is superior in performance to a

Conclusions l l l A simultaneous multithreaded architecture is superior in performance to a multiple-issue multiprocessor (multi-issue CMP). SMT boost utilization by dynamically scheduling functional units among multiple threads. SMT also increases hardware design flexibility. Simultaneous multithreading increases the complexity of instruction scheduling. Increased parallelism offered makes multi-threading ideal for DSP applications where each application can run as a separate thread.