Next Generation ISA Itanium IA64 Operating Environments IA32

  • Slides: 18
Download presentation
Next Generation ISA Itanium / IA-64

Next Generation ISA Itanium / IA-64

Operating Environments • IA-32 Protected Mode/Real Mode/Virtual Mode - if supported by the OS

Operating Environments • IA-32 Protected Mode/Real Mode/Virtual Mode - if supported by the OS • IA-64 Instruction Set

Instruction Set Transition Model Overview • The Processor can execute IA-32 or IA-64 based

Instruction Set Transition Model Overview • The Processor can execute IA-32 or IA-64 based upon the instruction set which can be switched by the following instructions. • Jmpe (IA-32 inst) - jump to an IA-64 target instruction and change the IS to IA-64 • br. ia (IA-64 inst) - branch to IA-32 target instruction and change the IS to IA-32 • rfi (IA-64) - “return from interruption” to return to IA-32 or IA-64 instruction.

IA-64 Instruction Set Features. • Explicit Parallelism • Features to enhance instruction level parallelism.

IA-64 Instruction Set Features. • Explicit Parallelism • Features to enhance instruction level parallelism. - Speculation (minimizes mem latency impact) - Predication - Software pipelining of loops • Improved High performance floating-point architecture • new multimedia instructions • 2 bit cache hint field for placement of cache lines in cache hierarchy which is encoded by the compiler.

IA-64 Instruction Set Features • Register Stack- avoids unnecessary spilling and filling of registers

IA-64 Instruction Set Features • Register Stack- avoids unnecessary spilling and filling of registers at procedure calls and return interfaces through compiler controlled renaming. The callee execs an ‘alloc’ instruction specifying the no. of registers it expects to use. • Register Rotation - allows concurrent exec of multiple iterations of loops. • Multimedia Support and streaming SIMD Extensions alongwith newer MMX instructions, IA-64 multimedia instructions treet general registers as concatenations of 8 bit, 16 -bit or two 32 -bit elements.

Architectural Overview.

Architectural Overview.

IA-64 Execution Environment • Application Register State (registers available to application programs) -128 general

IA-64 Execution Environment • Application Register State (registers available to application programs) -128 general purpose 64 -bit registers (+1 bit Na. T). -IA-32 integer and segment registers are contained in GR 8 -GR 31 when in IA-32 mode. -128 82 bit floating point registers (+1 bit. Na. TVal) (IA-32, same as GPR) -64 1 -bit predicate registers. -8 64 -bit branch registers (used for IA-64 branching) -64 bit IP

IA-64 Execution Environment -38 bit Current Frame Marker (CFM) - state that describes the

IA-64 Execution Environment -38 bit Current Frame Marker (CFM) - state that describes the current general register stack frame. -128 64 -bit Application Registers. - Special purpose IA-64 and IA-32 application registers. -64 bit Performance Monitor Data Registers -monitor performance of hardware. -6 bit User Mask - independent single bit values used for performance monitors, alignment traps and to monitor FPR usage. -Processor Identifiers (CPUID) registers that describe processor implementation dependant IA-64 features.

IA-64 Execution Environment • Memory -Memory is addressed with 64 -bit pointers -Memory can

IA-64 Execution Environment • Memory -Memory is addressed with 64 -bit pointers -Memory can be accessed in units of 1, 2, 4, 8, 16 bytes. -User mask controls whether loads /stores use little-endian or big-endian byte ordering of IA-64 references. • Instruction Encoding and sequencing.

Application Programming Model

Application Programming Model

Using IA-64 Instructions • Format [qp]mnemonic. [comp] dest=srcs • Expressing Parallelism ld 8 r

Using IA-64 Instructions • Format [qp]mnemonic. [comp] dest=srcs • Expressing Parallelism ld 8 r 1=r 5 ; ; //first group add r 4=r 5, r 6 sub r 5=r 7, r 8 st 8 r 6=r 12 //second group • Bundles and Templates. Bundle boundaries enclosed in curly braces and contains template specification and 3 instructions

IA-64 Optimizations • Refer IA-64 App Developers Architecture Guide for details of : -Memory

IA-64 Optimizations • Refer IA-64 App Developers Architecture Guide for details of : -Memory Ref -Predication, Control Flow, Instruction Stream -Software pipelining and loop support -Floating Point Applications

The Technology Behind Crusoe™ Processors Low-power x 86 -Compatible Processors Implemented with Code Morphing™

The Technology Behind Crusoe™ Processors Low-power x 86 -Compatible Processors Implemented with Code Morphing™ Software

Technology Overview • First Practical Demonstration that a microprocessor can be implemented as a

Technology Overview • First Practical Demonstration that a microprocessor can be implemented as a hardware/software hybrid. • Hardware Engine logically surrounded by software layer. • VLIW engine executing 4 instructions /clock cycle • Software layer surrounding the CPU called the Code Morphing Software dynamically morphs x 86 instructions into native VLIW instructions. • Code Morphing support built into underlying hardware. • Offers opportunity to improve performance without changing underlying hardware.

Hardware Support for Code Morphing 128 bit Molecule FADD Floating point unit ADD Integer

Hardware Support for Code Morphing 128 bit Molecule FADD Floating point unit ADD Integer ALU #0 LD Load/store Unit BRCC Branch unit

Code Morphing Software • Code Morphing Software simplifies chip hardware. • Less Hardware ,

Code Morphing Software • Code Morphing Software simplifies chip hardware. • Less Hardware , consumes low power, lower hear dissipation. • Complemented by code morphing software change, hardware engine’s native instruction set can be changed arbitrarily without affecting any x 86 software. • Transparent re-compilation and optimization of software. • Translations are performed over group of instructions once and used repeatedly.

Code Morphing Software • For repeating blocks of code, the code morphing s/w uses

Code Morphing Software • For repeating blocks of code, the code morphing s/w uses the translations from the translation buffer while optimizing the block further. • Execution modes for x 86 code range from interpretation to translation using very simple code generation to highly optimized code. • Translator adds code to collect information about block execution and branch history.

Hardware Support for Code Morphing • Exceptions and speculations. -Shadow registers. -gated store buffer.

Hardware Support for Code Morphing • Exceptions and speculations. -Shadow registers. -gated store buffer. -due to hardware implementation , commit operations are ‘free’ -load/store has an alias hardware load and protect. Store store under alias mask. -translated bit feature to handle self modifying code.