Intel Itanium Architecture Intel Itanium Architecture Itanium Microprocessor

  • Slides: 66
Download presentation

Intel® Itanium® Architecture

Intel® Itanium® Architecture

Intel® Itanium® Architecture Itanium ���� Microprocessor ������ 64 ��� (IA-64) ������ Instruction ��� Itanium

Intel® Itanium® Architecture Itanium ���� Microprocessor ������ 64 ��� (IA-64) ������ Instruction ��� Itanium ��������� Instruction ��� RISC ��� CISC ��������� EPIC (Explicitly Parallel Instruction Computing) ��� Itanium ����������� IA-32 ��� (Backward Compatible) EPIC Design Philosophy

Registers Itanium ����� registers ����������� - General registers or Integer registers - Floating-point registers

Registers Itanium ����� registers ����������� - General registers or Integer registers - Floating-point registers - Predicate registers - Branch registers - Instruction point Register (IP) - Current frame marker(CFM) - User mask - Performance monitor data registers - Processor identifiers - Application registers

������ registers ��� Itanium l General registers or Integer registers General register ������� 128

������ registers ��� Itanium l General registers or Integer registers General register ������� 128 ��� (r 0 -r 127) ������� 64 bits ��� registers r 0 -r 31 ������� (Static) ��� register r 0 ������������ �� 0 ���� registers r 32 -r 127 ��������� (rotating registers) ����� software pipelining �������� register stack

������ registers ��� Itanium l Floating-point registers Floating-point register ������� 128 ��� (fr 0

������ registers ��� Itanium l Floating-point registers Floating-point register ������� 128 ��� (fr 0 fr 127) ������� 82 bits ����������������������� IEEE 754 double extended format numbers ��� register fr 0 -fr 31 ������ (Static) ��� register fr 0, fr 1 ������������ �� +0. 0, +1. 0 ���� register fr 32 -fr 127 ��������� (rotating registers) ���� software pipelining

������ registers ��� Itanium l Predicate Registers Predicate register ������� 64 ��� (pr 0

������ registers ��� Itanium l Predicate Registers Predicate register ������� 64 ��� (pr 0 -pr 63) ������� 1 bit ����� predicates ������� predication ��� register pr 0 -pr 15 ������� (Static) ������ pr 0 ������ 1 ���� register pr 16 -pr 63 ��������� (rotating registers) ����� software pipelining

������ registers ��� Itanium l l Branch Registers Branch register ������� 8 ��� (br

������ registers ��� Itanium l l Branch Registers Branch register ������� 8 ��� (br 0 -br 7) ������� 64 bits ������� Branches Instruction point Register (IP) Instruction point register ������� 1 ������ 64 bits �������� Bundle Address ������������

������ registers ��� Itanium l l l Performance monitor data registers ������ 64 bits

������ registers ��� Itanium l l l Performance monitor data registers ������ 64 bits ������������ ������� software ������������ software ���� Processor identifiers ������ 64 bits ����������� implementation-dependent ��� processor Application registers �������� Application Registers ������� 128 ������� 64 bits ������������

Multiple Execution Unit l Multiple Execution Unit Itanium ������������ parallel pipelines ��� parallel execution

Multiple Execution Unit l Multiple Execution Unit Itanium ������������ parallel pipelines ��� parallel execution ������� 4 ����� • I-unit ������������ , Shift-andadd , logical , compare ��� Integer multimedia instructions • M-unit ������ load/store ������� memory ��� register • B-unit ������ Branch instructions • F-unit ���������� floating-point

EPIC l (Explicitly Parallel Instruction Computing) EPIC (Explicitly Parallel Instruction Computing) ���� EPIC �����

EPIC l (Explicitly Parallel Instruction Computing) EPIC (Explicitly Parallel Instruction Computing) ���� EPIC ����� Compilers ������������ (Independent Instruction Sequences) ������������ Itanium ������� Registers ��������� Compilers ������������ rename registers ���������� operations �������

Modulo Scheduled Software-pipelined Loop ������� EPIC �������� modulo scheduled loop ������������ 3 ������� Prolog,

Modulo Scheduled Software-pipelined Loop ������� EPIC �������� modulo scheduled loop ������������ 3 ������� Prolog, Kernel, Epilog ��������� 3 ������� rotating registers, rotating predicates ���� loop branch control ��� rotating registers ������������ register stack �������������� � rotating predicates ����������� phase-in ������ Prolog ��� phase-out ������ Epilog ��� loop branch ��������������������������� user ���� compiler Modulo Scheduled Loops ������ 2 ����� loop branch ��� loop count(LC) ���������� Prolog ��� Kernel ������ ��� epilog count (EC)

Itanium Instruction Set Architectures l Itanium Instruction Set Architectures ��������� virtual memory ��� IA-32

Itanium Instruction Set Architectures l Itanium Instruction Set Architectures ��������� virtual memory ��� IA-32 ����������� 4 GB ������� IA-64 ����������� virtual memory ������� 16 terabytes ����������� 6 ������ Itanium Instruction ������������ execution unit type ���������

** PR = Predicate register, GR = General or Floating-point register **

** PR = Predicate register, GR = General or Floating-point register **

������ Instruction Set Architecture (ISA) Data type Itanium ������������ - Integer ���� 1, 2,

������ Instruction Set Architecture (ISA) Data type Itanium ������������ - Integer ���� 1, 2, 4 ��� 8 bytes - Floating-point ��� Single, Double ��� Double-extended format - Pointers ���� 8 bytes Itanium ������������ 64 bits ���� 8 bytes ������� Integer ����� 1, 2 ��� 4 bytes ������ register �������� 64 bits

Itanium Instruction Format Instruction ��� Itanium (IA-64) ������ Instruction ��� 3 operand ������� [

Itanium Instruction Format Instruction ��� Itanium (IA-64) ������ Instruction ��� 3 operand ������� [ (qp) ] mnemonic [. comp 1] [. comp 2] dest = srcs

Intel® Itanium® 2 Architecture ������������ Intel Itanium 2 ������������ Intel Itanium 2 �������� Intel

Intel® Itanium® 2 Architecture ������������ Intel Itanium 2 ������������ Intel Itanium 2 �������� Intel Itanium 1 ������������ Intel Itanium 1 ����

Intel Itanium Architecture Overview Itanium ��������� 64 bit �������������� Explicitly Parallel Instruction Computing (EPIC)

Intel Itanium Architecture Overview Itanium ��������� 64 bit �������������� Explicitly Parallel Instruction Computing (EPIC) ������� Itanium Architecture ����� 1. ������������� 32 bit 2. ������������� Floating Point ����� 3. Support Memory ������ Address ���� 64 bit 4. ������������ 32 bit �������������� 5. Support �������� Application ���� (High-end Application) ������� E-Business, Enterprise ��� Technical Computing ������ 1 ������������ Itanium

EPIC (Explicitly Parallel Instruction Computing) l Predication Branch prediction ������������� ����������� Branch����� ���������������������

EPIC (Explicitly Parallel Instruction Computing) l Predication Branch prediction ������������� ����������� Branch����� ���������������������

Basic Programming Model l Data Type Intel Itanium Architecture ������������ 1. Integer 1, 2,

Basic Programming Model l Data Type Intel Itanium Architecture ������������ 1. Integer 1, 2, 4 ��� 8 bytes 2. Floating-Point Single, Double ��� Double-extended format 3. Pointers : 8 bytes ������ 2 ������� data type

Intel Itanium Instruction Format l l Instruction ��� Itanium ������ instruction ��� 3 operand

Intel Itanium Instruction Format l l Instruction ��� Itanium ������ instruction ��� 3 operand ��������� [ (qp) ] mnemonic [. comp 1] [. comp 2] dests = srcs ������ Simple instruction add r 1 = r 2 , r 3 Predicated instruction (p 4) add r 1 = r 2 , r 3 Instruction with immediate add r 1 = r 2 , r 3 , 1 Instruction with completer cmp. eq p 3 = r 2 , r 4 Intel Itanium Instruction Format

Memory Organization Intel Itanium Architecture ������� single, uniform ��� linear address space ��� 264

Memory Organization Intel Itanium Architecture ������� single, uniform ��� linear address space ��� 264 Single Space ������������ Memory ���������� Uniform Space ������������ Address ����� Linear Space ������� Address Space ������� segment ������� 264 byte ����� Code ���������� Little-endian �� memory ��� data ����������� little-endian ����� ��� Itanium Architecture ������������ �� Big-endian ���������� Memory �������� register

Instruction Level Parallelism l Overview Intel Itanium architecture ����� instruction level parallelism (ILP) ������

Instruction Level Parallelism l Overview Intel Itanium architecture ����� instruction level parallelism (ILP) ������ - ����� compiler/Assembler ������ Parallelism ��� - ������� three-instruction-wide word ������ bundle ����������� parallel ������ - ������ register ��������� register ����� ���������� register ������ 3 �������������� parallel ��� Itanium

Instruction groups Intel Itanium instruction �������� instruction group ����������� read-after-write ���� writeafter-write ����� execute

Instruction groups Intel Itanium instruction �������� instruction group ����������� read-after-write ���� writeafter-write ����� execute �� ��� parallel ����� processor �� execute ������������ instruction group ������������ Instruction group �������� 1 instruction ����������� instruction group ������

Instruction groups and bundles Template field ����������� instruction group ����������� bundle �������� instruction group

Instruction groups and bundles Template field ����������� instruction group ����������� bundle �������� instruction group ��������� Instruction group 3 ���� instruction group A ��� C ���������� bundle ��� instruction group B ����������� Bundle ������ 5 ������ template field �� instruction group

Register Intel Itanium architecture ���� register ������������������ - General register 128 ��� - Floating-point

Register Intel Itanium architecture ���� register ������������������ - General register 128 ��� - Floating-point register 128 ��� - Predicate register 64 ��� - Branch register 8 ��� - Application register 128 ��� Register �������� mnemonic ����������� ��� register ������ 6 Register ������� Itanium ���� general register 32 ����� r 32

Register l Floating-point register Itanium architecture ������� 128 �������� 82 bit ���������� floating point

Register l Floating-point register Itanium architecture ������� 128 �������� 82 bit ���������� floating point Floating-point register ������������� global ������� - static floating-point register 32 ��� - rotating floating-point register 96 ������� software pipeline register 2 ������������� (fr 0 ��� fr 1) ����� - fr 0 ������� +0. 0 - fr 1 ������� +1. 0 ��� register �������������� 3 ����� 1. �� 64 bit ���� significant field 2. �� 17 bit ���� exponent field 3. �� 1 bit ���� sign field

Register l Predicate register ���� 64 ��������� 1 bit ����������� Execute ��������� Predicate register

Register l Predicate register ���� 64 ��������� 1 bit ����������� Execute ��������� Predicate register ����� 2 ���� 1. validating ���� invalidating ������ 2. ��� branch �� if/then/else logic block ���� register ���������� - static predicate register 16 ��� - rotating predicate register 48 ������������ Software pipeline ����� Predicate register ������ (pr 0) ������������� 1 ����

Register l Branch register ���� 8 ��������� 64 bit Branch register �������������� Branch �����

Register l Branch register ���� 8 ��������� 64 bit Branch register �������������� Branch ����� branch ��� indirect Branch register ����� call/return l Application register ���� 128 ��������� 64 bit ������ Register �������������� function ��������� Application register �������������� assembly ���� ar 66 ����� Epilogue Counter (EC) ������ code assembly ���� ar. ec �������

Branch Handling Branching in the Intel Itanium Architecture �� branch ��� 2 ��� 1.

Branch Handling Branching in the Intel Itanium Architecture �� branch ��� 2 ��� 1. Relative direct branches ����� displacement ���� 21 bit ������� instruction pointer ��� Bundle ������ branch ����� 2. Indirect branch ����� address ���� 64 bit �� branch register ������� indirect branches ������ 9 Branch Register

Reduced Memory Access Costs l Eliminating memory accesses ������ Register ������ Itanium architecture ������������

Reduced Memory Access Costs l Eliminating memory accesses ������ Register ������ Itanium architecture ������������ ���� memory ������������ �� memory ���������� Memory ������� load, store

Reduced Memory Access Costs l Hiding memory latency Itanium architecture ����� memory latency �����������

Reduced Memory Access Costs l Hiding memory latency Itanium architecture ����� memory latency ����������� speculative load �� code ����� memory latency ������������� parallel ��������� speculation load ����� error �������������� ����� error memory latency ������ 11 ������� hidden ���� error Speculative load ���������� memory ��������������

Floating Point and Multimedia Overview Itanium architecture ���� floating point ��������� IEEE floating point

Floating Point and Multimedia Overview Itanium architecture ���� floating point ��������� IEEE floating point ������ single, double ��� double-extension format Itanium ������� multimedia ���� data-parallel application ������� - ����� SIMD ��� integer ����� MMX technology - ����� SIMD-FP ��� floating point ����� IA-32 streaming SIMD Extensions

Floating Point and Multimedia l Intel Itanium Architecture Floating point features Floating point feature

Floating Point and Multimedia l Intel Itanium Architecture Floating point features Floating point feature ������������ floating point ��� itanium architecture ������ feature ����������� - floating point register 128 ��� - ���� multiply and accumulate (fma) ������ operand������ floating point register ���������� 4 ��� (f=a * b +c) ������������ cycle ������� - ������ load ���� store ��� memory �������� memory ���� floating point

Floating Point and Multimedia - ����������� floating point register ����� general register ��� -

Floating Point and Multimedia - ����������� floating point register ����� general register ��� - Register ������������ speculation �� floating operation - ���������� integer ������ floating point ����������� - ������ rotate floating point register ���������� Itanium ������������ workstation ��� 3 D application

Floating Point and Multimedia l Multimedia support Itanium Architecture ������ integer ��� floating point

Floating Point and Multimedia l Multimedia support Itanium Architecture ������ integer ��� floating point multimedia integer multimedia ��������������� general register ������ 8 x 8 , 4 x 16 ���� 2 x 32 bit element ��������������� data element ������ Itanium architecture ���� MMX technology ���������� 3 ������ 1. Addition, Subtraction ��� Multiplication 2. Shift ������ signed ��� unsigned ������ 12 ���� register 3. Pack ��� Unpack ��������������� element ������� Floating point multimedia

82 bit floating point register Itanium Architecture ����� floating point register 128 ������ 82

82 bit floating point register Itanium Architecture ����� floating point register 128 ������ 82 bit ����������� ��� floating point ������ 80 bit ������������ 82 bit ������������ ��� square root ������ software ������������ hardware ������ software ������� square root ��������� 80 bit ��� format ��� IEEE ������ floating point multimedia operation ������ floating point register �����������

������� l l l Intel® Itanium® 2 Processor Hardware Developer's Manual http: //www. intel.

������� l l l Intel® Itanium® 2 Processor Hardware Developer's Manual http: //www. intel. com/design/itanium 2/manuals/25110901. pdf Intel® Itanium® Processor Family Reference Guide: IA-32 Execution Layer - http: //www. intel. com/design/itanium/downloads/254318. htm Intel® Itanium® Architecture - Volume 1: Application Architecture, Revision 2. 2 - http: //www. intel. com/design/itanium/manuals/245317. htm Intel® Itanium® Architecture - Volume 2: System Architecture, Revision 2. 2 - http: //www. intel. com/design/itanium/manuals/245318. htm Intel® Itanium® Architecture - Volume 3: Instruction Set Reference, Revision 2. 2 - http: //www. intel. com/design/itanium/manuals/245319. htm