Open VMS Technical Update Days September 22 nd
Open. VMS Technical Update Days September 22 nd, 2003 Bad Homburg, Germany Dr. Herbert Cornelius Intel EMEA Intel® Itanium® Architecture Technical Overview The Enterprise Architecture for the next Decade 1
Intel® Itanium® Architecture Intel Confidential Agenda Enterprise Computing Trends Intel® Itanium® Architecture Enterprise System Platforms Intel Software Tools *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 2
Intel® Itanium® Architecture Intel Confidential Enterprise Computing Areas Science Engineering *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. Business 3
Intel® Itanium® Architecture Intel Confidential RISC Some Enterprise Computing History 1960 s 1970 s Proprietary Solutions 1980 s 1990 s 2000 s Solutions based on Building Blocks using Industry Standard *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 4
Intel® Itanium® Architecture Intel Confidential Designed for Enterprise Computing ARCHITECTU RE *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 5
Intel® Itanium® Architecture Intel Confidential A new Architecture for Enterprise Computing New Architectural features Enhanced reliability features • • EPIC Predication Speculation Enhanced floating point performance • Massive Resources • 64 -bit instruction set, registers & addressing RISC Technology IA-32 Enterprise class OS CISC Technology *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 6
Intel® Itanium® Architecture Intel Confidential Why Intel? *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 7
Intel® Itanium® Architecture Intel Confidential Economy of Scale Eco-System Investment Performance Memory Costs Solution Costs Intel® Architecture RISC *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 8
Intel® Itanium® Architecture Intel Confidential Driving The Change in Computing Economics 10 GHz 1 Billion Transistors ~2007 (est. ) Enabling Peta-Flop Computing Moore’s Law will continue for the next 10 Years www. intel. com/research/silicon *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 9
Intel® Itanium® Architecture Intel Confidential Driving Performance Vectors • Silicon Process • Density • Frequency • Manufacturing • Micro-Architecture • Execution Units, Caches • Threading • Memory Subsystem • I/O-Subsystem • System Architecture *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. • Compilers • Libraries • Tools • ISVs 10
Intel® Itanium® Architecture Intel Confidential *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 11
Intel® Itanium® Architecture Intel Confidential Fundamental Architecture Challenges • • • Sequentiality inherent in traditional architectures Complex hardware needed to (re)extract ILP Limited ILP available within basic blocks Branches make extracting ILP difficult Memory dependencies further limit ILP Increasing latency exacerbates ILP need Limited resources : A fundamental constraint Shared resources create more overhead Loop ILP extraction costs code size And the challenges continue. . . Itanium® Architecture overcomes these fundamental challenges! *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 12
Intel® Itanium® Architecture Intel Confidential Characteristics of High-End Processors High-end Processors Require Significant Resources, Capabilities 3 MB 6 MB on-die cache Up to 1. 5 MB on-die cache 8 Issue ports 72 Registers Source: IBM. com IBM Power* 4 1. 75 MB on-die cache 152 Registers 264 Registers 11 Issue ports 4 -6 Issue ports Source: HP. com Alpha EV 7* 4 Source: Intel Itanium® 2 Processor 96 Registers Issue ports 96 k** on-die cache Source: Sun. com** Sun US III* **CPU connects to external 8 MB L 2 cache *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 13
Intel® Itanium® Architecture Intel Confidential Intel® Itanium® Processor Family 2001 2002 2003 2004 Madison 9 M** (Madison**) 2005 Montecito** common platform **codename 800 MHz 4 MB L 3 -Cache 460 GX Chip-set OEM Chip-sets 180 nm 1 GHz 3 MB i. L 3 -Cache E 8870 Chip-set OEM Chip-sets 180 nm 1. 5 GHz 6 MB i. L 3 -Cache E 8870 Chip-set OEM Chip-sets 130 nm >1. 5 GHz 9 MB i. L 3 -Cache E 8870 Chip-set OEM Chip-sets 130 nm >>1. 5 GHz larger L 3 -Cache Enhanced Dual-Core E 8870 Chip-set OEM Chip-sets 90 nm All features and dates specified are targets provided for planning purposes only and are subject to change *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 14
Intel® Itanium® Architecture Intel Confidential Performance Advantage over RISC 1322 2119 http: //www. intel. com/ebusiness/products/itanium/index. htm as of 06/30/2003 *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 15
Intel® Itanium® Architecture Intel Confidential Itanium® Processor Architecture Selected Features • • • • Instruction Level Parallelism (6 -way) Large Register Files Automatic Register Stack Engine Predication Software Pipelining Support with Loop Control Hardware Register Rotation Sophisticated Branch Architecture Control & Data Speculation Powerful 64 -bit Integer Architecture Advanced 82 -bit Floating Point Architecture Multimedia Support (MMX™ Technology) 64 -bit Addressing Flat Memory Model IA-32 Binary Execution Support *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 16
Intel® Itanium® Architecture Intel Confidential Itanium® 2 Processor Block Diagram i. L 3 cache 3 -6 MB (24 -way 128 B CL) (schematic overview) *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 17
Intel® Itanium® Architecture Intel Confidential Itanium® 2 Memory Cache Hierarchy Itanium® 2 Processor (1. 5 GHz) Memory (Controller) 6. 4 GB/s ~150 CLKS L 3 -Cache 3 -6 MB 128 B CL 24 -way 14 -17 CLKS 48 GB/s L 2 -Cache 48 256 KB 128 B CL GB/s 8 -way 5 -7 CLKS 48 GB/s L 1 I 16 KB 64 B CL 1 CLK L 1 D 16 KB 64 B CL 1 CLK 3 -level caching on Itanium® Architecture • 1 st level cache optimized for latency • 2 nd level cache optimized for bandwidth • 3 rd level cache optimized for size • all integrated, non-blocking caches at full CPU frequency *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 18
Intel® Itanium® Architecture Intel Confidential Itanium® 2 Pipelines FPU Core FP 1 FP 2 FP 3 FP 4 WB IPG ROT EXP REN REG EXE DET WB L 2 N L 2 I L 2 A L 2 M L 2 D L 2 C L 2 W IPG IP Generate, L 1 I Cache (6 inst) and TLB access EXE ALU Execute(6), L 1 D Cache and TLB access + L 2 Cache Tag Access(4) ROT Instruction Rotate and Buffer (6 inst) DET Exception Detect, Branch Correction EXP Expand, Port Assignment and Routing WB Writeback, Integer Register update REN Integer and FP Register Rename (6 inst) FP 1 -WB FP FMAC pipeline (2) + reg write REG Integer and FP Register File read (6) L 2 N-L 2 I L 2 Queue Nominate/Issue (4) L 2 A-W L 2 Access, Rotate, Correct, Write (4) Short 8 -stage in-order main pipeline – In-order issue, out-of-order completion – Reduced branch misprediction penalties – Fully interlocked, no way-prediction or flush/replay mechanism Pipelines are designed for very low latency *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 19
Intel® Itanium® Architecture Intel Confidential Large Register Set Integer Registers Na. T 63 GR 0 GR 1 0 0 GR 31 GR 32 F. P. Registers FR 0 FR 1 81 Branch Registers 0 + 0. 0 + 1. 0 BR 0 63 BR 7 FR 31 FR 32 0 Predicate Registers PR 0 1 PR 15 PR 16 PR 63 GR 127 FR 127 32 Static 96 Framed, Rotating 96 Rotating *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 16 Static 48 Rotating 20
Intel® Itanium® Architecture Intel Confidential Parallel Execution Units fully pipelined Itanium® 2 Issue Ports/Units F. P. MAC F. P. ALU/INT/MM Integer ALU/INT/MM ALU/MM/MEM Multimedia Load/Store BRANCH Branch *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 21
Intel® Itanium® Architecture Intel Confidential EPIC (Explicit Parallel Instruction Computing) Source Code Compiler Michael S. Schlansker, B. Ramakrishna Rau: EPIC: Explicit Parallel Instruction Computing; IEEE Computer, February 2000, pp. 37 -45 Instruction Groups (series of bundles) Instruction Bundles (3 Instr. each, 128 bit wide) Instructions Up to 6 instructions executed per clock *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 22
Intel® Itanium® Architecture Intel Confidential Itanium® 2 Dispersal Matrix MII MLI MMI MFI MMF MIB MBB BBB MFM MII MLI MMI MFI MMF MIB* MBB BBB MMB* MFB* * hint in first bundle Possible Itanium® 2 full issue Possible Itanium® processor and Itanium® 2 full issue Itanium® 2 allows more compiler dispersal options *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 23
Intel® Itanium® Architecture Intel Confidential Itanium® Floating-Point Architecture High Performance and High Precision • Dual Fused Multiply-Add Operation (FMA) - An efficient core computation unit • Abundant Register resources - 128 registers (32 static, 96 rotating) • High Precision Data computations - 82 -bit unified internal format for all data types • Software divide/square-root - High throughput achieved via pipelining Floating-Point: High Performance and High Precision *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 24
Intel® Itanium® Architecture Intel Confidential Predication Control Flow to Data Flow Traditional Arch. cmp if br else Itanium® Architecture cmp p 1, p 2 p 1 p 2 br then § 64 predicate registers § Can be combined with logical ops Removes/Reduces Branches and Enables Parallel Execution *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 25
Intel® Itanium® Architecture Intel Confidential Software Pipelining Sequential Loop Software-Pipelined Loop Time load compute store • Traditional architectures use loop unrolling – Results in code expansion and increased cache misses • Itanium®-Processor Software Pipelining uses rotating registers – Allows overlapping execution of multiple loop instances • Predication controls the pipeline stages *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 26
Intel® Itanium® Architecture Intel Confidential Software Pipelining (cont. ) Loop Iteration stage 1 stage 2 stage 3 stage 4 Special Loop control and branch registers, also usable for WHILEloops Predicate registers rotate as well and define the pipeline stages *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 27
Intel® Itanium® Architecture Intel Confidential Register Rotation • • GR 32 -127 and FR 32 -127 can rotate (specified range) Separate rotating register base for each set (GR, FR) Loop branches decrement all register rotating bases (RRB) Instructions contain a “virtual” register number – physical register # = RRB + virtual register # i=0 i=1 i=2 i=3 i=4 i=5 i=6 i=7 same phy. reg. Predicate register range also rotates. *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. diff. virtual number 28
Intel® Itanium® Architecture Intel Confidential Control & Data Speculation instr. 1 instr. 2 branch Barrier ld r 1= use = r 1 Control Speculation moves loads above branches / calls instr. 1 instr. 2 st[? ] Barrier ld r 1= use = r 1 Data Speculation moves loads above possibly conflicting stores Speculation reduces the impact of memory latency *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 29
Intel® Itanium® Architecture Intel Confidential Advanced Load Address Table: ALAT • ld. a inserts entries • Conflicting stores remove entries – also ld. c. clr, chk. a. clr • Presence of entry indicates success – chk. a branches when no entry is found ld. a reg# = chk. a reg# ? reg# addr : : reg# addr *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. st[addr] 30
Intel® Itanium® Architecture Intel Confidential Itanium® vs. Itanium® 2 Assembly Code. b 1_2: {. mmf } { } { . b 1_2: {. mfi (p 16) ldfd (p 19) fma. d f 37=[r 8], 8 f 45=[r 3], 8 f 52=f 40, f 48, f 0 ; ; . mmi (p 16) ldfd nop. i f 32=[r 33] f 40=[r 2], 8 0 ; ; . mfi (p 23) stfd (p 20) fma. d nop. i [r 40]=f 51 f 48=f 36, f 44, f 53 0 . mib (p 16) add r 32=8, r 33 nop. i 0 br. ctop. sptk. b 1_2 ; ; } 3 clockticks on Itanium (p 16) ldfd f 43=[r 8], 8 (p 19) fma. d f 51=f 46, f 50, f 0 nop. i 0 } {. mmf (p 16) ldfd f 47=[r 3], 8 (p 23) stfd [r 32]=f 56 (p 21) fma. d f 54=f 37, f 42, f 53 ; ; } {. mii (p 16) ldfd f 32=[r 33] nop. i 0 } {. mmb (p 16) ldfd f 37=[r 2], 8 (p 16) add r 32=8, r 33 br. ctop. sptk. b 1_2 ; ; } 2 clockticks on Itanium 2 ! *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 31
Intel® Itanium® Architecture Intel Confidential A simple Example. . double precision, dimension(10000) : : a, b, c, d do i=1, 10000 a(i)=a(i)*b(i)+c(i)*d(i) enddo. . • DAXPY like loop over floating-point vectors • can be optimized differently for Itanium® and Itanium® 2 *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 32
Intel® Itanium® Architecture Intel Confidential MP/DP CAPABLE Itanium® 2 Processor 1 GHz 3 MB i. L 3 cache Processor (Madison**) 1. 5 GHz 6 MB i. L 3 cache Processor (Madison 9 M**) >1. 5 GHz 9 MB i. L 3 cache **codename Montecito** Tanglewood** Dual Core High Frequency Large Caches per core Multi Core Montecitobased Low Voltage Future Processors Low Voltage DP-ONLY 180 nm 2002 Low Voltage Itanium® 2 Processor (Deerfield**) 1 GHz 1. 5 MB i. L 3 cache 62 W Processor (Deerfield** follow-on) 130 nm 2003 90 nm 2004 2005 future Potential Enhancements: faster FSB/Links and optimized market segment SKUs All features and dates specified are targets provided for planning purposes only and are subject to change *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 33
Intel® Itanium® Architecture Intel Confidential New Intel® Itanium® 2 Processors 3 rd Generation Itanium® Architecture Processor 130 nm Process, 410 M Transistors Up to 1. 5 GHz Frequency 6 GFLOPS DP-F. P Peak Performance 6 MB integrated L 3 -Cache (48 GB/s) 100% Software Binary Compatible Pin-Compatible to Itanium® 2 Processor Same Thermal Envelope ~1. 3 -1. 5 x faster than Itanium® 2 1 GHz/3 MB *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 34
Intel® Itanium® Architecture Intel Confidential Available Intel® Itanium® 2 Processors widening the deployment areas Max. Performance Best $/FLOP 1. 5 GHz, 6 MB i. L 3 Cache 1. 4 GHz, 4 MB i. L 3 Cache 1. 3 GHz, 3 MB i. L 3 Cache 1. 4 GHz, 1. 5 MB i. L 3 Compute Optimized DP 1. 0 GHz, 1. 5 MB i. L 3 Cache DP Low-Voltage *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. Best FLOP/Watt 35
Intel® Itanium® Architecture Intel Confidential Itanium® 2 Reliability Features **codename *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 36
Intel® Itanium® Architecture Intel Confidential Potential Future Directions All features and dates specified are targets provided for planning purposes only and are subject to change *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 37
Intel® Itanium® Architecture Intel Confidential Itanium® Architecture Systems High-end Itanium® 2 -based systems … >2 X more than Itanium ! Wide range of choice, e. g. 1 P/2 P WS Shipping 2 CPUs Shipping 4 CPUs Shipping up to 128 CPUs 2003 (not drawn to scale) *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 38
Intel® Itanium® Architecture Intel Confidential Itanium® 2 -based Servers Bringing High-End Data Center Capabilities to Intel® Architecture Scalable to High-End Multi-Processing 32 P+ SMP systems 512 P+ Clustered configurations High-End RAS Intelligent Platform Management, Hardware redundancy for Fault-Tolerance, Modular and Hot-Plug Capabilities Large Memory Capacity Ex. 4 P node w/48 GB 512 P+ system w/512 GB e. g. Partitioning Multiple System Images Static/Dynamic Domains High-Bandwidth, Flexible I/O Large Qty PCI-X slots Dual Gb. E LAN Ultra 320 SCSI Remote I/O capabilities (Selected examples of some high-end OEM platform capabilities. Not all capabilities found on all platforms) *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 39
Intel® Itanium® Architecture Intel Confidential Performance Scaling Scale Right Scale-Out (Cluster) Scale-Up (SMP, cc. NUMA) *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 40
Intel® Itanium® Architecture Intel Confidential OSV Support for Itanium® Architecture available Open. VMS™, Non. Stop™ Kernel, Converged Enterprise UNIX* available Port to Itanium Architecture underway *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 41
Intel® Itanium® Architecture Intel Confidential IA-32 Execution Layer IPF code IA-32 EL Native IPF Hardware Today IA-32 H/W Native IPF Hardware IA-32 H/W Future Enables Increased Utilization of Itanium® Architecture Features *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 42
Intel® Itanium® Architecture Intel Confidential Software Technologies Comprehensive Software Toolset Windows* and Linux*, IA-32 and Itanium® Compilers (C/C++, F 77/F 95) Performance Libraries (MKL, IPP) Performance Analyzer (VTune) Threading Tools Intel® Developer Services (IDS) Intel® Early Access Program (EAP) *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 43
Intel® Itanium® Architecture Intel Confidential Software Development Tools SW Products Developer Services Compilers Performance Libraries VTune™ Performance Analyzer Intel® Threading Tools www. intel. com/ids *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 44
Intel® Itanium® Architecture Intel Confidential IA-optimized Managed Runtime Choices • Windows* Server 2003 • framework for Itanium® processor family Framework includes – CLR – Base class – Libraries – ADO. NET – ASP. NET – Windows Forms • BEA* Web. Logic* and JRockit* JVM for Itanium® Processor Family – Shipped Technology Preview on Windows*. NET Server 2003 – Limited Availability on Red Hat* Linux 11/7/02 – GA for both Windows*. NET Server 2003 and Red Hat: Q 1’ 03 *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 45
Intel® Itanium® Architecture Intel Confidential Itanium® Software Solution Support High-End Enterprise Applications (Databases, Business Intelligence, ERP / SCM) (available or ongoing) *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 46
Intel® Itanium® Architecture Intel Confidential Summary The Economics of Enterprise Computing are changing. Intel® Itanium Architecture addresses all needs of Enterprise Computing. Intel is playing a key role in accelerating Enterprise Solutions with technology leadership. *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 47
Intel® Itanium® Architecture Intel Confidential Technology Leadership www. intel. com *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 48
Intel® Itanium® Architecture Intel Confidential Madison** Processor Features **codename *Other brands and names are the property of their respective owners © Copyright 2002 -2003 Intel Corporation. All Rights Reserved. 49
- Slides: 49