HyperThreading Technology BokYoun Lee Computer Science Engineering Department
Hyper-Threading Technology Bok-Youn, Lee Computer Science & Engineering Department Korea University, Korea wegra@wegra. org
Processor Architecture Basic v Pipeline v Superscalar v Out-of-order Execution v Hierarchical Cache System
Net. Burst Microarchitecture v Quad-Pumped System Bus v Advanced Transfer Cache v Execution Trace Cache v Hyper-Pipelining v Rapid Execution Engine v Streaming SIMD Extension 2 (SSE 2)
Net. Burst Microarchitecture Net. Burst Block Diagram
Net. Burst Microarchitecture Quad-Pumped System Bus v FSB: 프로세서와 외부 시스템 간의 데이터 전송 통로 v 64 bit-wide, 133 MHz (current) v Quad-Pumped (4 signals per clock) v 8 * 133 * 4 = 4. 2 GB/s v 현 메인 메모리의 경우 듀얼 채널로 구성하지 않으면 병목 발생
Net. Burst Microarchitecture Advanced Transfer Cache v On-die L 2 캐시 디자인을 말함 (from Pentium III - coppermine) v 8 -way 집합 연관 v 128 KB 캐시 라인(64 + 64) v 256 bit-wide v 읽기 지연: 7 클럭 v 동작 속도: 프로세서 코어 클럭 v 대역폭: (256/8) * 3066 = 98112 (약 98 GB/s)
Barriers to Performance Improvement v 프로세서 복잡도 증가 v 소비전력/발열량 증가
Barriers to Performance Improvement Examples of Barrier v AMD Athlon 프로세서 소비 전력 추이
Barriers to Performance Improvement Examples of Barrier v 프로세서 냉각 시스템 I
Barriers to Performance Improvement Examples of Barrier v 프로세서 냉각 시스템 II
Hyper-Threading Concept v Major Goal v Utilization of Processor Resources
Hyper-Threading Technology v Intel Net. Burst Microarchitecture Pipeline v Resource Management (Replicated/Partitioned/Shared) v Net. Burst with Hyper-Threading v Front End v Out-of-order Execution v Net. Burst with Hyper-Threading (review) v Memory Subsystem v Two Modes of Hyper-Threading v What was added ? v Software Optimization
Hyper-Threading Technology Net. Burst’s Pipeline v 일반적인 파이프라인
Hyper-Threading Technology Resource Management v 공유(Shared) v 캐시, 비순차 수행 엔진 v 분할(Partitioned) v Re-order/load/store 버퍼, 큐 v 복제(Replicated) 프로세서별 아키텍처 상태 v 명령어 포인터, ITLB v 복귀 스택 예측 v 이름 변경 로직 v
Hyper-Threading Technology Net. Burst with Hyper-Threading v 하이퍼스레딩 적용시 파이프라인
Hyper-Threading Technology Net. Burst with Hyper-Threading v 각 선택점은 두 논리 프로세서 중 하나에 자원 할당
Hyper-Threading Technology Memory Subsystem v DTLB (Data Translation Lookaside Buffer) v L 1 데이터 캐시 v L 2 통합 캐시 v L 3 통합 캐시 (Xeon MP의 경우) v 하이퍼스레딩과 독립적 v 논리 프로세서에 상관 없이 스케줄러가 요청
What was added ?
Hyper-Threading Technology Software Optimization v Spin-Wait Loop v 하이퍼스레딩 상태에서 spin-wait loop를 사용할 경우 나 머지 논리 프로세서의 수행에 악영향을 미친다.
Hyper-Threading Technology Software Optimization v Spin-Wait Loop with PAUSE 소비 전력 감소 v 자원의 생산적 사용 v
References v [1] Intel Technology Journal - Volume 06 Issue 01, http: //www. intel. com v [2] Introduction to Hyper-Threading Technology, http: //www. intel. com v [3] Tom's Hardware Guide: Intel’s New Pentium 4 Processor http: //www. tomshardware. com v [4] Extreme. Tech: Desktop Hyper-Threading – Coming Soon http: //www. extremetech. com/article 2/0, 3973, 587842, 00. asp v [4] IBM Power 4 Processor Review http: //www. digit-life. com/articles/ibmpower 4 v [5] Ultra. SPARC III Cu Processor http: //www. sun. com/processors/Ultra. SPARC-III/index. html
IBM Power 4 v CMP (Chip-Multiprocessor) v 코어 데이터 캐시: 32 K v 명령어 캐시: 64 K v v 칩 (2 코어) v L 2 캐시: 1. 5 M v 모듈(4 칩, 총 8개 코어) v L 3 캐시: 32 M v 소비전력: 500 W Appendix
Sun Ultra. SPARC III Cu v v v v v Appendix 4 -way Superscalar 6 Execution Pipelines (2 integer, 2 FP/VIS, 1 load/store, 1 branch) 14 -Stage, Non-Stalling Pipeline L 1: 64 KB 4 -way Data, 32 KB 4 -way Instruction, 2 KB Write, 2 KB Prefetch L 2: 1/4/8 MB External On-chip Memory Controller FSB: 150 MHz Power: 65 Watts at 900 MHz
AMD Opteron
- Slides: 58