CPU microarchitectures 2000 2018 NPRG 054 High Performance

  • Slides: 16
Download presentation
CPU microarchitectures (2000 -2018) NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 1

CPU microarchitectures (2000 -2018) NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 1

Intel Netburst Microarchitecture [2000] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 2

Intel Netburst Microarchitecture [2000] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 2

Intel Net. Burst Microarchitecture [2000] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek

Intel Net. Burst Microarchitecture [2000] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 3

Intel Core Microarchitecture Pipeline [2006] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek

Intel Core Microarchitecture Pipeline [2006] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 4

Intel Core Microarchitecture � In a cycle, CPU can (in theory) simultaneously perform: Fetch:

Intel Core Microarchitecture � In a cycle, CPU can (in theory) simultaneously perform: Fetch: 16 B (cca. 4 instrukce) from L 1 instruction cache � Decode: 1 to 5 instructions � ALU: 3 simple operations (add/mul) � Memory load: 1 read (up to 128 bits) from L 1 data cache � Memory store: 1 write (up to 128 bits) to L 1 data cache � � Latency § the time between consuming operands and producing results integer add: 1, mul: 3 -5 � FP add: 3, FP mul: 4 -5 � div: data dependent � integer load: 3, FP load: 4 (L 1 cache) � store address: 3 � store data: 2 (retirement, in-order) � NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 5

Intel Core Microarchitecture � Branch prediction conditions, indirect branches, call/return pairs � speculative execution

Intel Core Microarchitecture � Branch prediction conditions, indirect branches, call/return pairs � speculative execution � � Instruction decoder loop cache (simple loops up to 18 instructions) � conversion to micro-ops (1: 1, 1: N, 2: 1) � stack-pointer simulator � � Renamer � 16 architectural integer registers mapped to 144 physical § similarly for FP registers � Out-of-order execution 32 micro-ops in simultaneous execution (RS) from a window of 96 (ROB) � retirement: memory/register stores in-order in background � store forwarding: loads retrieve values from waiting stores � speculative loads: no waiting for waiting stores (to unknown addresses) � NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 6

Intel Nehalem Pipeline [2008] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 7

Intel Nehalem Pipeline [2008] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 7

Intel Sandy Bridge Pipeline [2011] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek

Intel Sandy Bridge Pipeline [2011] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 8

Intel vs. AMD architectures (realworldtech. com) NPRG 054 High Performance Software Development- 2016/2017 David

Intel vs. AMD architectures (realworldtech. com) NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 9

Intel Haswell Microarchitecture (2013) NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 10

Intel Haswell Microarchitecture (2013) NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 10

Haswell (2013) vs. Sandy Bridge (2011) NPRG 054 High Performance Software Development- 2016/2017 David

Haswell (2013) vs. Sandy Bridge (2011) NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 11

Haswell (2013) vs. Sandy Bridge (2011) NPRG 054 High Performance Software Development- 2016/2017 David

Haswell (2013) vs. Sandy Bridge (2011) NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 12

Intel Skylake (2015) [wikichip. org] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek

Intel Skylake (2015) [wikichip. org] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 13

Intel Skylake (2015) [wikichip. org] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek

Intel Skylake (2015) [wikichip. org] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 14

AMD Zen+ (2018) [wikichip. org] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek

AMD Zen+ (2018) [wikichip. org] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 15

AMD Zen+ (2018) [wikichip. org] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek

AMD Zen+ (2018) [wikichip. org] NPRG 054 High Performance Software Development- 2016/2017 David Bednárek 16