Multithreading n Simultaneous Multithreading SM n Simultaneous Subordinate
Εισαγωγή Multithreading n Simultaneous Multithreading (SM) n Simultaneous Subordinate Microthreading (SSMT) n Speculative Data Driven Multithreading n
Multithreaded Synchronization Unit (SU) n Execution Unit (EU) n
Synchronization Unit (SU) Είναι αρμόδιο για την έκδοση των σημάτων ενεργοποίησης (fire signals) στην EU n Fire signal : <fp, ip> n Ready thread poll n
Synchronization Signals <‘spawn’, fp, ip> n <‘sync’, fp, ss_off> n
Μηνύματα SU n Data Retrieval – <‘get’, addr_val, fp, ip> – <‘data’, addr_val, value, fp, ip> n Synchronization – <‘backto’, fp, fut_off, ss_off, value> n Thread migration – <‘thread’, f_size, frame_contents, ip>
Execution Unit (EU)
Simultaneous Multithreading Machine Models Fine Grain Multithreading n SM: Full Simultaneous Issue n SM: Single Issue, SM: Dual Issue, SM: Four Issue n SM: Limited Connection n
Machine Models n Διαφορές Hardware
Cache Design [total I cache size in KB] [private or shared]. [D cache size] [private or shared]
SM vs. Single Chip MP
Micro thread spawning Spawn instruction ως μέρος της ISA (Instruction Set Architecture) Event spawning : ένα προκαθορισμένο γεγονός κατά την εκτέλεση του κυρίως thread προκαλεί τη γέννηση του microthread.
Τα απαραίτητα υλικά n ISA υποστήριξη (εντολές για τη micro RAM, spawn εντολή, microcontext) n Compiler και OS υποστήριξη n Hardware υποστήριξη (Micro RAM, decode/rename, microcontext support, register sets)
Βελτιστοποιημένη εκτέλεση BP : compiler synthesis of dynamic branch prediction taken over by microthreads Prefetching : prefetch explicitly (via spawn instruction + routine ) or implicitly (spawn on events + routine ) Cache management : e. g. adaptive replacement policy
Παράδειγμα : BP Hybrid scheme : processor hardware predictor + microthread based predictor (SSMT) SPAWN => παραπάνω δουλειά. Τι πρέπει να προσέχει κανείς;
Πειράματα BP hardware predictor – 16 KB gshare Microthread routine – PAg implementation Ποιές branches αναλαμβάνονται από microthreads; Compiler Selection Heuristic
Speculative Data Driven Multithreading Microthread > Data driven thread Primary thread > Control driven thread SPAWN > FORK Critical instructions (branches, loads) Prediction cache > integration
Extracting Threads from a Program Trace (algo) Work backwards: Misbehaving instance of a critical instruction is a DDT candidate (1) add more a) such instances or b) their memory dependences(3) I 3 ->I 2 Trigger: last added instr.
Life cycle of a Data Driven Thread
Performance Evaluation 8 wide SMT DDTC, cloaking table Targeting Cache misses : latencies in L 1
Performance Evaluation (contnd. ) Targeting Branch Mispredict ions
- Slides: 40